1 Introduction

We investigate two incompressible fluids with homogeneous densities \(0<\rho _{-}<\rho _+\) under the influence of gravity modelled by the Euler equations in Boussinesq approximation

$$\begin{aligned} \partial _t v +{{\,\mathrm{div}\,}}(v\otimes v) +\nabla p&=-\rho gA e_n,\nonumber \\ {{\,\mathrm{div}\,}}v&=0,\nonumber \\ \partial _t \rho + {{\,\mathrm{div}\,}}(\rho v)&=0. \end{aligned}$$
(1.1)

The equations are considered on a bounded domain \(\Omega \subset {\mathbb {R}}^n\) and a time interval [0, T), \(T>0\). The function \(\rho :\Omega \times [0,T)\rightarrow {\mathbb {R}}\) is the normalized fluid density, i.e. \(\rho \in \{\pm 1\}\) a.e., \(v:\Omega \times [0,T)\rightarrow {\mathbb {R}}^n\) is the velocity field and \(p:\Omega \times [0,T)\rightarrow {\mathbb {R}}\) the pressure of the fluid. Furthermore, \(e_n\in {\mathbb {R}}^n\) denotes the nth coordinate vector, \(g>0\) the gravitational constant and

$$\begin{aligned} A:=\frac{\rho _+-\rho _{-}}{\rho _++\rho _{-}} \end{aligned}$$

is the Atwood number. The incompressibility condition is complemented by the no-penetration boundary condition

$$\begin{aligned} v\cdot \nu =0 \quad {\text {on }} \partial \Omega \times [0,T), \end{aligned}$$
(1.2)

where \(\nu \) denotes the exterior unit normal of the boundary of \(\Omega \), which is assumed to be sufficiently smooth. We will mostly consider (1.1), (1.2) with the unstable interface as initial data, i.e.,

$$\begin{aligned} \rho (x,0)={{\,\mathrm{sign}\,}}(x_n),\quad v(x,0)=0,\quad x\in \Omega . \end{aligned}$$
(1.3)

This initial data is a classical instance of the Rayleigh–Taylor instability which occurs whenever a lighter fluid or gas is accelerated into a heavier one—a situation which appears in various research areas and applications, see [1, 2, 39, 40] for an overview. Its linear (in)stability analysis goes back to Rayleigh [32] and Taylor [36].

1.1 General aspects of the Boussinesq system

System (1.1) arises from the actual inhomogeneous incompressible Euler equations

$$\begin{aligned} \partial _t (\tilde{\rho } v) +{{\,\mathrm{div}\,}}(\tilde{\rho }v\otimes v) +\nabla \tilde{p}&=-\tilde{\rho } g e_n,\nonumber \\ {{\,\mathrm{div}\,}}v&=0,\nonumber \\ \partial _t \tilde{\rho } + {{\,\mathrm{div}\,}}(\tilde{\rho } v)&=0, \end{aligned}$$
(1.4)

via the normalization \(\tilde{\rho }=\frac{1}{2}(\rho _++\rho _{-}) +\frac{1}{2}(\rho _+-\rho _{-})\rho \), such that \(\rho \in \left\{ \,{\pm 1}\,\right\} \), and by the Boussinesq approximation, i.e., by neglecting the density difference \(\rho _+-\rho _{-}\) in the acceleration term on the left-hand side of the first equation. The Boussinesq approximation therefore is only applicable in the regime of small Atwood number \(A\ll 1\).

In the present paper we consider the inviscid and indiffusive Boussinesq system (1.1). More generally one can also add different sorts of diffusion terms in the momentum balance and/or the mass balance. In dimension 2 global well-posedness results of sufficiently regular solutions and for different types of diffusion terms have been established in [7, 15, 22, 24], while finite time singularity formation for equation (1.1), i.e. without any diffusive terms, has recently been shown in [18].

Local well-posedness statements for (1.1) considered in different sufficiently regular classes can be found in [8, 14, 18]. Note however that the horizontal interface (1.3) does not belong to these classes. On the other hand if one adds the diffusion terms \(-\nu \Delta v\), \(\nu >0\) and \(-\mu \Delta \rho \), \(\mu >0\) to the equations, local well-posedness has also been established for \(L^p\) initial data [4].

Contrary to local well-posedness the articles [3, 10] address the question of non-uniqueness of solutions. More precisely, [3] shows the existence of wild solutions for (1.1) considered with \(g=0\), i.e. for the homogeneous Euler equations augmented by a transport equation for a passive tracer. In [10] the effect of the Coriolis force and a diffusion term in the continuity equation is added to (1.1) and non-uniqueness of weak solutions to given initial data is proven. Moreover, the non-uniqueness of admissible weak solutions is also shown in [10] by construction of a suitable initial velocity field \(v_0\) to a given initial density \(\rho _0\in L^\infty (\Omega )\cap {\mathcal {C}}^2(\Omega )\). The non-uniqueness results [3, 10] both rely on the method of convex integration introduced by De Lellis and Székelyhidi to the context of fluid dynamics [16, 17].

1.2 Heuristic outline of results

In this article we also address the inviscid Boussinesq system by means of convex integration, but our main goal here, as in [21] for the Euler equations (1.4), is to investigate nonlinear instability aspects of the Rayleigh–Taylor configuration (1.3) by providing the existence of solutions to the problem (1.1), (1.2), (1.3) that reflect a turbulent mixing. The oscillatory behaviour of solutions obtained by convex integration has also been utilized as an instance of turbulent mixing in the context of the Kelvin-Helmholtz instability [27, 34] and the Muskat problem for the incompressible porous media equation [5, 6, 11, 20, 23, 26, 29, 33]. Compared to [3, 10] this requires the explicit knowledge of the relaxation associated with (1.1), which will be given here.

Using this relaxation we construct solutions to (1.1), (1.2), (1.3) considered on an n-dimensional quader \(\Omega =(0,1)^{n-1}\times (-L,L)\), \(n\ge 2\), which at any time \(t>0\) with \(\frac{1}{3}gAt^2\le L\) are turbulently mixing in the space region \(\mathscr {U}(t):=\left\{ \,{x\in \Omega :\left| {x_n}\right| <\frac{1}{3}gAt^2}\,\right\} \). The solutions all have a underlying self-similar subsolution in common, whose \(\rho \)-component is linear inside the mixing zone, i.e. \(\rho _{sub}(x,t)=\frac{3x_n}{gAt^2}\) for \(x\in \mathscr {U}(t)\).

The subsolution and hence the growth rate of the mixing zone \(\frac{1}{3}gAt^2\) is selected uniquely and independently of the dimension by asking for maximal initial energy dissipation among all self-similar subsolutions. In particular the induced solutions are admissible with respect to the initial energy. The usage of maximal energy dissipation is motivated by the entropy rate admissibility criterion for hyperbolic conservation laws [13] and has been investigated in [27, 34] in the context of Euler subsolutions emanating from vortex-sheet initial data, as well as in [9, 19] for compressible Euler systems. As in [27] we focus here on maximal initial dissipation, i.e. the selection applies to small times.

Beyond small times, we show that the subsolutions can be extended in an admissible way past the time where the mixing zone hits the boundary. In fact we provide two possible extensions which in the long-time limit converge to two different states of the fluid: the fully mixed, isotropic state without any turbulent motion, this can be seen as the two different fluids forming now a single homogeneous fluid at rest, and the demixed stationary configuration where the two fluids are also at rest, but completly separated with the heavier fluid below the lighter. In other words this shows the existence of turbulent heteroclinic solutions emanating from the unstable interface configuration.

1.3 Brief comparison to experiments

There are numerous works addressing the Rayleigh–Taylor instability in different settings by means of experiments, numerical simulations and theoretical investigations for reduced models. For further reading we simply refer to the references given in the reviews [1, 2, 39, 40]. At this point we only like to quickly compare our solutions for (1.1) with the results of the experiments carried out in [31] at Atwood number \(A\sim 7.5\cdot 10^{-4}\).

First of all both the experiments and our solutions have a growth rate for the mixing zone like \(\alpha gAt^2\). While the criterion of maximal initial energy dissipation selects \(\alpha =\frac{1}{3}\) for our solutions, the actual constant observed in [31] is \(\alpha =0.07\). Concerning self-similarity it is written in [31]: “The saturation of \(\alpha \) at late time to a constant value of 0.07 suggests that the flow reaches self-similarity in these experiments.” Another quantity we can easily compare is the ratio between dissipated energy and released potential energy. We will see in Sect. 4 (Remark 4.2) that up to an arbitrary small error our solutions show a ratio of \(\frac{D}{P_{rel.}}=\frac{1}{3}\), while the ratio measured in [31] in dimension 3 is \(\frac{D}{P_{rel.}}=0.49\).

We conclude that the solutions selected by maximal initial energy dissipation stand the comparison to actual experiments on a qualitative level, but it remains the interesting question if the gap between \(\alpha =\frac{1}{3}\) vs. \(\alpha =0.07\) and \(\frac{D}{P_{rel.}}=\frac{1}{3}\) vs. \(\frac{D}{P_{rel.}}=0.49\) can be improved in the future. In particular, it would be an interesting open problem to see if the measured values correspond perhaps to the optimization of some other mathematical quantity (other than the initial energy dissipation).

1.4 The role of the energy as a prescribed quantity

As in some other previous works of convex integration in fluid mechanics (e.g. [16, 17, 21]), there is a microscopic quantity which one has to prescribe in a continuous way in order to implement the convex integration. For instance, in the case of the homogeneous density incompressible Euler equations, this quantity was the kinetic energy \(\frac{1}{2}|v|^2\); respectively in the case of the inhomogeneous incompressible Euler equations in [21] it was the quantity \(\frac{1}{2}\rho |v+gt e_n|^2\), which corresponded to the kinetic energy of a transformed system, and which can be seen as the kinetic energy of the original system plus a linear function of the momentum and the density. In our case the appropriate prescribed quantity which gives rise to the subsolutions mentioned before is \(\frac{1}{2}|v|^2+\frac{1}{3}\rho gAx_n\), i.e. the kinetic energy plus a fraction of the potential energy of the system.

In fact, both our convex integration strategy and the construction of our subsolutions from Sect. 4 can be carried out while prescribing the quantity \(\frac{1}{2}|v|^2+\epsilon \rho gAx_n\), for any \(\epsilon \in [0,1]\). However, setting for instance \(\epsilon =1\), i.e. prescribing the total energy of the system, leads to solutions which are not admissible. The value \(\epsilon =\frac{1}{3}\) is obtained by the process of maximizing the initial energy dissipation.

For further details, see the more detailed discussion in Sects. 2.1, 2.2 and the construction in Sect. 4.

1.5 Comparison to [21]

In [21] the authors together with L. Székelyhidi have addressed the Rayleigh–Taylor instability for the inhomogeneous incompressible Euler equations (1.4). In the aforementioned paper we obtained the existence of admissible turbulently mixing solutions for a sufficiently high density ratio \(\frac{\rho _+}{\rho _{-}}\), which translates to the Atwood number A being in the so-called “ultra high” range, \(A\ge 0.845\), i.e. far away from the Boussinesq range.

The proof there also relied on the explicit computation of the relaxation and convex integration within the Tartar framework. The computations for the convex hull in Sect. 3.3 resemble the computations done in [21]. While in the aforementioned paper a transformation of (1.4) onto an accelerated domain could be used in order to fit the system exactly into the Tartar framework (by which we mean that the gravity term in the momentum equation disappeared), here we can no longer use this transformation due to the Boussinesq approximation, and instead construct localized plane waves for an inhomogeneous linear system, see Sect. 3.1.

However, the main difference to [21] is the way subsolutions are constructed and selected. In [21] we reduced the relaxed system to a conservation law, which has some similarities to the conservation law appearing in [30] in a different approach to relax the incompressible porous media equation, and picked the unique entropy solution as our subsolution density profile. The subsolution found this way is self-similar and admissible for big enough A. Here instead, we consider the whole zoo of self-similar subsolutions and set up a variational problem whose unique minimizer corresponds to the subsolution maximizing the initial energy dissipation.

As illustrated in the previous subsection, contrary to [21] strictly speaking we do not provide one relaxation of the nonlinear system, but several relaxations, which differ in the amount of allowed turbulent behaviour in the local energy density, see Sect. 2.2.

The discussion of possible long time limits in Sect. 4.3 for the subsolution is not part of [21].

1.6 Outline of the paper

In Sect. 2 we formulate our results concerning the relaxation of (1.1) and the investigation of subsolutions in a precise way. Section 3 contains the steps needed to carry out convex integration in the Tartar framework and Sect. 4 contains the construction and selection of self-similar subsolutions.

2 Statement of results

Let \(\Omega \subset {\mathbb {R}}^n\) be a bounded domain and \(T>0\). Our notion of solution to system (1.1), (1.2) on \(\Omega \times [0,T)\) for general initial data \(\rho (\cdot ,0)=\rho _0\), \(v(\cdot ,0)=v_0\) with

$$\begin{aligned} \rho _0\in L^\infty (\Omega ),\quad \rho _0\in \{\pm 1\}~a.e., \quad v_0\in L^2(\Omega ;{\mathbb {R}}^n),\quad {{\,\mathrm{div}\,}}v_0=0\; {\text { weakly}},\nonumber \\ \end{aligned}$$
(2.1)

is as follows.

Definition 2.1

(Weak solutions) Let \((\rho _0,v_0)\) be as in (2.1). We say that \((\rho ,v)\in L^\infty (\Omega \times (0,T))\times L^2(\Omega \times (0,T);{\mathbb {R}}^n)\) is a weak solution to (1.1), (1.2) with initial data \((\rho _0,v_0)\) if for any test functions \(\Phi \in C^\infty _c(\Omega \times [0,T);{\mathbb {R}}^n) \), \(\Psi \in C^\infty _c(\overline{\Omega }\times [0,T)) \), such that \(\Phi \) is divergence-free, we have

$$\begin{aligned} \int _0^T\int _{\Omega } \left[ v \cdot \partial _t \Phi + \langle v\otimes v,\nabla \Phi \rangle - gA \rho \Phi _n \right] \ dx \ dt +\int _{\Omega }v_0 (x)\cdot \Phi (x,0)\ dx=0,\\ \int _0^T\int _\Omega v\cdot \nabla \Psi \,dx\,dt=0,\\ \int _0^T\int _{\Omega } \left[ \rho \partial _t \Psi + \rho v\cdot \nabla \Psi \right] \ dx \ dt +\int _{\Omega } \rho _0 (x) \Psi (x,0)\ dx=0, \end{aligned}$$

and if \(\rho (x,t)\in \left\{ \,{\pm 1}\,\right\} \) for a.e. \((x,t)\in \Omega \times (0,T)\).

Observe that the definition of v being weakly divergence-free includes the no-flux boundary condition. Moreover, for a smooth vectorfield v the condition \(\rho \in \{\pm 1\}\) automatically holds true, because then the density is transported along the flow associated with v, but for weaker notions of solutions this property in general is lost, see for example [28]. Furthermore, a (in general distributional) pressure p can be recovered from \((\rho ,v)\) as in the case of the homogeneous Euler equations, see [37].

The local energy density function \({\mathcal {E}}\in L^1(\Omega \times (0,T))\) associate with a weak solution \((\rho ,v)\) reads

$$\begin{aligned} {\mathcal {E}}(x,t):=\frac{1}{2}\left| {v(x,t)}\right| ^2+\rho (x,t)gAx_n. \end{aligned}$$
(2.2)

Indeed, testing a sufficiently smooth solution of (1.1) with v one sees that the total energy \(\int _{\Omega }{\mathcal {E}}(x,t)\,dx\) is independent of t. However, this property in general fails to be true for weak solutions of Euler type equations, see [16] for Euler and [10] for the Boussinesq system. In order to rule out unphysical solutions due to an increase in energy and in view of the weak-strong uniqueness principle in various equations in fluid dynamics [38] we require the solutions to satisfy the following admissibility condition.

Definition 2.2

(Admissible weak solutions) A weak solution \((\rho ,v)\) in the sense of Definition 2.1 is called admissible provided it satisfies the weak energy inequality

$$\begin{aligned} \int _\Omega {\mathcal {E}}(x,t)\,dx\le \int _\Omega \frac{1}{2}\left| {v_0(x)}\right| ^2+\rho _0(x)gAx_n\,dx \quad {\text { for a.e. }}\; t\in (0,T). \end{aligned}$$

2.1 The relaxation

Next we will reformulate equation (1.1) as a differential inclusion and state its relaxation. Let \({\mathcal {S}}^{n\times n}\) be the set of all symmetric \(n\times n\) matrices, \({\mathcal {S}}_0^{n\times n}\subset {\mathcal {S}}^{n\times n}\) the subset of matrices with vanishing trace and \({{\,\mathrm{id}\,}}\in {\mathcal {S}}^{n\times n}\) be the identity matrix. We also write \(\lambda _{\mathrm{max}}(S),\lambda _{\mathrm{min}}(S)\) for the maximal, minimal resp., eigenvalue of \(S\in {\mathcal {S}}^{n \times n}\), and the trace free part of S is denoted by \(S^\circ :=S-\frac{1}{n}{{\,\mathrm{tr}\,}}(S){{\,\mathrm{id}\,}}\).

Consider on \(\Omega \times (0,T)\) the linear system

$$\begin{aligned} \partial _tv+{{\,\mathrm{div}\,}}\sigma +\nabla p&=-\rho gA e_n,\nonumber \\ {{\,\mathrm{div}\,}}v&=0,\nonumber \\ \partial _t\rho +{{\,\mathrm{div}\,}}m&=0, \end{aligned}$$
(2.3)

complemented with the boundary conditions

$$\begin{aligned} v\cdot \nu =0,\quad m\cdot \nu =0\quad {\text {on }} \partial \Omega \times (0,T), \end{aligned}$$
(2.4)

for \(z:=(\rho ,v,m,\sigma ,p)\) taking values in \(Z:={\mathbb {R}}\times {\mathbb {R}}^n\times {\mathbb {R}}^n\times {\mathcal {S}}_0^{n\times n}\times {\mathbb {R}}\), and define

$$\begin{aligned} K_{(x,t)}:=\left\{ \,{z\in Z:\rho \in \{\pm 1\},~m=\rho v, ~v\otimes v-\sigma =e(x,t)[\rho ]{{\,\mathrm{id}\,}}}\,\right\} \end{aligned}$$
(2.5)

for a given function \(e:\Omega \times (0,T)\times {\mathbb {R}}\rightarrow {\mathbb {R}}\), \((x,t,r)\mapsto e(x,t)[r]\), which is affine linear in r. A brief discussion on possible choices of e and some general constraints can be found in Sect. 2.2 below.

Now if \(z:\Omega \times (0,T)\rightarrow Z\) is a weak solution of (2.3), (2.4) to some initial data \((\rho _0,v_0)\) as in (2.1), see Definition 2.3 below for the precise definition, and if for almost every \((x,t)\in \Omega \times (0,T)\) there holds \(z(x,t)\in K_{(x,t)}\), then \((\rho ,v)\) defines a solution to the original equation (1.1) in the sense of Definition 2.1 for the same initial data and with energy density function given by

$$\begin{aligned} {\mathcal {E}}(x,t)=\frac{n}{2}e(x,t)[\rho (x,t)]+\rho (x,t)gAx_n. \end{aligned}$$

Conversely, if \((\rho ,v)\) with associated pressure p is a weak solution in the sense of Definition 2.1, then \(z=\left( \rho ,v,\rho v,(v\otimes v)^\circ ,p+\frac{1}{n}\left| {v}\right| ^2\right) \) is a weak solution of (2.3), (2.4) and z pointwise a.e. takes values in the set \(K_{(x,t)}\) defined with respect to the function \(e(x,t)[r]=\frac{1}{n}\left| {v(x,t)}\right| ^2\).

For the relaxation of (2.3), (2.5) let \(Z_0:=\left\{ \,{z\in Z:\rho \in (-1,1)}\,\right\} \), as well as \(T_+,T_{-},Q:Z_0\rightarrow {\mathbb {R}}\), \(M:Z_0\rightarrow {\mathcal {S}}^{n \times n}\),

$$\begin{aligned} M(z)=\frac{v\otimes v-\rho (m\otimes v+v\otimes m) +m\otimes m}{1-\rho ^2}-\sigma ,\nonumber \\ Q(z)=\lambda _{\mathrm{max}}(M(z)),\quad T_{\pm }(z) =\frac{\left| {m\pm v}\right| ^2}{n(\rho \pm 1)^2}, \end{aligned}$$
(2.6)

and define for \((x,t)\in \Omega \times (0,T)\) the open set

$$\begin{aligned} U_{(x,t)}:=\left\{ \,{z\in Z:\rho \in (-1,1),~T_\pm (z)<e(x,t)[\pm 1],~Q(z)<e(x,t)[\rho ]}\,\right\} . \end{aligned}$$
(2.7)

In the course of the article we will show that \(U_{(x,t)}\) is the interior of the convex hull of \(K_{(x,t)}\). That in particular means that if \((\rho _k,v_k)_{k\in {\mathbb {N}}}\) is a sequence of weak solutions with \(v_k\in L^\infty (\Omega \times (0,T);{\mathbb {R}}^n)\) and such that the following convergences hold true \((\rho _k,v_k,\rho _kv_k,(v_k\otimes v_k)^\circ )\overset{*}{\rightharpoonup }(\rho ,v,m,\sigma )\) in \(L^\infty (\Omega \times (0,T);{\mathbb {R}}\times {\mathbb {R}}^n\times {\mathbb {R}}^n\times {\mathcal {S}}_0^{n\times n})\) and \(\frac{1}{n}\left| {v_k}\right| ^2\rightarrow e\) in \(L^\infty (\Omega \times (0,T))\), then there exists a pressure p, such that \((\rho ,v,m,\sigma ,p)\) is a weak solution of (2.3), while pointwise a.e. taking values in \(\overline{U}_{(x,t)}\), where \(U_{(x,t)}\) is defined with respect e.

With the help of the linear system (2.3) and the sets (2.7) we are ready to formulate the notion of subsolutions to (1.1), as well as our general convex integration result. Doing this the following projection turns out to be convenient: for \(z=(\rho ,v,m,\sigma ,p)\in Z\) let

$$\begin{aligned} \pi (z):=(\rho ,v,m,\sigma )\in {\mathbb {R}}\times {\mathbb {R}}^n\times {\mathbb {R}}^n \times {\mathcal {S}}_0^{n \times n}. \end{aligned}$$
(2.8)

Definition 2.3

(Subsolutions) Let \(e:\Omega \times (0,T)\times [-1,1]\rightarrow {\mathbb {R}}\) be bounded and affine linear in the last component. We say that \(z=(\rho ,v,m,\sigma ,p):\Omega \times (0,T)\rightarrow Z\) is a subsolution of (1.1) associated with e and initial data \((\rho _0,v_0)\) as in (2.1) if and only if \(\pi (z)\in L^\infty (\Omega \times (0,T);\pi (Z))\), p is a distribution, z solves (2.3), (2.4) in the sense that v is weakly divergence-free (as in Definition 2.1),

$$\begin{aligned} \int _0^T\int _{\Omega } \left[ v \cdot \partial _t \Phi +\langle \sigma ,\nabla \Phi \rangle - gA \rho \Phi _n \right] \ dx \ dt +\int _{\Omega }v_0 (x)\cdot \Phi (x,0)\ dx=0,\\ \int _0^T\int _{\Omega } \left[ \rho \partial _t \Psi + m\cdot \nabla \Psi \right] \ dx \ dt +\int _{\Omega } \rho _0 (x) \Psi (x,0)\ dx=0, \end{aligned}$$

for any test functions \(\Phi \in C^\infty _c(\Omega \times [0,T);{\mathbb {R}}^2)\), \({{\,\mathrm{div}\,}}\Phi =0\), \(\Psi \in C^\infty _c(\overline{\Omega }\times [0,T))\), and if there exists an open set \(\mathscr {U}\subset \Omega \times (0,T)\), such that the two restricted maps \(\mathscr {U}\ni (x,t)\mapsto \pi (z(x,t))\in \pi (Z)\) and \(\mathscr {U}\times {\mathbb {R}}\ni (x,t,r)\mapsto e(x,t)[r]\in {\mathbb {R}}\) are continuous, and if there holds \(z(x,t)\in U_{(x,t)}\) for all \((x,t)\in \mathscr {U}\), as well as \(z(x,t)\in K_{(x,t)}\) for a.e. \((x,t)\in \Omega \times (0,T){\setminus }\mathscr {U}\). The open set \(\mathscr {U}\) is called the mixing zone of z, and in analogy to solutions we call the subsolution admissible provided

$$\begin{aligned} {\mathcal {E}}_{sub}(x,t):=\frac{n}{2}e(x,t)[\rho (x,t)]+\rho (x,t) gAx_n \end{aligned}$$
(2.9)

satisfies

$$\begin{aligned} \int _\Omega {\mathcal {E}}_{sub}(x,t)\, dx \le \int _\Omega \frac{1}{2}\left| {v_0(x)}\right| ^2 +\rho _0(x)gAx_n\,dx \quad {\text { for a.e. }} t\in (0,T). \end{aligned}$$
(2.10)

Before formulating our convex integration theorem we like to point out the following observation, which follows from Lemma 8 in [17].

Remark 2.4

Without loss of generality the \(\rho \)-component of any subsolution or solution is contained in \({\mathcal {C}}^0([0,T];L^2_w(\Omega ))\). That is for any \(w\in L^2(\Omega )\) the function \([0,T]\ni t\mapsto \int _\Omega \rho (x,t)w(x)\,dx\in {\mathbb {R}}\) is continuous.

More precisely, [17, Lemma 8] gives \(\rho \in {\mathcal {C}}^0((0,T);L^2_w(\Omega ))\), but looking into the proof one sees that the functions in [17, equation (90)] can be uniquely extended to \({\mathcal {C}}^0([0,T])\).

Observe also that outside the mixing zone \(\mathscr {U}\) the components \((\rho ,v)\) of a subsolution z already solve the Euler-Boussinesq equation (1.1).

Theorem 2.5

Let \(z=(\rho ,v,m,\sigma ,p)\) be a subsolution associated with e and initial data \((\rho _0,v_0)\) satisfying (2.1), where \(e:\Omega \times (0,T)\times [-1,1]\rightarrow {\mathbb {R}}\) is given by

$$\begin{aligned} e(x,t)[r]=e_0(x,t)+re_1(x,t) \end{aligned}$$

with \(e_0\in L^\infty (\Omega \times (0,T))\), \(e_1\in L^\infty (\Omega \times (0,T))\cap {\mathcal {C}}^0([0,T];L^2(\Omega ))\). Then for any error function \(\delta :[0,T]\rightarrow {\mathbb {R}}\), \(\delta (0)=0\), \(\delta (t)>0\), \(t>0\) there exist infinitely many weak solutions \((\rho _{sol},v_{sol})\) of (1.1), (1.2) with initial data \((\rho _0,v_0)\) having the properties

  1. a)

    \((\rho _{sol},v_{sol})=(\rho ,v)\) a.e. on \(\Omega \times (0,T){\setminus }\mathscr {U}\),

  2. b)

    the local energy density defined in (2.2) for a.e. \((x,t)\in \Omega \times (0,T)\) is given by

    $$\begin{aligned} {\mathcal {E}}_{sol}(x,t)=\frac{n}{2}e(x,t)[\rho _{sol}(x,t)]+\rho _{sol}(x,t)gAx_n, \end{aligned}$$
  3. c)

    for any \(t\in [0,T]\) there holds

    $$\begin{aligned} \left| {\int _\Omega \left( \frac{n}{2}e_1(x,t)+gAx_n\right) (\rho (x,t)-\rho _{sol}(x,t)) \,dx}\right| <\delta (t), \end{aligned}$$
  4. d)

    for any \(t\in (0,T)\) and any open ball \(B\subset \Omega \) with \(B\times \{t\}\subset \mathscr {U}\) there holds

    $$\begin{aligned} \int _B (1-\rho _{sol}(x,t))\,dx\int _B (1+\rho _{sol}(x,t))\,dx> 0. \end{aligned}$$

Moreover, among these solutions one can find a sequence \((\rho _k,v_k)\), \(k\in {\mathbb {N}}\), such that \(\rho _k\rightarrow \rho \) in \({\mathcal {C}}^0([0,T];L^2_w(\Omega ))\) and \(v_k\rightharpoonup v\) in \(L^2(\Omega \times (0,T))\).

Remark 2.6

(Admissibility) Observe that by Remark 2.4 and the assumption on \(e_1\) the integral on the left-hand side in Thm. 2.5 c) defines a continuous function on [0, T]. Moreover, for a.e. \(t\in (0,T)\) the energy difference between the subsolution and the solutions is precisely given by this term, i.e.

$$\begin{aligned} \int _\Omega {\mathcal {E}}_{sub}(x,t)-{\mathcal {E}}_{sol}(x,t)\,dx=\int _\Omega \left( \frac{n}{2}e_1(x,t)+gAx_n\right) (\rho (x,t)-\rho _{sol}(x,t))\,dx \end{aligned}$$

for a.e. \(t\in (0,T)\). In particular, if the subsolution is admissible with strict inequality in (2.10) for a.e. \(t\in (0,T)\), then by a suitable choice of error function \(\delta (t)\) one sees that property c) implies the admissibility of the induced solutions \((\rho _{sol},v_{sol})\) in the sense of Definition 2.2.

However, one should note that \(\int _\Omega {\mathcal {E}}_{sol}(x,t)\,dx\) and \(\int _\Omega {\mathcal {E}}_{sub}(x,t)\,dx\) are only defined for a.e. \(t\in (0,T)\) due to \(e_0\) only being an \(L^\infty \) function. Hence, if one is interested in the notion of admissibility at every time, by which we mean that the inequality in Definition 2.2 holds for all \(t\in [0,T]\) instead of a.e. \(t\in (0,T)\), one can upgrade \(e_0\) to be continuous and use the convex integration strategy from [6, 17] based on a “shifted grid”, which is not used here.

Remark 2.7

(Mixing) The convergence \(\rho _k\rightarrow \rho \) in \({\mathcal {C}}^0([0,T];L^2_w(\Omega ))\) means that for any \(w\in L^2(\Omega )\) there holds

$$\begin{aligned} \sup _{t\in [0,T]}\left| {\int _{\Omega }(\rho _k(x,t)-\rho (x,t))w(x)\,dx}\right| \rightarrow 0. \end{aligned}$$

In that sense at every \(t\in [0,T]\) the subsolution density \(\rho (\cdot ,t)\) can be seen as a coarse grained or averaged density of the induced solutions \(\rho _{sol}(\cdot ,t)\), whose turbulent nature is illustrated by means of the mixing at every time slice property d).

The proof of Theorem 2.5 will be carried out in Sect. 3 and is based on the convex integration methods introduced by De Lellis and Székelyhidi in [16, 17] and its refinements in [6, 12]. In particular looking at [6] one could in addition also add the “linearly degraded macroscopic behaviour” to the list of properties of the solutions in Theorem 2.5.

2.2 Choices for e(xt)[r]

In order to have inside the mixing zone \(\mathscr {U}\) of a subsolution a non-empty interior of the convex hull \(U_{(x,t)}\) we need

$$\begin{aligned} e(x,t)[\pm 1]>0,\quad {\text {for all }} (x,t)\in \mathscr {U}. \end{aligned}$$
(2.11)

In general e(xt)[r] has to be non-negative a.e., because this expression coincides up to a positive factor with the kinetic energy of the solutions.

Besides the above conditions one can a priori use for e(xt)[r] any function of the type

$$\begin{aligned} e(x,t)=e_0(x,t)+e_1(x,t)r \end{aligned}$$

with \(e_0,e_1\) continuous on \(\mathscr {U}\), but in fact we will only consider such e with

$$\begin{aligned} e_1(x,t)=-\varepsilon gAx_n,\quad \varepsilon \in \left[ 0,\frac{2}{n}\right] . \end{aligned}$$
(2.12)

With this choice the solutions obtained by Theorem 2.5 will have a kinetic energy a.e. given by

$$\begin{aligned} \frac{1}{2}\left| {v_{sol}(x,t)}\right| ^2=\frac{n}{2}e_0(x,t) -\frac{n}{2}\varepsilon gA x_n\rho _{sol}(x,t). \end{aligned}$$

This means that besides the continuous part \(\frac{n}{2}e_0(x,t)\), which can be seen as a non turbulent or averaged part, the kinetic energy density of the solutions absorbs a certain fraction, given by \(\frac{n}{2}\varepsilon \in [0,1]\), of the turbulent oscillations in the potential energy density \(gAx_n\rho _{sol}(x,t)\).

A priori also \(\varepsilon \) can be a function depending on (xt), but we will mostly stick to constant \(\varepsilon \), except for Sect. 4.3.

2.3 Subsolutions

Our second main result addresses the construction and selection of subsolutions associated with the initial data \(\rho _0={\text {sgn}}(x_n)\), \(v_0\equiv 0\). We consider the problem on an n-dimensional box \(\Omega =(0,1)^{n-1}\times (-L,L)\), \(L>0\), \(n\ge 2\) and focus on self-similar subsolutions. For the precise definition let \({\mathcal {F}}\) denote the set of all \(f\in {\mathcal {C}}^1([-1,1])\) satisfying

$$\begin{aligned} f(\pm 1)=\pm 1,~ f'(\pm 1)> 0,~ f(y)\in (-1,1),~ f(-y)=-f(y),~y\in (-1,1) \end{aligned}$$
(2.13)

and let \({\mathcal {A}}\) denote the set of all \(a\in {\mathcal {C}}^2([0,T))\) with

$$\begin{aligned} a(0)=0,~ a(t)>0,~t\in (0,T). \end{aligned}$$
(2.14)

In Sect. 4.1 we will prove the following lemma.

Lemma 2.8

Any triple \((f,a,\varepsilon )\in {\mathcal {F}}\times {\mathcal {A}}\times \left[ 0,\frac{2}{n}\right] \) gives rise to a continuous, piecewise \({\mathcal {C}}^1\) subsolution z with

$$\begin{aligned} \rho (x,t)={\left\{ \begin{array}{ll} 1,&{}x_n\ge a(t),\\ f\left( \frac{x_n}{a(t)}\right) ,&{}x_n\in (-a(t),a(t)),\\ -1,&{}x_n\le -a(t), \end{array}\right. } \end{aligned}$$

\(v\equiv 0\), \(m_i\equiv 0\), \(1\le i\le n-1\), as long as \(a(t)\le L\) and with e having the form \(e(x,t)[r]=e_0(x,t)-\varepsilon gAx_n r\).

We refer to these subsolutions as self-similar subsolutions. Considering solutions with \(v\equiv 0\), \(m_i\equiv 0\), \(1\le i\le n-1\) and independent of \(x_1,\ldots ,x_{n-1}\) reflects the interpretation of the subsolution as an \((x_1,\ldots ,x_{n-1})\)-averaged solution. Moreover, we will see that the symmetry condition on f is needed for the existence of self-similar subsolutions for the Boussinesq system. In contrast the subsolution constructed in [21] for the Euler system without Boussinesq approximation is also self-similar, but the profile f is not symmetric.

Note that the associated mixing zone is given by \(\mathscr {U}_a:=\left\{ \,{(x,t):\left| {x_n}\right| <a(t)}\,\right\} \). Note also that at this point the subsolutions are not necessarily admissible.

In order to investigate the admissibility let \(z=z_{f,a,\varepsilon }\) be a self-similar subsolution and define the function \(\tilde{e}_{f,a,\varepsilon }:\Omega \times (0,T)\rightarrow {\mathbb {R}}\),

$$\begin{aligned} \tilde{e}_{f,a,\varepsilon }(x,t)&:=\inf \left\{ e_0(x,t):e_0 \in L^\infty (\Omega \times (0,T))\cap {\mathcal {C}}^0(\mathscr {U}_a),\right. \\&\quad \left. z_{f,a,\varepsilon }\; {\text { is a subsolution w.r.t. }} e(x,t)[r]=e_0(x,t) -\varepsilon gAx_n r\right\} . \end{aligned}$$

Hence by this definition, Theorem 2.5 c) and Remark 2.6 the subsolution \(z_{f,a,\varepsilon }\) induces mixing solutions whose total energy \(\int _\Omega {\mathcal {E}}_{sol}(x,t)\,dx\) for a.e. \(t\in (0,T)\) is arbitrarily close to

$$\begin{aligned} E_{f,a,\varepsilon }(t):=\int _\Omega \frac{n}{2}\tilde{e}_{f,a,\varepsilon }(x,t) +\left( 1-\frac{n}{2}\varepsilon \right) gAx_n \rho _{f,a,\varepsilon }(x,t)\,dx. \end{aligned}$$
(2.15)

Note that if \(z_{f,a,\varepsilon }\) is admissible, then \(E_{f,a,\varepsilon }(t)\le E(0)\) for a.e. \(t\in (0,T)\), where \(E(0)=gAL^2\) is the initial energy associated with (1.3). In order to evaluate the initial loss of energy define for \(k=0,\ldots ,4\) the functionals

$$\begin{aligned} J_k(f,a,\varepsilon ):=\lim _{t\rightarrow 0} \frac{E_{f,a,\varepsilon }(t)-E(0)}{t^k}, \end{aligned}$$
(2.16)

whenever the limits exist. We have the following small time selection of a self-similar subsolution.

Theorem 2.9

For any \(f\in {\mathcal {F}}\), \(a\in {\mathcal {A}}\), \(\varepsilon \in \left[ 0,\frac{2}{n}\right] \), such that \(z_{f,a,\varepsilon }\) is admissible there holds \(\dot{a}(0)=0\) and \(J_k(f,a,\varepsilon )=0\), \(k=0,1,2,3\). Moreover, among all admissible self-similar subsolutions the maximal initial dissipation rate

$$\begin{aligned} \inf \left\{ \,{J_4(f,a,\varepsilon ):(f,a,\varepsilon )\in {\mathcal {F}}\times {\mathcal {A}}\times \left[ 0,\frac{2}{n}\right] ,~J_k(f,a,\varepsilon )=0 \;{\text { for }} k=0,1,2,3}\,\right\} \end{aligned}$$

is achieved for \(f(y)=y\), \(a(t)=\frac{1}{3}gAt^2+o(t^2)\), \(\varepsilon =\frac{2}{3n}\). Up to the \(o(t^2)\), the minimizer is unique.

We will see that for \(f(y)=y\), \(a(t)=\frac{1}{3}gAt^2\) and \(\varepsilon =\frac{2}{3n}\) there holds

$$\begin{aligned} E_{f,a,\varepsilon }(t)-E(0)=-\frac{1}{81}g^3A^3t^4 \end{aligned}$$

as long as \(a(t)\le L\), i.e. for all \(t\in \left[ 0,\sqrt{\frac{3L}{gA}}\right] \).

Next we will formulate the two statements concerning the extension of the subsolution to all times. We like to emphasize that for the extensions we no longer use a selection criterion, instead the constructions contain several choices and for now are only done to illustrate possible options for the long-time behaviour.

Proposition 2.10

The minimizing subsolution from Theorem 2.9 with \(o(t^2)=0\) can be extended in an admissible manner to \(\Omega \times (0,+\infty )\) such that it converges to the fully mixed, isotropic state \(z\equiv 0\) as \(t\rightarrow +\infty \) and such that also the associated kinetic energy \(\frac{n}{2}e(x,t)[\rho (x,t)]\) converges to 0.

Proposition 2.11

There exists \(T_{end}\in \left( \sqrt{\frac{3L}{gA}},+\infty \right) \) such that the minimizing subsolution from Theorem 2.9 with \(o(t^2)=0\) can be extended in an admissible manner to \(\Omega \times (0,T_{end})\), and at \(T_{end}\) it reaches the stable configuration \(\rho =-\rho _0\), \((v,m,\sigma )\equiv 0\), \(p=const.\), \(e(\cdot ,T_{end})[\cdot ]\equiv 0\).

In fact, both subsolutions are not only admissible, but satisfy the strong energy inequality, which means that the total energy \(\int _\Omega {\mathcal {E}}_{sub}(x,t)\,dx\) is monotone decreasing w.r.t. time.

Moreover, in the first case the subsolution satisfies \(z(x,t)\in U_{(x,t)}\) for every \(x\in \Omega \) and \(t>\sqrt{\frac{3L}{gA}}\) while the closure of the hull \(\overline{U}_{(x,t)}\) collapses as \(t\rightarrow +\infty \) to the set \([-1,1]\times \{0\} \times \{0\}\times \{0\}\times {\mathbb {R}}\subset Z\) due to the decay of kinetic energy. Thus technically the mixing zone is unbounded here. In the second case we have that \(z(x,T_{end})\) actually is a solution, i.e. \(z(x,T_{end})\in K_{(x,T_{end})}\) for a.e. \(x\in \Omega \). Clearly we can extend this subsolution to all times by \(z(\cdot ,t)=z(\cdot ,T_{end})\) for all \(t>T_{end}\).

3 Convex integration via the Tartar framework

To prove our main result, we will use a version of the Tartar framework, originally introduced in the context of compensated compactness [35], for differential inclusions when the set of nonlinear constraints is not constant (c.f. e.g. [6, 12, 17]).

The general strategy of convex integration in the Tartar framework relies on the idea that if one can find a weak solution \(\tilde{z}\) of (2.3) which instead of taking values in \(K_{(x,t)}\) satisfies \(\tilde{z}(x,t)\in {\text {int}} \left( K_{(x,t)}^{co}\right) \), then one may deduce the existence of (infinitely many) solutions z of (2.3), which are near \(\tilde{z}\) in the weak sense while satisfying \(z(x,t)\in K_{(x,t)}\) a.e., by adding some specially constructed perturbations to \(\tilde{z}\). The perturbations rely on localized plane waves as basic building blocks.

3.1 Localized plane waves

For \(\bar{z}\in Z\) we define

$$\begin{aligned} M_\Lambda (\bar{z}):=\begin{pmatrix} \bar{\sigma }+\bar{p}{{\,\mathrm{id}\,}}&{} \bar{v}\\ \bar{v}^T &{} 0\\ \bar{m}^T &{} \bar{\rho } \end{pmatrix}\in {\mathbb {R}}^{(n+2)\times (n+1)}, \end{aligned}$$

such that the wave cone associated with (2.3) can be written as

$$\begin{aligned} \Lambda :=\left\{ \,{\bar{z}\in Z:\ker M_\Lambda (\bar{z}) \ne \{0\}, \quad (\bar{\rho },\bar{v})\ne 0}\,\right\} . \end{aligned}$$
(3.1)

Note that for \(\bar{z}\in \Lambda \) there exists \(\eta =(\xi ,c)\in {\mathbb {R}}^{n+1}{\setminus }\{0\}\) such that every function \(z(x,t)=\bar{z}h((x,t)\cdot \eta )\), \(h\in {\mathcal {C}}^1({\mathbb {R}})\) is a solution of (2.3). This allows us to construct solutions which oscillate in the direction \(\bar{z}\). Note that the condition \((\bar{\rho },\bar{v})\ne 0\) allows us to exclude the degenerate case when \(\xi =0\), which would correspond to having only oscillations in time.

Let us define a restricted wave cone which also eliminates oscillations only in space, i.e.

$$\begin{aligned} \Lambda ':=\left\{ \,{\bar{z}\in \Lambda :\ \ker M_\Lambda (\bar{z}) \cap {\mathbb {R}}^n\times ({\mathbb {R}}{\setminus }\{0\})\ne \emptyset }\,\right\} . \end{aligned}$$
(3.2)

In Lemma 3.2 below we construct localized plane wave-like solutions for (2.3) associated with \(\bar{z}\in \Lambda '\). In order to see that it is enough to consider \(\Lambda '\) instead of \(\Lambda \) we first show the following density lemma.

Lemma 3.1

The restricted cone \(\Lambda '\) is dense in \(\Lambda \).

Proof

Let \(\bar{z}\in \Lambda {\setminus }\Lambda '\). It follows that there exists \(\xi \in S^{n-1}\) such that \(\bar{v}\cdot \xi =0\), and we also have \(\bar{m} \cdot \xi =0\), \((\bar{\sigma }+\bar{p} {{\,\mathrm{id}\,}})\xi =0\).

We define the following sequence. For \(N\ge 1\) let

$$\begin{aligned}&\bar{\rho }_N:=\bar{\rho }+\frac{1}{N},\quad \bar{v}_N:=\bar{v}, \quad \bar{m}_N:=\bar{m}+\frac{1}{N^2}\xi ,\\&\bar{\sigma }_N+\bar{p}_N {{\,\mathrm{id}\,}}:=\bar{\sigma }+\bar{p} {{\,\mathrm{id}\,}}+\frac{1}{N^2\bar{\rho }+N}\left( \xi \otimes \bar{v} +\bar{v}\otimes \xi \right) . \end{aligned}$$

Here and in forthcoming formulas the definition of \(\bar{\sigma }_N\) and \(\bar{p}_N\) is understood in the sense that the symmetric matrix on the right hand side is split into its trace free part and its trace.

It is easy to check that \(\left( \xi ,-\frac{1}{N^2\bar{\rho }+N} \right) \in \ker M_\Lambda (\bar{z}_N)\), therefore \(\bar{z}_N \in \Lambda '\) for \(N\ge 1\). Furthermore, clearly \(\bar{z}_N \rightarrow \bar{z}\) as \(N\rightarrow +\infty \). This concludes the proof. \(\square \)

Recall the definition of the projection \(\pi :Z\rightarrow {\mathbb {R}}\times {\mathbb {R}}^n\times {\mathbb {R}}^n\times {\mathcal {S}}_0^{n\times n}\) from (2.8). We write d for the euclidian distance function.

Lemma 3.2

There exists \(C>0\) such that for any \(\bar{z}\in \Lambda '\), there exists a sequence \(z_N\in C_c^\infty (B_1(0);Z)\), where \(B_1(0)\subset {\mathbb {R}}^n\times {\mathbb {R}}\), solving the linear system (2.3) and satisfying

  1. (i)

    \(d(z_N,[-\bar{z},\bar{z}])\rightarrow 0\) uniformly,

  2. (ii)

    \(z_N\rightharpoonup 0\) in \(L^2(B_1(0);Z)\),

  3. (iii)

    \(\int _{B_1(0)} |\pi (z_N)|^2\, d(x,t)\ge C|\pi (\bar{z})|^2.\)

Proof

We will construct the desired sequence of solutions as a sum of two sequences \(z_N=\hat{z}_N+\tilde{z}_N\), where \(\hat{z}_N\) will be a localized plane wave for the usual Euler equations determining up to a small deviation \(v_N\), \(\sigma _N\) and \(p_N\), while \(\tilde{z}_N\) will take care of \(\rho _N\) and \(m_N\).

Step 1. Euler-type plane waves.

We treat two cases. First, suppose that \(\bar{z}\in \Lambda '\) with \(\bar{v}\ne 0\). It follows from [16, 17] that there exists a sequence \((\hat{v}_N,\hat{\sigma }_N,\hat{p}_N) \subset {\mathcal {C}}_c^\infty (B_1(0);{\mathbb {R}}^n\times {\mathcal {S}}_0^{n\times n}\times {\mathbb {R}})\) satisfying

$$\begin{aligned} \partial _t\hat{v}_N+{{\,\mathrm{div}\,}}\hat{\sigma }_N+\nabla \hat{p}_N=0, \quad {{\,\mathrm{div}\,}}\hat{v}_N=0, \end{aligned}$$

and such that the distance between \((\hat{v}_N(x,t),\hat{\sigma }_N(x,t),\hat{p}_N(x,t))\) and the line segment \([-(\bar{v},\bar{\sigma },\bar{p}),(\bar{v},\bar{\sigma }, \bar{p})]\) converges to 0 uniformly in (xt), \((\hat{v}_N,\hat{\sigma }_N,\hat{p}_N)\rightharpoonup 0\) in \(L^2\) and \(\int _{B_1(0)}\left| {(\hat{v}_N,\hat{\sigma }_N,\hat{p}_N)}\right| ^2\,d(x,t)\ge \hat{C} \left| {(\bar{v},\bar{\sigma },\bar{p})}\right| ^2\) for a constant \(\hat{C}>0\) independent of \(\bar{z}\). We then define the whole vector \(\hat{z}_N\) by setting \(\hat{\rho }_N=0\) and \(\hat{m}_N=0\). Clearly \(\hat{z}_N\) satisfies (2.3).

In the second case, if \(\bar{z}\in \Lambda '\) such that \(\bar{v}= 0\), one can not apply the construction from [16, 17], however one may construct a different suitable potential in the following way. We know that by the definition of \(\Lambda '\) there exists \(\eta =(\xi ,c)\in {\mathbb {R}}^n\times {\mathbb {R}}\) with \(c\ne 0\) and \(M_\Lambda (\bar{z})\eta =0\). In particular \(\bar{m}\cdot \xi +\bar{\rho }c=0\). As already discussed before there necessarily holds \(\xi \ne 0\), because otherwise \((\bar{\rho },\bar{v})=0\), which is ruled out by the definition of \(\Lambda '\). However, since \(\bar{v}=0\), we also obtain \((\bar{\sigma }+\bar{p}{{\,\mathrm{id}\,}})\xi =0.\)

If \(n=2\), then this implies that \(\bar{\sigma }+\bar{p}{{\,\mathrm{id}\,}}=k_1\xi ^\perp \otimes \xi ^\perp \) for some \(k_1\in {\mathbb {R}}\). Here \(\xi ^\perp :=(-\xi _2,\xi _1)\). Furthermore, for any \(\Psi \in {\mathcal {C}}^\infty ({\mathbb {R}}^2\times {\mathbb {R}})\), setting

$$\begin{aligned}&\hat{\sigma }+\hat{p}{{\,\mathrm{id}\,}}:=(\nabla ^\perp )^2\Psi =\begin{pmatrix} \partial _{x_2}^2\Psi &{} -\partial _{x_1}\partial _{x_2}\Psi \\ -\partial _{x_1}\partial _{x_2}\Psi &{}\partial _{x_1}^2\Psi \end{pmatrix},\nonumber \\&\hat{v}\equiv 0,\ \hat{\rho }\equiv 0,\ \hat{m}\equiv 0, \end{aligned}$$
(3.3)

yields a solution of (2.3). In particular, setting

$$\begin{aligned} \Psi _N(x,t):=k_1\frac{1}{N^2}\sin (N(x\cdot \xi +tc)) \chi _\varepsilon (x,t), \end{aligned}$$

where \(\chi _\varepsilon \in {\mathcal {C}}^\infty _c(B_1(0))\) satisfies \(\left| {\chi _\varepsilon }\right| \le 1\) on \(B_1(0)\), \(\chi _\varepsilon =1\) on \(B_{1-\varepsilon }(0)\), one obtains that the function \(\hat{z}_N\) associated via (3.3) satisfies

$$\begin{aligned} \hat{\sigma }_N(x,t)+\hat{p}_N(x,t){{\,\mathrm{id}\,}}=-(\bar{\sigma }+\bar{p}{{\,\mathrm{id}\,}}) \sin (N(x\cdot \xi +tc))\chi _\varepsilon (x,t)+O(1/N) \end{aligned}$$

uniformly in (xt) as \(N\rightarrow +\infty \). The remaining properties in analogy to the first case then follow in the usual way, cf. in particular Lemma 7 in [17].

If \(n=3\), it follows that \((0,\xi )\) is an eigenpair of \(\bar{\sigma }+\bar{p}{{\,\mathrm{id}\,}}\), hence by a spectral decomposition one obtains that \(\bar{\sigma }+\bar{p}{{\,\mathrm{id}\,}}=\lambda _1\nu _1\otimes \nu _1 +\lambda _2\nu _2\otimes \nu _2\), for some \(\lambda _{1,2}\in {\mathbb {R}}\) and \(\nu _{1,2}\perp \xi \). Assume without loss of generality that the second component of \(\xi \) is not vanishing, such that one may write \(\nu _{1,2}\) as linear combinations of \(\xi ^\perp _1:=(-\xi _2,\xi _1, 0)^T\) and \(\xi ^\perp _2:=(0,-\xi _3,\xi _2)^T\). Otherwise, i.e. if \(\xi _2=0\), one can use the corresponding pair of linear independent orthogonal vectors associated with \(\xi _1\ne 0\) or \(\xi _3\ne 0\). These linear combinations allow us to deduce that there exist some \(k_1,k_2,k_3\in {\mathbb {R}}\) such that

$$\begin{aligned} \bar{\sigma }+\bar{p}{{\,\mathrm{id}\,}}=k_1\xi _1^\perp \otimes \xi _1^\perp +k_2\xi _2^\perp \otimes \xi _2^\perp +k_3(\xi _1^\perp \otimes \xi _2^\perp +\xi _2^\perp \otimes \xi _1^\perp ). \end{aligned}$$
(3.4)

Observe that, for any \(\Phi \in {\mathcal {C}}^\infty ({\mathbb {R}}^3\times {\mathbb {R}};{\mathbb {R}}^3)\) setting

$$\begin{aligned} \hat{\sigma }+\hat{p}{{\,\mathrm{id}\,}}:=&\begin{pmatrix} \partial _2^2\Phi _1 &{} -\partial _1\partial _2\Phi _1 &{} 0\\ -\partial _1\partial _2\Phi _1 &{} \partial _1^2\Phi _1 &{} 0\\ 0 &{} 0 &{} 0 \end{pmatrix} +\begin{pmatrix} 0 &{} 0 &{} 0\\ 0&{}\partial _3^2\Phi _2 &{} -\partial _2\partial _3\Phi _2 \\ 0&{}-\partial _2\partial _3\Phi _2 &{} \partial _2^2\Phi _2 \end{pmatrix}\nonumber \\&\quad +\begin{pmatrix} 0 &{} \partial _2\partial _3\Phi _3 &{} -\partial _2^2\Phi _3\\ \partial _2\partial _3\Phi _3 &{} -2\partial _1\partial _3\Phi _3 &{} \partial _1\partial _2\Phi _3\\ -\partial _2^2\Phi _3 &{} \partial _1\partial _2\Phi _3 &{} 0 \end{pmatrix},\nonumber \\ \hat{v}\equiv 0,\ \hat{\rho }&\equiv 0,\ \hat{m}\equiv 0, \end{aligned}$$
(3.5)

yields a solution of (2.3). We then choose

$$\begin{aligned} \Phi _{N}(x,t):=(k_1,k_2,k_3)\frac{1}{N^2}\sin (N(x\cdot \xi +tc)) \chi _\varepsilon (x,t), \end{aligned}$$

to obtain by (3.4) that the function \(\hat{z}_N\) associated via (3.5) satisfies

$$\begin{aligned} \hat{\sigma }_N+\hat{p}_N{{\,\mathrm{id}\,}}=-(\bar{\sigma }+\bar{p}{{\,\mathrm{id}\,}}) \sin (N(x\cdot \xi +tc))\chi _\varepsilon (x,t)+O(1/N). \end{aligned}$$

One then concludes as in the case \(n=2\).

For higher dimensions, one may proceed analogously, the details are left to the reader. This concludes the first step of our construction.

Step 2. The potential for \(\pmb {\bar{\rho }}\) and \(\pmb {\bar{m}}\).

We will show that there exists a constant \(\tilde{C}>0\) independent of \(\bar{z}\), and a sequence \(\tilde{z}_N\subset {\mathcal {C}}^\infty _c(B_1(0);Z)\) of solutions of (2.3), such that

  1. a)

    \((\tilde{v}_N,\tilde{\sigma }_N,\tilde{p}_N)\rightarrow 0\) uniformly,

  2. b)

    \(d\big ((\tilde{\rho }_N,\tilde{m}_N),[-(\bar{\rho },\bar{m}), (\bar{\rho },\bar{m})]\big )\rightarrow 0\) uniformly,

  3. c)

    \((\tilde{\rho }_N,\tilde{m}_N)\rightharpoonup 0\) in \(L^2(B_1(0);{\mathbb {R}}\times {\mathbb {R}}^n)\),

  4. d)

    \(\int _{B_1(0)}\left| {(\tilde{\rho }_N,\tilde{m}_N)}\right| ^2\,d(x,t) \ge \tilde{C}\left| {(\bar{\rho },\bar{m})}\right| ^2\).

It is clear that \(C:=\min \left\{ \,{\tilde{C},\hat{C}}\,\right\} \) and \(z_N:=\hat{z}_N+\tilde{z}_N\) then satisfies the properties stated in the lemma.

For the existence of \(\tilde{z}_N\) observe that for any \(\Psi \in {\mathcal {C}}^\infty ({\mathbb {R}}^n\times {\mathbb {R}};{\mathcal {S}}^{n\times n})\) the function \(\tilde{z}\) defined by

$$\begin{aligned} \tilde{\rho }:=\partial _t{{\,\mathrm{div}\,}}{{\,\mathrm{div}\,}}\Psi ,\quad \tilde{v} :=gA\partial _{x_n}{{\,\mathrm{div}\,}}\Psi -gAe_n{{\,\mathrm{div}\,}}{{\,\mathrm{div}\,}}\Psi ,\nonumber \\ \tilde{m}:=-\partial _t^2{{\,\mathrm{div}\,}}\Psi ,\quad \tilde{\sigma }+\tilde{p}{{\,\mathrm{id}\,}}:=-gA\partial _t\partial _{x_n}\Psi \end{aligned}$$
(3.6)

satisfies equation (2.3).

As before, there exists \(\eta =(\xi ,c)\in {\mathbb {R}}^n\times {\mathbb {R}}\) with \(c\ne 0\) and \(M_\Lambda (\bar{z})\eta =0\). In particular \(\bar{m}\cdot \xi +\bar{\rho }c=0\) and necessarily \(\xi \ne 0\). The functions \(\tilde{z}_N\) are now defined as in (3.6) with \(\Psi =\Psi _N\) given by

$$\begin{aligned} \Psi _N(x,t):=\frac{1}{N^3}\tilde{M}\sin \left( N(x\cdot \xi +tc)\right) \chi _\varepsilon (x,t), \end{aligned}$$

where \(\chi _\varepsilon \in {\mathcal {C}}^\infty _c(B_1(0))\) again is the usual cut-off function and

$$\begin{aligned} \tilde{M}:=\frac{\bar{m}\cdot \xi }{c^2\left| {\xi }\right| ^4}\xi \otimes \xi -\frac{1}{c^2\left| {\xi }\right| ^2}(\bar{m}\otimes \xi +\xi \otimes \bar{m}) \in {\mathcal {S}}^{n\times n}. \end{aligned}$$

Clearly a) holds true, since the definition of \(\tilde{v}_N\), \(\tilde{\sigma }_N\) and \(\tilde{p}_N\) in (3.6) involves only derivatives of order 2. Moreover,

$$\begin{aligned} \tilde{m}_N(x,t)&=c^2\tilde{M}\xi \cos \left( N(x\cdot \xi +tc)\right) \chi _\varepsilon (x,t)+O(1/N),\\ \tilde{\rho }_N(x,t)&=c\xi ^T\tilde{M}\xi \cos \left( N(x\cdot \xi +tc)\right) \chi _\varepsilon (x,t)+O(1/N) \end{aligned}$$

uniformly as \(N\rightarrow +\infty \), and \(c^2\tilde{M}\xi =-\bar{m}\), \(c\xi ^T\tilde{M}\xi =-\bar{m}\cdot \xi /c=\bar{\rho }\). This shows property b). Properties c) and d) follow again in the standard fashion. \(\square \)

Remark 3.3

If one replaces in Lemma 3.2 the space-time ball \(B_1(0)\subset {\mathbb {R}}^n\times {\mathbb {R}}\) by the cylinder \(B_1(0)\times (-1,1)\subset {\mathbb {R}}^n\times {\mathbb {R}}\) and then chooses a cutoff function \(\chi _\varepsilon \in {\mathcal {C}}^\infty _0(B_1(0)\times (-1,1))\) of the form \(\chi _\varepsilon (x,t)=\chi ^a_\varepsilon (x)\chi _\varepsilon ^b(t)\) with \(\chi ^a_\varepsilon \in {\mathcal {C}}^\infty _0(B_1(0))\), \(\chi _\varepsilon ^b\in {\mathcal {C}}^\infty _0(-1,1)\) one sees that the convergence in (ii) improves to \(z_N\rightarrow 0\) in \({\mathcal {C}}^0([-1,1];L^2_w(B_1(0)))\).

3.2 Perturbing along sufficiently long segments

In this subsection we prove that the wave cone \(\Lambda \) is large with respect to \(K_{(x,t)}\), in the sense that any two points in \(K_{(x,t)}\) can be connected with a \(\Lambda \)-segment. Furthermore, this property automatically implies that any point in the interior of the convex hull of \(K_{(x,t)}\) can be perturbed along sufficiently long \(\Lambda \)-segments. The set \(K_{(x,t)}\) has been defined in (2.5).

For simplicity of notation, for the rest of the subsection we will fix a point \((x,t)\in \Omega \times [0,T)\) and write K instead of \(K_{(x,t)}\).

Lemma 3.4

For any \(z_1,z_2\in K\), \(z_1\ne z_2\), \(p_1=p_2\), we have \(\bar{z}:=z_2-z_1\in \Lambda \).

Proof

If \(\bar{\rho }=0\), then \(\bar{v}\ne 0\), because otherwise \(z_1=z_2\). Furthermore, \(\bar{m}=\rho _1\bar{v}\), so all that needs to be checked is that there exists \(\xi \in \bar{v}^\perp \), \(\xi \ne 0\) such that \(\bar{\sigma }\xi =c\bar{v}\), for some \(c\in {\mathbb {R}}\). However, for any \(\xi \in \bar{v}^\perp \) there holds

$$\begin{aligned} \bar{\sigma }\xi&=((v_2\otimes v_2)^\circ -(v_1\otimes v_1)^\circ ) \xi =(v_2\otimes v_2-v_1\otimes v_1)\xi \\&=(v_2\otimes \bar{v}+\bar{v}\otimes v_1)\xi =(v_1\cdot \xi )\bar{v}. \end{aligned}$$

If \(\bar{\rho }\ne 0\), then without loss of generality it is equal to 2, i.e. \(\rho _2=1\), \(\rho _1=-1\), and we obtain as before from \(z_1,z_2\in K\) that \(\bar{\sigma }\xi =(v_1\cdot \xi )\bar{v}\) for any \(\xi \in \bar{v}^\perp \). So it remains to check that \(\bar{m}\cdot \xi =2v_1\cdot \xi \) also holds. We have

$$\begin{aligned} \bar{m}\cdot \xi =(v_2+v_1)\cdot \xi =(\bar{v}+2v_1) \cdot \xi =2(v_1\cdot \xi ) \end{aligned}$$

and the proof is finished. \(\square \)

Without having any further information on the convex hull of K, we can prove the following geometric lemma solely based on this property.

Corollary 3.5

For any \(z\in {\text {int}} (K^{co})\) there exists \(\bar{z}\in \Lambda \) such that

$$\begin{aligned} {[}z-\bar{z},z+\bar{z}]\subset {\text {int}} (K^{co}) \quad {\text { and }}\quad |\pi (\bar{z})|\ge \frac{1}{2N}d(\pi (z),\pi (K)), \end{aligned}$$

where \(N={\text {dim}}(Z)\) and d is the Euclidean distance on \(\pi (Z)\).

The proof is the same as those of Lemma 6 from [17], respectively Lemma 4.9 from [21], relying on Carathéodory’s theorem and Lemma 3.4 above, therefore we omit it.

3.3 The convex hull

We now explicitly compute the full \(\Lambda \)-convex hull associated with the differential inclusion (2.3), (2.5), which in our case turns out to coincide with the usual convex hull. The definition of the \(\Lambda \)-convex hull \((K')^\Lambda \) of \(K'\subset Z\) can be recalled for example from [25].

Let us again fix \((x,t)\in \Omega \times [0,T)\) and write K instead of \(K_{(x,t)}\), U instead of \(U_{(x,t)}\) and e[r] instead of e(xt)[r] for \(r\in {\mathbb {R}}\). Recall the definition of U in (2.7) and of the functions \(T_\pm \), Q in (2.6).

Proposition 3.6

There holds \(K^\Lambda =K^{co}=\overline{U}\).

In Lemma 3.8 below we will see that the closure of U splits into

$$\begin{aligned} \overline{U}=K_{-}'\cup \overline{U}_0\cup K_+', \end{aligned}$$

where

$$\begin{aligned} \overline{U}_0&:=\left\{ \,{z\in Z:\rho \in (-1,1),~T_\pm (z) \le e[\pm 1],~Q(z)\le e[\rho ]}\,\right\} ,\\ K_\pm '&:=\left\{ \,{z\in Z:\rho =\pm 1,~m=\pm v,~\lambda _{\mathrm{max}}(v\otimes v-\sigma ) \le e[\pm 1]}\,\right\} . \end{aligned}$$

Moreover, Lemma 3.11 actually shows that \(K_\pm '\) is the \(\Lambda \)-convex hull of the sets \(K_\pm :=K\cap \{z\in Z:\rho =\pm 1\}\).

The proof of Proposition 3.6 is organized as in the corresponding Section of [21] for the inhomogeneous Euler equation and relies on Lemma 3.8 and 3.11.

Lemma 3.7

The function Q is convex.

Proof

Since Q is defined as the maximal eigenvalue of the matrix M(z), there holds

$$\begin{aligned} Q(z)=\sup _{\xi \in S^{n-1}}\xi ^TM(z)\xi =\sup _{\xi \in S^{n-1}} \left( g_\xi (z)-\xi ^T\sigma \xi \right) , \end{aligned}$$

where for every fixed \(\xi \in S^{n-1}\) the function \(g_\xi :\left\{ \,{z\in Z:\rho \in (-1,1)}\,\right\} \rightarrow {\mathbb {R}}\) is given by

$$\begin{aligned} g_\xi (z)=\xi ^TM(z)\xi +\xi ^T\sigma \xi =\frac{(v\cdot \xi )^2 -2\rho (m\cdot \xi )(v\cdot \xi )+(m\cdot \xi )^2}{1-\rho ^2}. \end{aligned}$$

We will show that every \(g_\xi \) is convex, such that Q is convex as a supremum of convex functions.

In order to prove the convexity of \(g_\xi \), \(\xi \in S^{n-1}\) fixed, we write \(v=x_1 \xi +v'\), \(m=x_2 \xi +m'\) with \(x_1,x_2\in {\mathbb {R}}\), \(v',m'\in \xi ^\perp \). Then it is enough to show that the function \(g:(-1,1)\times {\mathbb {R}}^2\rightarrow {\mathbb {R}}\),

$$\begin{aligned} g(\rho ,x)=\frac{x_1^2-2\rho x_1x_2+x_2^2}{1-\rho ^2} \end{aligned}$$

is convex. We write \(g(\rho ,x)=x^TA(\rho )x\) with

$$\begin{aligned} A(\rho ):=\frac{1}{1-\rho ^2}\begin{pmatrix} 1 &{} -\rho \\ -\rho &{} 1 \end{pmatrix}. \end{aligned}$$

Let us fix \((\rho ,x)\in (-1,1)\times {\mathbb {R}}^2\) and observe that \(A(\rho )\) is positive definite. Thus the restricted function \(g(\rho ,\cdot )\) is strictly convex, or equivalently \(D^2g(\rho ,x)[0,y]^2\ge 0\) for all \(y\in {\mathbb {R}}^2\).

It therefore remains to show that \(D^2g(\rho ,x)[1,y]^2\ge 0\) for all \(y\in {\mathbb {R}}^2\). By the positive definiteness of \(A(\rho )\) we obtain

$$\begin{aligned} D^2g(\rho ,x)[1,y]^2&=x^TA''(\rho )x+4y^TA'(\rho )x +2y^TA(\rho )y\\&=2\left( y+A(\rho )^{-1}A'(\rho )x\right) ^TA(\rho ) \left( y+A(\rho )^{-1}A'(\rho )x\right) \\&\quad +x^TA''(\rho )x -2x^TA'(\rho )A(\rho )^{-1}A'(\rho )x\\&\ge x^T\left( A''(\rho ) -2A'(\rho )A(\rho )^{-1}A'(\rho )\right) x. \end{aligned}$$

It turns out that in fact \(A''(\rho )=2A'(\rho )A(\rho )^{-1}A'(\rho )\), which shows the convexity of g. Indeed, differentiating

$$\begin{aligned} (1-\rho ^2)A(\rho )=\begin{pmatrix} 1 &{} -\rho \\ -\rho &{} 1 \end{pmatrix} \end{aligned}$$

on both sides yields

$$\begin{aligned} (1-\rho ^2)A'(\rho )&=2\rho A(\rho )+C, \end{aligned}$$
(3.7)
$$\begin{aligned} (1-\rho ^2)^2A''(\rho )&=2(1+3\rho ^2)A(\rho )+4\rho C, \end{aligned}$$
(3.8)

where

$$\begin{aligned} C:=\begin{pmatrix} 0 &{} -1\\ -1 &{} 0 \end{pmatrix}. \end{aligned}$$

Moreover, a straightforward computation shows

$$\begin{aligned} CA(\rho )^{-1}C=(1-\rho ^2)A(\rho )-2\rho C. \end{aligned}$$
(3.9)

Now (3.7)–(3.9) imply the desired identity \(A''(\rho )=2A'(\rho )A^{-1}(\rho )A'(\rho )\). \(\square \)

The following lemma implies the inclusion \(K^\Lambda \subset K^{co}\subset \overline{U}\).

Lemma 3.8

The set U is convex and its closure \(\overline{U}\) splits into \(\overline{U}=K_{-}'\cup \overline{U}_0\cup K_+'\). In particular \(K\subset \overline{U}\).

Proof

In Lemma 3.7 we have already shown that Q is a convex function. Using the basic triangle inequality one can directly check that \(T_\pm (z)<e[\pm 1]\) define convex sets. Hence U is convex.

For the stated identity concerning \(\overline{U}\) first of all observe that \(\overline{U}_0\subset \overline{U}\). Next we will show that \(K_\pm '\subset \overline{U}\). Let \(z_*\in K_+'\) for instance and take any \(z'\in K\) with \(\rho '=-1\), as well as a sequence \((\rho _j)_{j\in {\mathbb {N}}}\subset (-1,1)\) with \(\rho _j\rightarrow 1\). The element

$$\begin{aligned} z_j=\frac{1-\rho _j}{2}z'+\frac{1+\rho _j}{2}z_* \end{aligned}$$

clearly converges to \(z_*\) as \(j\rightarrow +\infty \). Using \(z_*\in K_+'\) and \(z'\in K\), \(\rho '=-1\) one sees that

$$\begin{aligned} T_+(z_j)=\frac{1}{n}\left| {v_*}\right| ^2 =\frac{1}{n}{{\,\mathrm{tr}\,}}(v_*\otimes v_*-\sigma _*) \le \lambda _{\mathrm{max}}(v_*\otimes v_*-\sigma _*) \le e[1], \end{aligned}$$

as well as \(T_{-}(z_j)=e[-1]\). For the matrix \(M(z_j)\) we compute

$$\begin{aligned} M(z_j)&=\frac{1-\rho _j}{2}\big (v'\otimes v'-\sigma '\big ) +\frac{1+\rho _j}{2}\big (v_*\otimes v_*-\sigma _*\big )\\&=\frac{1-\rho _j}{2}e[-1]{{\,\mathrm{id}\,}}+\frac{1+\rho _j}{2} \big (v_*\otimes v_*-\sigma _*\big ). \end{aligned}$$

Hence

$$\begin{aligned} Q(z_j)=\lambda _{\mathrm{max}}(M(z_j))\le \frac{1-\rho _j}{2}e[-1] +\frac{1+\rho _j}{2}e[1]=e[\rho _j]. \end{aligned}$$

This shows that every \(z_j\) and therefore also the limit \(z_*\) is contained in \(\overline{U}\). The case \(z_*\in K_{-}'\) works analoguosly. Thus \(K_{-}'\cup \overline{U}_0\cup K_+'\subset \overline{U}\).

Let now \((z_j)_{j\in {\mathbb {N}}}\subset U\) be convergent to some \(z_*\in Z\). If \(\rho _*\in (-1,1)\), then it is clear that \(z_*\in \overline{U}_0\subset \overline{U}\). Consider the case \(\rho _*=1\). Since on U there holds

$$\begin{aligned} |m\pm v|<\sqrt{ne[\pm 1]}(1\pm \rho ), \end{aligned}$$
(3.10)

it follows that \(m_*=v_*\). Recall that \(e[\pm 1]\ge 0\) from (2.11). Next we rewrite

$$\begin{aligned} M(z)=v\otimes v+(1-\rho ^2)\frac{m-\rho v}{1-\rho ^2} \otimes \frac{m-\rho v}{1-\rho ^2}-\sigma , \end{aligned}$$
(3.11)

and observe that

$$\begin{aligned} |m-\rho v|\le \frac{1-\rho }{2}|m+v| +\frac{1+\rho }{2}|m-v| <\frac{1}{2}\max \left\{ \,{\sqrt{ne[-1]},\sqrt{ne[+1]}}\,\right\} (1-\rho ^2), \end{aligned}$$
(3.12)

by (3.10). Therefore

$$\begin{aligned} \lim _{j\rightarrow +\infty } M(z_j)= v_*\otimes v_*-\sigma _*. \end{aligned}$$

Thus \(\lambda _{\mathrm{max}}(M(z_j))<e[\rho _j]\), \(j\in {\mathbb {N}}\) and the continuity of the maximal eigenvalue function imply \(z_*\in K_+'\). The same procedure again works for the other case \(\rho _*=-1\), such that the statement of the Lemma follows. \(\square \)

In terms of Proposition 3.6 it now remains to prove the inclusion \(\overline{U}\subset K^\Lambda \). The proof of this inclusion will rely on the Krein-Milman theorem for \(\Lambda \)-convex sets [25, Lemma 4.16]. For this we discuss the following \(\Lambda \)-directions.

Lemma 3.9

Let \(z\in Z_0\). The element \(\tilde{z}(z)\in Z\) defined by

$$\begin{aligned}&\tilde{\rho }(z):=1,\quad \tilde{v}(z):=\frac{m-\rho v}{1-\rho ^2}, \quad \tilde{m}(z):=v-\rho \tilde{v}(z), \\&\tilde{\sigma }(z)+\tilde{p}(z){{\,\mathrm{id}\,}}:=\tilde{m}(z)\otimes \tilde{v}(z) +\tilde{v}(z)\otimes \tilde{m}(z) \end{aligned}$$

is contained in \(\Lambda \) and has the property that for every \(t\in (-1-\rho ,1-\rho )\) there holds

$$\begin{aligned} \tilde{z}(z+t\tilde{z}(z))=\tilde{z}(z),\quad T_\pm (z+t\tilde{z}(z)) =T_\pm (z),\quad M(z+t\tilde{z}(z))^\circ =M(z)^\circ . \end{aligned}$$

Proof

The proof is a straightforward adaption of Lemma 4.6 (ii),(iii) in [21] and therefore only sketched here. As a nontrivial element \((\xi ,c)\in {\mathbb {R}}^n\times {\mathbb {R}}\) in the kernel of \(M_\Lambda (\tilde{z}(z))\) one can take any \(\xi \ne 0\) contained in the orthogonal complement of \(\tilde{v}(v)\) and set \(c=-\tilde{m}(z)\cdot \xi \).

The stated invariances can be verified directly. Note that for \(T_\pm \) it helps to rewrite

$$\begin{aligned} T_\pm (z)=\frac{1}{n}\left| {v+(\pm 1-\rho )\tilde{v}(z)}\right| ^2, \end{aligned}$$

whereas for \(M^\circ \) identity (3.11) is useful. \(\square \)

As in [21] we call \(\tilde{z}(z)\) the Muskat direction associated with z, since it generalizes the density perturbation of the Muskat problem introduced in [33]. Also as in [21] we have the following lemma concering Euler type directions preserving the density.

Lemma 3.10

For any pair \((\bar{v},\bar{\sigma })\in {\mathbb {R}}^n\times {\mathcal {S}}_0^{n \times n}\), \(\bar{v}\ne 0\), there exists \(\bar{p}\in {\mathbb {R}}\), such that for all \(\lambda \in {\mathbb {R}}\) the vector \(\bar{z}_\lambda :=(0,\bar{v},\lambda \bar{v},\bar{\sigma },\bar{p})\) belongs to \(\Lambda \). Moreover, for all \(t\in {\mathbb {R}}\) there holds

$$\begin{aligned} T_+(z+t\bar{z}_{-1})=T_+(z),\quad T_{-}(z+t\bar{z}_{+1})=T_{-}(z). \end{aligned}$$

Proof

See the proof of Lemma 4.6 (i),(iv) in [21]. \(\square \)

We have the following results concerning \(\Lambda \)-extreme points of \(\overline{U}\). Recall that \(\pi :Z\rightarrow {\mathbb {R}}\times {\mathbb {R}}^n\times {\mathbb {R}}^n\times {\mathcal {S}}_0^{n\times n}\) is the projection from (2.8).

Lemma 3.11

The set \(\pi (U)\) is bounded by a constant depending only on e and the dimension n. Moreover, for every \(z\in \overline{U}{\setminus } K\) there exists \(\bar{z}\in \Lambda {\setminus }\{0\}\), such that \(z\pm \bar{z}\in \overline{U}\).

Proof

Let \(z\in U\). Clearly \(\left| {\rho }\right| \le 1\) and the two inequalities (3.10) imply a bound on v and m in terms of e and n. Using (3.11), (3.12), we obtain that \(M(z)+\sigma \) is also bounded by means of e and n. In consequence we obtain \(\left| {{{\,\mathrm{tr}\,}}M(z)}\right| \le c(e,n)\). Since the trace is bounded and \(\lambda _{\mathrm{max}}(M(z))=Q(z)<e[\rho ]\), using that \(z\in U\), we get a corresponding bound on the whole spectrum of M(z). Hence, \(M(z)+\sigma \) and M(z) are both uniformly bounded, and therefore \(\left| {\sigma }\right| \le c(e,n)\). This proves that \(\pi (U)\) is bounded.

Next we turn to the perturbation property. Let \(z\in \overline{U}{\setminus } K\) and recall from Lemma 3.8 that \(\overline{U}=\overline{U}_0\cup K_+'\cup K_{-}'\), \(K\subset K_+'\cup K_{-}'\).

If \(z\in K_+'{\setminus } K\), there exists an Euler type direction \(\bar{z}_{+1}\), i.e. \(\bar{m}=\bar{v}\), from Lemma 3.10, such that \(z\pm \bar{z}_{+1}\in K_+'{\setminus } K\). The proof is the same as in [17] and [21, Lemma 4.8] and therefore omitted. Similar for \(z\in K_{-}'{\setminus } K\).

It remains to look at \(z\in \overline{U}_0\). Let us first check in which cases we can use the associated Muskat direction \(\bar{z}=\tilde{z}(z)\) from Lemma 3.9. By this Lemma the two inequalities \(T_\pm (z+t\tilde{z}(z))\le e[\pm 1]\) remain true for all \(t\in (-1-\rho ,1-\rho )\). Furthermore, a straightforward computation shows that \(Q(z)-e[\rho ]\) can be rewritten as

$$\begin{aligned} Q(z)-e[\rho ]&=\frac{1}{n}{{\,\mathrm{tr}\,}}M(z)+\lambda _{\mathrm{max}}(M(z)^\circ )-e[\rho ]\\&=\frac{1-\rho }{2}T_{-}(z)+\frac{1+\rho }{2}T_+(z)+\lambda _{\mathrm{max}}(M(z)^\circ )\\&\quad -\left( \frac{1-\rho }{2}e[-1]+\frac{1+\rho }{2}e[+1]\right) . \end{aligned}$$

Using Lemma 3.9 once more we therefore obtain

$$\begin{aligned} Q(z+t\tilde{z}(z))-e[\rho +t]=Q(z)-e[\rho ]+\frac{t}{2}\big (T_+(z) -T_{-}(z)+e[-1]-e[+1]\big ). \end{aligned}$$

Thus the desired inequality \(Q(z+t\tilde{z}(z))\le e[\rho +t]\) holds true for \(\left| {t}\right| >0\) sufficiently small in the case where \(Q(z)<e[\rho ]\), but also in the case where \(Q(z)=e[\rho ]\) and \(T_+(z)-e[+1]=T_{-}(z)-e[-1]\).

Therefore it remains to treat the last case \(Q(z)=e[\rho ]\) and \(T_+(z)-e[+1]\ne T_{-}(z)-e[-1]\). Note that this implies \(\lambda _{\mathrm{min}}(M(z))<e[\rho ]\), since otherwise \(e[\rho ]=\lambda _{\mathrm{max}}(M(z))=\lambda _{\mathrm{min}}(M(z))\) yields \(M(z)^\circ =0\) and thus

$$\begin{aligned} e[\rho ]=Q(z)=\frac{1-\rho }{2}T_{-}(z)+\frac{1+\rho }{2}T_+(z). \end{aligned}$$

However, using that \(T_\pm (z)\le e[\pm 1]\), this equality can only hold if \(T_\pm (z)= e[\pm 1]\), which is excluded in the considered case.

Let us assume \(T_{-}(z)-e[-1]> T_+(z)-e[+1]\), the other case is treated similarly. We consider Euler directions from Lemma 3.10 such that \(\bar{m}=\bar{v}\), i.e. \(\bar{z}=\bar{z}_{+1}\) associated with \((\bar{v},\bar{\sigma })\) to be chosen. By said Lemma such Euler directions preserve \(T_{-}\), i.e., \(T_{-}(z+t\bar{z}_{+1})=T_{-}(z)\le e[-1]\) for all \(t\in {\mathbb {R}}\).

Once again proceeding as in [17] or [21, Lemma 4.8], one may easily prove that there exists such an Euler direction which does not effect the maximal eigenvalue of M(z), i.e. such that \(Q(z+t\bar{z})=Q(z)=e[\rho ]\) for small enough \(\left| {t}\right| \). The last condition needed for \(z+t\bar{z}_{+1} \in \overline{U}\) follows from the continuity of \(T_+\), i.e., for all \(\left| {t}\right| \) small enough one has \(T_+(z+t\bar{z})-e[+1] <T_{-}(z)-e[-1]\le 0.\)

\(\square \)

Proof of Proposition 3.6

From Lemma 3.8 one obtains \(K^\Lambda \subset K^{co}\subset \overline{U}\), while Lemma 3.11 implies that the \(\Lambda \)-extreme points of the up to the p-component compact set \(\overline{U}\) are contained in K. The inclusion \(\overline{U}\subset K^\Lambda \) follows from the Krein-Milman theorem for \(\Lambda \)-convex sets, cf. [25, Lemma 4.16]. \(\square \)

3.4 Continuity of constraints

We have the following result regarding the continuity of the nonlinear constraints \(K_{(x,t)}\), given the continuity of the defining function \(e(x,t)[\rho ]\). This serves to have a set of subsolutions which is bounded in \(L^2(\mathscr {D})\), where \(\mathscr {D}:=\Omega \times (0,T)\).

Lemma 3.12

Let \(\mathscr {U}\subset \mathscr {D}\) be open and assume that the map \(\mathscr {D}\times {\mathbb {R}}\rightarrow {\mathbb {R}}\), \((x,t,r)\mapsto e(x,t)[r]\) is continuous and bounded on \(\mathscr {U}\times [-1,1]\), then it follows that the map \((x,t)\mapsto \pi (K_{(x,t)})\) is continuous and bounded on \(\mathscr {U}\) with respect to the Hausdorff metric \(d_{{\mathcal {H}}}\).

Proof

The boundedness of \(\bigcup _{(x,t)\in \mathscr {U}}\pi (K_{(x,t)})\) follows from Lemma 3.11 and the boundedness of e.

Concerning the continuity let us fix \(y:=(x,t)\in \mathscr {U}\) and \(\varepsilon \in (0,1)\). In order to prove \(d_{{\mathcal {H}}}(\pi (K_y),\pi (K_{y'}))<\varepsilon \) for all \(y'=(x',t')\in B_\delta (y)\subset \mathscr {U}\) for a suitable \(\delta =\delta (\varepsilon ,y)>0\) we will use [12, Lemma 3.1] saying that \(d_{{\mathcal {H}}}(\pi (K_y), \pi (K_{y'}))<\varepsilon \) holds true provided for any \(\pi (z)\in \pi (K_y)\) there exists \(\pi (z')\in \pi (K_{y'})\cap B_\varepsilon (\pi (z))\) and vice versa.

First of all observe that by the continuity of e there exists \(\delta \in (0,\varepsilon )\) such that

$$\begin{aligned} \left| {e(y)[\pm 1]-e(y')[\pm 1]}\right|<\varepsilon , \quad \left| \left( ne(y)[\pm 1]\right) ^{1/2} -\left( n e(y')[\pm 1]\right) ^{1/2} \right| <\varepsilon , \end{aligned}$$
(3.13)

for any \(y'\in B_\delta (y)\subset \mathscr {U}\). Let now

$$\begin{aligned} z=(\rho ,v,\rho v, v\otimes v-e(y)[\rho ]{{\,\mathrm{id}\,}},p)\in K_y, \end{aligned}$$

with \(\rho \in \{-1,1\}\) and \(|v|^2=ne(y)[\rho ]\). It follows that \(v=\left( ne(y)[\rho ]\right) ^{1/2}b,\) for some \(b\in S^{n-1}\). For \(y'\in B_\delta (y)\) we define

$$\begin{aligned} z':=(\rho ,v',\rho v', v'\otimes v'-e(y')[\rho ]{{\,\mathrm{id}\,}},p) \end{aligned}$$

by setting \( v':=\left( ne(y')[\rho ]\right) ^{1/2}b\). Note that \(z'\in K_{y'}\).

Furthermore, from (3.13) it follows that

$$\begin{aligned} |v-v'|<\varepsilon ,\quad |m-m'|<\varepsilon ,\quad |\sigma -\sigma '|<(n+1)\varepsilon . \end{aligned}$$

This way we have shown that for any \(y'\in B_\delta (y)\) and any \(z\in K_y\) there exists \(z'\in K_{y'}\cap B_{c\varepsilon }(z)\) for some \(c>0\) depending only on the dimension n. Using the symmetry of this construction, one can similarly prove that for any \(z'\in K_{y'}\) there exists \(z\in K_y\) such that \(|z-z'|<c\varepsilon \). As illustrated above we then conclude \(d_{{\mathcal {H}}}(\pi (K_y),\pi (K_{y'})) <c\varepsilon \) via [12, Lemma 3.1]. \(\square \)

3.5 Conclusion

We have now collected all the ingredients for the proof of Theorem 2.5, which follows by the known convex integration procedures in the Tartar framework [16, 17] and its refinements [6, 12]. We refrain from formulating another version of the Tartar framework exactly taylored to our needs and instead only point out the small modifications that need to be done in the existing convex integration theorems in order to conclude Theorem 2.5.

We begin with the functional setup. Let \(\mathscr {D}:=\Omega \times (0,T)\). Fix a function \(e:\mathscr {D}\times [-1,1]\rightarrow {\mathbb {R}}\), a subsolution \(\hat{z}=(\hat{\rho },\hat{v},\hat{m},\hat{\sigma }, \hat{p})\) with initial data \((\rho _0,v_0)\) and mixing zone \(\mathscr {U}\), as well as an error function \(\delta :[0,T]\rightarrow {\mathbb {R}}\) as stated in Theorem 2.5. Define \(X_0\) to be the set of all functions \(\pi (z)=(\rho ,v,m,\sigma )\), such that

  • \(z=(\rho ,v,m,\sigma ,p)\) is a subsolution for e, \((\rho _0,v_0)\) and with the same mixing zone \(\mathscr {U}\), in the sense of Definition 2.3,

  • \(z=\hat{z}\) a.e. on \(\mathscr {D}{\setminus }\mathscr {U}\),

  • there exists \(C(z)\in (0,1)\), such that for all \(t\in [0,T]\) there holds

    $$\begin{aligned} \left| {\int _\Omega \left( \frac{n}{2}e_1(x,t)+gAx_n\right) (\hat{\rho }(x,t)-\rho (x,t))\,dx}\right| \le C(z)\delta (t). \end{aligned}$$
    (3.14)

Recall from Sect. 2.1 that \(e(x,t)[r]=e_0(x,t)+re_1(x,t)\) with \(L^\infty \) functions \(e_0,e_1\), where \(e_1\) is additionally of class \({\mathcal {C}}^0([0,T];L^2(\Omega ))\).

Next we will equip \(X_0\) with a suitable metric. Recall from Remark 2.4 that for any \(\pi (z)\in X_0\) there holds \(\rho \in {\mathcal {C}}^0([0,T];L^2_w(\Omega ))\). Moreover, for every element from \(X_0\) there holds\(\left\| {\rho (\cdot ,t)}\right\| ^2_{L^2(\Omega )} \le \left| {\Omega }\right| \), \(t\in [0,T]\) and \(\left\| {(v,m,\sigma )}\right\| _{L^2 (\mathscr {D})}^2\le c\left| {\Omega }\right| T\) for a constant c depending only on \(\left\| {e_0}\right\| _{L^\infty (\mathscr {D})}, \left\| {e_1}\right\| _{L^\infty (\mathscr {D})}\) and the dimension n. This is due to Lemma 3.11.

Thus we can find two bounded closed balls \(B^{(1)}\) contained in \(L^2(\Omega )\) and \(B^{(2)}\) contained in \(L^2(\mathscr {D};{\mathbb {R}}^n\times {\mathbb {R}}^n\times {\mathcal {S}}_0^{n\times n})\), such that every function \(\pi (z)\in X_0\) satisfies \(\rho (\cdot ,t)\in B^{(1)}\), \(t\in [0,T]\), \((v,m,\sigma )\in B^{(2)}\). As in [6, 17] let \(d^{(i)}\), \(i=1,2\), be a metric on \(B^{(i)}\) metrizing the corresponding weak \(L^2\)-topology and define for \(\pi (z),\pi (z')\in X_0\) the metric

$$\begin{aligned} d_X(\pi (z),\pi (z')):=\sup \left\{ \,{\sup _{t\in [0,T]}d^{(1)} \big (\rho (\cdot ,t),\rho '(\cdot ,t)\big ),d^{(2)} \big ((v,m,\sigma ),(v',m',\sigma ')\big )}\,\right\} . \end{aligned}$$

Finally let X be the closure of \(X_0\) in \({\mathcal {C}}^0([0,T];(B^{(1)},d^{(1)}))\times (B^{(2)},d^{(2)})\) with respect to the metric \(d_X\). Then X is a complete metric space with \(d_X(\pi (z_j),\pi (z))\rightarrow 0\) if and only if \(\rho _j\rightarrow \rho \) in \({\mathcal {C}}^0([0,T];L^2_w(\Omega ))\) and \((v_j,m_j,\sigma _j)\rightharpoonup (v,m,\sigma )\) weakly in \(L^2(\mathscr {D};{\mathbb {R}}^n\times {\mathbb {R}}^n\times {\mathcal {S}}_0^{n\times n})\). Concerning notation we again denote elements from X by \(\pi (z)\).

Note that the \(d_X\) topology is stronger than the topology coming from simply metrizing the weak topology on a bounded closed ball of \(L^2(\mathscr {D};\pi (Z))\). In consequence the functional \(I:X\rightarrow {\mathbb {R}}\),

$$\begin{aligned} I(\pi (z)):=\int _{\mathscr {D}}\left| {\pi (z(x,t))}\right| ^2\,d(x,t) \end{aligned}$$
(3.15)

is still a Baire-1 functional, cf. [12, Section 2.3]. We also define \(J:X\rightarrow {\mathbb {R}}\),

$$\begin{aligned} J(\pi (z)):=\int _{\mathscr {D}}{{\,\mathrm{dist}\,}}\big (\pi (z(x,t)), \pi (K_{(x,t)})\big )^2\,d(x,t). \end{aligned}$$
(3.16)

Note that J is continuous with respect to the strong \(L^2(\mathscr {D};\pi (Z))\) topology.

Lemma 3.13

(Perturbation Lemma) Let \(\alpha >0\). There exists \(\beta >0\), such that for every \(\pi (z)\in X_0\) with \(J(\pi (z))\ge \alpha \) there exists a sequence \((\pi (z_k))_{k\in {\mathbb {N}}}\subset X_0\) with \(d_X(\pi (z_k),\pi (z))\rightarrow 0\) and such that for all \(k\in {\mathbb {N}}\) there holds

$$\begin{aligned} \int _{\mathscr {D}}\left| {\pi (z_k(x,t))-\pi (z(x,t))}\right| ^2\,d(x,t)\ge \beta . \end{aligned}$$
(3.17)

Proof

If we neglect for now property (3.14) in the definition of \(X_0\), it follows as in [12, Lemma 2.4] from Lemmas 3.1, 3.2, 3.12 and Corollary 3.5 that there exists \(\beta (\alpha )>0\) and a sequence \((\pi (z_k))_{k\in {\mathbb {N}}}\subset X_0{\setminus }\{(3.14)\}\) satisfying (3.17) and \(\pi (z_k)\rightharpoonup \pi (z)\) weakly in \(L^2(\mathscr {D};\pi (Z))\). At this point the only difference that prevents us from citing [12, Lemma 2.4] literally is the projection \(\pi \), but as in [21, Lemma 5.3] the projection can be included by canonical modifications.

It therefore remains to improve the convergence of the \(\rho \)-component from \(\rho _k\rightharpoonup \rho \) weakly in \(L^2(\mathscr {D})\) to \(\rho _k\rightarrow \rho \) in \({\mathcal {C}}^0([0,T];L_w^2(\Omega ))\) and to show that the functions \((\pi (z_k))_{k\in {\mathbb {N}}}\) satisfy (3.14) for all k big enough. However, the improved convergence follows from Remark 3.3 by using cylinders instead of balls in the proof of [12, Lemma 2.4]. Finally, the fact that the sequence \((\pi (z_k))_{k\in {\mathbb {N}}}\) satisfies property (3.14) for k big enough follows as in Step 3 of the proof of [6, Proposition 3.1]. Indeed, since \(z\in X_0\), it is enough to fix \(C'(z)\in (0,1-C(z))\) and to show

$$\begin{aligned} \left| {\int _\Omega \left( \frac{n}{2}e_1(x,t)+gAx_n\right) (\rho _k(x,t) -\rho (x,t))\,dx}\right| \le C'(z)\delta (t) \end{aligned}$$

for all \(t\in [0,T]\) and all k sufficiently large. Since by construction \(\rho _k=\rho \) outside a compact subset of the mixing zone \(\mathscr {U}\), hence outside a set contained in \([t_0,t_1]\times \Omega \) for some \(0<t_0<t_1<T\), it is enough to show

$$\begin{aligned} \forall t\in [t_0,t_1]:\quad \left| {\int _\Omega f(x,t)(\rho _k(x,t) -\rho (x,t))\,dx}\right| \le C'(z)\delta _0, \end{aligned}$$

where \(f(x,t):=\frac{n}{2}e_1(x,t)+gAx_n\) and \(\delta _0:=\inf \left\{ \,{\delta (t)>0:t\in [t_0,t_1]}\,\right\} >0\). But the latter inequality holds true for big enough k due to the uniform continuity of the map \([0,T]\ni t\mapsto f(\cdot ,t)\in L^2(\Omega )\), the uniform bound on \(\left\| {\rho _k(\cdot ,t)}\right\| _{L^2(\Omega )}\) and the convergence \(\rho _k\rightarrow \rho \) in \({\mathcal {C}}^0([0,T];L^2_w(\Omega ))\). \(\square \)

Proof of Theorem 2.5

Having Lemma 3.13 at hand we can prove as in [12] or [21] that \(J^{-1}(0)\) is contained in the set of continuity points of I, where IJ were defined in (3.15), (3.16). Since I is Baire-1, this shows that \(J^{-1}(0)\) is residual in \((X,d_X)\). Observe also that if \(\pi (z)\in J^{-1}(0)\), then \((\rho ,v)\) is a weak solution of (1.1), (1.2) satisfying properties a) and b) of Theorem 2.5.

Concering property Thm. 2.5 c), approximation by elements from \(X_0\) with respect to \(d_X\) shows that any element \(\pi (z)\) from X satisfies

$$\begin{aligned} \left| {\int _\Omega \left( \frac{n}{2}e_1(x,t)+gAx_n\right) (\hat{\rho }(x,t) -\rho (x,t))\,dx}\right| \le \delta (t) \end{aligned}$$

for all \(t\in [0,T]\).

Finally property 2.5 d) is a consequence of [6, Corollary 3.1]. This finishes the proof of Theorem 2.5. \(\square \)

4 Subsolutions

Let us turn to the construction of subsolutions on the n-dimensional box \(\Omega :=(0,1)^{n-1}\times (-L,L)\), \(L>0\) with Rayleigh–Taylor initial data (1.3). Let \(T>0\) and \(\mathscr {D}:=\Omega \times (0,T)\). Neglecting the admissibility, recall from Definition 2.3 that a subsolution \(z=(\rho ,v,m,\sigma ,p)\) is a weak solution of the linear system (2.3) on \(\mathscr {D}\) with boundary data (2.4) which is continuous on an open subset \(\mathscr {U}\subset \mathscr {D}\) satisfying

$$\begin{aligned}&\rho \in (-1,1),\quad \frac{\left| {m\pm v}\right| ^2}{n(1\pm \rho )^2}<e[\pm 1],\nonumber \\&\lambda _{\mathrm{max}}\left( \frac{v\otimes v-\rho (m\otimes v+v\otimes m) +m\otimes m}{1-\rho ^2}-\sigma \right) <e[\rho ] \end{aligned}$$
(4.1)

there, where \(e:\mathscr {D}\times {\mathbb {R}}\rightarrow {\mathbb {R}}\), \((x,t,r)\mapsto e(x,t)[r]\) is continous on \(\mathscr {U}\) and affine with respect to r. Outside of \(\mathscr {U}\) the conditions \(\rho \in \left\{ \,{\pm 1}\,\right\} \), \(v\otimes v-\sigma =e[\rho ]{{\,\mathrm{id}\,}}_{{\mathbb {R}}^n}\), \(m=\rho v\) are required to hold almost everywhere.

Due to the heuristic argument in Sect. 2.2, we consider \(e=e_\varepsilon \) to be of the form

$$\begin{aligned} e_\varepsilon (x,t)[r]=\tilde{e}_\varepsilon (x,t)-\varepsilon gA x_n r \end{aligned}$$

with \(\tilde{e}_\varepsilon :\mathscr {D}\rightarrow {\mathbb {R}}\) continuous on \(\mathscr {U}\) and \(\varepsilon \in \left[ 0,\frac{2}{n}\right] \), such that Theorem 2.5 will produce turbulent solutions to (1.1), (1.2), (1.3) with local energy given by

$$\begin{aligned} {\mathcal {E}}_{sol}(x,t)=\frac{n}{2}\tilde{e}_\varepsilon (x,t) +\left( 1-\frac{n}{2}\varepsilon \right) \rho _{sol}(x,t)gAx_n. \end{aligned}$$

4.1 Self-similar subsolutions

In this section we prove Lemma 2.8. Recall the definitions of \({\mathcal {F}}\) and \({\mathcal {A}}\) from (2.13), respectively (2.14).

Proof of Lemma 2.8

For \(f\in {\mathcal {F}}\) define \(F:[-1,1]\rightarrow {\mathbb {R}}\),

$$\begin{aligned} F(y):=\int _{-1}^yf'(s)s\,ds. \end{aligned}$$

For any choice of a profile \(f\in {\mathcal {F}}\) and a growth rate \(a\in {\mathcal {A}}\) one can check that \(z=(\rho ,v,m,\sigma ,p):\mathscr {D}\rightarrow Z\) defined by \(v\equiv 0\), \(\rho (x,t)=1\), \(m(x,t)=0\) for \(x_n\ge a(t)\), \(\rho (x,t)=-1\), \(m(x,t)=0\) for \(x_n\le -a(t)\) and

$$\begin{aligned} \rho (x,t)=f\left( \frac{x_n}{a(t)}\right) ,\quad m(x,t) =\dot{a}(t)F\left( \frac{x_n}{a(t)}\right) e_n \end{aligned}$$

for \(x_n\in (-a(t),a(t))\), as well as \(\sigma (x,t)=0\) for \(\left| {x_n}\right| \ge a(t)\),

$$\begin{aligned} \sigma (x,t)&=\frac{\left( m(x,t)\otimes m(x,t)\right) ^\circ }{1-\rho (x,t)^2} \quad {\text { for }} \left| {x_n}\right| <a(t),\nonumber \\ p(x,t)&=-\sigma _{nn}(x,t)-gA\int _{-L}^{x_n}\rho (\tilde{x}_n,t) \,d\tilde{x}_n \end{aligned}$$
(4.2)

are continuous on \(\overline{\mathscr {D}}{\setminus } \big ({\mathbb {R}}^{n-1}\times \{0\}\times \{0\}\big )\) piecewise \({\mathcal {C}}^1\) and satisfy (2.3), (1.3), and also (2.4) as long as \(a(t)\le L\) for all \(t\in (0,T)\). The continuity of m is a consequence of the symmetry of f, while the continuity of \(\sigma \) follows by an expansion at the points \(x_n=\pm a(t)\) and the condition \(f'(\pm 1)> 0\).

Once the construction of the subsolution is finished the set

$$\begin{aligned} \mathscr {U}:=\left\{ \,{(x,t)\in \mathscr {D}:x_n\in (-a(t),a(t))}\,\right\} \end{aligned}$$

will be the mixing zone. Concerning the pointwise constraints we define

$$\begin{aligned} \tilde{e}_\varepsilon (x,t)&:=\max \left\{ \,{\frac{m_n(x,t)^2}{n(1+\rho (x,t))^2} +\varepsilon gAx_n,\frac{m_n(x,t)^2}{n(1-\rho (x,t))^2}-\varepsilon gAx_n}\,\right\} \nonumber \\&\quad +\left( 1-\rho (x,t)^2\right) \delta (x,t) \end{aligned}$$
(4.3)

for \((x,t)\in \mathscr {U}\) and \(\tilde{e}_\varepsilon (x,t) =\varepsilon gA \left| {x_n}\right| \) for \((x,t)\in \overline{\mathscr {D}} {\setminus }\mathscr {U}\). Here \(\delta :\mathscr {D}\rightarrow (0,+\infty )\) is a continuous, even, positive and typically small function guaranteeing the inequalities (4.1) to hold in a strict sense. Indeed the first three conditions in (4.1) hold by definition of \(\rho \) and \(\tilde{e}_\varepsilon \). For the last inequality we have

$$\begin{aligned} \lambda _{\mathrm{max}}\left( \frac{m\otimes m}{1-\rho ^2}-\sigma \right)&=\frac{\left| {m}\right| ^2}{n(1-\rho ^2)}=\frac{1+\rho }{2} \frac{\left| {m}\right| ^2}{n(1+\rho )^2}+\frac{1-\rho }{2} \frac{\left| {m}\right| ^2}{n(1-\rho )^2}\\&< \frac{1+\rho }{2}e_\varepsilon [+1]+\frac{1-\rho }{2}e_\varepsilon [-1] =e_\varepsilon [\rho ]. \end{aligned}$$

Outside of \(\mathscr {U}\) it is clear that \(\rho =1\) on \(\left\{ \,{x_n\ge a(t)}\,\right\} \), \(\rho =-1\) on \(\left\{ \,{x_n\le -a(t)}\,\right\} \), \(m=0=\rho v\) and \(v\otimes v-\sigma =0=\tilde{e}_\varepsilon -\varepsilon gA \left| {x_n}\right| =e_\varepsilon [\rho ]\). This concludes the proof of Lemma 2.8. \(\square \)

4.2 Admissibility and maximal initial energy dissipation

Instead of investigating all admissible subsolutions emanating from Sect. 4.1, we will focus on the one that is selected by asking for maximal initial energy dissipation.

For \((f,a,\varepsilon )\in {\mathcal {F}}\times {\mathcal {A}}\times \left[ 0,\frac{2}{n}\right] \) observe that the total energy at time \(t>0\) of the induced subsolution can be chosen arbitrarily close to \(E_{f,a,\varepsilon }(t)\) defined in (2.15), which for admissibility has to be less than the initial energy \(E(0)=\int _\Omega gA\left| {x_n}\right| \,dx\). In fact \(E_{f,a,\varepsilon }(t)\) can be obtained from \(\int _\Omega \frac{n}{2}\tilde{e}_\varepsilon (x,t)+\left( 1-\frac{n}{2}\varepsilon \right) gAx_n\rho (x,t)\,dx\) with \(\tilde{e}_\varepsilon \) defined in (4.3) by letting \(\delta \rightarrow 0\) in \(L^\infty ((0,T);L^1(\Omega ))\).

Using this, the definitions of \(\rho \), m from Sect. 4.1, the transformation \(x_n=a(t)y\) and the symmetry of f one sees that the difference of the energies can be computed by the following integrals

$$\begin{aligned}&E_{f,a,\varepsilon }(t)-E(0)\\&\quad =2\int _0^1\max \left\{ \,{\frac{a(t)\dot{a}(t)^2F(y)^2}{2(1+f(y))^2} +\frac{n}{2}\varepsilon gAa(t)^2y,\frac{a(t)\dot{a}(t)^2 F(y)^2}{2(1-f(y))^2}-\frac{n}{2}\varepsilon gAa(t)^2y}\,\right\} \,dy\\&\qquad +a(t)^2gA\int _0^1(2-n\varepsilon )yf(y)-2y\,dy. \end{aligned}$$

Concerning the well-definedness observe again that for all \(f\in {\mathcal {F}}\) the quotient \(\frac{F(y)}{1-f(y)}\) has a finite limit as \(y\rightarrow 1\).

For a given profile \(f\in {\mathcal {F}}\) and a growth rate \(a\in {\mathcal {A}}\) one can via the above formula simply check by hands the admissibility of the induced self-similar subsolution.

Example 4.1

If \(T\le \sqrt{\frac{3L}{gA}}\), the choices \(f(y)=y\), \(a(t)=\frac{1}{3}gAt^2\) and \(\varepsilon =\frac{2}{3n}\) give rise to a subsolution on \(\Omega \times (0,T)\) with

$$\begin{aligned} E_{f,a,\varepsilon }(t)-E(0)=-\frac{g^3A^3}{81}t^4. \end{aligned}$$

In particular this implies that the subsolution is admissible for small \(\delta (x,t)\).

Remark 4.2

The released potential energy of the subsolution above at time \(t\in [0,T)\) is given by

$$\begin{aligned} \int _{\Omega }gAx_n\rho (x,t)\,dx-E(0)=-\frac{g^3A^3}{27}t^4. \end{aligned}$$

Therefore the ratio between dissipated and released energy is \(\frac{1}{3}\).

Besides the fact of being a simple example, it turns out that these choices for \(f,a,\varepsilon \) maximize the initial energy dissipation.

Recall the functionals \(J_k\), \(k=0,\ldots ,4\) from (2.16). Since \(a(0)=0\), there clearly holds \(J_0(f,a,\varepsilon )=0\). We are now in position to prove Theorem 2.9.

Proof of Theorem 2.9

In the formula for the energy difference let us abbreviate the two terms among which the maximum is taken, i.e., set

$$\begin{aligned} G^+_{f,a,\varepsilon }(y,t)&:=\frac{a(t)\dot{a}(t)^2F(y)^2}{2(1+f(y))^2} +\frac{n}{2}\varepsilon gAa(t)^2y,\nonumber \\ G^-_{f,a,\varepsilon }(y,t)&:=\frac{a(t)\dot{a}(t)^2F(y)^2}{2(1-f(y))^2} -\frac{n}{2}\varepsilon gAa(t)^2y. \end{aligned}$$
(4.4)

Estimating the maximum from below by the convex combination

$$\begin{aligned} \max \left\{ \,{G^+_{f,a,\varepsilon }(y,t),G^-_{f,a,\varepsilon }(y,t)}\,\right\}&\ge \frac{1+f(y)}{2}G^+_{f,a,\varepsilon }(y,t) +\frac{1-f(y)}{2}G^-_{f,a,\varepsilon }(y,t)\nonumber \\&=\frac{a(t)\dot{a}(t)^2F(y)^2}{2(1-f(y)^2)}+\frac{n}{2} f(y)\varepsilon gA a(t)^2y \end{aligned}$$
(4.5)

yields

$$\begin{aligned} J_k(f,a,\varepsilon ) \ge \lim _{t\rightarrow 0} \left( a(t)\dot{a}(t)^2\int _0^1\frac{F(y)^2}{1-f(y)^2} \,dy-2a(t)^2gA\int _0^1(1-f(y))y\,dy\right) t^{-k}. \end{aligned}$$

Observe that

$$\begin{aligned} I_1(f):=\int _0^1\frac{F(y)^2}{1-f(y)^2}\,dy>0,\quad I_2(f) :=\int _0^1(1-f(y))y\,dy>0, \end{aligned}$$
(4.6)

such that the required admissibility implies

$$\begin{aligned} 0\ge J_1(f,a,\varepsilon )\ge \dot{a}(0)^3I_1(f)\ge 0, \end{aligned}$$

and therefore \(\dot{a}(0)=0\), \(J_1(f,a,\varepsilon )=0\). Since now \(a(t)=\frac{1}{2}\ddot{a}(0)t^2+o(t^2)\) as \(t\rightarrow 0\) the admissibility also implies \(J_2(f,a,\varepsilon )=J_3(f,a, \varepsilon )=0\). This proves the first part of the Theorem.

The lowest order for which the initial energy dissipation rate is not necessarily vanishing is 4. There holds

$$\begin{aligned} J_4(f,a,\varepsilon )\ge \frac{1}{2}\ddot{a}(0)^3I_1(f) -\frac{1}{2}\ddot{a}(0)^2gAI_2(f)=:\tilde{J}(f,\ddot{a}(0)). \end{aligned}$$
(4.7)

In Lemma 4.3 below we will show that the functional \(\tilde{J}:{\mathcal {F}}\times [0,+\infty )\rightarrow {\mathbb {R}}\) has a unique global minimum in \(f(y)=y\) and \(\ddot{a}(0)=\frac{2}{3}gA\).

It follows that for any \((f,a,\varepsilon )\in {\mathcal {F}}\times {\mathcal {A}}\times \left[ 0,\frac{2}{n}\right] \) leading to an admissible subsolution there holds

$$\begin{aligned} J_4(f,a,\varepsilon )\ge \tilde{J}\left( {{\,\mathrm{id}\,}},\frac{2}{3}gA\right) =-\frac{g^3A^3}{81}. \end{aligned}$$

Note that \(I_1({{\,\mathrm{id}\,}})=\frac{1}{6}\), \(I_2({{\,\mathrm{id}\,}})=\frac{1}{6}\). It remains to check that this lower bound is achieved for \(f(y)=y\), any \(a\in {\mathcal {A}}\) with \(a(t)=\frac{1}{3}gAt^2+o(t^2)\) and \(\varepsilon =\frac{2}{3n}\). This is a consequence of the fact that for this choice the two limits

$$\begin{aligned} \lim _{t\rightarrow 0}\frac{G^\pm _{f,a,\varepsilon }(y,t)}{t^4} =\frac{\ddot{a}(0)^3F(y)^2}{4(1\pm f(y))^2}\pm \frac{n}{8} \varepsilon gA\ddot{a}(0)^2y=\frac{g^3A^3}{54}(1+y^2), \end{aligned}$$

with \(G^\pm _{f,a,\varepsilon }\) defined in (4.4), coincide. Therefore instead of an inequality we actually have equality when dividing (4.5) by \(t^4\) and passing to the limit \(t\rightarrow 0\). Thus we also have equality in (4.7), which means

$$\begin{aligned} J_4\left( {{\,\mathrm{id}\,}},\frac{1}{3}gAt^2+o(t^2),\frac{2}{3n}\right) =\tilde{J}\left( {{\,\mathrm{id}\,}},\frac{2}{3}gA\right) =-\frac{g^3A^3}{81}. \end{aligned}$$

The uniqueness of the minimizer follows from the uniqueness of the minimizer of \(\tilde{J}\) and the fact that for \(f(y)=y\), \(a(t)=\frac{1}{3}gt^2+o(t^2)\), any choice of \(\varepsilon \ne \frac{2}{3n}\) leads to a strict inequality when estimating the maximum by the convex combination in the limit \(t\rightarrow 0\) of \(\frac{(4.5)}{t^4}\). \(\square \)

Lemma 4.3

The functional \(\tilde{J}:{\mathcal {F}}\times [0,+\infty )\rightarrow {\mathbb {R}}\),

$$\begin{aligned} \tilde{J}(f,c)=\frac{1}{2}c^3I_1(f)-\frac{1}{2}c^2gAI_2(f) \end{aligned}$$

with \(I_{1,2}(f)\) defined in (4.6) has a unique global minimum in \(\left( {{\,\mathrm{id}\,}},\frac{2}{3}gA\right) \).

Proof

First of all observe that for fixed \(f\in {\mathcal {F}}\) the function \(\tilde{J}(f,\cdot ):[0,+\infty )\rightarrow {\mathbb {R}}\) has a unique minimum in \(c_0(f)=\frac{2}{3}gA\frac{I_2(f)}{I_1(f)}\). Therefore

$$\begin{aligned} \tilde{J}(f,c)\ge \tilde{J}(f,c_0(f))=-\frac{2}{27}g^3A^3 \frac{I_2(f)^3}{I_1(f)^2} \end{aligned}$$

and it remains to show

$$\begin{aligned} 6I_2(f)^3< I_1(f)^2 \end{aligned}$$
(4.8)

for any \(f\in {\mathcal {F}}{\setminus } \{{{\,\mathrm{id}\,}}\}\). Note that for \(f={{\,\mathrm{id}\,}}\) there holds equality, since \(I_1({{\,\mathrm{id}\,}})=\frac{1}{6}\), \(I_2({{\,\mathrm{id}\,}})=\frac{1}{6}\).

Let us rewrite

$$\begin{aligned} I_2(f)=\int _0^1(1-f(y))y\,dy=\int _0^1f'(y) \frac{1}{2}y^2\,dy=\frac{1}{2}\int _0^1yF'(y) \,dy=-\frac{1}{2}\int _0^1F(y)\,dy. \end{aligned}$$

Since \(I_2(f)>0\), inequality (4.8) is equivalent to \(6I_2(f)^4<I_1(f)^2I_2(f)\). Now

$$\begin{aligned} 6I_2(f)^4=\frac{3}{8}\left( \int _0^1\frac{F(y)}{\sqrt{1-f(y)^2}}\sqrt{1-f(y)^2}\,dy\right) ^4 \le \frac{3}{8}I_1(f)^2\left( \int _0^11-f(y)^2\,dy\right) ^2. \end{aligned}$$

Since also \(I_1(f)\) is positive, we see that (4.8) holds true provided

$$\begin{aligned} \hat{J}(f):=-\int _0^1F(y)\,dy-\frac{3}{4} \left( \int _0^11-f(y)^2\,dy\right) ^2>0. \end{aligned}$$

In order to prove \(\hat{J}(f)>0\) for \(f\in {\mathcal {F}}{\setminus }\{{{\,\mathrm{id}\,}}\}\) we write \(f={{\,\mathrm{id}\,}}+\varphi \) with \(\varphi \ne 0\), such that

$$\begin{aligned} F(y)=\int _{-1}^y(1+\varphi '(s))s\,ds=\frac{1}{2}(y^2-1) +y\varphi (y)-\int _{-1}^y\varphi (s)\,ds \end{aligned}$$

and

$$\begin{aligned} \hat{J}({{\,\mathrm{id}\,}}+\varphi )&=-\int _0^1\frac{1}{2}(y^2-1) +y\varphi (y)-\int _{-1}^y\varphi (s)\,ds\,dy\\&\quad -\frac{3}{4}\left( \int _0^11-y^2-2y\varphi (y) -\varphi (y)^2\,dy\right) ^2\\&=\int _0^1\varphi (y)^2\,dy-\frac{3}{4} \left( \int _0^12y\varphi (y)+\varphi (y)^2\,dy\right) ^2. \end{aligned}$$

Thus in terms of f and the \(L^2(0,1)\) inner product and norm we have

$$\begin{aligned} \hat{J}(f)=\left\| {f-{{\,\mathrm{id}\,}}}\right\| _{L^2(0,1)}^2-\frac{3}{4} \left\langle {f-{{\,\mathrm{id}\,}},f+{{\,\mathrm{id}\,}}}\right\rangle _{L^2(0,1)}^2. \end{aligned}$$
(4.9)

The next (and last for this subsection) lemma implies that \(\hat{J}(f)>0\) for all \(f\in {\mathcal {F}}{\setminus }\{{{\,\mathrm{id}\,}}\}\), which allows us to conclude the proof of Lemma 4.3. \(\square \)

Lemma 4.4

Let \({\mathcal {F}}_0:=\left\{ \,{f\in L^2(0,1):\left| {f}\right| <1 {\text { a.e.}}}\,\right\} \). The functional \(\hat{J}\) defined in (4.9) satisfies \(\hat{J}(f)>0\) for all \(f\in {\mathcal {F}}_0{\setminus }\{{{\,\mathrm{id}\,}}\}\).

Proof

We set \(\overline{{\mathcal {F}}}_0:=\left\{ \,{f\in L^2(0,1):\left| {f}\right| \le 1 {\text { a.e.}}}\,\right\} \), which is the closure of \({\mathcal {F}}_0\) with respect to \(\left\| {\cdot }\right\| _{L^2(0,1)}\), and observe that \(\hat{J}(f)\ge -\frac{4}{3}\) for \(f\in \overline{{\mathcal {F}}}_0\). Now let \((f_n)_{n\in {\mathbb {N}}}\subset \overline{{\mathcal {F}}}_0\) be a minimzing sequence for \(\hat{J}\). Since \(\overline{{\mathcal {F}}}_0\) is bounded and convex there exists \(f_*\in \overline{{\mathcal {F}}}_0\) with \(f_n\rightharpoonup f_*\) in \(L^2(0,1)\) along a subsequence. By the weak lower semicontinuity of the norm and since

$$\begin{aligned} \hat{J}(f)=h\left( \left\| {f}\right\| _{L^2(0,1)}^2\right) -2\left\langle {{{\,\mathrm{id}\,}},f}\right\rangle _{L^2(0,1)} \end{aligned}$$
(4.10)

with \(h:[0,1]\rightarrow {\mathbb {R}}\),

$$\begin{aligned} h(x)=x+\frac{1}{3}-\frac{3}{4}\left( x-\frac{1}{3}\right) ^2, \quad h'(x)=\frac{3}{2}(1-x)\ge 0, \end{aligned}$$

there holds

$$\begin{aligned} \inf _{\overline{{\mathcal {F}}}_0}\hat{J}&=\liminf _{n\rightarrow +\infty } \hat{J}(f_n)=h\left( \liminf _{n\rightarrow +\infty } \left\| {f_n}\right\| _{L^2(0,1)}^2\right) -\liminf _{n\rightarrow +\infty } 2\left\langle {{{\,\mathrm{id}\,}},f_n}\right\rangle _{L^2(0,1)}\\&\ge h\left( \left\| {f_*}\right\| ^2_{L^2(0,1)}\right) -2\left\langle {{{\,\mathrm{id}\,}},f_*}\right\rangle _{L^2(0,1)} =\hat{J}(f_*). \end{aligned}$$

Thus the minimum of \(\hat{J}:\overline{{\mathcal {F}}}_0\rightarrow {\mathbb {R}}\) is achieved at \(f_*\).

Without loss of generality we can assume \(f_*\ge 0\) and \(f_*\) to be nondecreasing, otherwise we replace \(f_*\) by the monotone increasing rearrangement of \(\left| {f_*}\right| \), which only decreases \(\hat{J}\), cf. (4.10).

Now there are two cases to consider: \(f_*\in {\mathcal {F}}_0\) and \(f_*\in \overline{{\mathcal {F}}}_0{\setminus }{\mathcal {F}}_0\). In the first case \(f_*\in {\mathcal {F}}_0\) one can check that, due to the monotonicity of \(f_*\), for any any \(\phi \in {\mathcal {C}}^0_c((0,1))\), there exists \(\varepsilon _0>0\) such that \(f_*+\varepsilon \phi \in {\mathcal {F}}_0\) for all \(\varepsilon \in (0,\varepsilon _0]\). Since \(f_*\) is a minimizer, one gets that the directional derivative \(\langle D\hat{J}(f_*),\phi \rangle \) must vanish.

It is clear that \(\hat{J}:L^2(0,1)\rightarrow {\mathbb {R}}\) is smooth and a quick computation shows that the gradient is given by

$$\begin{aligned} \nabla \hat{J}(f)=(2-3S(f))f-2{{\,\mathrm{id}\,}}, \end{aligned}$$

where

$$\begin{aligned} S(f):=\left\langle {f-{{\,\mathrm{id}\,}},f+{{\,\mathrm{id}\,}}}\right\rangle _{L^2(0,1)}=\left\| {f}\right\| _{L^2(0,1)}^2-\frac{1}{3}. \end{aligned}$$

Since we have for any \(\phi \in {\mathcal {C}}^0_c((0,1))\) that \(\langle D\hat{J}(f_*),\phi \rangle =\left\langle {\nabla \hat{J}(f_*),\phi }\right\rangle _{L^2(0,1)}=0\), it follows that there holds \(S(f_*)\ne \frac{3}{2}\) and

$$\begin{aligned} f_*=\frac{2}{2-3S(f_*)}{{\,\mathrm{id}\,}}. \end{aligned}$$

Plugging this identity into the definition of S(f) one obtains that

$$\begin{aligned} S(f_*)=\frac{4}{(2-3S(f_*))^2}\left\| {{{\,\mathrm{id}\,}}}\right\| ^2_{L^2(0,1)}-\frac{1}{3} \end{aligned}$$

or equivalently \(S(f_*)\in \left\{ \,{0,1}\,\right\} \). Thus, one has either \(f_*={{\,\mathrm{id}\,}}\) or \(f_*=-2{{\,\mathrm{id}\,}}\), but since \(f_*\in {\mathcal {F}}_0\), one must have \(f_*={{\,\mathrm{id}\,}}\) and \(\hat{J}_{|\overline{{\mathcal {F}}}_0}\ge \hat{J}({{\,\mathrm{id}\,}})=0\).

If we assume that \({{\,\mathrm{id}\,}}\) is not minimizing \(\hat{J}_{|\overline{{\mathcal {F}}}_0}\), then any minimizer \(f_*\) lies in \(\overline{{\mathcal {F}}}_0{\setminus }{\mathcal {F}}_0\) and satisfies \(\hat{J}(f_*)<0\). Then the monotonicity and sign of \(f_*\) together with \(f_*\in \overline{{\mathcal {F}}}_0{\setminus }{\mathcal {F}}_0\) imply that there exist \(f_0\in {\mathcal {F}}_0\) and \(a\in [0,1)\), such that

$$\begin{aligned} f_*(y)=f_0\left( \frac{y}{a}\right) {\mathcal {X}}_{(0,a)}(y)+{\mathcal {X}}_{(a,1)}(y) \end{aligned}$$

for a.e. \(y\in (0,1)\). Here \({\mathcal {X}}\) denotes the indicator function and for \(a=0\) this expression is understood as \(f_*={\mathcal {X}}_{(0,1)}\). In a straightforward way one sees that

$$\begin{aligned}&\left\| {f_*}\right\| _{L^2(0,1)}^2=a\left\| {f_0}\right\| _{L^2(0,1)}^2+1-a,\\&\left\langle {{{\,\mathrm{id}\,}},f_*}\right\rangle _{L^2(0,1)}=a^2\left\langle {{{\,\mathrm{id}\,}},f_0}\right\rangle _{L^2(0,1)}+\frac{1}{2}(1-a^2), \end{aligned}$$

such that

$$\begin{aligned} \hat{J}(f_*)=a^2\hat{J}(f_0). \end{aligned}$$

Since by assumption \(\hat{J}(f_*)<0\), this equality implies \(a\in (0,1)\) and \(\hat{J}(f_0)<\hat{J}(f_*)\), which tells us that \(f_*\) can not be a minimizer of \(\hat{J}_{|\overline{{\mathcal {F}}}_0}\). Due to this contradiction we conclude that the infimum of \(\hat{J}_{|\overline{{\mathcal {F}}}_0}\) is achieved at \(f_*={{\,\mathrm{id}\,}}\).

Finally the strict inequality \(\hat{J}(f)>0\) for \(f\in {\mathcal {F}}_0{\setminus }\{{{\,\mathrm{id}\,}}\}\) follows from the fact that \(\tilde{f}={{\,\mathrm{id}\,}}\) is the only function lying in \({\mathcal {F}}_0\) which satisfies \(\langle D\hat{J}(\tilde{f}),\phi \rangle =0\) for any \(\phi \in {\mathcal {C}}^0_c((0,1))\). \(\square \)

4.3 Beyond small-time behaviour

While the subsolution constructed in the previous subsection focused on minimizing the initial energy dissipation, one could also be interested in the long-time behaviour of such subsolutions. In particular, how can the subsolution be continued after a reaches L, i.e. the mixing zone touches the upper boundary. There are two long-time states which are of interest, namely the one where both the density and the momentum are vanishing everywhere (hence there are no longer two different density fluids, but only one completely mixed fluid), and the configuration \(-\rho _0\), where the higher density fluid occupies the lower half of the domain, respectively the lower density fluid occupies the upper half (i.e. gravity demixes the two fluids in the long term). We will show that both of these configurations can be achieved.

4.3.1 Converging towards the fully mixed, isotropic state

Proof of Proposition 2.10

We claim that one may extend (in an admissible way) the subsolution given in Example 4.1 from \(\Omega \times \left( 0,\sqrt{\frac{3L}{gA}}\right) \) to \(\mathscr {D}:=\Omega \times (0,+\infty )\) simply by considering for \((x,t)\in (0,1)^{n-1}\times (-L,L)\times \left( \sqrt{\frac{3L}{gA}},+\infty \right) \) the following:

$$\begin{aligned} \rho (x,t)=\frac{3x_n}{gAt^2},\quad m(x,t) =\frac{3}{gAt^3}(x_n^2-L^2)e_n,\quad \varepsilon =\varepsilon (t)=\frac{2}{3n} \sqrt{\frac{3L}{gA}}\frac{1}{t}, \end{aligned}$$

\(v\equiv 0\), \(\sigma ,p,\tilde{e}\) as in (4.2), (4.3), as well as the mixing zone

$$\begin{aligned} \mathscr {U}:=\left\{ \,{(x,t)\in \mathscr {D}:|x_n|<\frac{gAt^2}{3}}\,\right\} . \end{aligned}$$

Indeed, one observes through a straightforward calculation that for this choice, the maximum in (4.3) is always achieved for the first term if \(x_n\ge 0\) and \(t\ge \sqrt{\frac{3L}{gA}}\), i.e.

$$\begin{aligned}&\frac{m_n(x,t)^2}{n(1+\rho (x,t))^2}+\varepsilon (t) gAx_n \ge \frac{m_n(x,t)^2}{n(1-\rho (x,t))^2}-\varepsilon (t) gAx_n \\&\quad \Leftrightarrow 2n\varepsilon (t) gA x_n \ge \frac{9}{g^2A^2t^6} \frac{4\rho (x,t)(x_n^2-L^2)^2}{\left( 1-\rho (x,t)^2\right) ^2} =\frac{27}{g^3A^3t^8} \frac{4L^4x_n(1-(x_n/L)^2)^2}{\left( 1-\rho (x,t)^2\right) ^2}, \end{aligned}$$

which follows if \(\frac{n}{2}\varepsilon (t) \ge \frac{1}{3} \left( \frac{3L}{gA t^2}\right) ^4,\) by observing that \(\left| \frac{1-(x_n/L)^2}{1-\rho (x,t)^2}\right| <1\). Plugging in the value for \(\varepsilon (t)\), this is equivalent to \(1\ge \left( \sqrt{\frac{3L}{gA}}\frac{1}{t}\right) ^7\), which is obviously true for \(t\ge \sqrt{\frac{3L}{gA}}\).

Hence we have

$$\begin{aligned} \tilde{e}_\varepsilon (x,t)=\frac{1}{n} \frac{(x_n^2-L^2)^2}{t^2(\frac{gA}{3}t^2+x_n)^2} +\varepsilon (t)gA x_n +(1-\rho (x,t)^2)\delta (x,t) \quad {\text { for }} x_n\ge 0,\ t\ge \sqrt{\frac{3L}{gA}}. \end{aligned}$$

Then, recalling (2.15) and using the parity of \(x_n\mapsto \tilde{e}_\varepsilon (x,t)\) as well as \(x_n\mapsto x_n\rho (x,t)\), for \(t>\sqrt{\frac{3L}{gA}}\) one obtains that

$$\begin{aligned} E_{f,a,\varepsilon }(t)&= 2\int _0^L \frac{n}{2} \left( \tilde{e}_\varepsilon (x,t)-(1-\rho (x,t)^2) \delta (x,t)\right) +\left( 1-\frac{n}{2}\varepsilon \right) gA x_n\rho \,dx_n\\&= \int _0^L\frac{(y^2-L^2)^2}{t^2(\frac{gA}{3}t^2+y)^2}+\frac{2}{3} \sqrt{\frac{3L}{gA}}\frac{1}{t}gA y +2 \frac{3}{t^2} \left( 1-\frac{1}{3}\sqrt{\frac{3L}{gA}}\frac{1}{t}\right) y^2\, dy\\&=\frac{1}{t^2}\left( \int _0^L \frac{(y^2-L^2)^2}{(\frac{gA}{3}t^2+y)^2} \, dy +2L^3\left( 1-\frac{1}{3}\sqrt{\frac{3L}{gA}} \frac{1}{t}\right) \right) +\frac{gA L^2}{3} \sqrt{\frac{3L}{gA}}\frac{1}{t}, \end{aligned}$$

which is decreasing with respect to t, since

$$\begin{aligned} \frac{d}{dt}\left( \frac{1}{t^2}\left( 1-\frac{1}{3}\sqrt{\frac{3L}{gA}} \frac{1}{t}\right) \right) =-\frac{2}{t^3}\left( 1-\frac{1}{2} \sqrt{\frac{3L}{gA}}\frac{1}{t}\right) <0,\quad {\text { for }} t \ge \sqrt{\frac{3L}{gA}}. \end{aligned}$$

Therefore, the admissibility follows.

To conclude the proof of Proposition 2.10, observe that the limit of the subsolution as \(t\rightarrow +\infty \) is identically zero, and \(\delta \) can be chosen such that the energy of the system also decays to zero in the limit at \(+\infty \). \(\square \)

Remark 4.5

Since the kinetic energy of the solutions associated with the constructed subsolution goes to 0 as \(t\rightarrow +\infty \), any turbulent motion, in fact any motion, will vanish as \(t\rightarrow +\infty \). Note that one could have made the same construction while keeping \(\varepsilon =\frac{2}{3n}\) constant and still have obtained an admissible subsolution. However, the associated energy as \(t\rightarrow +\infty \) would not vanish, which would imply that there is still some turbulence at infinite time.

4.3.2 Demixing in finite time

Let us now construct an example of a different admissible continuation past the time when the mixing zone touches the upper boundary, one where first the density profile is rotated by 180 degrees, and then the mixing zone shrinks until the stable configuration \(-\rho _0\) is reached.

Proof of Proposition 2.11

We will do this in two steps.

Step 1: Rotation.

Denote \(T_0:=\sqrt{\frac{3L}{gA}}\). As before, on \(\left[ 0,T_0\right] \) we consider the subsolution given in Example 4.1. We claim that there exist \(\tilde{T}>T_0\) and a non-increasing, continuously differentiable function \(r:\left[ T_0,\tilde{T}\right] \rightarrow \left[ -\frac{1}{L},\frac{1}{L}\right] \) satisfying \(r\left( T_0\right) =\frac{1}{L}\), \(r(\tilde{T})=-\frac{1}{L}\), \(\dot{r}\left( T_0\right) =-2\sqrt{\frac{gA}{3L^3}}\), such that setting

$$\begin{aligned} \rho (x,t)=r(t)x_n,\quad m(x,t)=-\frac{\dot{r}(t)}{2}(x_n^2-L^2) e_n, \end{aligned}$$

as well as \(v\equiv 0\), \(\varepsilon =\frac{2}{3n},\) and \(\sigma ,p,\tilde{e}_\varepsilon \) as in (4.2), (4.3), for \((x,t)\in (0,1)^{n-1}\times (-L,L)\times (T_0,\tilde{T})\subset \mathscr {U}\), yields a subsolution which is continuous, piecewise \({\mathcal {C}}^1\) and admissible on \(\Omega \times (0,\tilde{T}]\).

Indeed, the continuity at \(t=T_0\) follows from the definitions of \(r(T_0)\) and \(\dot{r}(T_0)\). To check the admissibility, one needs to treat the maximum in (4.3). Once again, through simple calculations one obtains for \(x_n\ge 0\), \(t\in [T_0,\tilde{T}]\) that if \(r(t)\dot{r}(t)^2\le \frac{4gA}{3L^4}\), then the maximum is realized by the first term, i.e.

$$\begin{aligned} \tilde{e}_\varepsilon (x,t)=\frac{1}{n}\left( \frac{\dot{r}(t)^2 (x_n^2-L^2)^2}{4(1+r(t)x_n)^2}+\frac{2}{3}gA x_n \right) +(1-\rho (x,t)^2)\delta (x,t). \end{aligned}$$

Using once more the parity of \(x_n\mapsto \tilde{e}_\varepsilon (x,t)\) and \(x_n\mapsto x_n\rho (x,t)\), one obtains that in this case the corrected total energy at time \(t\in [T_0,\tilde{T}]\) reads

$$\begin{aligned} E_{r}(t)&:=\int _\Omega \frac{n}{2}(\tilde{e}_\varepsilon (x,t) -(1-\rho (x,t)^2)\delta (x,t))+\left( 1-\frac{n}{2}\varepsilon \right) gA x_n\rho \,dx\nonumber \\&=\frac{\dot{r}(t)^2}{4}I(r(t)) +\frac{4}{9}gAL^3r(t)+\frac{1}{3}gAL^2, \end{aligned}$$
(4.11)

where

$$\begin{aligned} I(r):=\int _0^L\frac{(y^2-L^2)^2}{(1+ry)^2} \, dy \end{aligned}$$

Let us now construct a function r satisfying the properties stated above.

Observe that \(I\in {\mathcal {C}}^1\left( \left( -\frac{1}{L}, +\infty \right) \right) \cap {\mathcal {C}}^0\left( \left[ -\frac{1}{L}, +\infty \right) \right) \) with \(I\left( -\frac{1}{L}\right) =\frac{7}{3}L^5\). Moreover, I is clearly positive and monotone decreasing on the intervall \(\left[ -\frac{1}{L},+\infty \right) \). Let \(r:[T_0,T_{\max })\rightarrow {\mathbb {R}}\) be the unique solution of the initial value problem

$$\begin{aligned} \dot{r}(t)=-2\sqrt{\frac{gAL^2}{9I(r(t))}},\quad r(t) \in \left( -\frac{1}{L},+\infty \right) ,\quad r(T_0)=\frac{1}{L}, \end{aligned}$$
(4.12)

where \(T_{\max }\) denotes the maximal time of existence of the solution.

We claim that \(T_{\max }<+\infty \) and r as a function extends continuously to \([T_0,T_{\max }]\) with \(r(T_{\max })=-\frac{1}{L}\). Assume to the contrary that \(T_{\max }=+\infty \), then \(r(t)>-\frac{1}{L}\) for all \(t\ge T_0\). But now integrating (4.12) for \(t\in (T_0,T_{\max })\) and using that I is decreasing, one has the contradiction

$$\begin{aligned} -\frac{2}{L}< r(t)-\frac{1}{L}=-2\int _{T_0}^t \sqrt{\frac{gAL^2}{9I(r(s))}} \, ds \le -2\sqrt{\frac{gAL^2}{9I(-\frac{1}{L})}}(t-T_0)\rightarrow -\infty \end{aligned}$$

as \(t\rightarrow +\infty \). Hence \(T_{\max }<+\infty \) and then necessarily \(\lim _{t\rightarrow T_{\max }}r(t)=-\frac{1}{L}\), because the orbit \(r([T_0,T_{\max }))\) is bounded from above due to the monotonicity of r. We therefore set \(\tilde{T}:=T_{\max }\).

Next due to \(I\left( \frac{1}{L}\right) =\frac{1}{3}L^5\) and (4.12) it is easy to see that \(\dot{r}\left( T_0\right) =-2\sqrt{\frac{gA}{3L^3}}\).

Finally, let us show that the associated corrected total energy function \(E_r\) is decreasing, to conclude the admissibility on \([T_0,\tilde{T}]\) of our subsolution. For this, we first show that one has \(r(t)\dot{r}(t)^2\le \frac{4gA}{3L^4}\), so that in \(\tilde{e}_\varepsilon \) indeed the first term under the maximum is selected for \(x_n\ge 0\) and thus (4.11) holds. Once again, this follows from (4.12) by using the monotonicity of I and r:

$$\begin{aligned} r(t)\dot{r}(t)^2\le \frac{1}{L}\frac{4}{I(\frac{1}{L})} \frac{1}{9}gAL^2=\frac{12}{L^6}\frac{1}{9}gAL^2=\frac{4gA}{3L^4}. \end{aligned}$$

Since the corrected total energy function \(E_r\) is then given by formula (4.11), we may plug (4.12) into (4.11) to further obtain that

$$\begin{aligned} E_r(t)=\frac{1}{9}gAL^2+\frac{4}{9}gAL^3r(t)+\frac{1}{3}g AL^2=\frac{4}{9}gAL^2\left( 1+Lr(t)\right) , \end{aligned}$$

which is clearly decreasing since r is decreasing. This concludes the construction for the rotation of the profile.

Step 2: Shrinking of the mixing zone.

We will now further extend the subsolution constructed above past the time \(\tilde{T}\). Let

$$\begin{aligned} T_{end}:=\tilde{T}+\sqrt{\frac{21L}{gA}}, \end{aligned}$$

and set \(\mathscr {D}:=\Omega \times (0,T_{end})\), \(\mathscr {U}:=\left\{ \,{(x,t)\in \mathscr {D}:x_n\in (-a(t),a(t))}\,\right\} \), with

$$\begin{aligned} a(t)= \left\{ \begin{array}{ll} \frac{gAt^2}{3}, &{} 0\le t \le T_0 \\ L, &{} T_0\le t\le \tilde{T}\\ \frac{gA(t-T_{end})^2}{21}, &{} \tilde{T}\le t\le T_{end} \\ \end{array}\right. . \end{aligned}$$

On \([0,T_0]\) our subsolution will coincide with the one from Example 4.1, on \([T_0,\tilde{T}]\) with the one constructed in Step 1, and on \([\tilde{T}, T_{end}]\) it will be of the form

$$\begin{aligned} \rho (x,t)=-\frac{x_n}{a(t)},\quad m(x,t)=\frac{\dot{a}(t)}{2} \left( 1-\frac{x_n^2}{a(t)^2}\right) e_n, \end{aligned}$$

\(v\equiv 0\), \(\varepsilon =\frac{2}{3n},\) \(\sigma ,p,\tilde{e}\) as in (4.2), (4.3), for \((x,t)\in \mathscr {U}\). Outside the mixing zone we consider \(\rho =-\rho _0\), \(v\equiv 0\) and \(\tilde{e}_\varepsilon (x,t)=-\varepsilon gA \left| {x_n}\right| \).

One can check through straightforward calculations that this choice makes \(\rho \) and m (and hence the whole subsolution) continuous at \(t=\tilde{T}\).

Clearly at \(T_{end}\), this subsolution reaches the stable configuration \(\rho =-\rho _0\), \(v\equiv 0\) with no mixing. All that remains to be checked is the admissibility on \([\tilde{T},T_{end}]\).

Once more, one may easily evaluate the maximum in (4.3) to obtain that for \(x_n\ge 0\) one has

$$\begin{aligned} \tilde{e}_\varepsilon (x,t)=\frac{m_n(x,t)^2}{n(1+\rho (x,t))^2} +\varepsilon gAx_n+\left( 1-\rho (x,t)^2\right) \delta (x,t). \end{aligned}$$

On the other hand, using once more the parity of \(x_n\mapsto \tilde{e}_\varepsilon (x,t)\), plugging in the formulas for \(\rho \) and m, and using the change of variables \(y=\frac{x_n}{a(t)}\), we have

$$\begin{aligned}&\int _\Omega \frac{n}{2}(\tilde{e}_\varepsilon (x,t)-(1-\rho (x,t)^2) \delta (x,t))+\left( 1-\frac{n}{2}\varepsilon \right) gA x_n\rho \,dx\\&\quad =2\int _0^L \frac{n}{2}\left( \tilde{e}_\varepsilon (x,t) -(1-\rho (x,t)^2)\delta (x,t)\right) +\left( 1-\frac{n}{2}\varepsilon \right) gA x_n\rho \,dx_n \\&\quad =\int _0^{a(t)}\frac{m_n(x,t)^2}{(1+\rho (x,t))^2}+\frac{2}{3} gAx_n -\frac{4}{3}gA\frac{x_n^2}{a(t)}\, dx_n+2\int _{a(t)}^L \left( -\frac{1}{3}-\frac{2}{3} \right) gA x_n \, dx_n\\&\quad =a(t)\int _0^1\frac{\dot{a}(t)^2}{4}(1+y)^2 +\frac{2}{3}gAa(t)y-\frac{4}{3}gAa(t)y^2 \, dy-gA(L^2-a(t)^2)\\&\quad =\frac{7}{12}a(t)\dot{a}(t)^2+\frac{8}{9}gAa(t)^2-gAL^2, \end{aligned}$$

which is clearly decreasing on \([\tilde{T},T_{end}]\) since both a and \(\left| {\dot{a}}\right| \) are decreasing. This concludes the proof of Proposition 2.11. \(\square \)