1 Introduction

This paper is concerned with the global existence analysis of a degenerate diffusion system governing the evolution of the particle density \(\rho (x,t)\) and temperature \(\theta (x,t)\):

$$\begin{aligned} \partial _t\rho = \Delta (\rho \theta ), \quad \partial _t E = \Delta \bigg (\theta + \frac{5}{2}\rho \theta ^2\bigg ) \quad \text{ in } \Omega ,\ t>0, \end{aligned}$$
(1)

where \(E=\theta + \frac{3}{2}\rho \theta \) is the energy density, supplemented by no-flux boundary and initial conditions,

$$\begin{aligned} \nabla (\rho \theta )\cdot \nu =\nabla \bigg (\theta +\frac{5}{2}\rho \theta ^2\bigg )\cdot \nu =0&\quad \text{ on } \partial \Omega ,\ t>0, \end{aligned}$$
(2)
$$\begin{aligned} \rho (0)=\rho ^0,\ E(0)=E^0:=\theta ^0+\frac{3}{2}\rho ^0\theta ^0&\quad \text{ in } \Omega , \end{aligned}$$
(3)

and \(\Omega \subset {{\mathbb {R}}}^3\) is a bounded domain. The equations describe a rarefied gas that exchanges heat with the background, coupled through the energy exchange. They can be formally derived from a collisional kinetic equation, coupled to a heat equation for the background temperature governed by a Fourier law, and they are written in dimensionless form. We refer to Sect. 2 for modeling details.

A major difficulty of system (1) is the derivation of suitable a priori estimates. This issue will be tackled by exploiting the entropy structure of the system. This means that Eq. (1) can be written in the cross-diffusion form

$$\begin{aligned} \partial _t \mathbf {u} = {\text {div}}({B}\nabla \mathbf {q}), \end{aligned}$$
(4)

where

$$\begin{aligned} \mathbf {u} = \begin{pmatrix} \rho \\ E \end{pmatrix}, \quad \mathbf {q} = \begin{pmatrix} \log (\rho /\theta ^{3/2})+\frac{5}{2} \\ -1/\theta \end{pmatrix}, \quad {B} = \begin{pmatrix} \rho \theta &{} \tfrac{5}{2}\rho \theta ^2 \\ \tfrac{5}{2}\rho \theta ^2 &{} \theta ^2(1+\tfrac{35}{4}\rho \theta ) \end{pmatrix}. \end{aligned}$$

The so-called Onsager matrix B is symmetric and positive semidefinite. However, B becomes indefinite when \(\rho =0\) or \(\theta =0\), showing that (4) is of degenerate type. The Gibbs free energy

$$\begin{aligned} G = \rho \theta \log \frac{\rho }{\theta ^{3/2}} + \frac{3}{2}\rho \theta - \theta (\log \theta -1), \end{aligned}$$
(5)

defines the

  • chemical potential \(\mu =\partial G/\partial \rho =\theta (\log (\rho /\theta ^{3/2})+\frac{5}{2})\),

  • the (mathematical) entropy \(h=\partial G/\partial \theta =\rho \log (\rho /\theta ^{3/2}) -\log \theta \), and

  • the energy density \(E=G-\theta \partial G/\partial \theta =(1+\frac{3}{2}\rho )\theta \).

We reveal the formal gradient-flow structure for (4) by defining the thermo-chemical potential \(\phi =\partial h/\partial \rho =\mu /\theta \) and the negative inverse temperature \(\partial h/\partial E=-1/\theta \) (interpreting h as a function of \((\rho ,E)\)) such that

$$\begin{aligned} \partial _t(\rho ,E)^T - {\text {div}}({B}\nabla \mathrm {D}h) = 0, \end{aligned}$$

where \(\mathrm {D}h\) is the vector with components \(\partial h/\partial \rho \) and \(\partial h/\partial E\). Furthermore, the entropy h is a Lyapunov functional along solutions to (4):

$$\begin{aligned} \frac{d}{dt}\int \nolimits _\Omega h dx = \int \nolimits _\Omega \bigg (\frac{\partial h}{\partial \rho }\partial _t\rho + \frac{\partial h}{\partial E}\partial _t E\bigg )dx = -\int \nolimits _\Omega \nabla (\mathrm {D}h)^T {B}\nabla \mathrm {D}h dx \le 0, \end{aligned}$$

since M is positive semidefinite. In particular, we obtain a priori estimates for \(\nabla (\mathrm {D}h)^T {B}\) \(\times \nabla \mathrm {D}h\) in \(L^1(\Omega )\), from which we conclude gradient estimates for \(\sqrt{\rho \theta }\) and \(\log \theta \) in \(L^2(\Omega )\) (see below).

Still, this approach is not sufficient. Indeed, because of the degeneracy at \(\theta =0\), we cannot expect to achieve any control on the gradient of \(\rho \), and moreover, the bounds from the entropy estimate are not sufficient to conclude. Our idea, detailed below, is to apply well-known tools from mathematical fluid dynamics like \(H^{-1}\) estimates and compensated compactness. The originality of this work consists in the combination of these tools and entropy methods, which allows us to treat non-standard degeneracies. We remark that degenerate mobilities were also treated in Cahn–Hilliard equations; see, e.g., [1, 11].

1.1 State of the art

Equation (1) belong to the class of energy-transport models which have been investigated particularly in semiconductor theory [14]. The first energy-transport model for semiconductors was presented by Stratton [17]. First existence results were concerned with models with very particular diffusion coefficients (being not of the form (1)) [2, 3] or with uniformly positive definite diffusion matrices [8]. Existence results for physically more realistic diffusion coefficients were shown in [6], but only for situations close to equilibrium. A degenerate energy-transport system with a simplified temperature equation was analyzed in [15]. Energy-transport models do not only appear in semiconductor theory. For instance, they have been used to model self-gravitating particle clouds [4] and the dynamics in optical lattices [5].

In [19], the global existence of weak solutions to the model

$$\begin{aligned} \partial _t\rho = \Delta (\rho \theta ), \quad \partial _t(\rho \theta ) = \frac{5}{3}\Delta (\rho \theta ^2) \end{aligned}$$
(6)

in a bounded domain \(\Omega \) with no-flux boundary conditions was proved. At first glance, Eq. (1) look simpler than (6) because of the additional diffusion in the energy equation. However, the ideas in [19] cannot be easily applied to (1). Indeed, the key idea in [19] was to introduce the variables \(u=\rho \theta \) and \(v=\rho \theta ^2\) and to apply the Stampacchia trunction method to a time-discretized version of

$$\begin{aligned} \partial _t\bigg (\frac{u^2}{v}\bigg ) = \Delta u, \quad \partial _t u = \frac{5}{3}\Delta v. \end{aligned}$$
(7)

The functionals \(\int \nolimits _\Omega \rho ^2\theta ^b dx\) turn out to be Lyapunov functionals along solutions to (7) for suitable values of \(b\in {{\mathbb {R}}}\), leading to uniform gradient estimates. However, the additional term in the energy equation of (1) complicates the derivation of a priori estimates. Thus, the proof in [19] seems to be rather specific to system (6) and is not generalizable. Our idea is to treat (1) by combining entropy methods and tools from mathematical fluid dynamics, which may be also applied to other cross-diffusion systems.

1.2 Mathematical key ideas

As explained before, the first key idea is to exploit, in contrast to [19], the entropy structure of (1). Indeed, recalling the mathematical entropy density

$$\begin{aligned} h(\rho ,\theta ) = \rho \log \frac{\rho }{\theta ^{3/2}}-\log \theta \quad \text{ for } \rho ,\,\theta >0, \end{aligned}$$
(8)

a formal computation (which is made rigorous for an approximate scheme; see (23)) gives the entropy dissipation equation

$$\begin{aligned} \frac{d}{dt}\int \nolimits _\Omega h(\rho ,\theta )dx + \int \nolimits _\Omega \bigg (2\big |\nabla \sqrt{\rho \theta }\big |^2 + |\nabla \log \theta |^2\bigg (1+\frac{5}{2}\rho \theta \bigg )\bigg )dx = 0 \,, \end{aligned}$$

which provides \(H^1(\Omega )\) estimates for \(\sqrt{\rho \theta }\) and \(\log \theta \). Moreover, this estimate implies that \(\theta >0\) a.e. (but not \(\rho >0\)).

Clearly, the entropy estimates are not sufficient to pass to the de-regularization limit in the approximate scheme. Further bounds are derived from the \(H^{-1}(\Omega )\) method, i.e., we use basically \((-\Delta )^{-1}\rho \) and \((-\Delta )^{-1}E\), respectively, as test functions in the weak formulation of (1) (second key idea). This method gives estimates for

$$\begin{aligned} \int \nolimits _\Omega \rho ^2\theta dx \qquad \text{ and }\qquad \int \nolimits _\Omega \bigg (\theta +\frac{5}{2}\rho \theta ^2\bigg ) \bigg (\theta +\frac{3}{2}\rho \theta \bigg )dx \,. \end{aligned}$$

Combining these bounds with those coming from the entropy inequality and the conservation laws leads to estimates for \(\nabla (\rho \theta )=\sqrt{\rho \theta }\nabla \sqrt{\rho \theta }\), \(\nabla \theta =\theta \nabla \log \theta \) and consequently for E in \(W^{1,1}(\Omega )\). Moreover, \(\partial _t E\) is bounded in some dual Sobolev space. This allows us to apply the Aubin–Lions lemma to E. Unfortunately, we do not obtain gradient estimates for \(\rho \).

To overcome this issue, we use tools from mathematical fluid dynamics (third key idea). Let \((\rho _\delta ,\theta _\delta )\) be approximate solutions to (1) (in a sense made precise in Sect. 3). First, we write the mass balance equation in the renormalized form

$$\begin{aligned} \partial _t f(\rho _\delta ) - {\text {div}}(f'(\rho _\delta )\nabla (\rho _\delta \theta _\delta )) = -f''(\rho _\delta )\nabla \rho _\delta \cdot \nabla (\rho _\delta \theta _\delta ) \end{aligned}$$

in the sense of distributions for smooth functions f with bounded derivatives. Let g another smooth function with bounded derivatives and introduce the vectors

$$\begin{aligned} U_\delta = \big (f(\rho _\delta ),-f'(\rho _\delta ) \nabla (\rho _\delta \theta _\delta )\big ), \quad V_\delta = \big (g(\theta _\delta ),0,0,0\big ). \end{aligned}$$

We deduce from the properties of f and g and the a priori estimates that \({\text {div}}_{(t,x)}U_\delta \) and \({\text {curl}}_{(t,x)}V_\delta \) are uniformly bounded in \(L^1(\Omega \times (0,T))\) and hence relatively compact in \(W^{-1,r}(\Omega )\) for some \(r>1\). The div-curl lemma implies that \(\overline{U_\delta \cdot V_\delta }=\overline{U_\delta }\cdot \overline{V_\delta }\) a.e., where the bar denotes the weak limit of the corresponding sequence. Thus, \(\overline{f(\rho _\delta )g(\theta _\delta )}=\overline{f(\rho _\delta )}\; \overline{g(\theta _\delta )}\) a.e. A truncation procedure yields that \(\overline{\rho _\delta \theta _\delta }=\rho \theta \), where \(\rho \) and \(\theta \) are the weak limits of \((\rho _\delta )\) and \((\theta _\delta )\), respectively. As \((E_\delta )\) converges strongly, by the Aubin–Lions lemma, we are able to prove that \(\theta _\delta \rightarrow \theta \) and eventually \(\rho _\delta \rightarrow \rho \) a.e. These limits allow us to identify the weak limits and to pass to the limit \(\delta \rightarrow 0\) in the approximate equations. The approximate scheme contains additional terms which need to be treated carefully such that our arguments are more technical than presented here. In fact, we need three approximation levels; see Sect. 3 for details.

1.3 Main result

Our main result is as follows:

Theorem 1

(Existence of weak solutions). Let \(\Omega \subset {{\mathbb {R}}}^3\) be a bounded domain with \(\partial \Omega \in C^{1,1}\). Let \(\rho ^0\), \(\theta ^0\in L^1(\Omega )\) satisfy \(\rho ^0\ge 0\), \(\theta ^0\ge 0\) in \(\Omega \) and \(\rho ^0\theta ^0\), \(h(\rho ^0,\theta ^0)\in L^1(\Omega )\), where h is defined in (8). Let \(T>0\) and \(\Omega _T=\Omega \times (0,T)\). Then there exist \(\rho \), \(\theta \in L^\infty (0,T;L^1(\Omega ))\) such that

$$\begin{aligned}&\rho \log \rho \in L^\infty (0,T; L^1(\Omega )), \quad E = \theta + \frac{3}{2} \rho \theta \in L^\infty (0,T; L^1(\Omega ))\cap L^2(\Omega _T), \\&\sqrt{\rho \theta },\,\log \theta \in L^2(0,T; H^1(\Omega )), \quad \rho \theta ^2\in L^{3/2}(\Omega _T), \\&\partial _t\rho \in L^{4/3}(0,T;W^{1,4}(\Omega )'), \quad \partial _t E\in L^{6/5}(0,T;W^{2,4}(\Omega )'); \end{aligned}$$

it holds that \(\rho \ge 0\) and \(\theta >0\) a.e. in \(\Omega _T\); \((\rho ,\theta )\) is a weak solution to (1)–(3) in the sense

$$\begin{aligned} \int \nolimits _0^T\langle \partial _t\rho ,\psi _1\rangle dt + \frac{3}{2}\int \nolimits _0^T\int \nolimits _\Omega \nabla (\rho \theta )\cdot \nabla \psi _1 dxdt&= 0, \end{aligned}$$
(9)
$$\begin{aligned} \int \nolimits _0^T\langle \partial _t E,\psi _2\rangle dt -\int \nolimits _0^T\int \nolimits _\Omega \bigg (\theta +\frac{5}{2}\rho \theta ^2\bigg ) \Delta \psi _2 dxdt&=0 , \end{aligned}$$
(10)

for any test functions \(\psi _1\in L^4(0,T; W^{1,4}(\Omega ))\), \(\psi _2\in L^{6}(0,T; W^{2,4}(\Omega ))\); and the initial data (3) is satisfied in the sense of \(W^{1,4}(\Omega )'\) and \(W^{2,4}(\Omega )'\), respectively. Moreover, the total mass and energy are preserved:

$$\begin{aligned} \int \nolimits _\Omega \rho (t)dx = \int \nolimits _\Omega \rho ^0 dx, \quad \int \nolimits _\Omega E(t)dx = \int \nolimits _\Omega E^0 dx\quad \text{ for } t\ge 0. \end{aligned}$$

In the theorem, we denote by \(X'\) the dual space of the Banach space X.

The paper is organized as follows. Equation (1) are formally derived from a relaxation-time kinetic model in Sect. 2, while the proof of Theorem 1 is presented in Sect. 3.

2 Formal derivation from a kinetic model

We consider a gas which is rarefied enough such that collisions between gas particles can be neglected, but there are thermalizing collisions at a fixed rate with a nonmoving background. This is modeled by sampling post-collisional velocities from a Maxwellian distribution with zero mean velocity and with the background temperature, which is determined from the assumptions of energy conservation as well as heat transport in the background governed by the Fourier law. These assumptions lead to the equations

$$\begin{aligned} \varepsilon ^2 \partial _t f_\varepsilon + \varepsilon v\cdot \nabla f_\varepsilon&= \rho _\varepsilon M(\theta _\varepsilon ) - f_\varepsilon , \end{aligned}$$
(11)
$$\begin{aligned} \varepsilon ^2 (\partial _t \theta _\varepsilon - \Delta \theta _\varepsilon )&= \frac{1}{2}\int \nolimits _{{{\mathbb {R}}}^3} |v|^2 (f_\varepsilon -\rho _\varepsilon M(\theta _\varepsilon ))dv, \end{aligned}$$
(12)

which are written in dimensionless form with a diffusive macroscopic scaling with the scaled Knudsen number \(0<\varepsilon \ll 1\). The gas is described by the distribution function \(f_\varepsilon (x,v,t)\) with the velocity \(v\in {{\mathbb {R}}}^3\), and the temperature of the background is \(\theta _\varepsilon (x,t)\). The gradient and Laplace operators are meant with respect to the position variable x, and the Maxwellian is given by

$$\begin{aligned} M(\theta ;v) = \frac{1}{(2\pi \theta )^{3/2}} \exp \bigg (-\frac{|v|^2}{2\theta }\bigg ). \end{aligned}$$
(13)

Finally, the position density of the gas is defined by

$$\begin{aligned} \rho _\varepsilon (x,t) = \int \nolimits _{{{\mathbb {R}}}^3} f_\varepsilon (x,v,t)dv. \end{aligned}$$

The right-hand side of the heat equation (12) has been chosen such that the sum of the kinetic energy of the gas and the thermal energy of the background is conserved. In [12], the energy-transport system (1) has been derived formally from (11)–(12) in the macroscopic limit \(\varepsilon \rightarrow 0\). We repeat the argument here for completeness.

In the computations, the moments of the Maxwellian up to order 4 will be needed:

$$\begin{aligned} \begin{aligned}&\int \nolimits _{{{\mathbb {R}}}^3} M(\theta ;v)dv = 1,\quad \int \nolimits _{{{\mathbb {R}}}^3} vM(\theta ;v)dv = \int \nolimits _{{{\mathbb {R}}}^3} v|v|^2M(\theta ;v)dv = 0, \\&\int \nolimits _{{{\mathbb {R}}}^3} v_i v_j M(\theta ;v)dv = \theta \delta _{ij},\quad \int \nolimits _{{{\mathbb {R}}}^3} v_i v_j|v|^2 M(\theta ;v)dv = 5\theta ^2\delta _{ij}, \end{aligned} \end{aligned}$$
(14)

where \(v_i\), \(v_j\) denote the components of v (\(i,j=1,2,3\)). From (11)–(12), the local conservation laws for mass and energy,

$$\begin{aligned}&\partial _t \rho _\varepsilon + {\text {div}}\bigg (\frac{1}{\varepsilon }\int \nolimits _{{{\mathbb {R}}}^3} vf_\varepsilon dv\bigg ) = 0,\\&\partial _t \bigg (\theta _\varepsilon + \frac{1}{2}\int \nolimits _{{{\mathbb {R}}}^3} |v|^2 f_\varepsilon dv\bigg ) + {\text {div}}\bigg ( \frac{1}{2\varepsilon }\int \nolimits _{{{\mathbb {R}}}^3} v|v|^2 f_\varepsilon dv - \nabla \theta _\varepsilon \bigg ) = 0, \end{aligned}$$

can be derived by integration of (11) with respect to v and, respectively, by integration of (11) against \(|v|^2/2\) and adding to (12).

In a formal convergence analysis, we assume \(f_\varepsilon \rightarrow f\), \(\rho _\varepsilon \rightarrow \rho \), and \(\theta _\varepsilon \rightarrow \theta \) as \(\varepsilon \rightarrow 0\) and deduce from (11) that \(f=\rho M(\theta )\). With (14), we obtain for the kinetic energy density

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0} \frac{1}{2}\int \nolimits _{{{\mathbb {R}}}^3} |v|^2 f_\varepsilon dv = \frac{3}{2}\rho \theta . \end{aligned}$$

The limit of the mass flux is obtained by multiplication of (11) by \(v/\varepsilon \), integration with respect to v, and passing to the limit, using again (14):

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0} \bigg (\frac{1}{\varepsilon }\int \nolimits _{{{\mathbb {R}}}^3} vf_\varepsilon dv\bigg )&= -\int \nolimits _{{{\mathbb {R}}}^3} v (v\cdot \nabla (\rho M(\theta ;v)))dv \\&= -{\text {div}}\bigg ( \rho \int \nolimits _{{{\mathbb {R}}}^3} v\otimes v M(\theta ;v)dv\bigg ) = - \nabla (\rho \theta ). \end{aligned}$$

Analogously, we compute the flux of the kinetic energy,

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0} \bigg (\frac{1}{2\varepsilon }\int \nolimits _{{{\mathbb {R}}}^3} v_i|v|^2 f_\varepsilon dv\bigg )&= -\frac{1}{2}\int \nolimits _{{{\mathbb {R}}}^3} v_i|v|^2 (v\cdot \nabla (\rho M(\theta ;v)))dv \\&= - \frac{1}{2}\sum _{j=1}^3\frac{\partial }{\partial x_j}\bigg ( \rho \int \nolimits _{{{\mathbb {R}}}^3} v_iv_j |v|^2 M(\theta ;v)dv\bigg ) = - \frac{5}{2}\frac{\partial }{\partial x_i}(\rho \theta ^2) \end{aligned}$$

for \(i=1,2,3\). Using these results in the limits of the conservation laws leads to (1).

3 Proof of Theorem 1

We approximate Eq. (1) in the following way. The time derivative is replaced by the implicit Euler discretization with parameter \(\tau >0\). This is needed to avoid issues related to the time regularity. A higher-order \(H^4\) regularization for \(\phi =\partial h/\partial \rho \) in the mass balance equation with parameter \(\varepsilon >0\) gives \(H^2(\Omega )\) regularity and compactness in \(W^{1,4}(\Omega )\). Furthermore, \(H^2(\Omega )\) and \(W^{1,4}(\Omega )\) regularizations for \(\log \theta \) with the same parameter are added to the energy balance equation. The \(W^{1,4}(\Omega )\) regularization is needed to derive estimates when using both \(\log \theta \) and \(-1/\theta \) as test functions in (1). Furthermore, we add an additional \(H^2(\Omega )\) regularization for \(\phi \) in the mass balance equation with parameter \(\delta >0\), which removes the degeneracy of the diffusion matrix M in (4). Finally, we add the artificial heat flux \(\Delta \theta ^{3}\) in the energy density equation with the same parameter \(\delta \) to obtain gradient estimates for the temperature, and we add the term \(\theta ^{-N}\log \theta \) for some \(N>0\) to achieve an estimate for \(\theta ^{-(N+1)}\).

After having proved the existence of solutions to the approximate problem and some a priori estimates coming from the entropy inequality, we perform the limits \(\varepsilon \rightarrow 0\), \(\tau \rightarrow 0\), and \(\delta \rightarrow 0\) (in this order).

3.1 Solution of the approximate problem

We wish to solve a system which approximates (1) and is formulated in the variables \(\phi \) and \(w=\log \theta \), similarly as in (4). We interpret \(\rho \) and \(E=\theta (1+\frac{3}{2}\rho )\) as functions of \((\phi ,w)\), i.e.

$$\begin{aligned} \rho (\phi ,w) = \exp \bigg (\phi +\frac{3}{2}w-\frac{5}{2}\bigg ), \quad E(\rho ,w) = \bigg (1+\frac{3}{2}\rho (\phi ,w)\bigg )\exp (w). \end{aligned}$$

In this notation, the diffusion coefficients become

$$\begin{aligned} M_{11} = \rho e^w, \quad M_{12} = \frac{5}{2}\rho e^{2w}, \quad M_{22} = e^{2w}\bigg (1+\frac{35}{4}\rho e^w\bigg ). \end{aligned}$$
(15)

Let \(T>0\) and let the approximation parameters \(\tau >0\) (such that \(T/\tau \in {{\mathbb {N}}}\)), \(\varepsilon >0\), and \(\delta >0\) be given. Furthermore, let \(0<N<5\) be a number needed for the approximation \(\theta ^{-N}\log \theta \) in the energy balance equation.

We wish to find \((\phi ^k,w^k)\in H^2(\Omega ;{{\mathbb {R}}}^2)\) such that, with \(\rho ^k=\rho (\phi ^k,w^k)\), \(E^k=E(\rho ^k,w^k)\),

$$\begin{aligned} 0&= \frac{1}{\tau }\int \nolimits _\Omega (\rho ^k-\rho ^{k-1})\psi _1 dx + \int \nolimits _\Omega (M_{11}^k\nabla \phi ^k + M_{12}^ke^{-w^k}\nabla w^k)\cdot \nabla \psi _1 dx \nonumber \\&\quad + \varepsilon \int \nolimits _\Omega D^2\phi ^k:D^2\psi _1 dx + \delta \int \nolimits _\Omega (\nabla \phi ^k\cdot \nabla \psi _1 + \phi ^k\psi _1) dx, \end{aligned}$$
(16)
$$\begin{aligned} 0&= \frac{1}{\tau }\int \nolimits _\Omega (E^k-E^{k-1})\psi _2 dx + \int \nolimits _\Omega (M_{12}^k\nabla \phi ^k + M_{22}^ke^{-w^k}\nabla w^k)\cdot \nabla \psi _2 dx \nonumber \\&\quad + \varepsilon \int \nolimits _\Omega e^{w^k}\big (D^2 w^k:D^2\psi _2 + |\nabla w^k|^2\nabla w^k\cdot \nabla \psi _2\big )dx + \varepsilon \int \nolimits _\Omega (1+e^{w^k})w^k\psi _2 dx \nonumber \\&\quad + \delta \int \nolimits _\Omega e^{3w^k}\nabla w^k\cdot \nabla \psi _2 dx + \delta \int \nolimits _\Omega e^{-N w^k}w^k\psi _2 dx \end{aligned}$$
(17)

for all \((\psi _1,\psi _2)\in H^2(\Omega ;{{\mathbb {R}}}^2)\), and \(M_{ij}^k\) are given by (15) with \((\rho ,w)\) replaced by \((\rho ^k,w^k)\). The existence of solutions to (16)–(17) is shown in two steps.

Step 1: solution of the linearized approximated problem. In the following, we drop the superindex k. Let \(({{\widetilde{\phi }}},{\widetilde{w}})\in W^{1,4}(\Omega ;{{\mathbb {R}}}^2)\) be given and set \({{\widetilde{\rho }}}=\rho ({{\widetilde{\phi }}},{\widetilde{w}})\), \({\widetilde{E}}=E({\widetilde{\phi }},{\widetilde{w}})\). We wish to find \((\phi ,w)\in H^2(\Omega ;{{\mathbb {R}}}^2)\) such that

$$\begin{aligned} a_1(\phi ,\psi _1) = \sigma F_1(\psi _1), \quad a_2(w,\psi _2) = \sigma F_2(\psi _2) \end{aligned}$$
(18)

for all \((\psi _1,\psi _2)\in H^2(\Omega ;{{\mathbb {R}}}^2)\), where \(\sigma \in [0,1]\) and

$$\begin{aligned} a_1(\phi ,\psi _1)&= \varepsilon \int \nolimits _\Omega D^2\phi :D^2\psi _1 dx + \delta \int \nolimits _\Omega (\nabla \phi \cdot \nabla \psi _1 + \phi \psi _1) dx, \\ a_2(w,\psi _2)&= \varepsilon \int \nolimits _\Omega e^{{\widetilde{w}}}\big (D^2 w:D^2\psi _2 + |\nabla {\widetilde{w}}|^2\nabla w\cdot \nabla \psi _2\big )dx + \varepsilon \int \nolimits _\Omega (1+e^{{\widetilde{w}}})w\psi _2 dx \\&\quad + \delta \int \nolimits _\Omega e^{3{\widetilde{w}}}\nabla w \cdot \nabla \psi _2 dx + \delta \int \nolimits _\Omega e^{-N \widetilde{w}}w\psi _2 dx, \\ F_1(\psi _1)&= -\frac{1}{\tau }\int \nolimits _\Omega ({\widetilde{\rho }}-\rho ^{k-1})\psi _1 dx - \int \nolimits _\Omega ({\widetilde{M}}_{11}\nabla {\widetilde{\phi }} + {\widetilde{M}}_{12} e^{-{\widetilde{w}}}\nabla {\widetilde{w}})\cdot \nabla \psi _1 dx\\ F_2(\psi _2)&= -\frac{1}{\tau }\int \nolimits _\Omega ({\widetilde{E}}-E^{k-1})\psi _2 dx - \int \nolimits _\Omega ({\widetilde{M}}_{12}\nabla {\widetilde{\phi }} + {\widetilde{M}}_{22} e^{-{\widetilde{w}}}\nabla {\widetilde{w}})\cdot \nabla \psi _2 dx, \end{aligned}$$

where \(\widetilde{M}_{ij}\) is given by (15) with \((\rho ,w)\) replaced by \(({\widetilde{\rho }},{\widetilde{w}})\). The bilinear forms \(a_1\) and \(a_2\) are coercive on \(H^2(\Omega )\) since, by the generalized Poincaré inequality [18, Chap. 2, Sect. 1.4],

$$\begin{aligned} a_1(\phi ,\phi )&= \varepsilon \int \nolimits _\Omega |D^2\phi |^2 dx + \delta \int \nolimits _\Omega (|\nabla \phi |^2 + \phi ^2) dx \ge \min \{\varepsilon ,\delta \}\Vert \phi \Vert _{H^2(\Omega )}^2, \\ a_2(w,w)&\ge \varepsilon \int \nolimits _\Omega (C|D^2 w|^2 + w^2)dx \ge \varepsilon C\Vert w\Vert _{H^2(\Omega )}^2 \end{aligned}$$

for some constant \(C>0\). The linear forms \(F_1\) and \(F_2\) are continuous on \(H^2(\Omega )\) since, by the continuous embedding \(W^{1,4}(\Omega )\hookrightarrow L^\infty (\Omega )\), \({\widetilde{\phi }}\) and \({\widetilde{w}}\) are \(L^\infty (\Omega )\) functions such that \({\widetilde{\rho }}\), \({\widetilde{E}}\in L^\infty (\Omega )\) too. The Lax–Milgram lemma implies the existence of a unique solution \((\phi ,w)\) to (18) such that \(\rho =\rho (\phi ,w)>0\) and \(E=E(\phi ,w)>0\). This defines the fixed-point operator \(S:W^{1,4}(\Omega ;{{\mathbb {R}}}^2)\times [0,1]\rightarrow W^{1,4}(\Omega ;{{\mathbb {R}}}^2)\), \(S({\widetilde{\phi }},{\widetilde{w}},\sigma )=(\phi ,w)\), where \((\phi ,w)\) solves (18).

Step 2: solution of the approximate problem. We wish to apply the Leray–Schauder fixed-point theorem. It holds that \(S({\widetilde{\phi }},{\widetilde{w}},0)=0\). Standard arguments show that \(S:W^{1,4}(\Omega ;{{\mathbb {R}}}^2)\rightarrow H^2(\Omega ;{{\mathbb {R}}}^2)\) is continuous. Since \(H^2(\Omega ;{{\mathbb {R}}}^2)\) is compactly embedded into \(W^{1,4}(\Omega ;{{\mathbb {R}}}^2)\), \(S:W^{1,4}(\Omega ;{{\mathbb {R}}}^2)\rightarrow W^{1,4}(\Omega ;{{\mathbb {R}}}^2)\) is compact. It remains to show that there exists a uniform bound in \(W^{1,4}(\Omega ;{{\mathbb {R}}}^2)\) for all fixed points.

Let \(\sigma \in (0,1]\) and let \((\phi ,w)\) be a fixed point of \(S(\cdot ,\cdot ,\sigma )\). It is a solution to (16)–(17) with \(\phi =\phi ^k\), \(w=w^k\), \(\rho =\rho ^k\), and \(E=E^k\). We use the test functions \(\psi _1=\phi \) and \(\psi _2=1-e^{-w}\) in (16) and (17), respectively, and add both equations. (We use \(1-e^{-w}\) instead of \(-e^{-w}\) as a test function in order to be able to treat the term \(\varepsilon \int \nolimits _\Omega (1+e^w)w\psi _2 dx\) and to obtain the entropy and energy balance in one single equation.) Then

$$\begin{aligned} 0&= \frac{\sigma }{\tau }\int \nolimits _\Omega \big ((\rho -\rho ^{k-1})\phi + (E-E^{k-1})(1-e^{-w})\big )dx \nonumber \\&\quad + \int \nolimits _\Omega \big (M_{11}|\nabla \phi |^2 + 2M_{12}e^{-w}\nabla \phi \cdot \nabla w + M_{22}e^{-2w}|\nabla w|^2\big )dx \nonumber \\&\quad + \varepsilon \int \nolimits _\Omega e^w\big (D^2w:D^2(-e^{-w}) + e^{-w}|\nabla w|^4\big )dx + \delta \int \nolimits _\Omega e^{2w} |\nabla w|^2 dx \nonumber \\&\quad + \varepsilon \int \nolimits _\Omega |D^2\phi |^2 dx + \delta \int \nolimits _\Omega (|\nabla \phi |^2+\phi ^2) dx + \varepsilon \int \nolimits _\Omega (1+e^w)w(1-e^{-w})dx\nonumber \\&\quad + \delta \int \nolimits _\Omega e^{-(N+1) w}w(e^{w}-1) dx \nonumber \\&=: I_1+\cdots +I_8. \end{aligned}$$
(19)

To estimate the first integral \(I_1\), we use the entropy density (8), formulated in terms of the variables \((\rho ,E)\),

$$\begin{aligned} h(\rho ,\theta ) = {\widetilde{h}}(\rho ,E) = \rho \log \rho + \bigg (1+\frac{3}{2}\rho \bigg )\log \frac{E}{1+\frac{3}{2}\rho }. \end{aligned}$$

The function \({\widetilde{h}}\) in the variables \((\rho ,E)\) is convex, since the determinant of its Hessian,

$$\begin{aligned} D^2{\widetilde{h}}(\rho ,E) = \begin{pmatrix} \frac{1}{\rho } + \frac{9}{4}(1+\frac{3}{2}\rho )^{-1} &{} -\frac{3}{2}\frac{1}{E} \\ -\frac{3}{2}\frac{1}{E} &{} (1+\frac{3}{2}\rho )E^{-2} \end{pmatrix} \end{aligned}$$

equals \((1+\frac{3}{2}\rho )/(\rho E^{2})\), which is positive. This implies that

$$\begin{aligned} {\widetilde{h}}(\rho _1,E_1)-{\widetilde{h}}(\rho _2,E_2) \le D{\widetilde{h}}(\rho _1,E_1)\cdot \begin{pmatrix} \rho _1-\rho _2 \\ E_1-E_2 \end{pmatrix} = (\rho _1-\rho _2)\phi + (E_1-E_2)(-e^{-w}) \end{aligned}$$

for any \((\rho _1,E_2)\), \((\rho _2,E_2)>0\), and consequently,

$$\begin{aligned} I_1 \ge \frac{\sigma }{\tau }\int \nolimits _\Omega \big ({\widetilde{h}}(\rho ,E) -{\widetilde{h}}(\rho ^{k-1},E^{k-1})\big )dx + \frac{\sigma }{\tau }\int \nolimits _\Omega (E-E^{k-1})dx. \end{aligned}$$

The second integral \(I_2\) is nonnegative since

$$\begin{aligned} M_{11}&|\nabla \phi |^2 + 2M_{12}e^{-w}\nabla \phi \cdot \nabla w + M_{22}e^{-2w}|\nabla w|^2 \\&= \rho e^w|\nabla \phi |^2 + 5\rho e^w\nabla \phi \cdot \nabla w + \bigg (1+\frac{35}{4}\rho e^w\bigg )|\nabla w|^2 \\&= \rho e^w\bigg (\frac{1}{8}|\nabla \phi |^2 + \frac{7}{8}\bigg |\nabla \phi +\frac{20}{7}\nabla w\bigg |^2 + \frac{45}{28}|\nabla w|^2\bigg ) + |\nabla w|^2 \\&\ge \frac{1}{8}\rho e^w|\nabla \phi |^2 + \big (1+\rho e^w\big )|\nabla w|^2. \end{aligned}$$

The integrals \(I_3\), \(I_7\), and \(I_8\) are estimated according to

$$\begin{aligned} I_3&= \frac{1}{2}\big (|D^2 w|^2 + |D^2 w - \nabla w\otimes \nabla w|^2 + |\nabla w|^4\big ) \ge \frac{1}{2}\big (|D^2 w|^2 + |\nabla w|^4\big ), \\ I_7&= 2\varepsilon \int \nolimits _\Omega w\sinh (w)dx \ge \varepsilon \int \nolimits _\Omega w^2 dx, \\ I_8&= \delta \int \nolimits _\Omega e^{-(N+1)w}w(e^w-1)dx \ge \delta \int \nolimits _\Omega e^{-(N+1)w}1_{\{w>-2\}}dx \\&= \delta \int \nolimits _\Omega e^{-(N+1)w}dx - \delta \int \nolimits _{\{w\ge -2\}}e^{-(N+1)w}dx \\&\ge \delta \int \nolimits _\Omega e^{-(N+1)w}dx - \delta e^{2(N+1)}{\text {meas}}(\Omega ). \end{aligned}$$

Therefore, we obtain from (19)

$$\begin{aligned} \frac{\sigma }{\tau }&\int \nolimits _\Omega \big ({\widetilde{h}}(\rho ,E)+E\big )dx + \sigma \int \nolimits _\Omega \bigg \{\frac{1}{8} \rho e^w|\nabla \phi |^2 + \big (1+\rho e^w\big )|\nabla w|^2\bigg \} dx \nonumber \\&\quad + \varepsilon \int \nolimits _\Omega |D^2\phi |^2 dx + \frac{\varepsilon }{2}\int \nolimits _\Omega (|D^2 w|^2+|\nabla w|^4+w^2)dx \nonumber \\&\quad + \delta \int \nolimits _\Omega e^{2w}|\nabla w|^2 dx + \delta \int \nolimits _\Omega (|\nabla \phi |^2+\phi ^2) dx + \delta \int \nolimits _\Omega e^{-(N+1) w}dx\nonumber \\&\le \frac{\sigma }{\tau }\int \nolimits _\Omega \big ({\widetilde{h}}(\rho ^{k-1},E^{k-1}) +E^{k-1}\big )dx + C\delta , \end{aligned}$$
(20)

where \(C>0\) is here and in the following a generic constant independent of \(\tau \), \(\varepsilon \), and \(\delta \). This gives a uniform \(H^2(\Omega )\) estimate for \(\phi \) and w, independent of \(\sigma \) (but depending on \(\varepsilon \) and \(\delta \)), and hence the desired uniform estimate for \((\phi ,w)\) in \(W^{1,4}(\Omega ;{{\mathbb {R}}}^2)\). By the Leray–Schauder fixed-point theorem, there exists a solution \((\phi ^k,w^k):=(\phi ,w)\in H^2(\Omega ;{{\mathbb {R}}}^2)\) to (16)–(17) with \(\sigma =1\), \(\rho ^k=\rho (\phi ^k,w^k)\), and \(E^k=E(\phi ^k,w^k)\). Moreover, this solution satisfies (20) with \(\sigma =1\).

We reformulate Eqs. (16)–(17) by inserting definition (15) of the diffusion coefficients and computing (we drop the superindex k)

$$\begin{aligned} M_{11}\nabla \phi + M_{12}e^{-w}\nabla w&= \rho \theta \nabla \bigg (\log \rho -\frac{3}{2}\log \theta \bigg ) + \frac{5}{2}\rho \nabla \theta = \nabla (\rho \theta ), \\ M_{12}\nabla \phi + M_{22}e^{-w}\nabla w&= \frac{5}{2}\rho \theta ^2\nabla \bigg (\log \rho -\frac{3}{2}\log \theta \bigg ) + \bigg (1+\frac{35}{4}\rho \theta \bigg )\nabla \theta \\&= \nabla \bigg (\theta + \frac{5}{2}\rho \theta ^2\bigg ). \end{aligned}$$

Therefore, \((\phi ^k,\rho ^k,\theta ^k,w^k)\) solves

$$\begin{aligned} 0&= \frac{1}{\tau }\int \nolimits _\Omega (\rho ^k-\rho ^{k-1})\psi _1 dx + \int \nolimits _\Omega \nabla (\rho ^k\theta ^k)\cdot \nabla \psi _1 dx + \varepsilon \int \nolimits _\Omega D^2\phi ^k:D^2\psi _1 dx \nonumber \\&\quad + \delta \int \nolimits _\Omega (\nabla \phi ^k\cdot \nabla \psi _1 + \phi ^k\psi _1) dx, \end{aligned}$$
(21)
$$\begin{aligned} 0&= \frac{1}{\tau }\int \nolimits _\Omega (E^k-E^{k-1})\psi _2 dx + \int \nolimits _\Omega \nabla \bigg (\theta ^k+\frac{5}{2}\rho ^k(\theta ^k)^2\bigg )\cdot \nabla \psi _2 dx \nonumber \\&\quad + \delta \int \nolimits _\Omega e^{3w^k}\nabla w^k\cdot \nabla \psi _2 dx + \varepsilon \int \nolimits _\Omega (1+e^{w^k})w^k\psi _2 dx \nonumber \\&\quad + \varepsilon \int \nolimits _\Omega e^{w^k}\big (D^2 w^k:D^2\psi _2 + |\nabla w^k|^2\nabla w^k\cdot \nabla \psi _2\big )dx +\delta \int \nolimits _\Omega e^{-N w^k}w^k\psi _2 dx \end{aligned}$$
(22)

for test functions \(\psi _1\), \(\psi _2\in H^2(\Omega )\).

3.2 Uniform estimates

Set \(\theta ^{k-1}=\exp (w^{k-1})\) and \(\theta ^k=\exp (w^k)\). In the following, we drop again the superindex k to simplify the notation. We reformulate inequality (20) to obtain gradient estimates for expressions depending on \(\rho \) and \(\theta \). We estimate the second integral in (20):

$$\begin{aligned} \frac{1}{8}&\rho e^w|\nabla \phi |^2 + \big (1+\rho e^w\big )|\nabla w|^2 \\&= |\nabla \log \theta |^2 + \frac{1}{8}\rho \theta \bigg |\frac{\nabla \rho }{\rho } -\frac{3}{2}\frac{\nabla \theta }{\theta }\bigg |^2 + \rho \frac{|\nabla \theta |^2}{\theta } \\&= |\nabla \log \theta |^2 + \frac{1}{8}\rho \theta \bigg (\frac{|\nabla \rho |^2}{\rho ^2} - 3\frac{\nabla \rho }{\rho }\cdot \frac{\nabla \theta }{\theta } + \frac{41}{4}\frac{|\nabla \theta |^2}{\theta ^2} \bigg ) \\&\ge |\nabla \log \theta |^2 + \frac{1}{16}\rho \theta \bigg (\frac{|\nabla \rho |^2}{\rho ^2} + \frac{|\nabla \theta |^2}{\theta ^2}\bigg ) \\&= |\nabla \log \theta |^2 + \frac{1}{32}\rho \theta \bigg (\frac{|\nabla \rho |^2}{\rho ^2} + \frac{|\nabla \theta |^2}{\theta ^2}\bigg ) \\&\quad + \frac{1}{32}\big |\sqrt{\theta }\nabla \sqrt{\rho } + \sqrt{\rho }\nabla \sqrt{\theta }\big |^2 + \frac{1}{32}\big |\sqrt{\theta }\nabla \sqrt{\rho } - \sqrt{\rho }\nabla \sqrt{\theta }\big |^2 \\&\ge |\nabla \log \theta |^2 + \frac{1}{8}\theta |\nabla \sqrt{\rho }|^2 + \frac{1}{32}\big |\nabla \sqrt{\rho \theta }\big |^2. \end{aligned}$$

We infer from (20) with \(\sigma =1\) the reformulated discrete entropy inequality

$$\begin{aligned} \frac{1}{\tau }\int \nolimits _\Omega&\big ({\widetilde{h}}(\rho ,E)+E\big )dx + \int \nolimits _\Omega |\nabla \log \theta |^2 dx + \frac{1}{8}\int \nolimits _\Omega \theta |\nabla \sqrt{\rho }|^2 dx + \frac{1}{64}\int \nolimits _\Omega |\nabla \sqrt{\rho \theta }|^2 dx \nonumber \\&\quad + \varepsilon \int \nolimits _\Omega |D^2\phi |^2 dx + \frac{\varepsilon }{2}\int \nolimits _\Omega (|D^2 w|^2 + |\nabla w|^4 + w^2)dx \nonumber \\&\quad + \delta \int \nolimits _\Omega (|\nabla \phi |^2+ \phi ^2) dx + \delta \int \nolimits _\Omega |\nabla e^{w}|^2 dx + \delta \int \nolimits _\Omega e^{-(N+1) w}dx \nonumber \\&\le \frac{1}{\tau }\int \nolimits _\Omega \big ({\widetilde{h}}(\rho ^{k-1},E^{k-1})+E^{k-1}\big )dx + \delta C. \end{aligned}$$
(23)

There exists \(c\in (0,1)\) such that \(x-\log x\ge c(x+|\log x|)\) for all \(x>0\). Therefore,

$$\begin{aligned} {\widetilde{h}}(\rho ,E) + E&= \rho \log \rho - \frac{3}{2}\rho \log \theta - \log \theta + \bigg (1+\frac{3}{2}\rho \bigg )\theta \nonumber \\&= \rho \log \rho + \bigg (1+\frac{3}{2}\rho \bigg )(\theta -\log \theta ) \ge \rho \log \rho + c(1+\rho )(\theta +|\log \theta |). \end{aligned}$$

This provides the following uniform estimates independent of \((\delta ,\varepsilon ,\tau )\):

$$\begin{aligned} \Vert \rho \log \rho \Vert _{L^1(\Omega )} + \Vert \theta \Vert _{L^1(\Omega )} + \Vert \rho \theta \Vert _{L^1(\Omega )} + \Vert \log \theta \Vert _{L^1(\Omega )} \le C. \end{aligned}$$
(24)

3.3 Limit \(\varepsilon \rightarrow 0\)

Let \(\phi _\varepsilon =\phi ^k\), \(w_\varepsilon =w^k\) be a solution to (16)–(17). We set \(\rho _\varepsilon =\rho (\phi _\varepsilon ,w_\varepsilon )\), \(E_\varepsilon =E(\rho _\varepsilon ,w_\varepsilon )\), \(\theta _\varepsilon =\exp (w_\varepsilon )\), and \(\phi _\varepsilon =\log (\rho _\varepsilon /\theta _\varepsilon ^{3/2})+5/2\). We deduce from (23) and (24) the following bounds which are independent of \(\varepsilon \) and \(\delta \) (but not of \(\tau \)):

$$\begin{aligned} \Vert \rho _\varepsilon \log \rho _\varepsilon \Vert _{L^1(\Omega )} + \Vert \theta _\varepsilon \Vert _{L^1(\Omega )} + \Vert \rho _\varepsilon \theta _\varepsilon \Vert _{L^1(\Omega )} + \Vert \log \theta _\varepsilon \Vert _{L^1(\Omega )}&\le C, \\ \Vert \sqrt{\theta _\varepsilon }\nabla \sqrt{\rho _\varepsilon }\Vert _{L^2(\Omega )} + \Vert \nabla \sqrt{\rho _\varepsilon \theta _\varepsilon }\Vert _{L^2(\Omega )} + \Vert \nabla \log \theta _\varepsilon \Vert _{L^2(\Omega )}&\le C(\tau ), \\ \sqrt{\varepsilon }\Vert \phi _\varepsilon \Vert _{H^2(\Omega )} + \sqrt{\varepsilon }\Vert w_\varepsilon \Vert _{H^2(\Omega )}&\le C(\tau ), \\ \sqrt{\delta }\Vert \phi _\varepsilon \Vert _{H^1(\Omega )} + \sqrt{\delta }\Vert \nabla \theta _\varepsilon \Vert _{L^2(\Omega )} + \delta \Vert \theta _\varepsilon ^{-(N+1)}\Vert _{L^1(\Omega )}&\le C(\tau ). \end{aligned}$$

These bounds allow us to derive further estimates. By the Poincaré inequality, we have

$$\begin{aligned} \Vert \theta _\varepsilon \Vert _{L^2(\Omega )}&\le C\Vert \nabla \theta _\varepsilon \Vert _{L^2(\Omega )} + \Vert \theta _\varepsilon \Vert _{L^1(\Omega )} \le C(\tau )\delta ^{-1/2}, \\ \Vert \log \theta _\varepsilon \Vert _{L^2(\Omega )}&\le C\Vert \nabla \log \theta _\varepsilon \Vert _{L^2(\Omega )} + \Vert \log \theta _\varepsilon \Vert _{L^1(\Omega )} \le C(\tau ). \end{aligned}$$

This gives \(\varepsilon \)-uniform bounds for \(\theta _\varepsilon \) and \(\log \theta _\varepsilon \) in \(H^1(\Omega )\):

$$\begin{aligned} \Vert \theta _\varepsilon \Vert _{H^1(\Omega )} \le C(\tau )\delta ^{-1/2}, \quad \Vert \log \theta _\varepsilon \Vert _{H^1(\Omega )} \le C(\tau ). \end{aligned}$$

The \(L^1(\Omega )\) bound for \(\rho _\varepsilon \theta _\varepsilon \) and the \(L^2(\Omega )\) bound for \(\nabla \sqrt{\rho _\varepsilon \theta _\varepsilon }\) imply that

$$\begin{aligned} \Vert \sqrt{\rho _\varepsilon \theta _\varepsilon }\Vert _{H^1(\Omega )} \le C(\tau ). \end{aligned}$$

These estimates provide a uniform bound for the energy. Indeed, we deduce from the Sobolev embedding \(H^1(\Omega )\hookrightarrow L^6(\Omega )\) that \(\nabla (\rho _\varepsilon \theta _\varepsilon )=2\sqrt{\rho _\varepsilon \theta _\varepsilon }\nabla \sqrt{\rho _\varepsilon \theta _\varepsilon }\) is uniformly bounded in \(L^{3/2}(\Omega )\). This shows that \((E_\varepsilon )\) is bounded in \(W^{1,3/2}(\Omega )\).

We know that \((\log \theta _\varepsilon )\) and \((\phi _\varepsilon )\) are bounded in \(H^1(\Omega )\). Consequently, \(\log \rho _\varepsilon =\phi _\varepsilon + \frac{3}{2}\log \theta _\varepsilon -\frac{5}{2}\) is bounded in \(H^1(\Omega )\) too, i.e.

$$\begin{aligned} \sqrt{\delta }\Vert \log \rho _\varepsilon \Vert _{H^1(\Omega )} \le C. \end{aligned}$$

The previous uniform bounds are sufficient to perform the limit \(\varepsilon \rightarrow 0\). There exist subsequences which are not relabeled such that, as \(\varepsilon \rightarrow 0\),

$$\begin{aligned} \phi _\varepsilon \rightarrow \phi&\quad \text{ strongly } \text{ in } L^{p}(\Omega ) \text{ and } \text{ weakly } \text{ in } H^1(\Omega ), \\ \log \rho _\varepsilon \rightarrow Y&\quad \text{ strongly } \text{ in } L^{p}(\Omega ) \text{ and } \text{ weakly } \text{ in } H^{1}(\Omega ), \\ \theta _\varepsilon \rightarrow \theta&\quad \text{ strongly } \text{ in } L^{p}(\Omega ) \text{ and } \text{ weakly } \text{ in } H^1(\Omega ), \\ \log \theta _\varepsilon \rightarrow Z&\quad \text{ strongly } \text{ in } L^{p}(\Omega ) \text{ and } \text{ weakly } \text{ in } H^1(\Omega ), \\ \varepsilon \phi _\varepsilon ,\ \varepsilon w_\varepsilon \rightarrow 0&\quad \text{ strongly } \text{ in } H^2(\Omega ),\\ \varepsilon ^{1/3}\nabla w_\varepsilon \rightarrow 0&\quad \text{ strongly } \text{ in } L^4(\Omega ), \end{aligned}$$

where \(1<p<6\) and Y, Z are functions in \(H^1(\Omega )\). Up to a subsequence, we have \(\log \rho _\varepsilon \rightarrow Y\) and \(\log \theta _\varepsilon \rightarrow Z\) a.e. in \(\Omega \). Thus, \(\rho _\varepsilon \rightarrow e^Y=:\rho \) and \(\theta _\varepsilon \rightarrow e^Z=:\theta \) a.e. in \(\Omega \). In particular, \(\rho >0\) and \(\theta >0\) a.e. in \(\Omega \). It follows from

$$\begin{aligned} \int \nolimits _{\{\rho _\varepsilon \ge R\}}\rho _\varepsilon dx \le \frac{1}{\log R}\int \nolimits _{\{\rho _\varepsilon \ge R\}}\rho _\varepsilon \log \rho _\varepsilon dx \le \frac{C}{\log R} \end{aligned}$$

for any \(R>1\) that \((\rho _\varepsilon )\) is equi-integrable. Vitali’s convergence theorem implies that \(\rho _\varepsilon \rightarrow \rho \) strongly in \(L^1(\Omega )\). Furthermore, possibly for a subsequence, \(\sqrt{\rho _\varepsilon \theta _\varepsilon }\rightarrow \sqrt{\rho \theta }\) a.e. in \(\Omega \). The \(H^1(\Omega )\) bound for \((\sqrt{\rho _\varepsilon \theta _\varepsilon })\) then yields

$$\begin{aligned} \sqrt{\rho _\varepsilon \theta _\varepsilon }\rightarrow \sqrt{\rho \theta } \quad \text{ strongly } \text{ in } L^{p}(\Omega ) \text{ and } \text{ weakly } \text{ in } H^1(\Omega ) \text{ and } L^6(\Omega ), \end{aligned}$$

where \(1<p<6\). Furthermore, we have

$$\begin{aligned} E_\varepsilon = \bigg (1+\frac{3}{2}\rho _\varepsilon \bigg )\theta _\varepsilon \rightharpoonup E:=\bigg (1+\frac{3}{2}\rho \bigg )\theta&\quad \text{ weakly } \text{ in } L^3(\Omega ), \\ \nabla (\rho _\varepsilon \theta _\varepsilon )=2\sqrt{\rho _\varepsilon \theta _\varepsilon }\nabla \sqrt{\rho _\varepsilon \theta _\varepsilon } \rightharpoonup \nabla (\rho \theta )&\quad \text{ weakly } \text{ in } L^{3/2}(\Omega ), \\ \nabla (\rho _\varepsilon \theta _\varepsilon ^2) = \rho _\varepsilon \theta _\varepsilon \nabla \theta _\varepsilon + \theta _\varepsilon \nabla (\rho _\varepsilon \theta _\varepsilon ) \rightharpoonup \nabla (\rho \theta ^2)&\quad \text{ weakly } \text{ in } L^{6/5}(\Omega ). \end{aligned}$$

We deduce from the strong convergence of \((\phi _\varepsilon )\), \((\rho _\varepsilon )\), and \((\theta _\varepsilon )\) as well as from the a.e. positivity of \(\rho \) and \(\theta \) that \(\phi =\log \rho -\frac{3}{2}\log \theta +\frac{5}{2}\) a.e. in \(\Omega \).

The uniform bounds for \(w_\varepsilon \) are sufficient to pass to the limit \(\varepsilon \rightarrow 0\) in the \(\varepsilon \)-terms,

$$\begin{aligned} \varepsilon D^2\phi _\varepsilon \rightarrow 0&\quad \text{ strongly } \text{ in } L^2(\Omega ), \\ \varepsilon \theta _\varepsilon D^2 w_\varepsilon \rightarrow 0&\quad \text{ strongly } \text{ in } L^1(\Omega ), \\ \varepsilon \theta _\varepsilon |\nabla w_\varepsilon |^2\nabla w_\varepsilon \rightarrow 0&\quad \text{ strongly } \text{ in } L^1(\Omega ), \\ \varepsilon (1+\theta _\varepsilon )w_\varepsilon \rightarrow 0&\quad \text{ strongly } \text{ in } L^2(\Omega ), \end{aligned}$$

as well as in the \(\delta \)-terms. The most difficult term is \(\delta \int \nolimits _\Omega e^{-N w_\varepsilon }w_\varepsilon \psi _2 dx\). It follows from \(\sqrt{\theta }|\log \theta |^{(2N+1)/(2N)}\le C\) for \(\theta \le 1\) and \(\theta ^{-(N+1)}\sqrt{\theta }|\log \theta |^{(2N+1)/(2N)}\le C\) for \(\theta >1\) as well as from (24) that

$$\begin{aligned} \delta \Vert e^{-Nw_\varepsilon }w_\varepsilon \Vert _{L^{(2N+1)/(2N)}(\Omega )}^{(2N+1)/(2N)}&= \delta \int \nolimits _\Omega \theta _\varepsilon ^{-(N+1)}\sqrt{\theta _\varepsilon } |\log \theta _\varepsilon |^{(2N+1)/(2N)}dx \nonumber \\&\le \delta \int \nolimits _\Omega \theta _\varepsilon ^{-(N+1)}dx + \delta C\le C(\tau ). \end{aligned}$$
(25)

Since \(\delta e^{-Nw_\varepsilon }w_\varepsilon \rightarrow \delta \theta ^{-N}\log \theta \) a.e. in \(\Omega \), we conclude that this limit also holds strongly in \(L^1(\Omega )\). Therefore, we can perform the limit \(\varepsilon \rightarrow 0\) in (21)–(22) (now writing the superindex k) leading to

$$\begin{aligned} 0&= \frac{1}{\tau }\int \nolimits _\Omega (\rho ^k-\rho ^{k-1})\psi _1 dx + \int \nolimits _\Omega \nabla (\rho ^k\theta ^k)\cdot \nabla \psi _1 dx + \delta \int \nolimits _\Omega (\nabla \phi ^k\cdot \nabla \psi _1 + \phi ^k\psi _1)dx, \end{aligned}$$
(26)
$$\begin{aligned} 0&= \frac{1}{\tau }\int \nolimits _\Omega (E^k-E^{k-1})\psi _2 dx + \int \nolimits _\Omega \nabla \bigg (\theta ^k + \frac{5}{2}\rho ^k(\theta ^k)^2\bigg )\cdot \nabla \psi _2 dx \nonumber \\&\quad + \delta \int \nolimits _\Omega (\theta ^k)^2\nabla \theta ^k\cdot \nabla \psi _2 dx + \delta \int \nolimits _\Omega (\theta ^k)^{-N}\log (\theta ^k)\psi _2 dx \end{aligned}$$
(27)

for any test functions \(\psi _1\in W^{1,3}(\Omega )\), \(\psi _2\in W^{1,6}(\Omega )\).

3.4 Limit \(\tau \rightarrow 0\)

We introduce the piecewise constant functions in time \(\rho _\tau (x,t) =\rho ^k(x)\), \(\theta _\tau (x,t)=\theta ^k(x)\), \(\phi _\tau (x,t)=\phi ^k(x)\), and \(E_\tau (x,t)=E^k(x)\) for \(x\in \Omega \), \(t\in ((k-1)\tau ,k\tau ]\). Furthermore, let \((\pi _\tau u)(x,t) =u^{k-1}(x)\) for \(x\in \Omega \), \(t\in ((k-1)\tau ,k\tau ]\) be the shift operator for piecewise constant functions u. We reformulate (26)–(27):

$$\begin{aligned} 0&= \frac{1}{\tau }\int \nolimits _0^T\int _\Omega (\rho _\tau -\pi _\tau \rho _\tau )\psi _1 dx dt + \int \nolimits _0^T\int _\Omega \nabla (\rho _\tau \theta _\tau )\cdot \nabla \psi _1 dx dt \nonumber \\&\quad + \delta \int \nolimits _0^T\int _\Omega (\nabla \phi _\tau \cdot \nabla \psi _1 + \phi _\tau \psi _1) dx dt, \end{aligned}$$
(28)
$$\begin{aligned} 0&= \frac{1}{\tau }\int \nolimits _0^T\int _\Omega (E_\tau -\pi _\tau E_\tau )\psi _2 dx dt + \int \nolimits _0^T\int _\Omega \nabla \bigg (\theta _\tau +\frac{5}{2}\rho _\tau \theta _\tau ^2\bigg ) \cdot \nabla \psi _2 dx dt \nonumber \\&\quad + \delta \int \nolimits _0^T\int _\Omega \theta _\tau ^2\nabla \theta _\tau \cdot \nabla \psi _2 dx dt + \delta \int \nolimits _0^T\int _\Omega \theta _\tau ^{-N}\log (\theta _\tau )\psi _2 dx dt \end{aligned}$$
(29)

for piecewise constant test functions in time \(\psi _1\), \(\psi _2\in L^2(0,T;W^{1,6}(\Omega ))\). By density [16, Prop. 1.36], these formulations hold for all test functions in \(L^2(0,T;W^{1,6}(\Omega ))\). We collect the uniform estimates from the discrete entropy inequality (23):

$$\begin{aligned} \Vert \rho _\tau \log \rho _\tau \Vert _{L^\infty (0,T;L^{1}(\Omega ))} + \Vert \theta _\tau \Vert _{L^\infty (0,T;L^1(\Omega ))}&\le C, \end{aligned}$$
(30)
$$\begin{aligned} \Vert \rho _\tau \theta _\tau \Vert _{L^\infty (0,T;L^1(\Omega ))} + \Vert \sqrt{\theta _\tau }\nabla \sqrt{\rho _\tau }\Vert _{L^2(\Omega _T)} + \Vert \sqrt{\rho _\tau \theta _\tau }\Vert _{L^2(0,T;H^1(\Omega ))}&\le C, \end{aligned}$$
(31)
$$\begin{aligned} \Vert \log \theta _\tau \Vert _{L^2(0,T; H^1(\Omega ))} + \Vert \log \theta _\tau \Vert _{L^\infty (0,T;L^1(\Omega ))}&\le C, \end{aligned}$$
(32)
$$\begin{aligned} \sqrt{\delta }\Vert \phi _\tau \Vert _{L^2(0,T;H^1(\Omega ))} + \sqrt{\delta }\Vert \nabla \theta _\tau \Vert _{L^2(\Omega _T)} + \delta \Vert \theta _\tau ^{-(N+1)}\Vert _{L^1(\Omega _T)}&\le C, \end{aligned}$$
(33)

where the constant \(C>0\) does not depend on \(\tau \) or \(\delta \). In the following, we show some additional estimates for \((\rho _\tau ,\theta _\tau )\).

Lemma 2

(Mass and energy control). It holds for any \(t\in (0,T)\) that

$$\begin{aligned} \bigg |\int \nolimits _\Omega \rho _\tau (t)dx - \int \nolimits _\Omega \rho ^0 dx\bigg |\le C\delta ^{1/2}, \quad \bigg |\int \nolimits _\Omega E_\tau (t)dx - \int \nolimits _\Omega E^0 dx\bigg | \le C\delta ^{1/(2N+1)}. \end{aligned}$$

Proof

Using \(\psi _1=1\) in (26) and summing from \(k=1,\ldots ,n\) gives

$$\begin{aligned} \int \nolimits _\Omega (\rho ^n-\rho ^0)dx = \sum _{k=1}^n\int \nolimits _\Omega (\rho _\tau ^k-\rho ^{k-1})dx = \tau \delta \sum _{k=1}^n\int \nolimits _\Omega \phi ^k dx, \end{aligned}$$

where \(n\le N\). We infer from bound (33) for \((\phi _\tau )\) that

$$\begin{aligned} \bigg |\int \nolimits _\Omega (\rho _\tau (t)-\rho ^0)dx\bigg | = \delta \bigg |\int \nolimits _0^t\int _\Omega \phi _\tau dxdt\bigg | \le \delta ^{1/2}C. \end{aligned}$$

The second statement follows after choosing \(\psi _2=1\) in (27) and using (25). \(\square \)

Lemma 3

(Higher integrability). It holds that

$$\begin{aligned} \Vert \theta _\tau \Vert _{L^2(\Omega _T)} + \Vert \rho _\tau ^\alpha \theta _\tau ^\beta \Vert _{L^1(\Omega _T)} + \delta ^{1/4}\Vert \theta _\tau \Vert _{L^4(\Omega _T)}&\le C, \end{aligned}$$

where \((\alpha ,\beta )\in \{(1,2),(1,3),(\frac{3}{2},3),(2,1),(2,2),(2,3)\}\).

Proof

The proof is based on the \(H^{-1}(\Omega )\) method, i.e., we use test functions of the type \((-\Delta )^{-1}\rho _\tau \) and \((-\Delta )^{-1}E_\tau \). More precisely, let \(\Psi _1\), \(\Psi _2\in L^\infty (0,T;H^1(\Omega ))\) be the unique solutions to, respectively,

$$\begin{aligned} \begin{aligned} -\Delta \Psi _1&= \rho _\tau - \fint _\Omega \rho _\tau dx\quad \text{ on } \Omega , \quad \nabla \Psi _1\cdot \nu =0\quad \text{ on } \partial \Omega ,\quad \int \nolimits _\Omega \Psi _1 dx = 0,\\ -\Delta \Psi _2&= E_\tau - \fint _\Omega E_\tau dx \quad \text{ on } \Omega ,\quad \nabla \Psi _2\cdot \nu =0\quad \text{ on } \partial \Omega , \quad \int \nolimits _\Omega \Psi _2 dx = 0, \end{aligned} \end{aligned}$$
(34)

where \(\fint udx = {\text {meas}}(\Omega )^{-1}\int \nolimits _\Omega udx\).

Step 1: uniform bounds for \(\Psi _2\). We use the test function \(\Psi _2\) in the weak formulation of the second equation in (34) and take into account the energy control. Then

$$\begin{aligned} \Vert \nabla \Psi _2\Vert _{L^2(\Omega )}^2 \le C\big (1+\Vert E_\tau \Vert _{L^{6/5}(\Omega )}\big ) \Vert \Psi _2\Vert _{L^6(\Omega )}. \end{aligned}$$

It follows from Sobolev’s embedding and the Poincaré–Wirtinger inequality that

$$\begin{aligned} \Vert \nabla \Psi _2\Vert _{L^2(\Omega )}^2 \le C\big (1+\Vert E_\tau \Vert _{L^{6/5}(\Omega )}\big ) \Vert \nabla \Psi _2\Vert _{L^2(\Omega )} \end{aligned}$$

and so

$$\begin{aligned} \Vert \Psi _2\Vert _{H^1(\Omega )}\le C\big (1+\Vert E_\tau \Vert _{L^{6/5}(\Omega )}\big ). \end{aligned}$$

We proceed by bootstrapping this result. Elliptic regularity for

$$\begin{aligned} -\Delta \Psi _2 + \Psi _2 = E_\tau - \fint _\Omega E_\tau dx + \Psi _2 \quad \text{ in } \Omega \end{aligned}$$

gives (here, we need the boundary regularity \(\partial \Omega \in C^{1,1}\))

$$\begin{aligned} \Vert \Psi _2\Vert _{W^{2,6/5}(\Omega )} \le C\big (1+\Vert E_\tau \Vert _{L^{6/5}(\Omega )} + \Vert \Psi _2\Vert _{L^{6/5}(\Omega )}\big ) \le C\big (1+\Vert E_\tau \Vert _{L^{6/5}(\Omega )}\big ). \end{aligned}$$

Since \((E_\tau )\) is bounded in \(L^{\infty }(0,T;L^1(\Omega ))\), an interpolation shows that

$$\begin{aligned} \Vert E_\tau \Vert _{L^6(0,T;L^{6/5}(\Omega ))}^6&\le \int \nolimits _0^T\Vert E_\tau \Vert _{L^2(\Omega )}^2\Vert E_\tau \Vert _{L^1(\Omega )}^4 dt \\&\le \Vert E_\tau \Vert _{L^\infty (0,T;L^1(\Omega ))}^4\int \nolimits _0^T\Vert E_\tau \Vert _{L^2(\Omega )}^2 dt. \end{aligned}$$

We deduce from the embedding \(L^6(0,T;W^{2,6/5}(\Omega ))\hookrightarrow L^6(\Omega _T)\) that

$$\begin{aligned} \Vert \Psi _2\Vert _{L^6(\Omega _T)} \le C\big (1+\Vert E_\tau \Vert _{L^2(\Omega _T)}^{1/3}\big ). \end{aligned}$$
(35)

Step 2: Test functions \(\Psi _1\) and \(\Psi _2\). We choose \(\Psi _1\) and \(\Psi _2\) as test functions in (28) and (29), respectively:

$$\begin{aligned} 0&= \frac{1}{\tau }\int \nolimits _0^T\int _\Omega (\rho _\tau - \pi _\tau \rho _\tau )\Psi _1 dx dt + \int \nolimits _0^T\int _\Omega \rho _\tau \theta _\tau \bigg (\rho _\tau - \fint _\Omega \rho _\tau dx\bigg )dxdt \nonumber \\&\quad + \delta \int \nolimits _0^T\int _\Omega \phi _\tau \bigg (\rho _\tau - \fint _\Omega \rho _\tau dx + \Psi _1\bigg ) dxdt, \end{aligned}$$
(36)
$$\begin{aligned} 0&= \frac{1}{\tau }\int \nolimits _0^T\int _\Omega (E_\tau - \pi _\tau E_\tau )\Psi _2 dx dt + \int \nolimits _0^T\int _\Omega \bigg (\theta _\tau +\frac{5}{2}\rho _\tau \theta _\tau ^2\bigg ) \bigg (E_\tau - \fint _\Omega E_\tau dx \bigg )dxdt \nonumber \\&\quad + \frac{\delta }{3}\int \nolimits _0^T\int _\Omega \theta _\tau ^3 \bigg (E_\tau - \fint _\Omega E_\tau dx \bigg ) dxdt + \delta \int \nolimits _0^T\int _\Omega \theta _\tau ^{-N}\log (\theta _\tau )\Psi _2 dx dt. \end{aligned}$$
(37)

We estimate the first integral in (36). Since \(\Psi _1\) has zero spatial average and \(\nabla \Psi _1\cdot \nu =0\) on \(\partial \Omega \), it follows from (34) that

$$\begin{aligned} \frac{1}{\tau }\int \nolimits _0^T\int _\Omega (\rho _\tau -\pi _\tau \rho _\tau )\Psi _1 dx dt&= \frac{1}{\tau }\int \nolimits _0^T\int _\Omega (\text{ id } - \pi _\tau ) \bigg (\rho _\tau -\fint _\Omega \rho _\tau dx\bigg )\Psi _1 dx dt \\&\quad + \frac{1}{\tau }\int \nolimits _0^T(\text{ id } - \pi _\tau ) \bigg (\fint _\Omega \rho _\tau dx\bigg )\bigg (\int \nolimits _\Omega \Psi _1 dx\bigg ) dt \nonumber \\&= \frac{1}{\tau }\int \nolimits _0^T\int _\Omega \nabla \big ((\text{ id } - \pi _\tau )\Psi _1\big ) \cdot \nabla \Psi _1 dx dt. \end{aligned}$$

The function \(\Psi _1\) is piecewise constant in time. We write \(\Psi _1(x,t)=\Psi _1^k(x)\) for \(x\in \Omega \), \(t\in ((k-1)\tau ,k\tau ]\). Then, using Young’s inequality,

$$\begin{aligned} \frac{1}{\tau }\int \nolimits _0^T&\int _\Omega \nabla \big ((\text{ id } - \pi _\tau )\Psi _1\big ) \cdot \nabla \Psi _1 dx dt = \sum _{k=1}^N\int \nolimits _\Omega \nabla (\Psi _1^k-\Psi _1^{k-1})\cdot \nabla \Psi _1^k dx \\&\ge \frac{1}{2}\sum _{k=1}^N\int \nolimits _\Omega \big (|\nabla \Psi _1^k|^2-|\nabla \Psi _1^{k-1}|^2\big )dx = \frac{1}{2}\int \nolimits _\Omega \big (|\nabla \Psi _1^N|^2-|\nabla \Psi _1^0|^2\big )dx. \end{aligned}$$

We conclude that

$$\begin{aligned} \frac{1}{\tau }\int \nolimits _0^T\int _\Omega (\rho _\tau -\pi _\tau \rho _\tau )\Psi _1 dx dt \ge \frac{1}{2}\int \nolimits _\Omega |\nabla \Psi _1(T)|^2 dx - \frac{1}{2}\int \nolimits _\Omega |\nabla \Psi _1(0)|^2 dx. \end{aligned}$$

In a similar way, we have

$$\begin{aligned} \frac{1}{\tau }\int \nolimits _0^T\int _\Omega (E_\tau -\pi _\tau E_\tau )\Psi _2 dx dt \ge \frac{1}{2}\int \nolimits _\Omega |\nabla \Psi _2(T)|^2 dx - \frac{1}{2}\int \nolimits _\Omega |\nabla \Psi _2(0)|^2 dx. \end{aligned}$$

Inserting these inequalities into (36) and (37), respectively, and adding both inequalities, we find that

$$\begin{aligned} \frac{1}{2}&\int \nolimits _\Omega |\nabla \Psi _1(T)|^2 dx + \frac{1}{2}\int \nolimits _\Omega |\nabla \Psi _2(T)|^2 dx + \frac{\delta }{3}\int \nolimits _0^T\int _\Omega \theta _\tau ^3 E_\tau dxdt \nonumber \\&\quad + \int \nolimits _0^T\int _\Omega \rho _\tau ^2\theta _\tau dxdt + \int \nolimits _0^T\int _\Omega \bigg (\theta _\tau +\frac{5}{2}\rho _\tau \theta _\tau ^2\bigg ) \bigg (\theta _\tau +\frac{3}{2}\rho _\tau \theta _\tau \bigg ) dxdt \nonumber \\&\le \frac{1}{2}\int \nolimits _\Omega |\nabla \Psi _1(0)|^2 dx + \frac{1}{2}\int \nolimits _\Omega |\nabla \Psi _2(0)|^2 dx - \delta \int \nolimits _0^T\int _\Omega \phi _\tau \bigg (\rho _\tau - \fint _\Omega \rho _\tau dx +\Psi _1\bigg ) dxdt \nonumber \\&\quad + \frac{3}{2}\int \nolimits _0^T\bigg (\int \nolimits _\Omega \rho _\tau \theta _\tau dx\bigg ) \bigg (\fint _\Omega \rho _\tau dx \bigg ) dt + \int \nolimits _0^T \int _\Omega \bigg (\theta _\tau +\frac{5}{2}\rho _\tau \theta _\tau ^2 \bigg )dx\bigg (\fint _\Omega E_\tau dx \bigg ) dt \nonumber \\&\quad + \frac{\delta }{3}\int \nolimits _0^T\int _\Omega \theta _\tau ^3 \bigg (\fint _\Omega E_\tau dx \bigg ) dxdt - \delta \int \nolimits _0^T\int _\Omega \theta _\tau ^{-N}\log (\theta _\tau )\Psi _2 dx dt \nonumber \\&=: J_1+\cdots +J_7. \end{aligned}$$
(38)

We start with the last integral. It follows from (35) that

$$\begin{aligned} J_7 \le \delta \Vert \theta _\tau ^{-N}\log \theta _\tau \Vert _{L^{6/5}(\Omega _T)} \Vert \Psi _2\Vert _{L^6(\Omega _T)} \le \delta C\Vert \theta _\tau ^{-N}\log \theta _\tau \Vert _{L^{6/5}(\Omega _T)} \big (1+\Vert E_\tau \Vert _{L^2(\Omega _T)}^{1/3}\big ). \end{aligned}$$

The first norm is estimated according to

$$\begin{aligned} \Vert \theta _\tau ^{-N}\log \theta _\tau \Vert _{L^{6/5}(\Omega _T)}^{6/5}&= \int \nolimits _0^T\int _\Omega \theta _\tau ^{-6N/5}|\log \theta _\tau |^{6/5}dxdt \\&\le C + \int \nolimits _0^T\int _{\Omega \cap \{\theta _\tau (t)<1\}} \theta _\tau ^{-6N/5}|\log \theta _\tau |^{6/5}dxdt \\&\le C + C\int \nolimits _0^T\int _{\Omega \cap \{\theta _\tau (t)<1\}} \theta _\tau ^{-(N+1)}dxdt, \end{aligned}$$

where the last inequality follows from the condition \(N<5\) (and hence \(6N/5<N+1\)). Because of (33), this leads to

$$\begin{aligned} \delta \Vert \theta _\tau ^{-N}\log \theta _\tau \Vert _{L^{6/5}(\Omega _T)} \le C\delta ^{1/6}. \end{aligned}$$
(39)

Therefore, we infer that

$$\begin{aligned} J_7 \le \delta ^{1/6}C\big (1+\Vert E_\tau \Vert _{L^2(\Omega _T)}^{1/3}\big ). \end{aligned}$$

Since \(E_\tau =\theta _\tau +\frac{3}{2}\rho _\tau \theta _\tau \), the right-hand side can be controlled (for sufficiently small \(\delta >0\)) by the last two integrals on the left-hand side of (38).

Next, we consider the following term appearing in \(J_3\):

$$\begin{aligned} -\delta \int \nolimits _0^T\int _\Omega \phi \rho _\tau dx dt&= -\delta \int \nolimits _0^T\int _\Omega \bigg (\log (\rho _\tau \theta _\tau ^{-3/2}) + \frac{5}{2}\bigg )\rho _\tau dx dt\\&\le -\delta \int \nolimits _0^T\int _\Omega \theta _\tau ^{3/2}\cdot \rho _\tau \theta _\tau ^{-3/2} \log (\rho _\tau \theta _\tau ^{-3/2}) dx dt + C \\&\le \delta C\int \nolimits _0^T\int _\Omega \theta _\tau ^{3/2} dx dt + C, \end{aligned}$$

where the last inequality follows from the fact that \(z\mapsto z\log z\) is bounded from below. Furthermore, we deduce from Lemma 2, bound (33) for \(\phi _\tau \), and the Poincaré–Wirtinger inequality that

$$\begin{aligned} \delta \int \nolimits _0^T\int _\Omega \phi _\tau \bigg (\fint _\Omega \rho _\tau dx\bigg )dxdt&\le \delta \Vert \phi _\tau \Vert _{L^1(\Omega _T)}\Vert \rho _\tau \Vert _{L^\infty (0,T;L^1(\Omega ))} \le C, \\ \delta \int \nolimits _0^T\int _\Omega \phi _\tau \Psi _1 dxdt&\le \frac{\delta }{2}\int \nolimits _0^T\int _\Omega \phi _\tau ^2 dxdt + \frac{\delta }{2}\int \nolimits _\Omega \Psi _1^2 dxdt \\&\le C + \delta C\int \nolimits _0^T\int _\Omega |\nabla \Psi _1|^2 dxdt. \end{aligned}$$

This shows that

$$\begin{aligned} J_3 \le C + \delta C \int \nolimits _0^T\int _\Omega \theta _\tau ^{3/2} dx dt + \delta C\int \nolimits _0^T\int _\Omega |\nabla \Psi _1|^2 dxdt. \end{aligned}$$

The first integral on the right-hand side can be controlled by the last integral on the left-hand side of (38). The last integral on the right-hand side is controlled after applying Gronwall’s inequality. The integrals \(J_4\), \(J_5\), and \(J_6\) can be controlled by the expressions on the left-hand side of (38). We conclude that

$$\begin{aligned} \int \nolimits _\Omega&\big (|\nabla \Psi _1(T)|^2 + |\nabla \Psi _2(T)|^2\big )dx \\&{}+ \int \nolimits _0^T\int _\Omega \big (\theta _\tau ^2 + \delta \theta _\tau ^4 + \rho _\tau \theta _\tau ^2(1+\theta _\tau ) + \rho _\tau ^2\theta _\tau (1+\theta _\tau ^2) \big )dxdt \le C\exp (\delta CT). \end{aligned}$$

We deduce from this estimate and Young’s inequality that

$$\begin{aligned} \Vert \rho _\tau \theta _\tau \Vert _{L^2(\Omega _T)}^2&\le \frac{1}{2}\int \nolimits _0^T\int _\Omega \rho _\tau ^2(\theta _\tau +\theta _\tau ^3) dxdt \le C, \\ \Vert \rho _\tau \theta _\tau ^2\Vert _{L^{3/2}(\Omega _T)}^{3/2}&\le \frac{1}{2}\int \nolimits _0^T\int _\Omega (\rho _\tau +\rho _\theta ^2)\theta _\tau ^3 dxdt \le C. \end{aligned}$$

This proves the lemma. \(\square \)

Step 3: Strong convergence of \((\rho _\tau )\) and \((\theta _\tau )\). First, we prove a gradient bound for the particle density.

Lemma 4

(Gradient estimate). There exist \(N\in (0,5)\), \(m\in (\frac{1}{2},1)\), and \(\alpha \in (\frac{2}{3},1)\) such that

$$\begin{aligned} \Vert \rho _\tau ^m\Vert _{L^p(0,T;W^{1,q}(\Omega ))} \le C(\delta ), \end{aligned}$$

where \(C(\delta )>0\) does not depend on \(\tau \), \(p\ge 1/m\), and \(3q/(3-q)>1/m\) (or equivalently, \(q>3/(3m+1)\)). Moreover, with a constant \(C>0\) independent of \(\tau \) and \(\delta \),

$$\begin{aligned} \Vert E_\tau \Vert _{L^1(0,T;W^{1,1}(\Omega ))} \le C. \end{aligned}$$

The condition \(q>3/(3m+1)\) guarantees that \(W^{1,q}(\Omega )\hookrightarrow L^{1/m}(\Omega )\). This is needed below for the application of the nonlinear Aubin–Lions lemma.

Proof

It follows from Lemma 3 that \((\rho _\tau \theta _\tau ^{1/2})\) is bounded in \(L^2(\Omega _T)\), while estimate (33) implies that \((\theta _\tau ^{-1/2})\) is bounded in \(L^{2(N+1)}(\Omega _T)\). Consequently, \(\rho _\tau = \rho _\tau \theta ^{1/2}\theta _\tau ^{-1/2}\) is uniformly bounded in \(L^{r}(\Omega _T)\), where \(r:=2(N+1)/(N+2)>1\). Together with the \(L^\infty (0,T;L^1(\Omega ))\) bound for \((\rho _\tau )\), an interpolation with \(1/c=(1-\alpha )/1 + \alpha /r\) and \(b\ge 1\) gives

$$\begin{aligned} \Vert \rho _\tau \Vert _{L^b(0,T;L^c(\Omega ))}^b \le \Vert \rho _\tau \Vert _{L^\infty (0,T;L^1(\Omega ))}^{(1-\alpha )b} \int \nolimits _0^T\Vert \rho _\tau \Vert _{L^r(\Omega )}^{\alpha b}dt \le C\int \nolimits _0^T\Vert \rho _\tau \Vert _{L^r(\Omega )}^{\alpha b}dt. \end{aligned}$$

A simple computation shows that \(c=r/(\alpha +(1-\alpha )r)\). We choose \(b=r/\alpha \) and use the \(L^r(\Omega _T)\) bound for \((\rho _\tau )\):

$$\begin{aligned} \Vert \rho _\tau \Vert _{L^{r/\alpha }(0,T;L^{r/(\alpha +(1-\alpha )r)}(\Omega ))} \le C \quad \text{ for } r=\frac{2(N+1)}{N+2},\ \alpha \in (0,1). \end{aligned}$$

Let \(\frac{1}{2}<m<1\). Then

$$\begin{aligned} \Vert \rho _\tau ^m\Vert _{L^{r/(\alpha m)}(0,T;L^{r/(m(\alpha +(1-\alpha )r))}(\Omega ))}\le C. \end{aligned}$$

We know from (32) and (33) that \(\nabla \log \rho _\tau = \nabla \phi _\tau + \frac{3}{2}\nabla \log \theta _\tau \) is uniformly bounded in \(L^2(\Omega _T)\) (but not uniformly in \(\delta \)). It follows that \(\nabla \rho _\tau ^m = m\rho _\tau ^m\nabla \log \rho _\tau \) is uniformly bounded in \(L^p(0,T;L^q(\Omega ))\), where \(p,q\ge 1\) satisfy

$$\begin{aligned} \frac{1}{p} = \frac{1}{2} + \frac{\alpha m}{r}, \quad \frac{1}{q} = \frac{1}{2} + \frac{m}{r}(\alpha +(1-\alpha )r). \end{aligned}$$
(40)

We deduce from the Poincaré–Wirtinger inequality and the \(L^\infty (0,T;L^1(\Omega ))\) bound for \((\rho _\tau )\) that

$$\begin{aligned} \Vert \rho _\tau ^m\Vert _{L^p(0,T;L^q(\Omega ))} \le C\Vert \nabla \rho _\tau ^m\Vert _{L^p(0,T;L^q(\Omega ))} + C\Vert \rho _\tau ^m\Vert _{L^p(0,T;L^1(\Omega ))} \le C(\delta ). \end{aligned}$$

We claim that there exist \(N\in (0,5)\), \(m\in (\frac{1}{2},1)\), and \(\alpha \in (0,1)\) such that

$$\begin{aligned} p\ge \frac{1}{m}, \quad \frac{3q}{3-q} > \frac{1}{m}, \end{aligned}$$

where p and q are given by (40). A straightforward computation shows that these inequalities are equivalent to

$$\begin{aligned} r\ge \frac{2\alpha m}{2m-1}, \quad \frac{r}{r-1} < 6\alpha m. \end{aligned}$$

We choose \(r=2\alpha m/(2m-1)\) (recall that \(m>1/2\)) such that the first inequality is satisfied. With this choice, the second inequality is equivalent to \(m<1/(3(1-\alpha ))\). Since we want \(m<1\), we need to choose \(\alpha >2/3\). Then \(\frac{1}{2}<m<1<1/(3(1-\alpha ))\). By definition of r,

$$\begin{aligned} \frac{2(N+1)}{N+2} = r = \frac{2\alpha m}{2m-1}. \end{aligned}$$
(41)

Thus, it remains to prove that \(N\in (0,5)\) can be chosen such that this identity holds for some \(\alpha >\frac{2}{3}\) and \(m\in (\frac{1}{2},1)\). Equation (41) is equivalent to

$$\begin{aligned} N = -\frac{2\alpha m-2m+1}{\alpha m-2m+1}, \end{aligned}$$

and the requirement \(N<5\) gives \(m>6/(12-7\alpha )\). The right-hand side is smaller than one if \(\alpha <\frac{6}{7}\). This is compatible with the previous constraint \(\alpha >\frac{2}{3}\) and proves the claim.

To finish the proof of the lemma, we observe that (31) and Lemma 3 imply that

$$\begin{aligned} \Vert \nabla (\rho _\tau \theta _\tau )\Vert _{L^{4/3}(\Omega _T)} \le 2\Vert \sqrt{\rho _\tau \theta _\tau }\Vert _{L^4(\Omega _T)} \Vert \nabla \sqrt{\rho _\tau \theta _\tau }\Vert _{L^2(\Omega _T)} \le C. \end{aligned}$$
(42)

Moreover, we deduce from (32) and Lemma 3 that

$$\begin{aligned} \Vert \nabla \theta _\tau \Vert _{L^1(\Omega _T)} \le \Vert \theta _\tau \Vert _{L^2(\Omega _T)}\Vert \nabla \log \theta _\tau \Vert _{L^2(\Omega _T)} \le C. \end{aligned}$$

Thus, \((E_\tau )\) is bounded in \(L^1(0,T;W^{1,1}(\Omega ))\), and the proof is finished. \(\square \)

Lemma 5

(Bounds for the discrete time derivative). There exists a constant \(C>0\) which does not depend on \(\tau \) such that

$$\begin{aligned} \tau ^{-1}\Vert \rho _\tau -\pi _\tau \rho _\tau \Vert _{L^{4/3}(0,T;W^{1,4}(\Omega )')} \le C, \quad \tau ^{-1}\Vert E_\tau -\pi _\tau E_\tau \Vert _{L^{6/5}(0,T;W^{2,4}(\Omega )')} \le C. \end{aligned}$$

Proof

We infer from (42) and (33) that

$$\begin{aligned} \tau ^{-1}\Vert \rho _\tau -\pi _\tau \rho _\tau \Vert _{L^{4/3}(0,T;W^{1,4}(\Omega )')}&= \sup _{\Vert \psi _1\Vert _{L^4(0,T;W^{1,4}(\Omega ))}=1} \bigg |\tau ^{-1}\int \nolimits _0^T\int \nolimits _\Omega (\rho _\tau -\pi _\tau \rho _\tau )\psi _1 dxdt\bigg | \\&\le \frac{3}{2}\Vert \nabla (\rho _\tau \theta _\tau )\Vert _{L^{4/3}(\Omega _T)} + \delta \Vert \phi _\tau \Vert _{L^{4/3}(\Omega _T)} \le C. \end{aligned}$$

Furthermore,

$$\begin{aligned} \tau ^{-1}\Vert E_\tau -\pi _\tau E_\tau \Vert _{L^{6/5}(0,T;W^{2,4}(\Omega )')}&\le \Vert \theta _\tau \Vert _{L^2(\Omega _T)} + \frac{15}{4}\Vert \rho _\tau \theta _\tau ^2\Vert _{L^{3/2}(\Omega _T)} \\&\quad + \frac{\delta }{3}\Vert \theta _\tau ^3\Vert _{L^{4/3}(\Omega _T)} + \delta \Vert \theta _\tau ^{-N}\log \theta _\tau \Vert _{L^{6/5}(\Omega _T)}. \end{aligned}$$

Taking into account Lemma 3, the first three terms on the right-hand side are uniformly bounded. Since \(N<5\), the last term can be estimated from above by \(\delta \Vert \theta _\tau ^{-(N+1)}\Vert _{L^1(\Omega _T)}^{5/6}\) which is bounded because of (33). This finishes the proof. \(\square \)

Lemmas 4 and 5 allow us to apply the Aubin–Lions lemma in the version of [7, Theorem 3]. This is possible since \(p\ge 1/m\) and \(W^{1,q}(\Omega )\hookrightarrow L^{1/m}(\Omega )\) (the last fact is a consequence of \(q>3/(3m+1)\)). We infer the existence of a subsequence which is not relabeled such that, as \(\tau \rightarrow 0\),

$$\begin{aligned} \rho _\tau \rightarrow \rho \quad \text{ strongly } \text{ in } L^1(\Omega _T). \end{aligned}$$

Concerning \((E_\tau )\), Lemmas 4 and 5 allow us to apply the Aubin–Lions lemma in the version of [9] (or Theorem 3 in [7] with \(m=1\)) to obtain a subsequence of \((E_\tau )\) (not relabeled) such that, as \(\tau \rightarrow 0\),

$$\begin{aligned} E_\tau \rightarrow E\quad \text{ strongly } \text{ in } L^1(\Omega _T). \end{aligned}$$

In fact, because of the \(L^2(\Omega _T)\) bound for \((E_\tau )\) from Lemma 3, this convergence holds in \(L^\eta (\Omega _T)\) for any \(\eta <2\). Up to subsequences, we know that \(\rho _\tau \rightarrow \rho \) and \(E_\tau \rightarrow E\) a.e. in \(\Omega _T\). Thus,

$$\begin{aligned} \theta _\tau = \frac{E_\tau }{1+3\rho _\tau /2} \rightarrow \frac{E}{1+3\rho /2} =: \theta \quad \text{ a.e. } \text{ in } \Omega _T. \end{aligned}$$

In particular, \(E=\theta +\frac{3}{2}\rho \theta \). The bound for \((\theta _\tau )\) in \(L^4(\Omega _T)\) (not uniform in \(\delta \)) shows that the previous convergence holds in \(L^\eta (\Omega _T)\) for any \(\eta <4\). We deduce from the \(L^2(\Omega _T)\) bounds for \(\log \theta _\tau \) and \(\log \rho _\tau =\phi _\tau +\frac{3}{2}\log \theta _\tau -\frac{5}{2}\) that \(\log \theta \) and \(\log \rho \) are integrable and thus, \(\rho >0\), \(\theta >0\) a.e. in \(\Omega _T\). Furthermore, \(\phi _\tau \rightarrow \log \rho -\frac{3}{2}\log \theta +\frac{5}{2}=:\phi \) a.e. in \(\Omega _T\) and, because of (33), weakly in \(L^2(0,T;H^1(\Omega ))\).

The previous bounds and the strong convergences of \((\rho _\tau )\) and \((\theta _\tau )\) allow us to pass to the limit \(\tau \rightarrow 0\) in (28)–(29). For this, we observe that, by (42),

$$\begin{aligned} \nabla (\rho _\tau \theta _\tau )\rightharpoonup \nabla (\rho \theta )\quad \text{ weakly } \text{ in } L^{4/3}(\Omega _T). \end{aligned}$$

Furthermore, by Lemma 3,

$$\begin{aligned} \rho _\tau \theta _\tau ^2\rightarrow \rho \theta ^2\quad \text{ strongly } \text{ in } L^\eta (\Omega _T),\ \eta <\frac{3}{2}. \end{aligned}$$

The strong convergence of \((\theta _\tau )\) to \(\theta \), the uniform bounds on \((\theta _\tau )\), and the a.e. positivity of \(\theta \) imply that

$$\begin{aligned} \theta _\tau ^3\rightarrow \theta ^3,\quad \theta _\tau ^{-N}\log \theta _\tau \rightarrow \theta ^{-N}\log \theta \quad \text{ strongly } \text{ in } L^1(\Omega _T). \end{aligned}$$

Finally, by Lemma 5,

$$\begin{aligned} \tau ^{-1}(\rho _\tau -\pi _\tau \rho _\tau )\rightharpoonup \partial _t\rho&\quad \text{ weakly } \text{ in } L^{4/3}(0,T;W^{1,4}(\Omega )'), \\ \tau ^{-1}(E_\tau -\pi _\tau E_\tau )\rightharpoonup \partial _t E&\quad \text{ weakly } \text{ in } L^{6/5}(0,T;W^{2,4}(\Omega )'). \end{aligned}$$

Then (28)–(29) become in the limit \(\tau \rightarrow 0\),

$$\begin{aligned} 0&= \int \nolimits _0^T\langle \partial _t\rho ,\psi _1\rangle dt + \int \nolimits _0^T\int \nolimits _\Omega \nabla (\rho \theta )\cdot \nabla \psi _1 dxdt + \delta \int \nolimits _0^T\int \nolimits _\Omega (\nabla \phi \cdot \nabla \psi _1+\phi \psi _1)dxdt, \end{aligned}$$
(43)
$$\begin{aligned} 0&= \int \nolimits _0^T\langle \partial _t E,\psi _2\rangle dt - \int \nolimits _0^T\int \nolimits _\Omega \bigg (\theta +\frac{5}{2}\rho \theta ^2\bigg )\Delta \psi _2 dxdt \nonumber \\&\quad - \frac{\delta }{3}\int \nolimits _0^T\int \nolimits _\Omega \theta ^3\Delta \psi _2 dxdt + \delta \int \nolimits _0^T\int \nolimits _\Omega \theta ^{-N}\log (\theta )\psi _2 dxdt \end{aligned}$$
(44)

for any test functions \(\psi _1\), \(\psi _2\in C_0^2(\Omega _T)\).

3.5 Limit \(\delta \rightarrow 0\)

In this subsection, we need some tools from mathematical fluid dynamics, in particular the concept of renormalized solutions and the div-curl lemma. In the following, we denote by \(\overline{u_\delta }\) the weak or distributional limit of a sequence \((u_\delta )\) whenever it exists. Let \((\rho _\delta , E_\delta )\) be a weak solution to (43)–(44) and set \(\phi _\delta =\log (\rho _\delta /\theta _\delta ^{3/2})+\frac{5}{2}\), \(E_\delta =\theta _\delta +\frac{3}{2}\rho _\delta \theta _\delta \).

Step 1: Renormalized mass balance equation. We compute the renormalized form of (43). Let \(f\in C^2([0,\infty )) \cap L^\infty (0,\infty )\) satisfy \(|f'(s)|\le C(1+s)^{-1}\) and \(|f''(s)|\le C(1+s)^{-2}\) for \(s\ge 0\). Furthermore, let \(\xi \in C_0^\infty (\Omega _T)\). Choosing \(\psi _1=f'(\rho _\delta )\xi \) in (43), we find that

$$\begin{aligned} \int \nolimits _0^T&\langle \partial _t f(\rho _\delta ), \xi \rangle dt + \int \nolimits _0^T\int \nolimits _\Omega \big ( f'(\rho _\delta )\nabla (\rho _\delta \theta _\delta ) + \delta f'(\rho _\delta )\nabla \phi _\delta \big ) \cdot \nabla \xi dx dt\\&= - \int \nolimits _0^T\int \nolimits _\Omega \big (\delta f'(\rho _\delta ) \phi _\delta + f''(\rho _\delta )\nabla \rho _\delta \cdot \nabla (\rho _\delta \theta _\delta ) + \delta f''(\rho _\delta )\nabla \rho _\delta \cdot \nabla \phi _\delta \big )\xi dx dt. \end{aligned}$$

This computation can be made rigorous (such that \(\psi _1\) is an admissible test function) by using renormalization techniques; see, e.g., [13, Section 10.18]. The previous equation can be rewritten as

$$\begin{aligned} -\partial _t&f(\rho _\delta ) + {\text {div}}\big (f'(\rho _\delta )\nabla (\rho _\delta \theta _\delta ) + \delta f'(\rho _\delta )\nabla \phi _\delta \big ) \nonumber \\&= \delta f'(\rho _\delta ) \phi _\delta + f''(\rho _\delta )\nabla \rho _\delta \cdot \nabla (\rho _\delta \theta _\delta ) + \delta f''(\rho _\delta )\nabla \rho _\delta \cdot \nabla \phi _\delta \quad \text{ in } {\mathcal {D}}'(\Omega _T). \end{aligned}$$
(45)

Step 2: Application of the div-curl lemma. We apply the div-curl lemma to the vector fields

$$\begin{aligned} U_\delta = \big ( f(\rho _\delta ), -f'(\rho _\delta ) \nabla (\rho _\delta \theta _\delta ) - \delta f'(\rho _\delta )\nabla \phi _\delta \big ),\quad V_\delta = \big ( g(\theta _\delta ) , 0,0,0 \big ), \end{aligned}$$

where f is as before and \(g\in C^1([0,\infty ))\cap L^\infty (0,\infty )\) satisfies \(|g'(s)|\le C(1+s)^{-1}\) for \(s>0\). We know from (31) that \((\nabla \sqrt{\rho _\delta \theta _\delta })\) and \((\sqrt{\delta }\nabla \phi _\delta )\) are bounded in \(L^2(\Omega _T)\) and from Lemma 3 that \((\sqrt{\rho _\delta \theta _\delta })\) is bounded in \(L^4(\Omega _T)\). Consequently,

$$\begin{aligned} f'(\rho _\delta )\nabla (\rho _\delta \theta _\delta ) + \delta f'(\rho _\delta )\nabla \phi _\delta = 2f'(\rho _\delta )\sqrt{\rho _\delta \theta _\delta } \nabla \sqrt{\rho _\delta \theta _\delta }+\delta f'(\rho _\delta )\nabla \phi _\delta \end{aligned}$$

is uniformly bounded in \(L^{4/3}(\Omega _T)\). Thus, \((U_\delta )\) is bounded in \(L^{4/3}(\Omega _T)\). Because of the properties of g, \((V_\delta )\) is trivially bounded in \(L^\infty (\Omega _T)\).

The left-hand side of (45) equals \(-{\text {div}}_{(t,x)}U_\delta \). We wish to bound the right-hand side of (45). For this, we observe that, thanks to (33), the first term \(\delta f'(\rho _\delta )\phi _\delta \) is uniformly bounded in \(L^2(\Omega _T)\). We rewrite the second term as

$$\begin{aligned} f''(\rho _\delta )\nabla \rho _\delta \cdot \nabla (\rho _\delta \theta _\delta ) = 4\rho _\delta f''(\rho _\delta )\sqrt{\theta _\delta }\nabla \sqrt{\rho _\delta } \cdot \nabla \sqrt{\rho _\delta \theta _\delta }. \end{aligned}$$
(46)

Since \(\rho _\delta |f''(\rho _\delta )|\le C\rho _\delta /(1+\rho _\delta )^2\le C\) and \((\sqrt{\theta _\delta }\nabla \sqrt{\rho _\delta })\), \((\sqrt{\rho _\delta \theta _\delta })\) are bounded in \(L^2(\Omega _T)\) by (31), expression (46) is bounded in \(L^1(\Omega _T)\). In order to bound the last term in (45), we observe that, by (32) and (33),

$$\begin{aligned} \sqrt{\delta }\nabla \log \rho _\delta = \sqrt{\delta }\nabla \phi _\delta + \frac{3}{2}\sqrt{\delta }\nabla \log \theta _\delta \end{aligned}$$

is uniformly bounded in \(L^2(\Omega _T)\). Then

$$\begin{aligned} \delta f''(\rho _\delta )\nabla \rho _\delta \cdot \nabla \phi _\delta = f''(\rho _\delta )\rho _\delta (\sqrt{\delta }\nabla \log \rho _\delta )\cdot (\sqrt{\delta }\nabla \phi _\delta ) \end{aligned}$$

is uniformly bounded in \(L^1(\Omega _T)\). We infer that the right-hand side of (45) and consequently also \(-{\text {div}}_{(t,x)}U_\delta \) are uniformly bounded in \(L^1(\Omega _T)\). By Sobolev’s embedding, it follows that \({\text {div}}_{(t,x)}U_\delta \) is relatively compact in \(W^{-1,r}(\Omega _T)\) for some \(r>1\).

It follows from the uniform \(L^2(\Omega _T)\) bound for \((\nabla \log \theta _\delta )\) (see (32)) and \(\theta _\delta |g'(\theta _\delta )|\le C\theta _\delta /(1+\theta _\delta )\le C\) that

$$\begin{aligned} {\text {curl}}_{(t,x)}V_\delta = g'(\theta _\delta )\begin{pmatrix} 0 &{} (\nabla \theta _\delta )^T \\ \nabla \theta _\delta &{} 0 \end{pmatrix} = \theta _\delta g'(\theta _\delta )\begin{pmatrix} 0 &{} (\nabla \log \theta _\delta )^T \\ \nabla \log \theta _\delta &{} 0 \end{pmatrix} \end{aligned}$$

is uniformly bounded in \(L^2(\Omega _T)\). By Sobolev’s embedding, this expression is relatively compact in \(W^{-1,r}(\Omega _T;{{\mathbb {R}}}^{3\times 3})\) for some \(r>1\).

The div-curl lemma [13, Theorem 10.21] implies that \(\overline{U_\delta \cdot V_\delta }=\overline{U_\delta }\cdot \overline{V_\delta }\) a.e. in \(\Omega _T\), which means that

$$\begin{aligned} \overline{f(\rho _\delta )g(\theta _\delta )} = \overline{f(\rho _\delta )}\, \overline{g(\theta _\delta )}\quad \text{ a.e. } \text{ in } \Omega _T \end{aligned}$$
(47)

for all \(f\in C^2([0,\infty ))\cap L^\infty (0,\infty )\) and \(g\in C^1([0,\infty )) \cap L^\infty (0,\infty )\) satisfying \(|f'(s)|\le C(1+s)^{-1}\), \(|f''(s)|\le C(1+s)^{-2}\), and \(|g'(s)|\le C(1+s)^{-1}\) for \(s>0\).

Step 3: Proof of \(\overline{\rho _\delta \theta _\delta }=\rho \theta \). We wish to relax the assumptions on the functions f and g. To this end, we introduce the truncation function \(T_1\in C^2([0,\infty ))\) by \(T_1(s)=s\) for \(0\le s<1\), \(T_1(s)=2\) for \(s>3\), and \(T_1\) is nondecreasing and concave in \([0,\infty )\). Then we define \(T_k(s)=k T_1(s/k)\) for \(s>0\) and \(k\in {{\mathbb {N}}}\). It is possible to choose \(f=T_k\) in (47). Together with Fatou’s lemma and the boundedness of g, we infer that

$$\begin{aligned}&\big \Vert \overline{\rho _\delta g(\theta _\delta )} - \overline{\rho _\delta }~\overline{g(\theta _\delta )} \big \Vert _{L^1(\Omega _T)}\\&= \big \Vert \overline{(\rho _\delta - T_k(\rho _\delta )) g(\theta _\delta )} - \overline{(\rho _\delta - T_k(\rho _\delta ))}~\overline{g(\theta _\delta )} \big \Vert _{L^1(\Omega _T)}\\&\le C\sup _{0<\delta <1}\int \nolimits _{\Omega _T}|T_k(\rho _\delta )-\rho _\delta | dx dt. \end{aligned}$$

Furthermore, we deduce from (30) that

$$\begin{aligned} \int \nolimits _{\Omega _T}|T_k(\rho _\delta )-\rho _\delta | dx dt \le C\int \nolimits _{\{\rho _\delta \ge k\}}\rho _\delta dx dt \le \frac{C}{\log k}\int \nolimits _{\{\rho _\delta \ge k\}}\rho _\delta \log \rho _\delta dx dt \le \frac{C}{\log k}, \end{aligned}$$

such that we obtain for any \(k\ge 2\),

$$\begin{aligned} \big \Vert \overline{\rho _\delta g(\theta _\delta )} - \overline{\rho _\delta }~\overline{g(\theta _\delta )} \big \Vert _{L^1(\Omega _T)} \le \frac{C}{\log k}. \end{aligned}$$

Then the limit \(k\rightarrow \infty \) implies that

$$\begin{aligned} \overline{\rho _\delta g(\theta _\delta )} = \rho \,\overline{g(\theta _\delta )} \quad \text{ a.e. } \text{ in } \Omega _T \end{aligned}$$
(48)

for any \(g\in C^1([0,\infty ))\cap L^\infty (0,\infty )\) satisfying \(|g'(s)|\le C(1+s)^{-1}\) for \(s>0\). We choose \(g=T_k\) which leads to

$$\begin{aligned} \overline{\rho _\delta \theta _\delta } - \rho \theta = \overline{\rho _\delta (\theta _\delta -T_k(\theta _\delta ))} - \rho (\theta -\overline{T_k(\theta _\delta )}). \end{aligned}$$
(49)

We claim that both terms on the right-hand side converge to zero as \(k\rightarrow \infty \). Indeed, it follows from Fatou’s lemma and the \(L^1(\Omega _T)\) bound for \((\rho _\delta \theta _\delta ^2)\) from Lemma 3 that

$$\begin{aligned} \bigg \Vert \frac{\overline{\rho _\delta (\theta _\delta -T_k(\theta _\delta ))}}{1+\rho } \bigg \Vert _{L^1(\Omega _T)}&\le \sup _{0<\delta<1}\int \nolimits _{\Omega _T}\rho _\delta |\theta _\delta - T_k(\theta _\delta )|dx dt\\&\le C\sup _{0<\delta<1}\int \nolimits _{\{\theta _\delta>k\}}\rho _\delta \theta _\delta dx dt \\&\le \frac{C}{k}\sup _{0<\delta <1}\int \nolimits _{\{\theta _\delta >k\}} \rho _\delta \theta _\delta ^2 dx dt\le \frac{C}{k}, \end{aligned}$$

while we deduce from Fatou’s lemma and the \(L^2(\Omega _T)\) bound for \((\theta _\delta )\), again from Lemma 3, that

$$\begin{aligned} \bigg \Vert \frac{\rho (\theta - \overline{T_k(\theta _\delta )})}{1+\rho } \bigg \Vert _{L^1(\Omega _T)}&\le \sup _{0<\delta<1}\int \nolimits _{\Omega _T}|\theta _\delta - T_k(\theta _\delta )|dx dt \le C\sup _{0<\delta<1}\int \nolimits _{\{\theta _\delta>k\}}\theta _\delta dx dt \\&\le \frac{C}{k}\sup _{0<\delta <1}\int \nolimits _{\{\theta _\delta >k\}}\theta _\delta ^2 dx dt \le \frac{C}{k}. \end{aligned}$$

We infer from (49) that for any \(k\ge 1\),

$$\begin{aligned} \bigg \Vert \frac{\overline{\rho _\delta \theta _\delta }-\rho \theta }{1+\rho } \bigg \Vert _{L^1(\Omega _T)} \le \frac{C}{k}, \end{aligned}$$

which implies, in the limit \(k\rightarrow \infty \), that

$$\begin{aligned} \overline{\rho _\delta \theta _\delta } = \rho \theta \quad \text{ a.e. } \text{ in } \Omega _T. \end{aligned}$$
(50)

Step 4: Pointwise convergence of \((\theta _\delta )\). We prove via the Aubin–Lions lemma that \(E_\delta =\theta _\delta +\frac{3}{2}\rho _\delta \theta _\delta \) is strongly convergent. We know from Lemma 4 that \((E_\delta )\) is bounded in \(L^1(0,T;W^{1,1}(\Omega ))\). For the time derivative of \(E_\delta \), we estimate (44) for \(\psi _2\in C_0^\infty (\Omega _T)\):

$$\begin{aligned} \bigg |\int \nolimits _0^T\langle \partial _t E_\delta ,\psi _2\rangle dt\bigg |&\le \bigg |\int \nolimits _0^T\int \nolimits _\Omega \bigg (\theta _\delta + \frac{5}{2}\rho _\delta \theta _\delta ^2 + \frac{\delta }{3}\theta _\delta ^3\bigg ) \Delta \psi _2 dxdt\bigg | \\&\quad + \bigg |\delta \int \nolimits _0^T\int \nolimits _\Omega \theta _\delta ^{-N} \log (\theta _\delta )\psi _2 dxdt\bigg | \\&\le C\big (\Vert \theta _\delta \Vert _{L^2(\Omega _T)} + \Vert \rho _\delta \theta _\delta ^2\Vert _{L^{3/2}(\Omega _T)}\\&\quad + \delta \Vert \theta _\delta ^3\Vert _{L^{4/3}(\Omega _T)}\big ) \Vert \Delta \psi _2\Vert _{L^4(\Omega _T)} \\&\quad + C\big (1+\delta \Vert \theta _\delta ^{-N}\log \theta _\delta \Vert _{L^{6/5}(\Omega _T)}\big )\Vert \psi _2\Vert _{L^6(\Omega _T)}. \end{aligned}$$

Taking into account estimate (39) and again using Lemma 3, we infer that

$$\begin{aligned} \Vert \partial _t E_\delta \Vert _{L^{6/5}(0,T;W^{2,4}(\Omega )')} \le C. \end{aligned}$$

We apply the Aubin–Lions lemma to \((E_\delta )\) to obtain the existence of a subsequence which is not relabeled such that, as \(\delta \rightarrow 0\), \((E_\delta )\) converges strongly in \(L^\eta (\Omega _T)\) for \(\eta <2\). Since \((1+\theta _\delta )^{-1}\) converges weakly in \(L^\eta (\Omega _T)\) for any \(\eta <\infty \), we find that

$$\begin{aligned} \overline{\bigg (\theta _\delta + \frac{3}{2}\rho _\delta \theta _\delta \bigg ) (1+\theta _\delta )^{-1}} = \overline{\bigg (\theta _\delta + \frac{3}{2}\rho _\delta \theta _\delta \bigg )}\,\overline{(1+\theta _\delta )^{-1}} \quad \text{ a.e. } \text{ in } \Omega _T. \end{aligned}$$
(51)

We choose \(g(s)=s(1+s)^{-1}\) in (48) and recall (50):

$$\begin{aligned} \overline{\rho _\delta \theta _\delta (1+\theta _\delta )^{-1}} = \rho \,\overline{\theta _\delta (1+\theta _\delta )^{-1}}, \quad \overline{\rho _\delta \theta _\delta } = \rho \theta \quad \text{ a.e. } \text{ in } \Omega _T. \end{aligned}$$

Using these expressions, we deduce from (51) that

$$\begin{aligned} \bigg (1+\frac{3}{2}\rho \bigg )&\overline{\theta _\delta (1+\theta _\delta )^{-1}} = \overline{\theta _\delta (1+\theta _\delta )^{-1} +\frac{3}{2}\rho _\delta \theta _\delta (1+\theta _\delta )^{-1}} \\&= \overline{\bigg (\theta _\delta + \frac{3}{2}\rho _\delta \theta _\delta \bigg )}\;\overline{(1+\theta _\delta )^{-1}} = \overline{\theta _\delta }\;\overline{(1+\theta _\delta )^{-1}} + \frac{3}{2}\overline{\rho _\delta \theta _\delta }\;\overline{(1+\theta _\delta )^{-1}} \\&= \bigg (1+\frac{3}{2}\rho \bigg )\theta \overline{(1+\theta _\delta )^{-1}} \quad \text{ a.e. } \text{ in } \Omega _T. \end{aligned}$$

This means that

$$\begin{aligned} \overline{\theta _\theta (1+\theta _\delta )^{-1}} = \theta \overline{(1+\theta _\delta )^{-1}}\quad \text{ a.e. } \text{ in } \Omega _T. \end{aligned}$$

We apply [13, Theorem 10.19] to the strictly decreasing function \(s\mapsto (1+s)^{-1}\) for \(s\ge 0\) to conclude that

$$\begin{aligned} \overline{(1+\theta _\delta )^{-1}} = (1+\theta )^{-1}\quad \text{ a.e. } \text{ in } \Omega _T. \end{aligned}$$

The strict convexity of \(s\mapsto (1+s)^{-1}\) then implies, by [13, Theorem 10.20], that \(\theta _\delta \rightarrow \theta \) a.e. in \(\Omega _T\). We deduce from the \(L^2(\Omega _T)\) bound for \((\theta _\delta )\) from Lemma 3 that this convergence is in fact strong in \(L^1(\Omega _T)\).

Step 5: Limit \(\delta \rightarrow 0\) in equations (43)–(44). We know from (42) that \((\nabla (\rho _\delta \theta _\delta ))\) is bounded in \(L^{4/3}(\Omega _T)\). Thus, up to a subsequence, \(\nabla (\rho _\delta \theta _\delta )\rightharpoonup \zeta _1\) weakly in \(L^{4/3}(\Omega _T)\) for some \(\zeta _1\in L^{4/3}(\Omega _T)\). Since \(\rho _\delta \theta _\delta \rightharpoonup \rho \theta \) weakly in \(L^1(\Omega _T)\), by (50), we infer that \(\zeta _1=\nabla (\rho \theta )\), i.e.

$$\begin{aligned} \nabla (\rho _\delta \theta _\delta )\rightharpoonup \nabla (\rho \theta ) \quad \text{ weakly } \text{ in } L^{4/3}(\Omega _T). \end{aligned}$$
(52)

We know from Lemma 3 that \((\rho _\delta \theta _\delta ^2)\) is bounded in \(L^{3/2}(\Omega _T)\), so that up to a subsequence, \(\rho _\delta \theta _\delta ^2\rightarrow \zeta _2\) weakly in \(L^{3/2}(\Omega _T)\). We deduce from the strong convergence of \((\theta _\delta )\) and the boundedness of \(s\mapsto (1+s^2)^{-1}\) that \((1+\theta _\delta ^2)^{-1}\rightarrow (1+\theta ^2)^{-1}\) strongly in \(L^\eta (\Omega _T)\) for any \(\eta <\infty \). Therefore,

$$\begin{aligned} \frac{\rho _\delta \theta _\delta ^2}{1+\theta _\delta ^2} \rightharpoonup \frac{\zeta _2}{1+\theta ^2}\quad \text{ weakly } \text{ in } L^1(\Omega _T). \end{aligned}$$

An application of (48) with \(g(s)=s^2(1+s^2)^{-1}\) together with the strong convergence of \((\theta _\delta )\) leads to

$$\begin{aligned} \frac{\rho _\delta \theta _\delta ^2}{1+\theta _\delta ^2} \rightharpoonup \frac{\rho \theta ^2}{1+\theta ^2}\quad \text{ weakly } \text{ in } L^1(\Omega _T). \end{aligned}$$

Hence, \(\zeta _2=\rho \theta ^2\) a.e. in \(\Omega _T\) and

$$\begin{aligned} \rho _\delta \theta _\delta ^2\rightharpoonup \rho \theta ^2\quad \text{ weakly } \text{ in } L^{3/2}(\Omega _T). \end{aligned}$$
(53)

Furthermore, it follows from (33), Lemma 3, and (39) that

$$\begin{aligned} \delta \phi _\delta \rightarrow 0&\quad \text{ strongly } \text{ in } L^2(0,T;H^1(\Omega )), \nonumber \\ \delta \theta _\delta ^3\rightarrow 0&\quad \text{ strongly } \text{ in } L^{4/3}(\Omega _T), \nonumber \\ \delta \theta _\delta ^{-N}\log \theta _\delta \rightarrow 0&\quad \text{ strongly } \text{ in } L^{6/5}(\Omega _T). \end{aligned}$$
(54)

For any \(\psi _1\in L^4(0,T;W^{1,4}(\Omega ))\), we have

$$\begin{aligned} \bigg |\int \nolimits _0^T\langle \partial _t\rho _\delta ,\psi _1\rangle dt\bigg |&\le \frac{3}{2}\Vert \nabla (\rho _\delta \theta _\delta )\Vert _{L^{4/3}(\Omega _T)} \Vert \nabla \psi _1\Vert _{L^4(\Omega _T)} \\&\quad + \delta \Vert \phi _\delta \Vert _{L^2(0,T;H^1(\Omega ))} \Vert \psi _1\Vert _{L^2(0,T;H^1(\Omega ))}\le C. \end{aligned}$$

Hence, up to subsequences,

$$\begin{aligned} \begin{aligned} \partial _t\rho _\delta \rightharpoonup \partial _t\rho&\quad \text{ weakly } \text{ in } L^{4/3}(0,T;W^{1,4}(\Omega )'), \\ \partial _t E_\delta \rightharpoonup \partial _t E&\quad \text{ weakly } \text{ in } L^{6/5}(0,T;W^{2,4}(\Omega )'). \end{aligned} \end{aligned}$$
(55)

We deduce from the bound for \((\log \theta _\delta )\) in \(L^\infty (0,T;L^1(\Omega ))\) that \(\theta >0\) a.e. in \(\Omega _T\).

We claim that \((\rho _\delta )\) also converges strongly. Indeed, the a.e. convergence of \((E_\delta )\) and \((\theta _\delta )\) imply that \(\rho _\delta =\frac{2}{3}(E_\delta /\theta _\delta -1)\rightarrow \rho \) a.e. in \(\Omega _T\). The \(L^\infty (0,T;L^1(\Omega ))\) bound for \((\rho _\delta \log \rho _\delta )\) from (30) shows that \((\rho _\delta )\) is equi-integrable, and together with its a.e. convergence, we conclude from the de la Vallée–Poussin theorem [10, Chap. 8, Sect. 1.7, Corollary 1.3] that

$$\begin{aligned} \rho _\delta \rightarrow \rho \quad \text{ strongly } \text{ in } L^1(\Omega _T). \end{aligned}$$

The positivity of \(\rho _\delta \) implies that \(\rho \ge 0\) a.e. in \(\Omega _T\). Note, however, that we cannot conclude that \(\rho >0\) a.e., since the control on \(\phi _\delta \) is now lost.

Convergences (52)–(55) allow us to perform the limit \(\delta \rightarrow 0\) in (43)–(44) showing that \((\rho ,\theta )\) solves (9)–(10). Theorem 1 is proved.