1 Introduction

Considering two species \(X_{1}\) and \(X_{2}\) that diffuse in a bounded medium \(\Omega \subset \mathbb {R}^{d}\) and react linearly \(X_{1}\leftrightharpoons X_{2}\), the evolution of their concentrations \(c=(c_{1},c_{2})\) can be described by the linear reaction-diffusion system

$$\begin{aligned} \dot{c_{1}}&=\delta _{1}\Delta c_{1}-(\tilde{\alpha }c_{1}-\tilde{\beta }c_{2})\nonumber \\ \dot{c_{2}}&=\delta _{2}\Delta c_{2}+(\tilde{\alpha }c_{1}-\tilde{\beta }c_{2}) \end{aligned}$$
(1.1)

complemented with no-flux boundary conditions and initial conditions, where \(\delta _{1},\delta _{2}>0\) are diffusion coefficients for species \(X_{1}\) and \(X_{2}\), respectively, and \(\tilde{\alpha },\tilde{\beta }>0\) are reaction rates describing the reaction speed of the linear reaction \(X_{1}\leftrightharpoons X_{2}\). The aim of the paper is to investigate system (1.1) if the reaction is much faster than the diffusion. To do this, we introduce a small parameter \(\varepsilon >0\) and assume that the reaction rates are given by \(\tilde{\alpha }=\tfrac{1}{\varepsilon }\sqrt{\tfrac{\alpha }{\beta }}\), \(\tilde{\beta }=\tfrac{1}{\varepsilon }\sqrt{\tfrac{\beta }{\alpha }}\). Then, the system (1.1) can be rewritten in an \(\varepsilon \)-dependent reaction-diffusion system

$$\begin{aligned} \dot{c_{1}^{\varepsilon }}&=\delta _{1}\Delta c_{1}^{\varepsilon }-\frac{1}{\varepsilon }\left( \sqrt{\tfrac{\alpha }{\beta }}c_{1}^{\varepsilon }-\sqrt{\tfrac{\beta }{\alpha }}c_{2}^{\varepsilon }\right) \nonumber \\ \dot{c_{2}^{\varepsilon }}&=\delta _{2}\Delta c_{2}^{\varepsilon }+\frac{1}{\varepsilon }\left( \sqrt{\tfrac{\alpha }{\beta }}c_{1}^{\varepsilon }-\sqrt{\tfrac{\beta }{\alpha }}c_{2}^{\varepsilon }\right) \ . \end{aligned}$$
(1.2)

Reaction systems and reaction-diffusion systems with slow and fast time scales have attracted a lot of attention in the last decades [7,8,9,10, 13, 17, 22, 30, 34, 36, 38]. Bothe and Hilhorst proved a fast-reaction limit \(\varepsilon \rightarrow 0\) for (1.2) in the following form.

Theorem 1.1

([6]) Let \(\Omega \subset \mathbb {R}^{d}\) be a domain with Lipschitz boundary. Let \(c_{1}^{\varepsilon }\) and \(c_{2}^{\varepsilon }\) be weak solutions of (1.2) with no-flux boundary conditions \(\nabla c_{i}^{\varepsilon }\cdot \nu =0\) on \(\partial \Omega \). Then \(c_{1}^{\varepsilon }\rightarrow c_{1}\) and \(c_{2}^{\varepsilon }\rightarrow c_{2}\) in \(\mathrm {L}^{2}([0,T]\times \Omega )\) as \(\varepsilon \rightarrow 0\) , and we have \(\frac{c_{1}}{\beta }=\frac{c_{2}}{\alpha }\). Moreover, defining the coarse-grained concentration \(\hat{c}=c_{1}+c_{2}\), then \(\hat{c}\) solves the diffusion equation \(\dot{\hat{c}}=\hat{\delta }\Delta \hat{c}\) with a new mixed diffusion coefficient \(\hat{\delta }=\frac{\beta \delta _{1}+\alpha \delta _{2}}{\alpha +\beta }\).

Essentially, the proof uses the free energy as a Lyapunov function to derive \(\varepsilon \)-uniform bounds on the concentrations \(c_{i}^{\varepsilon }\) and their gradients \(\nabla c_{i}^{\varepsilon }\), which is then used to prove convergence towards the slow manifold \(\left\{ c\in [0,\infty [^{2}\ |\ \alpha c_{1}=\beta c_{2}\right\} \). This proof also works for nonlinear reactions once \(\varepsilon \)-uniform \(\mathrm {L}^{\infty }\)-estimates are established (see [6]). On the linear slow manifold, one easily verifies that the coarse grained concentration \(\hat{c}:=\frac{\alpha +\beta }{\beta }c_{1}=\frac{\alpha +\beta }{\alpha }c_{2}=c_{1}+c_{2}\) solves \(\dot{\hat{c}}=\hat{\delta }\Delta \hat{c}\) where \(\hat{\delta }=\frac{\beta \delta _{1}+\alpha \delta _{2}}{\alpha +\beta }\) is the effective mixed diffusion coefficient.

In this work, we are not primarily interested in convergence of solutions of system (1.2). Instead, we perform the fast-reaction limit on the level of the underlying variational structure, which then implies convergence of solutions as a byproduct. Our starting point is that reaction-diffusion systems such as (1.2) can be written as a gradient flow equation induced by a gradient system \((Q,\mathcal {E},\mathcal {R}_{\varepsilon }^{*})\), where the state space Q is the space of probability measures \(Q=\mathrm {Prob}(\Omega \times \left\{ 1,2\right\} )\) and the driving functional is the free energy \(\mathcal{E}(\mu )=\int _{\Omega }\sum _{j=1}^{2}E_{B}\left( \frac{c_{j}}{w_{j}}\right) w_{j}\mathrm {d}x\) for measures \(\mu =c\ \mathrm {d}x\), with the Boltzmann function \(E_{B}(r)=r\log r-r+1\) and the (in general space dependent) stationary measure \(w=(w_{1},w_{2})^{\mathrm {T}}\). The dissipation potential \(\mathcal {R}_{\varepsilon }^{*}\) that determines the geometry of the underlying space is given by two parts \(\mathcal {R}_{\varepsilon }^{*}=\mathcal {R}_{\mathrm {diff}}^{*}+\mathcal {R}_{\mathrm {react},\varepsilon }^{*}\) describing the diffusion and reaction separately. Starting with the pioneering work of Otto and coauthors [23, 37] it is known that many diffusion-type problems can be understood as gradient systems driven by the free energy in the space of probability measures equipped with the Wasserstein distance. The corresponding dissipation potential \(\mathcal {R}_{\mathrm {diff}}^{*}\) is quadratic and given by

$$\begin{aligned} \mathcal {R}_{\mathrm {diff}}^{*}(\mu ,\xi )=\frac{1}{2}\int _{\Omega }\sum _{i=1}^{2}\delta _{i}\left| \nabla \xi _{i}\right| ^{2}\ \mathrm {d}\mu _{i}. \end{aligned}$$

Later Mielke [28] proposed a quadratic gradient structure with the same driving functional also for reaction-diffusion systems with reversible reactions satisfying detailed balance. Geometric properties of that gradient structure were investigated in [20, 24]. Here, we are not interested in that gradient structure, but use a different, the so-called cosh-type gradient structure, where the reaction part is given by

$$\begin{aligned} \mathcal {R}_{\mathrm {react},\varepsilon }^{*}(\mu ,\xi )=\frac{1}{\varepsilon }\int _{\Omega }\mathsf {C}^{*}(\xi _{1}(x)-\xi _{2}(x))\ \sqrt{\mathrm {d}\mu _{1}\mathrm {d}\mu _{2}}, \end{aligned}$$

with \(\mathsf {C}^{*}(r)=4(\mathrm {cosh}(r/2)-1)\). Setting \(\mathcal {R}_{\varepsilon }^{*}=\mathcal {R}_{\mathrm {diff}}^{*}+\mathcal {R}_{\mathrm {react},\varepsilon }^{*}\), the reaction-diffusion system (1.2) can now be formally written as a gradient flow equation

$$\begin{aligned} \dot{\mu }=\partial _{\xi }\mathcal {R}_{\varepsilon }^{*}(\mu ,-\mathrm {D}\mathcal {E}(\mu )). \end{aligned}$$

Although there are many gradient structures for (1.2) (see e.g. [30, Sect. 4]) and the cosh-type gradient structure entails several technical difficulties as defining a nonlinear kinetic relation and not inducing metric on Q, it nevertheless has several significant features. Historically, it has its origin in [27] where, following thermodynamic considerations, chemical reactions are written in exponential terms. In recent years, the cosh-gradient structure has been derived via a large-deviation principle [32, 33], and it was shown that it is stable under limit processes [25] that are similar to our approach. Moreover, it does not explicitly depend on the stationary measure w, which allows for an rigorous distinction between the energetic and dissipative part [30]. This is physically reasonable because a change of the energy by an external field should not influence the geometric structure of the underlying space. We refer also to [40], where our choice of gradient structure has been formally derived for (1.2).

The goal of the paper is to construct an effective gradient system \((Q,\mathcal {E},\mathcal {R}_{\mathrm {eff}}^{*})\) and perform the limit \((Q,\mathcal {E},\mathcal {R}_{\varepsilon }^{*})\rightarrow (Q,\mathcal {E},\mathcal {R}_{\mathrm {eff}}^{*})\) as \(\varepsilon \rightarrow 0\). For this, we use the notion of convergence of gradient systems in the sense of the energy-dissipation principle, shortly called EDP-convergence. EDP-convergence was introduced in [14] and further developed in [30, 31] and is based on the dissipation functional

$$\begin{aligned} \mathfrak {D}_{\varepsilon }^{\eta }(\mu )=\int _{0}^{T}\mathcal {R}_{\varepsilon }(\mu ,\dot{\mu })+\mathcal {R}_{\varepsilon }^{*}(\mu ,\eta -\mathrm {D}\mathcal {E}(\mu ))\ \mathrm {d}t, \end{aligned}$$

which, for solutions \(\mu \) of the gradient flow equation describes the total dissipation between initial time \(\mathcal {E}(\mu (0))\) and final time \(\mathcal {E}(\mu (T))\), and can now be defined for general trajectories \(\mu \in \mathrm {L}^{1}([0,T],Q)\). Here, the primal dissipation potential \(\mathcal {R}_\varepsilon \) is defined as the Legendre transform of \(\mathcal {R}^*_\varepsilon \) with respect to the second variable. The notion of EDP-convergence with tilting requires \(\Gamma \)-convergences of the energies \(\mathcal {E}_{\varepsilon }\xrightarrow {{\Gamma }}\mathcal {E}_{0}\) and of the dissipation functionals \(\mathfrak {D}_{\varepsilon }^{\eta }\xrightarrow {{\Gamma }}\mathfrak {D}_{0}^{\eta }\) in suitable topologies, such that for all tilts \(\eta \) the limit \(\mathfrak {D}_{0}^{\eta }\) has the form

$$\begin{aligned} \mathfrak {D}_{0}^{\eta }(\mu )=\int _{0}^{T}\mathcal {R}_{\mathrm {eff}}(\mu ,\dot{\mu })+\mathcal {R}_{\mathrm {eff}}^{*}(\mu ,\eta -\mathrm {D}\mathcal {E}_{0}(\mu ))\ \mathrm {d}t, \end{aligned}$$

see Sect. 2.2 for a precise definition. Importantly, the effective dissipation potential \(\mathcal {R}_{\mathrm {eff}}\) in the \(\Gamma \)-limit is independent of the tilts, hence allowing for more general energy functionals and stationary measures. In our situation, the tilts \(\eta \) correspond to an external potential \(V=(V_{1},V_{2})\) added to the energy \(\mathcal {E}\). On the level of the PDE, the original reaction-diffusion system (1.2) is extended to a reaction-drift-diffusion system of the form

$$\begin{aligned} \frac{\mathrm {d}}{\mathrm {d}t}\begin{pmatrix}c_{1}\\ c_{2} \end{pmatrix}=\mathrm {div}\left( \begin{pmatrix}\delta _{1}\nabla c_{1}\\ \delta _{2}\nabla c_{2} \end{pmatrix}+\begin{pmatrix}\delta _{1}c_{1}\nabla V_{1}\\ \delta _{2}c_{2}\nabla V_{2} \end{pmatrix}\right) +\frac{1}{\varepsilon }\begin{pmatrix}-\sqrt{\frac{\alpha }{\beta }}\mathrm {e}^{\frac{V_{1}-V_{2}}{2}} &{} \sqrt{\frac{\beta }{\alpha }}\mathrm {e}^{\frac{V_{2}-V_{1}}{2}}\\ \sqrt{\frac{\alpha }{\beta }}\mathrm {e}^{\frac{V_{1}-V_{2}}{2}} &{} -\sqrt{\frac{\beta }{\alpha }}\mathrm {e}^{\frac{V_{2}-V_{1}}{2}} \end{pmatrix}\begin{pmatrix}c_{1}\\ c_{2} \end{pmatrix}. \end{aligned}$$

The main result of the paper is Theorem 4.2 which asserts tilt EDP-convergence of \((Q,\mathcal {E},\mathcal {R}_{\varepsilon }^{*})\) to \((Q,\mathcal {E},\mathcal {R}_{\mathrm {eff}}^{*})\) as \(\varepsilon \rightarrow 0\) where the effective dissipation potential is given by

$$\begin{aligned} \mathcal {R}_{\mathrm {eff}}^{*}&=\mathcal {R}_{\mathrm {diff}}^{*}+\chi _{\left\{ \xi _{1}=\xi _{2}\right\} }, \end{aligned}$$

where \(\chi _{A}\) is the characteristic function of convex analysis taking values zero in A and infinity otherwise. The effective dissipation potential describes diffusion but restricts the chemical potential \(\xi =(\xi _{1},\xi _{2})\) to a linear submanifold. The induced gradient flow equation of the gradient system \((Q,\mathcal {E},\mathcal {R}_{\mathrm {eff}}^{*})\) is then given by a system of drift-diffusion equations on a linear submanifold with a space and time dependent Lagrange multiplier \(\lambda \)

$$\begin{aligned} \begin{array}{cc} \dot{c_{1}} &{} =\mathrm {div}\left( \delta _{1}\nabla c_{1}+\delta _{1}c_{1}\nabla V_{1}\right) -\lambda \\ \dot{c_{2}} &{} =\mathrm {div}\left( \delta _{2}\nabla c_{2}+\delta _{2}c_{2}\nabla V_{2}\right) +\lambda \end{array}\quad ,\quad \frac{c_{1}}{\beta \mathrm {e}^{-V_{1}}}=\frac{c_{2}}{\alpha \mathrm {e}^{-V_{2}}}\ . \end{aligned}$$

Moreover, as an immediate consequence of Theorem 4.2, we obtain that the effective gradient flow equation can be equivalently described as a drift-diffusion equation of the coarse-grained concentration \(\hat{c}\), see Proposition 4.4. Introducing the mixed diffusion coefficient \(\hat{\delta }^{V}=\frac{\delta _{1}\beta \mathrm {e}^{-V_{1}}+\delta _{2}\alpha \mathrm {e}^{-V_{2}}}{\beta \mathrm {e}^{-V_{1}}+\alpha \mathrm {e}^{-V_{2}}}\) and the mixed potential \(\hat{V}=-\log (\frac{\beta }{\alpha +\beta }\mathrm {e}^{-V_{1}}+\frac{\alpha }{\alpha +\beta }\mathrm {e}^{-V_{2}})\), we obtain

$$\begin{aligned} \dot{\hat{c}} =\mathrm {div}\left( \hat{\delta }^{V}\nabla \hat{c}+\hat{\delta }^{V}\hat{c}\nabla \hat{V}\right) , \end{aligned}$$

which is in accordance with [6] in the potential-free case \(V=\mathrm {const}\). Moreover, we obtain a natural coarse-grained gradient structure \((\hat{Q},\hat{\mathcal {E}},\hat{\mathcal {R}^{*}})\), where \(\hat{Q}=\mathrm {Prob}(\Omega )\) is the coarse-grained state space and \(\hat{\mathcal {E}},\hat{\mathcal {R}^{*}}\) are the coarse-grained energy functional and dissipation potential, respectively. This coarse-grained gradient structure \((\hat{Q},\hat{\mathcal {E}},\hat{\mathcal {R}^{*}})\) contains the same information as the effective gradient structure \((Q,\mathcal {E},\mathcal {R}_{\mathrm {eff}}^{*})\), although defined on a smaller state space, see Proposition 4.4.

The result on tilt EDP-convergence is an immediate consequence of the \(\Gamma \)-convergence result of the dissipation functional \(\mathfrak {D}_{\varepsilon }^{\eta }\) (Theorem 5.12). The primal dissipation potential \(\mathcal {R}_{\varepsilon }\) is given by an infimal sum consisting of diffusion fluxes and reaction fluxes coupled via a generalized continuity equation, see Sect. 3.3. Theorem 5.12 follows from the following observations: \(\mathcal {R}_{\varepsilon }^{*}\) converges monotonically to a singular limit \(\mathcal {R}_{\mathrm {eff}}^{*}\), the primal dissipation potentials \(\mathcal {R}_{\varepsilon }\) degenerate. It is not possible to control the rates of \(\dot{\mu }_{1}\) and \(\dot{\mu }_{2}\) separately by \(\mathcal {R}_{\varepsilon }\), since the reaction flux between both species may become unbounded. Instead, it is possible to prove compactness for the sum (or slow variable) \(\mu _{1}+\mu _{2}\) by \(\mathcal {R}_{\varepsilon }\), and proving convergence towards the slow-manifold where an equilibration takes place, i.e. \(\alpha c_{1}^{\varepsilon }-\beta c_{2}^{\varepsilon }\rightarrow 0\). The two pieces of complementary information provide strong convergence of the densities \(c^{\varepsilon }\) in \(\mathrm {L}^{1}([0,T]\times \Omega )\). This procedure has been already successfully applied for linear and nonlinear reaction systems [30, 34] and is here applied to a space-dependent evolution system. A posteriori we conclude that the limit measure \(\mu ^{0}\) has indeed an absolutely continuous representative using results from [1]. The construction of the recovery sequence relies on the fact that the limit dissipation functional can be equivalently written as a functional of coarse-grained variables. Only the reaction flux, which is present for positive \(\varepsilon >0\) and hidden for \(\varepsilon =0\), has to be reconstructed by using the diffusion fluxes. Since the dissipation functional considers also fluctuations which may be neither strictly positive nor smooth in contrast to the solution of the linear reaction diffusion system (1.2), the construction of a recovery sequence is completed by a suitable approximation argument.

Let us finally mention, that the same results can also be established for reaction-diffusion systems, where more than two species are involved. Applying the coarse-graining and reconstruction machinery as developed in [30], a similar \(\Gamma \)-convergence result for the dissipation functional can be proved. For notational convenience we restrict to the two-species situation and briefly discuss the multi-species case in Sect. 6. We refer also to [42], where coarse-graining and reconstruction for concentrations as well as the fluxes is developed.

2 Gradient structures

2.1 Gradient systems and the energy-dissipation principle

Let us briefly recall what we mean with a gradient system. Following [29], we call a triple \((Q,\mathcal {E},\mathcal {R})\) a gradient system if

  1. (1)

    Q is a closed convex subset of a Banach space X

  2. (2)

    \(\mathcal {E}:Q\rightarrow \mathbb {R}_{\infty }:=\mathbb {R}\cup \{\infty \}\) is a functional (such as the free energy)

  3. (3)

    \(\mathcal {R}:Q\times X\rightarrow \mathbb {R}_{\infty }\) is a dissipation potential, which means that for any \(u\in Q\) the functional \(\mathcal {R}(u,\cdot ):X\rightarrow \mathbb {R}_{\infty }\) is lower semicontinuous (lsc), nonnegative and convex, and it satisfies \(\mathcal {R}(u,0)=0\).

We define the dual dissipation potential \(\mathcal {R}^{*}:Q\times X^{*}\rightarrow [0,\infty ]\) using the Legendre transform via

$$\begin{aligned} \mathcal {R}^{*}(u,\xi )=(\mathcal {R}(u,\cdot ))^{*}(\xi )=\sup _{v\in X}\left\{ \langle v,\xi \rangle -\mathcal {R}(u,v)\right\} \ . \end{aligned}$$

The gradient system is uniquely described by \((Q,\mathcal {E},\mathcal {R})\) or, equivalently by \((Q,\mathcal {E},\mathcal {R}^{*})\) and, in particular, in this paper we use the second representation.

For a sufficiently smooth energy function \(\mathcal {E}\), its Gateaux-differential \(\mathrm {D}\mathcal {E}(u)\in X^*\) describes the (negative) potential force of the system. The dynamics of a gradient system can be formulated in different ways as an equation in X, in \(\mathbb {R}\) or in \(X^{*}\) (the dual Banach space of X), respectively:

  1. (1)

    Force balance in \(X^{*}\): \(0\in \partial _{\dot{u}}\mathcal {R}(u,\dot{u})+\mathrm {D}\mathcal {E}(u)\in X^{*}\),

  2. (2)

    Power balance in \(\mathbb {R}\): \(\mathcal {R}(u,\dot{u})+\mathcal {R}^{*}(u,-\mathrm {D}\mathcal {E}(u))=-\langle \mathrm {D}\mathcal {E}(u),\dot{u}\rangle \),

  3. (3)

    Rate equation in X : \(\dot{u}\in \partial _{\xi }\mathcal {R}^{*}(u,-\mathrm {D}\mathcal {E}(u))\in X\).

(Here, \(\partial \) denotes the subdifferential of convex analysis.) Equations (1) and (3) are called gradient flow equation associated with \((Q,\mathcal {E},\mathcal {R}^{*})\). The equivalent formulations rely on the following fact: Let X be a reflexive Banach space and \(\Psi :X\rightarrow \mathbb {R}_{\infty }\) be proper, convex and lsc. Then for every \(\xi \in X^{*}\) and \(v\in X\) the following three statements, the so-called Legendre-Fenchel-equivalences, are equivalent:

$$\begin{aligned} v\in \partial \Psi ^{*}(\xi ) \quad \Leftrightarrow \quad \Psi (v)+\Psi ^{*}(\xi )=\langle \xi ,v\rangle \quad \Leftrightarrow \quad \xi \in \partial \Psi (v). \end{aligned}$$

Especially the second dynamic formulation, the power balance (2), is interesting for us. Integrating the power balance (2) in time from 0 to T and using the chain rule for the time-derivative of \(t\mapsto \mathcal {E}(u(t))\), we get another equivalent formulation of the dynamics of the gradient system, which is called Energy-Dissipation-Balance:

$$\begin{aligned} \mathrm {(EDB)}~~~~~~~\mathcal {E}(u(T))+\int _{0}^{T}\left[ \mathcal {R}(u,\dot{u})+\mathcal {R}^{*}(u,-\mathrm {D}\mathcal {E}(u))\right] \mathrm {d}t=\mathcal {E}(u(0)). \end{aligned}$$
(2.1)

Equation (EDB) compares the energy of the system at time \(t=0\) and at time \(t=T\), the difference is described by the total dissipation from \(t=0\) to \(t=T\). This gives rise to another definition: We define the De Giorgi dissipation functional as

$$\begin{aligned} \mathfrak {D}(u)=\int _{0}^{T}\left[ \mathcal {R}(u,\dot{u})+\mathcal {R}^{*}(u,-\mathrm {D}\mathcal {E}(u))\right] \mathrm {d}t, \end{aligned}$$

for \(u\in \mathrm {W}^{1,1}([0,T],Q)\) and extend it to infinity otherwise. The following energy-dissipation principle provides the definition for solutions of the gradient flow equation, see e.g. [1, Prop. 1.4.1], [2, Def. 1.1], [31, Thm 2.5]. In particular, it is the starting point for our further analysis.

Definition 2.1

We say \(u\in \mathrm {W}^{1,1}([0,T],Q)\) is a solution of the gradient flow equation (1) or (3) induced by the gradient system \((Q,\mathcal {E},\mathcal {R}^{*})\), if \(\mathcal {E}(u(0))<\infty \) and the energy-dissipation balance holds.

2.2 Definition of EDP-convergence

The definition of EDP-convergence for gradient systems relies on the notion of \(\Gamma \)-conver- gence for functionals (cf. [12]). If Y is a Banach space and \(I_{\varepsilon }:Y\rightarrow \mathbb {R}_{\infty }\) we write \(I_{\varepsilon }\xrightarrow {{\Gamma }}I_{0}\) and \(I_{\varepsilon }{\mathop {\rightharpoonup }\limits ^{{\Gamma }}}I_{0}\) for \(\Gamma \)-convergence in the strong and weak topology, respectively. If both holds this is called Mosco-convergence and written as \(I_{\varepsilon }\xrightarrow {\text {M}}I_{0}\).

For families of gradient systems \((X,\mathcal {E}_{\varepsilon },\mathcal {R}_{\varepsilon })\), three different levels of EDP-convergence are introduced and discussed in [14, 31]: simple EDP-convergence, EDP-convergence with tilting and contact EDP-convergence with tilting. EDP-convergence with tilting is the strongest notion implying the other two. Here we will only use the first two notions. For all three notions the choice of weak or strong topology is still to be decided according to the specific problem.

Definition 2.2

(Simple EDP-convergence) A family of gradient structures \((Q,\mathcal {E}_{\varepsilon },\mathcal {R}_{\varepsilon })\) is said to EDP-converge to the gradient system \((Q,\mathcal {E}_{\mathrm {0}},\mathcal {R}_{\mathrm {eff}})\) if the following conditions hold:

  1. 1.

    \(\mathcal {E}_{\varepsilon }\xrightarrow {{\Gamma }}\mathcal {E}_{0}\) on \(Q\subset X\);

  2. 2.

    \(\mathfrak {D}_{\varepsilon }\) strongly \(\Gamma \)-converges to \(\mathfrak {D}_{0}\) on \(\mathrm {L}^{1}([0,T],Q)\) conditioned to bounded energies (we write \(\mathfrak {D}_{\varepsilon }{\mathop {\rightarrow }\limits ^{{\Gamma _\mathrm {E}}}}\mathfrak {D}_{0}\)), i.e. we have

    1. (a)

      (Liminf-estimate) For all strongly converging families \(u_{\varepsilon }\rightarrow u\) in \(\mathrm {L}^{1}([0,T],Q)\) which satisfy \(\sup _{\varepsilon >0}\mathrm {ess\,sup}_{t\in [0,T]}\mathcal {E}_{\varepsilon }(u_{\varepsilon }(t))<\infty \), we have \(\liminf _{\varepsilon \rightarrow 0^{+}}\mathfrak {D}_{\varepsilon }(u_{\varepsilon })\ge \mathfrak {D}_{0}(u)\),

    2. (b)

      (Limsup-estimate) For all \(\widetilde{u}\in \mathrm {L}^{1}([0,T],Q)\) there exists a strongly converging family \(u_{\varepsilon }\rightarrow \widetilde{u}\) in \(\mathrm {L}^{1}([0,T],Q)\) with \(\sup _{\varepsilon >0}\mathrm {ess\,sup}_{t\in [0,T]}\mathcal {E}_{\varepsilon }(\widetilde{u}_{\varepsilon }(t))<\infty \) such that we have \(\limsup _{\varepsilon \rightarrow 0^{+}}\mathfrak {D}_{\varepsilon }(u_{\varepsilon })\le \mathfrak {D}_{0}(\widetilde{u})\).

  3. 3.

    There is an effective dissipation potential \(\mathcal {R}_{\mathrm {eff}}:Q\times X\rightarrow \mathbb {R}_{\infty }\) such that \(\mathfrak {D}_{0}\) takes the form of a dual sum, namely \(\mathfrak {D}_{0}(u)=\int _{0}^{T}\{\mathcal {R}_{\mathrm {eff}}(u,\dot{u}){+}\mathcal {R}_{\mathrm {eff}}^{*}(u,-\mathrm {D}\mathcal {E}_{\mathrm {eff}}(u))\}\mathrm {d}t\).

Similarly, one can also use weak \(\Gamma \)- or Mosco-convergence conditioned to bounded energy, which we will then write as \(\mathfrak {D}_{\varepsilon }{\mathop {\rightharpoonup }\limits ^{{\Gamma _\mathrm {E}}}}\mathfrak {D}_{0}\) and \(\mathfrak {D}_{\varepsilon }{\mathop {\longrightarrow }\limits ^{{\mathrm {M}_\mathrm {E}}}}\mathfrak {D}_{0}\). In fact, for our fast-slow reaction systems we are going to prove \(\mathfrak {D}_{\varepsilon }{\mathop {\longrightarrow }\limits ^{{\mathrm {M}_\mathrm {E}}}}\mathfrak {D}_{0}\).

A general feature of EDP-convergence is that, under suitable conditions, sufficiently smooth solutions u of the gradient flow equation \(\dot{u}=\partial _{\xi }\mathcal {R}_{\mathrm {eff}}^{*}(u,{-}\mathrm {D}\mathcal {E}_{0}(u))\) of the effective gradient system \((X,\mathcal {E}_{0},\mathcal {R}_{\mathrm {eff}})\) are indeed limits of solutions \(u^{\varepsilon }\) of the gradient flow equation \(\dot{u}^\varepsilon =\partial _{\xi }\mathcal {R}_{\varepsilon }^{*}(u^\varepsilon ,{-}\mathrm {D}\mathcal {E}_{\varepsilon }(u^\varepsilon ))\), see e.g. [11, Thm. 11.3], [30, Lem. 3.4] and [31, Lem. 2.8].

A strengthening of simple EDP-convergence is the so-called EDP-convergence with tilting. This notion involves the tilted energy functionals \(\mathcal {E}_{\varepsilon }^{\eta }:Q\ni u\mapsto \mathcal {E}_{\varepsilon }(u)-\langle \eta ,u\rangle \), where the tilt \(\eta \) (also called external loading) varies through the whole dual space \(X^{*}\).

Definition 2.3

(EDP-convergence with tilting (cf. [31, Def. 2.14]) A family of gradient structures \((Q,\mathcal {E}_{\varepsilon },\mathcal {R}_{\varepsilon })\) is said to EDP-converge with tilting to the gradient system \((Q,\mathcal {E}_{0},\mathcal {R}_{\mathrm {eff}})\), if for all tilts \(\eta \in X^{*}\) we have that \((Q,\mathcal {E}_{\varepsilon }^{\eta },\mathcal {R}_{\varepsilon })\) EDP-converges to \((Q,\mathcal {E}_{\varepsilon }^{\eta },\mathcal {R}_{\mathrm {eff}})\).

Clearly, we have that \(\mathcal {E}_{\varepsilon }\xrightarrow {{\Gamma }}\mathcal {E}_{0}\) implies \(\mathcal {E}_{\varepsilon }^{\eta }\xrightarrow {{\Gamma }}\mathcal {E}_{0}^{\eta }\) for all \(\eta \in X^{*}\) (and similarly for weak \(\Gamma \)-convergence), since the linear tilt \(u\mapsto -\langle \eta ,u\rangle \) is weakly continuous. The main and nontrivial assumption is that additionally

$$\begin{aligned} \mathfrak {D}_{\varepsilon }^{\eta }:u\mapsto \int _{0}^{T}\!\!\big \{\mathcal {R}_{\varepsilon }(u,\dot{u})+\mathcal {R}_{\varepsilon }^{*}(u,\eta {-}\mathrm {D}\mathcal {E}_{\varepsilon }(u))\big \}\mathrm {d}t \end{aligned}$$

\(\Gamma \)-converges in \(\mathrm {L}^{1}([0,T],Q)\) to \(\mathfrak {D}_{0}^{\eta }\) for all \(\eta \in X^{*}\) and that this limit \(\mathfrak {D}_{0}^{\eta }\) is given in \(\mathcal {R}\oplus \mathcal {R}^{*}\)-form with \(\mathcal {R}_{\mathrm {eff}}\) via

$$\begin{aligned} \mathfrak {D}_{0}^{\eta }(u)=\int _{0}^{T}\!\!\big \{\mathcal {R}_{\mathrm {eff}}(u,\dot{u})+\mathcal {R}_{\mathrm {eff}}^{*}(u,\eta {-}\mathrm {D}\mathcal {E}_{\mathrm {eff}}(u))\big \}\mathrm {d}t. \end{aligned}$$

The main point is that \(\mathcal {R}_{\mathrm {eff}}\) remains independent of \(\eta \in X^{*}\). We refer to [31] for a discussion of this and the other two notions of EDP-convergence.

3 Gradient system of reaction-diffusion systems

Now, we present the gradient system \((Q,\mathcal {E},\mathcal {R}_{\varepsilon }^{*})\), which induces the reaction-diffusion system (1.2) and discuss its properties. In Sect. 3.2 we formally derive the gradient flow equation of the gradient system including general tilts of the energy. In Sect. 3.3 we compute the primal dissipation potential \(\mathcal {R}_{\varepsilon }\), which is only implicitly given via a infimal-convolution, and the total dissipation functional \(\mathfrak {D}_{\varepsilon }^{\eta }\), which will be the main object of interest in Sect. 4. In Sect. 3, the computations are basically formal; the precise functional analytic setting is presented in Sect. 4 which also includes the \(\Gamma \)-convergence and EDP-convergence result.

3.1 Gradient structure for the linear reaction system

The gradient system \((Q,\mathcal {E},\mathcal {R}_{\varepsilon }^{*})\) for the linear fast-slow reaction-diffusion system (1.2) is defined as follows: The state space is the space of probability measure on \(\Omega \times \left\{ 1,2\right\} \)

$$\begin{aligned} Q&:=\mathrm {Prob}(\Omega \times \left\{ 1,2\right\} )=\{\mu =(\mu _{1},\mu _{2}):\mu _{i}\in \mathcal{M}(\Omega ),\ \mu _{i}\ge 0,~\sum _{i=1}^{2}\mu _{i}(\Omega )=1\}, \end{aligned}$$

where we assume that \(\Omega \subset \mathbb {R}^{d}\) is a compact domain with normalized mass \(|\Omega |=1\). The driving functional \(\mathcal {E}:Q\rightarrow \mathbb {R}_{\infty }:=\mathbb {R}\cup \{\infty \}\) is the free energy of the reaction-diffusion system. It is finite for measures \(\mu =(\mu _{1},\mu _{2})\) with Lebesgue density \(c=(c_{1},c_{2})\) only and has the form

$$\begin{aligned} \mathcal {E}(\mu ):={\left\{ \begin{array}{ll} \int _{\Omega }\sum _{j=1}^{2}E_{B}\left( \frac{c_{j}}{w_{j}}\right) w_{j}\mathrm {d}x, &{} \mathrm {~if~}\mu =c\cdot \mathrm {d}x\\ \infty , &{} \mathrm {otherwise}. \end{array}\right. } \end{aligned}$$
(3.1)

where the Boltzmann function is defined as \(E_{B}(r)=r\log r-r+1\) and the positive stationary measure is given by \(w=\frac{1}{\alpha +\beta }(\beta ,\alpha )^{\mathrm {T}}\). Note that the stationary measure w as well as the energy \(\mathcal{E}\) is \(\varepsilon \)-independent. As usual in convex analysis the differential \(\mathrm {D}\mathcal {E}\) of the energy \(\mathcal{E}\) is only defined in the domain of \(\mathcal {E}\), i.e. for measures with Lebesgue density. Here, all calculations are on a formal level assuming that the densities are smooth, the functionals can be extended to a linear space and being Gateaux differentiable. For a mathematical rigorous theory of subdifferentials in the space of probability measures we refer to Section 10 of [1]. For measures \(\mu \) with Lebesgue density c we have

$$\begin{aligned} \mathrm {D}\mathcal {E}(\mu )=\sum _{j=1}^{2}\left( \log c_{j}-\log w_{j}\right) =\sum _{j=1}^{2}\left( \log \frac{c_{j}}{w_{j}}\right) \ . \end{aligned}$$

As the equation splits into a diffusion and reaction part, so does the dual dissipation functional. We define

$$\begin{aligned} \mathcal {R}_{\varepsilon }^{*}(\mu ,\xi ):=\mathcal {R}_{\mathrm {diff}}^{*}(\mu ,\xi )+\mathcal {R}_{\mathrm {react},\varepsilon }^{*}(\mu ,\xi ), \end{aligned}$$

with

$$\begin{aligned} \mathcal {R}_{\mathrm {diff}}^{*}(\mu ,\xi )&:=\frac{1}{2}\int _{\Omega }\sum _{j=1}^{2}\delta _{j}|\nabla \xi _{j}(x)|^{2}\mathrm {d}\mu _{j},\\ \mathcal {R}_{\mathrm {react},\varepsilon }^{*}(\mu ,\xi )&:=\frac{1}{\varepsilon }\int _{\Omega }\mathsf {C}^{*}(\xi _{1}(x)-\xi _{2}(x))\ \mathrm {d}\sqrt{\mu _{1}\mu _{2}}, \end{aligned}$$

where we use the cosh-function \(\mathsf {C}^{*}(x)=4\left( \cosh (x/2)-1\right) \) and for measures \(\mu \) with Lebesgue density c we have \(\mathrm {d}\sqrt{\mu _{1}\mu _{2}}:=\sqrt{c_{1}c_{2}}\mathrm {d}x\).

The diffusion part \(\mathcal{R}_{\mathrm {diff}}^{*}\) induces the Wasserstein distance on Q. The \(\varepsilon \)-dependent reaction part \(\mathcal{R}_{\mathrm {react},\varepsilon }^{*}\) forces the evolution close to a linear submanifold given by

$$\begin{aligned} \mathcal {R}_{\mathrm {react},\varepsilon }^{*}(\mu ,-\mathrm {D}\mathcal {E}(\mu ))=0\quad \Leftrightarrow \quad \alpha c_{1}-\beta c_{2}=0\,. \end{aligned}$$

Note, that since \(\mathcal{R}_{\mathrm {react},\varepsilon }^{*}\) is not 2-homogeneous, it does not define a metric on Q. We refer to [39] which treats similar and general dissipation potentials and understands them as generalized transport costs on discrete spaces. Note that \(\mathcal{R}_{\varepsilon }^{*}\) does not depend on the stationary measure w explicitly, as highlighted in [30].

3.2 The tilted gradient flow equation

To exploit the full information of the dissipation potential, we consider general tilted energies. Considering two free energies \(\mathcal {E},{\widetilde{\mathcal {E}}}\) of the form (3.1) with different positive stationary measures \(w,\widetilde{w}\) (which may be space dependent in general), one easily sees that

$$\begin{aligned} \widetilde{\mathcal{E}}(\mu )=\mathcal{E}(\mu )+\sum _{i=1}^{2}\int _{\Omega }\log \left( \frac{w_{i}}{\widetilde{w}_{i}}\right) \mathrm {d}\mu _i. \end{aligned}$$

Hence, changing the underlying stationary measure corresponds to a linear tilt of the energy by a two component potential \(V=(V_{1},V_{2})\) where \(V_{i}=\log \left( \frac{w_{i}}{\widetilde{w}_{i}}\right) \). On the other hand, introducing a tilted energy

$$\begin{aligned} \mathcal{E}^{V}(\mu ):=\mathcal{E}(\mu )+\sum _{i=1}^{2}\int _{\Omega }V_{i}\mathrm {d}\mu _{i}, \end{aligned}$$

where \(V\in \mathrm {C}^{1}(\Omega ,\mathbb {R}^{2})\) is a two component smooth potential, its stationary measure \(w^{V}\) on the space \(Q=\mathrm {Prob}(\Omega \times \left\{ 1,2\right\} )\) has the form

$$\begin{aligned} w_{i}^{V}=\frac{1}{Z}w_{i}\mathrm {e}^{-V_{i}},\ \mathrm {where}\ Z:=\sum _{i=1}^{2}\int _{\Omega }w_{i}\mathrm {e}^{-V_{i}}\mathrm {d}x\ . \end{aligned}$$
(3.2)

We introduce \(\eta _{i}:=\mathrm {e}^{-V_{i}}\) and clearly, we have \(\eta _{i}>0\) on \(\Omega \subset \mathbb {R}^{d}\).

Next, we formally compute the tilted gradient flow equation \(\dot{\mu }=\partial _{\xi }\mathcal{R}_{\varepsilon }^{*}(\mu ,-\mathrm {D}\mathcal {E}^{V}(\mu ))\), which is induced by the gradient system \((Q,\mathcal {E}^{V},\mathcal {R}_{\varepsilon }^{*}=\mathcal {R}_{\mathrm {diff}}^{*}+\mathcal {R}_{\mathrm {react},\varepsilon }^{*})\). First, we observe that \(\mathcal{E}(\mu )<\infty \) if and only if \(\mathcal{E}^{V}(\mu )<\infty \). Inserting \(\xi _{i}=\left( -\mathrm {D}\mathcal {E}^{V}(\mu )\right) _{i}\) into \(\partial _{\xi }\mathcal{R}_{\mathrm {diff},\varepsilon }^{*}(\mu ,\xi )\), we see that

$$\begin{aligned} \partial _{\xi }\mathcal {R}_{\mathrm {diff}}^{*}(\mu ,\cdot )|_{\xi =-\mathrm {D}\mathcal {E}^{V}(\mu )}=-(\mathrm {div}(\delta _{i}c_{i}\nabla (-\log (c_{i}/w_{i})-V_{i}))_{i=1,2}=\mathrm {div}\left( \delta _{i}\nabla c_{i}+\delta _{i}c_{i}\nabla V_{i}\right) _{i=1,2}, \end{aligned}$$

which is the right-hand side of a system of two uncoupled drift-diffusion equations or Fokker-Planck equations for the concentrations \(c_{i}\), where the fluxes are given by a diffusion part \(-\delta _{i}\nabla c_{i}\) and a drift part \(-\delta _{i}c_{i}\nabla V_{i}\).

For the reaction part of the dual dissipation potential, we insert \(\xi _{i}=\left( -\mathrm {D}\mathcal {E}^{V}(\mu )\right) _{i}\) into \(\partial _{\xi }\mathcal{R}_{\mathrm {react},\varepsilon }^{*}(\mu ,\xi )\). On readily verifies the identity \(\left( \mathsf {C}^{*}\right) '(\log p-\log q)=\frac{p-q}{\sqrt{pq}}\) for the cosh-function and conclude the linear term

$$\begin{aligned} \sqrt{c_{1}c_{2}}\left( \mathsf {C}^{*}\right) '(\xi _{1}(x)-\xi _{2}(x))|_{\xi =-\mathrm {D}\mathcal {E}^{V}(\mu )}&=\sqrt{c_{1}c_{1}}\frac{\frac{c_{2}}{w_{2}\eta _{2}}-\frac{c_{1}}{w_{1}\eta _{1}}}{\sqrt{\frac{c_{1}}{w_{1}\eta _{1}}\frac{c_{2}}{w_{2}\eta _{2}}}}=\sqrt{w_{1}\eta _{1}w_{2}\eta _{2}}\left( \frac{c_{2}}{w_{2}\eta _{2}}-\frac{c_{1}}{w_{1}\eta _{1}}\right) . \end{aligned}$$

Hence, we get a tilted Markov generator of the form

$$\begin{aligned} \partial _{\xi }\mathcal{R}_{\mathrm {react},\varepsilon }^{*}(\mu ,-\mathrm {D}\mathcal {E}^{V}(\mu ))=\frac{1}{\varepsilon }\begin{pmatrix}-\sqrt{\frac{\alpha }{\beta }\frac{\eta _{2}}{\eta _{1}}} &{} \sqrt{\frac{\beta }{\alpha }\frac{\eta _{1}}{\eta _{2}}}\\ \sqrt{\frac{\alpha }{\beta }\frac{\eta _{2}}{\eta _{1}}} &{} -\sqrt{\frac{\beta }{\alpha }\frac{\eta _{1}}{\eta _{2}}} \end{pmatrix}c=\frac{1}{\varepsilon }\begin{pmatrix}-\sqrt{\frac{\alpha }{\beta }}\mathrm {e}^{\frac{V_{1}-V_{2}}{2}} &{} \sqrt{\frac{\beta }{\alpha }}\mathrm {e}^{\frac{V_{2}-V_{1}}{2}}\\ \sqrt{\frac{\alpha }{\beta }}\mathrm {e}^{\frac{V_{1}-V_{2}}{2}} &{} -\sqrt{\frac{\beta }{\alpha }}\mathrm {e}^{\frac{V_{2}-V_{1}}{2}} \end{pmatrix}c, \end{aligned}$$

which has the space dependent stationary measure (3.2).

Summarizing, the tilted evolution equation for the density c has the form

$$\begin{aligned} \frac{\mathrm {d}}{\mathrm {d}t}\begin{pmatrix}c_{1}\\ c_{2} \end{pmatrix}=\mathrm {div}\left( \begin{pmatrix}\delta _{1}\nabla c_{1}\\ \delta _{2}\nabla c_{2} \end{pmatrix}+\begin{pmatrix}\delta _{1}c_{1}\nabla V_{1}\\ \delta _{2}c_{2}\nabla V_{2} \end{pmatrix}\right) +\frac{1}{\varepsilon }\begin{pmatrix}-\sqrt{\frac{\alpha }{\beta }}\mathrm {e}^{\frac{V_{1}-V_{2}}{2}} &{} \sqrt{\frac{\beta }{\alpha }}\mathrm {e}^{\frac{V_{2}-V_{1}}{2}}\\ \sqrt{\frac{\alpha }{\beta }}\mathrm {e}^{\frac{V_{1}-V_{2}}{2}} &{} -\sqrt{\frac{\beta }{\alpha }}\mathrm {e}^{\frac{V_{2}-V_{1}}{2}} \end{pmatrix}\begin{pmatrix}c_{1}\\ c_{2} \end{pmatrix}, \end{aligned}$$
(3.3)

which is a linear reaction-drift-diffusion system with space dependent reaction rates. In the special case without external forcing \(V=\mathrm {const}\), we get the linear reaction diffusion system (1.2). Note that the reaction part has the property that the product of the off-diagonal elements is constantly one. In particular, not all general linear reaction-drift-diffusion system with space dependent reaction rates for two species can be expressed in the form (3.3) and are induced by the gradient system \((Q,\mathcal {E}^{V},\mathcal {R}_{\varepsilon }^{*}=\mathcal {R}_{\mathrm {diff}}^{*}+\mathcal {R}_{\mathrm {react},\varepsilon }^{*})\).

3.3 The dissipation functional

The dissipation functional \(\mathfrak {D}_{\varepsilon }\) consists of two parts: the velocity part given by the primal dissipation potential \(\mathcal {R}_{\varepsilon }\) and the slope-part (sometimes also called Fisher information) \(\mathcal {R}_{\varepsilon }^{*}(\mu ,-\mathrm {D}\mathcal {E}(\mu ))\). Again all computations are formal and we always assume that the measure \(\mu \) has a Lebesgue density c. The precise functional analytic setting is presented in the Sect. 4.

The primal dissipation potential \(\mathcal {R}_{\varepsilon }\), given by the Legendre transform of the dual dissipation potential \(\mathcal {R}_{\varepsilon }^{*}=\mathcal {R}_{\mathrm {diff}}^{*}+\mathcal {R}_{\varepsilon ,\mathrm {react}}^{*}\) with respect to the second variable, can be computed via inf-convolution of \(\mathcal {R}_{\mathrm {diff}}\) and \(\mathcal {R}_{\mathrm {react},\varepsilon }\). First, we compute both primal dissipation potentials separately.

The primal dissipation potential of the diffusion part can be computed by the Legendre transform of \(\mathcal {R}^*_\mathrm {diff}\)

$$\begin{aligned} \mathcal {R}_\mathrm {diff} (\mu ,v)=\sup _{\xi =(\xi _1,\xi _2)} \left\{ \sum _{j=1}^2\int _\Omega \xi _jv_j-\frac{1}{2}\delta _jc_j|\nabla \xi _j|^2\ \mathrm {d}x\right\} . \end{aligned}$$

The force \(\xi ^*=(\xi _1^*,\xi _2^*)\) where the supremum is attained can be calculated, is uniquely determined up to a constant, and satisfies the following elliptic equation (in weak sense)

$$\begin{aligned} -\mathrm {div}(\delta _jc_j\nabla \xi _j^*)=v_j\quad \mathrm {in}~\Omega , \quad (\delta _jc_j\nabla \xi _j^*)\cdot \nu =0 \quad \mathrm {on}~\partial \Omega . \end{aligned}$$

Introducing the (uniquely determined) diffusion flux \(J_j=\delta _jc_j\nabla \xi _j^*\), we have the explicit form of \(\mathcal {R}_\mathrm {diff}\), namely,

$$\begin{aligned} \mathcal {R}_\mathrm {diff} (\mu ,v) = \sum _{j=1}^2\int _\Omega \xi _j^*v_j-\frac{1}{2}\delta _jc_j|\nabla \xi _j^*|^2\ \mathrm {d}x = \frac{1}{2} \sum _{j=1}^2\int _\Omega \xi _j^*v_j\ \mathrm {d}x = \frac{1}{2} \sum _{j=1}^2\int _\Omega \frac{|J_j|^2}{\delta _j c_j}\ \mathrm {d}x, \end{aligned}$$

for positive concentrations \(c_j>0\). To capture also general non-negative concentrations \(c_j\ge 0\), we introduce the following notation: For a convex, lsc. function \(\mathsf {F}:X\rightarrow [0,\infty ]\) on a reflexive and separable Banach space X with Legendre dual \(\mathsf {F}^{*}\), we define the function \(\widetilde{\mathsf {F}}:[0,\infty [\times X\rightarrow [0,\infty ]\) by

$$\begin{aligned} \widetilde{\mathsf {F}}(a,x):=\left( a\,\mathsf {F}^{*}(\cdot )\right) ^{*}(x)={\left\{ \begin{array}{ll} a\,\mathsf {F}\left( \frac{1}{a}\,x\right) &{} \mathrm {for\ }a>0\ ,\\ \chi _{0}(x) &{} \mathrm {for\ }a=0\ . \end{array}\right. } \end{aligned}$$
(3.4)

In particular, introducing the quadratic function \(\mathsf {Q}(x)=\frac{1}{2}|x|^{2}\) on \(\mathbb {R}^{d}\), the primal dissipation potential of the diffusion part \(\mathcal{R_{\mathrm {diff}}^{*}}\) is given by

$$\begin{aligned} \mathcal {R}_{\mathrm {diff}}(\mu ,v)=\sum _{j=1}^{2}\int _{\Omega }\widetilde{\mathsf {Q}}\left( \delta _{j}c_{j},J_{j}\right) \mathrm {d}x, \end{aligned}$$

where \(J_{j}\) is, by definition, the unique solution of the equation \(v_{j}+\mathrm {div}J_{j}=0\) with \(J_j\cdot \nu =0\) on \(\partial \Omega \).

A direct computation shows that the primal dissipation potential of the reaction part is

$$\begin{aligned} \mathcal {R}^{*}_{{\mathrm {react,}}\varepsilon }(\mu ,b)={\left\{ \begin{array}{ll} \int _{\Omega }\widetilde{\mathsf {C}}\left( \frac{\sqrt{c_{1}c_{2}}}{\varepsilon },b_{2}\right) \mathrm {d}x, &{} \mathrm {for\ }b_{1}+b_{2}=0\\ \infty &{} \mathrm {for\ }b_{1}+b_{2}\ne 0 \end{array}\right. }\ , \end{aligned}$$

where \(\mathsf {C}=\left( \mathsf {C}^{*}\right) ^{*}\) is the Legendre transform of the cosh-function \(\mathsf {C}^{*}(x)=4\left( \cosh (x/2)-1\right) \). In the following, we use the inequality

$$\begin{aligned} \frac{1}{2}|r|\cdot \log (|r|+1)\le \mathsf {C}(r)\le 2|r|\cdot \log (|r|+1), \end{aligned}$$
(3.5)

In particular, this inequality implies that the Orlicz class for \(A\subset \mathbb {R}^{d}\) given by all functions \(u\in \mathrm {L}^{1}(A)\) such that \(\int _{A}\mathsf {C}(u)\mathrm {d}x<\infty \) is, in fact, a Banach space with the norm \(\Vert u\Vert _{\mathsf {C}}=\sup _{\int _{A}\mathsf {C}^{*}(v)\le 1}\left| \int _{A}uv\ \mathrm {d}x\right| \). In the following the Orlicz space is denoted by \(\mathrm {L}^{\mathsf {C}}(A)\).

Importantly, functions \(\widetilde{\mathsf {Q}}\), \(\widetilde{\mathsf {C}}\) as well as the functionals \(\mathcal {R}_{\mathrm {diff}},\mathcal {R}_{\mathrm {react},\varepsilon }\) are convex on their domain of definition.

The primal dissipation potential \(\mathcal {R}_{\varepsilon }\) is the inf-convolution of \(\mathcal{R}_{\mathrm {diff}}\) and \(\mathcal{R}_{\mathrm {react},\varepsilon }\), and is given by

$$\begin{aligned} \mathcal {R}_{\varepsilon }&(\mu ,v)=\inf _{v=u_{1}+u_{2}}\left\{ \mathcal {R}_{\mathrm {diff}}(\mu ,u_{1})+\mathcal {R}_{\mathrm {react},\varepsilon }(\mu ,u_{2})\right\} \\ =\inf _{J,b}&\left\{ \sum _{j=1}^{2}\int _{\Omega }\widetilde{\mathsf {Q}}\left( \delta _{j}c_{j},J_{j}\right) \mathrm {d}x+\int _{\Omega }\widetilde{\mathsf {C}}\left( \tfrac{\sqrt{c_{1}c_{2}}}{\varepsilon },b_{2}(x)\right) \mathrm {d}x:\left\{ \begin{array}{c} v_{1}=-\mathrm {div}J_{1}+b_{1}\\ v_{2}=-\mathrm {div}J_{2}+b_{2}\\ b_{1}+b_{2}=0 \end{array}\right\} \right\} . \end{aligned}$$

In time-integrated form we get for \(v=\dot{\mu }\) that

$$\begin{aligned} \int _{0}^{T}\mathcal {R}_{\varepsilon }&(\mu ,\dot{\mu })\ \mathrm {d}t=\inf _{v=v_{1}+v_{2}}\int _{0}^{T}\left\{ \mathcal {R}_{\mathrm {diff}}(\mu ,v_{1})+\mathcal {R}_{\mathrm {react},\varepsilon }(\mu ,v_{2})\right\} \mathrm {d}t\\ =\inf _{J,b}&\left\{ \int _{0}^{T}\left\{ \sum _{j=1}^{2}\int _{\Omega }\widetilde{\mathsf {Q}}\left( \delta _{j}c_{j},J_{j}\right) \mathrm {d}x+\int _{\Omega }\widetilde{\mathsf {C}}\left( \tfrac{\sqrt{c_{1}c_{2}}}{\varepsilon },b_{2}(x)\right) \right\} \mathrm {d}x\ \mathrm {d}t:\ (c,J,b)\in \mathrm {(gCE)}\right\} , \end{aligned}$$

where we introduce the notation of a (linear) generalized continuity equation

$$\begin{aligned} (c,J,b)\in \mathrm {(gCE)}\ \ \Leftrightarrow \ \ \left\{ b_{1}+b_{2}=0\ \mathrm {and}\ \left\{ \begin{array}{c} \dot{c}_{1}=-\mathrm {div}J_{1}+b_{1}\\ \dot{c}_{2}=-\mathrm {div}J_{2}+b_{2} \end{array}\right\} \ \mathrm {and}\ J_j\cdot \nu =0 \ \mathrm {on}\ \partial \Omega \right\} . \end{aligned}$$

Without the reaction part, \(\int _{0}^{T}\!\mathcal{R}_{\varepsilon }\mathrm {d}t\) is the dynamic formulation à la Benamou-Brenier of the Wasserstein distance in Q [4], which can be equivalently written in the form

$$\begin{aligned} \mathcal{W}_{2}(\mu _{0},\mu _{1})^{2}=\inf \left\{ \int _{0}^{1}\int _{\Omega }\sum _{j=1}^{2}\delta _{j}|v_{j}|^{2}\mathrm {d}\mu _{j}:\dot{\mu }_{j}+\mathrm {div}(\mu _{j}v_{j})=0,\ \mu _{j,0}=\mu _{0},\ \mu _{j,1}=\mu _{1}\right\} \end{aligned}$$

expressed in terms of transport velocities \(v_{j}=J_{j}/c_{j}\). The Wasserstein distance can be interpreted as a cost in transporting mass from one measure \(\mu _{0}\) to \(\mu _{1}\). In our situation \(\int _{0}^{T}\!\mathcal{R}_{\varepsilon }\mathrm {d}t\) is jointly convex in c, J and b and corresponds to a modified cost function which also takes the reaction fluxes into account. The optimal diffusion fluxes \(J_{j}\) and reaction fluxes \(b_{j}\) have to satisfy the generalized continuity equation. Note that \(\int _{0}^{T}\mathcal{R}_{\varepsilon }\mathrm {d}t\) does not induce a metric on Q since the reaction part is not quadratic.

Next, we compute the tilted slope part \(\mathcal {R}_{\varepsilon }^{*}(\mu ,-\mathrm {D}\mathcal {E}^{V}(\mu ))\). To do this, we introduce the relative densities \(\rho ^{V}\) of \(\mu \) w.r.t. the stationary measure \(w^{V}\mathrm {d}x\), i.e. \(\rho _{j}^{V}=\frac{\mathrm {d}\mu _j}{w_{j}^{V}\mathrm {d}x}=\frac{c_{j}}{w_{j}^{V}}\), where by (3.2) the stationary measure is \(w_{j}^{V}=\frac{1}{Z}w_{i}\mathrm {e}^{-V_{j}}\). Since \(V\in \mathrm {C}^{1}(\Omega ,\mathbb {R}^{2})\) and \(\Omega \subset \mathbb {R}^{d}\) is compact, \(\mu \) is absolutely continuous w.r.t. the Lebesgue measure \(\mathrm {d}x\) if and only if it is w.r.t. the stationary measure \(w^{V}\mathrm {d}x\). Inserting \(\xi =-\mathrm {D}\mathcal {E}^{V}(\mu )=-(\log (c_{i}/w_{i})+V_{i})_{i=1,2}\) in the dual dissipation potential \(\mathcal {R}_{\varepsilon }^{*}\), we get for the diffusive part

$$\begin{aligned} \mathcal {R}_{\mathrm {diff}}^{*}(\mu ,-\mathrm {D}\mathcal {E}^{V}(\mu ))=\frac{1}{2}\int _{\Omega }\sum _{j=1}^{2}\delta _{j}c_{j}|\nabla \left( \log c_{j}/w_{j}+V_{j}\right) |^{2}\mathrm {d}x. \end{aligned}$$

Using \(w_{j}^{V}=\frac{1}{Z}w_{j}\mathrm {e}^{-V_{j}}\), a short calculation shows \(\delta _{j}c_{j}\left| \nabla \left( \log c_{j}/w_{j}+V_{j}\right) \right| ^{2}=\delta _{j}w_{j}^{V}\frac{\left| \nabla \rho _{j}^{V}\right| ^{2}}{\rho _{j}^{V}}.\) Hence, we have

$$\begin{aligned} \mathcal {R}_{\mathrm {diff}}^{*}(\mu ,-\mathrm {D}\mathcal {E}^{V}(\mu ))=\frac{1}{2}\int _{\Omega }\sum _{j=1}^{2}\delta _{j}w_{j}^{V}\frac{\left| \nabla \rho _{j}^{V}\right| ^{2}}{\rho _{j}^{V}}\mathrm {d}x. \end{aligned}$$

For the reaction part, we use the identity \(\mathsf {C}^{*}(\log p-\log q)=2\frac{\left( \sqrt{p}-\sqrt{q}\right) ^{2}}{\sqrt{pq}}\) and get

$$\begin{aligned} \mathcal {R}_{\text {react,}\varepsilon }^{*}(\mu ,-\mathrm {D}\mathcal {E}^{V}(\mu ))&= 2 \int _{\Omega }\frac{1}{\varepsilon }\sqrt{c_{1}c_{2}}\frac{\left( \sqrt{c_{1}/\eta _{1}w_{1}}-\sqrt{c_{2}/\eta _{2}w_{2}}\right) ^{2}}{\sqrt{c_{1}c_{2}/\eta _{1}w_{1}\eta _{2}w_{2}}}\,\,\mathrm {d}x\\&=\frac{2}{\varepsilon }\int _{\Omega }\sqrt{w_{1}^{V}w_{2}^{V}}\left( \sqrt{\rho _{1}^{V}}-\sqrt{\rho _{2}^{V}}\right) ^{2}x. \end{aligned}$$

4 EDP-convergence result

In this section we state the EDP-convergence result for the gradient systems \((Q,\mathcal {E},\mathcal {R}_{\varepsilon }^{*})\) to \((Q,\mathcal {E},\mathcal {R}_{\mathrm {eff}}^{*})\) and discuss the properties of the effective gradient system \((Q,\mathcal {E},\mathcal {R}_{\mathrm {eff}}^{*})\). Since the energy \(\mathcal {E}\) is \(\varepsilon \)-independent the major challenge is to prove \(\Gamma \)-convergence of the dissipation functional \(\mathfrak {D}_{\varepsilon }^{V}\), which is a functional defined on the space of trajectories in the state space Q. To be mathematical precise, we first fix the functional analytic setting.

The state space \(Q=\mathrm {Prob}(\Omega \times \left\{ 1,2\right\} )\) is equipped with the p-Wasserstein distance \(d_{\mathcal {W}_{p}}\), where in our situation either \(p=1\) or \(p=2\). Recall that for any compact Euclidean subspace \(E\subset \mathbb {R}^{k}\) the p-Wasserstein distance is defined on the space of probability measures \(\mathrm {Prob}(E)\) by

$$\begin{aligned} {\mathcal {W}}_p(\mu ^{1},\mu ^{2})^{p}=\min _{\gamma \in \Gamma (\mu ^{1},\mu ^{2})}\int _{E}|x-y|{}^{p}\mathrm {d}\gamma (x,y), \end{aligned}$$

where \(\Gamma (\mu ^{1},\mu ^{2})\) is the set of all transport plans with marginals \(\mu ^{1}\) and \(\mu ^{2}\) (see e.g. [1]). The p-Wasserstein distance \({\mathcal {W}_{p}}\) metrizises the weak*-topology of measures, i.e. convergence tested against continuous functions on E. In the following we will consider either \(E=\Omega \) or \(E=\Omega \times \left\{ 1,2\right\} \).

To define the topology in the space of trajectories on Q, we start very coarse, where we understand the trajectories on Q as measures in space and time. We denote the space of trajectories by \(\mathrm {L}_{w}^{\infty }([0,T],Q)\) equipped with the weak*-measurability. The weak*-convergence is defined as usual by

$$\begin{aligned} \mu ^{\varepsilon }(\cdot )\rightarrow \mu ^{0}(\cdot )\ \Leftrightarrow \ \ \forall i\in \left\{ 1,2\right\} ,\,\forall \phi \in \mathrm {C}^{\infty }(\Omega \times [0,T]):\ \int _{0}^{T}\!\!\!\int _{\Omega }\phi \mathrm {d}\mu _{i}^{\varepsilon }\mathrm {d}t\rightarrow \int _{0}^{T}\!\!\!\int _{\Omega }\phi \mathrm {d}\mu _{i}^{0}\mathrm {d}t. \end{aligned}$$
(4.1)

A finer topology, which enables to prove compactness and evaluate the effective dissipation functional is then given by the a priori bounds

$$\begin{aligned} \sup _{\varepsilon \in ]0,1]}\mathfrak {D}_{\varepsilon }^{V}(\mu ^{\varepsilon })\le C,\quad \sup _{\varepsilon \in ]0,1]}\underset{t\in [0,T]}{\mathrm {ess\ sup}}\ \mathcal {E}(\mu ^{\varepsilon }(t))\le C. \end{aligned}$$
(4.2)

In fact, as presented in Sect. 5.1, these bounds provide that the measures \(\mu ^{\varepsilon }\) have Lebesgue densities \(c^{\varepsilon }\) which converge strongly in \(\mathrm {L}^{1}([0,T]\times \Omega ,\mathbb {R}_{\ge 0}^{2})\). Moreover, the limiting coarse-grained measure \(\hat{\mu }^{0}=\mu _{1}^{0}+\mu _{2}^{0}\) has an representative which is absolutely continuous in time with values in \(\left( \mathrm {Prob}(\Omega ),{\mathcal{W}_2}\right) \), i.e. there is a function \(m\in \mathrm {L}^{1}([0,T]\) such that for all \(t,s\in [0,T]\) with \(s\le t\) we have

$$\begin{aligned} {\mathcal{W}_{2}}\left( \hat{\mu }^{0}(s),\hat{\mu }^{0}(t)\right) \le \int _{s}^{t}m(r)~\mathrm {d}r. \end{aligned}$$

Each component \(\mu _{i}^{0}\), \(i=1,2\) is not a trajectory in the space of probability measure, but in the space of non-negative Radon measures. Proposition 5.11 shows that \(\mu _{i}^{0}\) is absolutely continuous in time with values in \((\mathcal{M}_{+}(\Omega ),{\mathcal{W}_1})\) exploiting the dual formulation of the 1-Wasserstein distance (see e.g. [15]). This compactness result is comparable to the result of Bothe and Hilhorst [6], where also strong convergence of solutions \(c=(c_{1},c_{2})\) is proved. In particular, similar to the space independent situation in [30, 34, 41] one cannot guarantee that \(\mu ^{\varepsilon }(t)\rightarrow \mu ^{0}(t)\) in Q for all times \(t\in [0,T]\) as jumps in time cannot be excluded. Instead the limit \(\mu ^{0}=c^{0}\,\mathrm {d}x\) has an absolutely continuous representative.

4.1 Main theorem

Let us state our main EDP-convergence result. For doing this, we define for \(V\in \mathrm {C}^{1}(\Omega ,\mathbb {R}^{2})\) the total dissipation functional on \(\mathrm {L}_{w}^{\infty }([0,T],Q)\) as

$$\begin{aligned} \mathfrak {D}_{\varepsilon }^{V}(\mu )&={\left\{ \begin{array}{ll} \int _{0}^{T}\left\{ \mathcal {R}_{\varepsilon }(\mu ,\dot{\mu })+\mathcal {R}_{\varepsilon }^{*}(\mu ,-\!\mathrm {D}\mathcal {E}^{V}(\mu ))\right\} \mathrm {d}t, &{} \mu \in \mathrm {AC}([0,T],Q),\mu =c\,\mathrm {d}x\ \mathrm {a.e.\,in}\,[0,T]\\ \infty &{} \mathrm {otherwise}. \end{array}\right. } \end{aligned}$$

If \(\mu =c\,\mathrm {d}x\) a.e. in [0, T], then the dissipation functional is given by

$$\begin{aligned}&\int _{0}^{T}\mathcal {R}_{\varepsilon }(\mu ,\dot{\mu })+\mathcal {R}_{\varepsilon }^{*}(\mu ,-V-\mathrm {D}\mathcal {E}(\mu ))\mathrm {d}t\nonumber \\&\quad =\inf _{(c,J,b)\in \mathrm {(gCE)}}\left\{ \int _{0}^{T}\left\{ \int _{\Omega }\sum _{j=1}^{2}\widetilde{\mathsf {Q}}(\delta _{j}c_{j},J_{j})\mathrm {d}x+\int _{\Omega }\widetilde{\mathsf {C}}\left( \frac{\sqrt{c_{1}c_{2}}}{\varepsilon },b_{2}(x)\right) \mathrm {d}x\right\} \ \mathrm {d}t\right\} +\nonumber \\&\qquad +\int _{0}^{T}\left\{ \frac{1}{2}\int _{\Omega }\sum _{j=1}^{2}\delta _{j}w_{j}^{V}\frac{\left| \nabla \rho _{j}^{V}\right| ^{2}}{\rho _{j}^{V}}\mathrm {d}x+\frac{2}{\varepsilon }\int _{\Omega }\sqrt{w_{1}^{V}w_{2}^{V}}\left( \sqrt{\rho _{1}^{V}}-\sqrt{\rho _{2}^{V}}\right) ^{2}\mathrm {d}x\ \right\} \ \mathrm {d}t, \end{aligned}$$
(4.3)

where the infimum is taken over all Borel fluxes \(J_{j}\in \mathcal{M}([0,T]\times \Omega ,\mathbb {R}^{d}),b_{j}\in \mathcal{M}([0,T]\times \Omega ,\mathbb {R})\) which satisfy the generalized continuity equation (gCE) in the sense of distributions, i.e.

$$\begin{aligned} \forall j&=1,2\ \forall \phi \in \mathrm {C}^{\infty }_c(]0,T[\times \Omega ) :\int _{0}^{T}\!\!\int _{\Omega }\dot{\phi }c_{j}-\nabla \phi \cdot J_{j}\mathrm {d}x\mathrm {d}t=-\int _{0}^{T}\!\!\int _{\Omega }b_{j}\phi \mathrm {d}x\mathrm {d}t,\\&\ J_j\cdot \nu =0\mathrm {\ \ on}\ \partial \Omega . \end{aligned}$$

Strictly speaking, the functions \(\widetilde{\mathsf {Q}}\) and \(\widetilde{\mathsf {C}}\) are not defined for measures \(J_{j}\),\(b_{j}\) and hence, the formula for the dissipation functional (4.3) is a priori not correct. In fact, introducing the related functional as in Lemma 5.2, the dissipation functional can be expressed via the Lebesgue densities of \(J_{j},b_{j}\). By the a priori bounds (4.2), these densities are in \(\mathrm {L}^{1}([0,T]\times \Omega )\) as Lemma 5.1 shows. For notational convenience, we identify the measures with their Lebesgue densities and stick to the above expression (4.3). Only the Lebesgue density of \(\mu \) is denoted by a different letter (namely c). Only in Lemma 5.7, we, in fact, show compactness for the sequence of measures \(J_{i}^{\varepsilon }\in \mathcal {M}(\Omega )\), which are the diffusion fluxes that provide the minimum of \(\mathfrak {D}_\varepsilon ^V(\mu ^\varepsilon )\) in (4.3) (see Sect. 5.1).

The main result is the \(\Gamma \)-convergence of \(\mathfrak {D}_{\varepsilon }^{V}\) to the effective dissipation functional \(\mathfrak {D}_{0}^{V}\) which is defined by

$$\begin{aligned} \mathfrak {D}_{0}^{V}(\mu )={\left\{ \begin{array}{ll} \int _{0}^{T}\mathcal {R}_{\mathrm {eff}}(\mu ,\dot{\mu })+\mathcal {R}_{\mathrm {eff}}^{*}(\mu ,-\mathrm {D}\mathcal {E}^{V}(\mu ))\mathrm {d}t, &{} \mathrm {if~}\mu \in \mathrm {AC}([0,T],Q),\mu =c\,\mathrm {d}x\ \mathrm {a.e.\,in}\,[0,T]\\ ~~\infty &{} \mathrm {otherwise}. \end{array}\right. } \end{aligned}$$

where

$$\begin{aligned} \mathcal {R}_{\mathrm {eff}}^{*}(\mu ,\xi )=\mathcal {R}_{\mathrm {diff}}^{*}(\mu ,\xi )+\chi _{\left\{ \xi _{1}=\xi _{2}\right\} }(\xi ),\ \ \mathcal {R}_{\mathrm {eff}}(\mu ,v)=\left( \mathcal {R}_{\mathrm {eff}}(\mu ,\cdot )\right) ^{*}(v). \end{aligned}$$
(4.4)

Theorem 4.1

Let \(V\in \mathrm {C}^{1}(\Omega ,\mathbb {R}^{2})\). On \(\mathrm {L}_{w}^{\infty }([0,T],Q)\), we have \(\Gamma \)-convergence conditioned to bounded energies of \(\mathfrak {D}_{\varepsilon }^{V}\), i.e. \(\mathfrak {D}_{\varepsilon }^{V}\xrightarrow {\mathrm{{M}}_{E}}\mathfrak {D}_{0}^{V}\) where

$$\begin{aligned} \mathfrak {D}_{0}^{V}(\mu )={\left\{ \begin{array}{ll} \int _{0}^{T}\mathcal {R}_{\mathrm {eff}}(\mu ,\dot{\mu })+\mathcal {R}_{\mathrm {eff}}^{*}(\mu ,-V-\mathrm {D}\mathcal {E}(\mu ))\mathrm {d}t, &{} \mu \in \mathrm {AC}([0,T],Q),\mu =c\,\mathrm {d}x\ \mathrm {a.e.}\ [0,T]\\ ~~\infty &{} \mathrm {otherwise} \end{array}\right. } \end{aligned}$$
(4.5)

with

$$\begin{aligned} \mathcal {R}_{\mathrm {eff}}^{*}(\mu ,\xi )&=\mathcal {R}_{\mathrm {diff}}^{*}(\mu ,\xi )+\chi _{\left\{ \xi _{1}=\xi _{2}\right\} }(\xi ),\\ \mathcal {R}_{\mathrm {eff}}(\mu ,v)&=\inf _{u+\tilde{u}=v}\left\{ \mathcal {R}_{\mathrm {diff}}(\mu ,\tilde{u})+\chi _{0}(u_{1}+u_{2})\right\} =\\&=\inf \left\{ \sum _{j=1}^{2}\int _{\Omega }\widetilde{\mathsf {Q}}(\delta _{j}c_{j},J_{j})\mathrm {d}x:u_{1}+u_{2}=0,\left\{ \begin{array}{c} v_{1}=-\mathrm {div}J_{1}+u_{1}\\ v_{2}=-\mathrm {div}J_{2}+u_{2} \end{array}\right\} \right\} \ . \end{aligned}$$

The theorem states that the limit dissipation functional is again of \(\mathcal {R}\oplus \mathcal {R}^{*}\)-form with an effective dissipation potential \(\mathcal {R}_{\mathrm {eff}}^{*}\). The effective dissipation potential \(\mathcal {R}_{\mathrm {eff}}^{*}\) consists again of two terms describing the diffusion and a coupling, which forces the chemical potential \(-\mathrm {D}\mathcal {E}^{V}\) to equilibration providing the microscopic equilibria of the densities \(\rho ^{V}\) and defining the slow manifold of the evolution

$$\begin{aligned} \rho ^V_1=\rho ^V_2 \quad \Leftrightarrow \quad \frac{c_1}{\beta \mathrm {e}^{-V_1}} = \frac{c_2}{\alpha \mathrm {e}^{-V_2}}\ . \end{aligned}$$

Since the effective dissipation functional \(\mathcal {R}_\mathrm {eff}\) is independent of the tilts \(\eta =V\), Theorem 4.1 immediately implies that the family of gradient systems \((Q,\mathcal {E},\mathcal {R}_{\varepsilon }^{*})\) EDP-converges with tilting to the effective gradient system \((Q,\mathcal {E},\mathcal {R}_{\mathrm {eff}}^{*})\).

Theorem 4.2

Let \(\mathcal {R}_{\mathrm {eff}}^{*}\) be defined by (4.4). Then the gradient system \((Q,\mathcal {E},\mathcal {R}_{\varepsilon }^{*})\) EDP-converges with tilting to \((Q,\mathcal {E},\mathcal {R}_{\mathrm {eff}}^{*})\).

In the Sect. 5, we will present the detailed proof of the \(\Gamma \)-convergence result. In this section, we discuss the effective gradient system and its induced gradient flow equation.

4.2 Effective gradient flow equation

Similar to the space-independent situation in [30, 34] the limit gradient system can be understood in two different but mathematically equivalent ways: as a gradient system on Q, and also in the coarse-grained space of slow variables \(\hat{Q}\). On Q, the effective gradient flow equation is given with Lagrange multipliers ensuring the projection on the slow manifold. On the coarse-grained space \(\hat{Q}\), we obtain a scalar diffusion equation with a new mixed diffusion coefficient. Throughout the section the potential \(V\in \mathrm {C}^{1}(\Omega ,\mathbb {R}^{2})\) is fixed.

4.2.1 Gradient flow equation with Lagrange multipliers

For being brief, the calculations in this section are rather formal. The effective dissipation potential \(\mathcal {R}_{\mathrm {eff}}^{*}=\mathcal {R}_{\mathrm {diff}}^{*}+\chi _{\left\{ \xi _{1}=\xi _{2}\right\} }\) consists of two parts: the first describes the dissipation of the evolution and the second provides the linear constraint of being on the slow manifold and also the corresponding Lagrange multiplier. The evolution equation is given by

$$\begin{aligned} \dot{\mu }\in \partial _{\xi }\mathcal {R}_{\mathrm {eff}}^{*}(\mu ,-V-\mathrm {D}\mathcal {E}(\mu ))=\partial _{\xi }\left\{ \mathcal {R}_{\mathrm {diff}}^{*}(\mu ,-V-\mathrm {D}\mathcal {E}(\mu ))+\chi _{\left\{ \xi _{1}=\xi _{2}\right\} }(-V-\mathrm {D}\mathcal {E}(\mu ))\right\} . \end{aligned}$$

Following [16], the subdifferential of a sum is given by the sum of the subdifferentials, if one term is continuous. For the second term, the subdifferential of the characteristic function is only defined in its domain, i.e. if

$$\begin{aligned} -V_{1}-\mathrm {D}\mathcal {E}(\mu )_{1}=-V_{2}-\mathrm {D}\mathcal {E}(\mu )_{2}\quad \Leftrightarrow \quad \frac{c_{1}}{\beta \mathrm {e}^{-V_{1}}}=\frac{c_{2}}{\alpha \mathrm {e}^{-V_{2}}}, \end{aligned}$$

hence, defining the linear slow manifold. Moreover, on its domain we have for the sub-differential that \(\partial \chi _{\left\{ \xi _{1}=\xi _{2}\right\} }=\mathcal{M}(\Omega )\begin{pmatrix}1\\ -1 \end{pmatrix}\). Hence, we conclude that

$$\begin{aligned} \dot{\mu }&\in \partial _{\xi }\left\{ \mathcal {R}_{\mathrm {diff}}^{*}(\mu ,-V-\mathrm {D}\mathcal {E}(\mu ))\right\} +\mathcal{M}(\Omega )\begin{pmatrix}1\\ -1 \end{pmatrix},\quad \frac{c_{1}}{\beta \mathrm {e}^{-V_{1}}}=\frac{c_{2}}{\alpha \mathrm {e}^{-V_{2}}}\ , \end{aligned}$$

which implies the gradient flow equation on the slow manifold with a Lagrange multiplier \(\lambda =(\lambda _{1},\lambda _{2})\) for the densities of the form

$$\begin{aligned} {\left\{ \begin{array}{ll} \dot{c_{1}}=\mathrm {div}\left\{ \delta _{1}\nabla c_{1}+\delta _{1}c_{1}\nabla V_{1}\right\} +\lambda _{1}\\ \dot{c_{2}}=\mathrm {div}\left\{ \delta _{2}\nabla c_{2}+\delta _{2}c_{2}\nabla V_{2}\right\} +\lambda _{2} \end{array}\right. },\quad \lambda _{1}+\lambda _{2}=0,\quad \frac{c_{1}}{\beta \mathrm {e}^{-V_{1}}}=\frac{c_{2}}{\alpha \mathrm {e}^{-V_{2}}}\ . \end{aligned}$$
(4.6)

4.2.2 Coarse-grained gradient structure and its gradient flow equation

We introduce the coarse grained probability measure \(\hat{\mu }=\mu _{1}+\mu _{2}\) on \(\Omega \) and the corresponding concentrations \(\hat{c}:=c_{1}+c_{2}\). Moreover, we define the equilibrated densities \(\hat{\rho }^{V}=\rho _{1}^{V}=\rho _{2}^{V}\) and the coarse-grained stationary measure \(\hat{w}^{V}=w_{1}^{V}+w_{2}^{V}\), for which we get \(\hat{c}=\rho _{1}^{V}w_{1}^{V}+\rho _{2}^{V}w_{2}^{V}=\hat{\rho }^{V}(w_{1}^{V}+w_{2}^{V})=\hat{\rho }^{V}\hat{w}^{V}\). We also introduce the coarse-grained diffusion coefficient \(\hat{\delta }^{V}=\frac{\delta _{1}w_{1}^{V}+\delta _{2}w_{2}^{V}}{w_{1}^{V}+w_{2}^{V}}\).

With this notation, we may define the coarse-grained gradient structure \((\hat{Q},\hat{\mathcal {E}},\hat{\mathcal {R}}^{*})\). On the state space \(\hat{Q}=\mathrm {Prob}(\Omega )\), we define

$$\begin{aligned} \hat{\mathcal {R}}^{*}(\hat{\mu },\hat{\xi })&:=\mathcal {R}_{\mathrm {eff}}^{*}\left( \left( \frac{w_{1}^{V}}{w_{1}^{V}+w_{2}^{V}}\hat{\mu },\frac{w_{2}^{V}}{w_{1}^{V}+w_{2}^{V}}\hat{\mu }\right) ,(\hat{\xi },\hat{\xi })\right) =\frac{1}{2}\int _{\Omega }\hat{\delta }^{V}|\nabla \hat{\xi }|^{2}\mathrm {d}\hat{\mu }\ ,\nonumber \\ \hat{\mathcal {E}}(\hat{\mu })&:=\mathcal {E}^{V}\left( \frac{w_{1}^{V}}{w_{1}^{V}+w_{2}^{V}}\hat{\mu },\frac{w_{2}^{V}}{w_{1}^{V}+w_{2}^{V}}\hat{\mu }\right) . \end{aligned}$$
(4.7)

Introducing the coarse-grained potential \(\hat{V}=-\log (w_{1}^{V}+w_{2}^{V})-\log Z=-\log (\hat{w}^{V})-\log Z=-\log \left( w_{1}\mathrm {e}^{-V_{1}}+w_{2}\mathrm {e}^{-V_{2}}\right) \), for which the exponential is given by the weighted arithmetic mean of the exponentials \(\mathrm {e}^{-V_{1}}\) and \(\mathrm {e}^{-V_{2}}\), i.e. \(\mathrm {e}^{-\hat{V}}=w_{1}\mathrm {e}^{-V_{1}}+w_{2}\mathrm {e}^{-V_{2}}\) (we used \(w_{1}+w_{2}=1\)). Easy calculations show that the energy has the explicit form

$$\begin{aligned} \hat{\mathcal {E}}(\hat{\mu })=\int _{\Omega }\left( {\log }\hat{\mu }+\hat{V}\right) \mathrm {d}{\hat{\mu }}. \end{aligned}$$

The coarse-grained dissipation functional is defined by

$$\begin{aligned} \hat{\mathfrak {D}}(\hat{\mu })=\int _{0}^{T}\left\{ \hat{\mathcal {R}}(\hat{\mu },\dot{\hat{\mu }})+\hat{\mathcal {R}}^{*}(\hat{\mu },-\mathrm {D}\hat{\mathcal {E}}(\hat{\mu }))\right\} \mathrm {d}t, \end{aligned}$$

which incorporates the tilt via the coarse-grained variables. Note, that the coarse-grained dissipation potential \(\hat{\mathcal {R}}^{*}\) depends explicitly on the tilt V via the diffusion coefficient \(\hat{\delta }^{V}\). This is not a contradiction to tilt-EDP convergence (Theorem 4.2), because in original variables the effective dissipation potential (4.5) is indeed independent of the tilts. The tilts dependence of \(\hat{\mathcal {R}}^{*}\) originates from the energy and tilt dependent slow manifold.

To relate the dissipation functional \(\mathfrak {D}_{0}^{V}\) with the coarse grained dissipation functional \(\hat{\mathfrak {D}}\), we first show that also an equilibration of the fluxes occurs. To do this the following convexity property is important.

Lemma 4.3

Let X a separable and reflexive Banach space and \(\mathsf {F}:X\rightarrow \mathbb {R}_{\infty }\) be convex and lsc. Let the function \(\widetilde{\mathsf {F}}:[0,\infty [\times X\rightarrow \mathbb {R}_{\infty }\) be defined as in (3.4). Then, we have

$$\begin{aligned} \widetilde{\mathsf {F}}\left( \sum _{i=1}^{I}a_{i},\sum _{i=1}^{I}x_{i}\right) \le \sum _{i=1}^{I}\widetilde{\mathsf {F}}(a_{i},x_{i}). \end{aligned}$$

If \(\mathsf {F}\) is strictly convex then equality holds if and only if \((a_{i},x_{i})=(0,0)\) whenever \(a_{i}=0\) or \(x_{i}/a_{i}=x_{j}/a_{j}\) whenever \(a_{i},a_{j}>0\). Moreover, if \(\mathsf {F}(0)=0\), we have the following monotonicity property

$$\begin{aligned} \widetilde{\mathsf {F}}(a_{1},x)\le \widetilde{\mathsf {F}}(a_{2},x),\ \ \mathrm {if}\ a_{1}\ge a_{2}. \end{aligned}$$

Proof

Let pairs \((a_{i},x_{i})\) for \(i=1,\dots ,I\) be given. If \(a_{i}=0\), then either \(x_{i}=0\) and the claim has to be shown for \(I-1\) -number of pairs, or \(x_{i}\ne 0\) and the right-hand side is infinite meaning that the claim is trivial. So let us assume that \(a_{i}>0\) for all \(i=1\dots ,I\). Then \(\widetilde{\mathsf {F}}(a_{i},x_{i})=a_{i}\mathsf {F}(x_{i}/a_{i})\) and the claim is equivalent to

$$\begin{aligned} \sum _{i=1}^{I}\frac{a_{i}}{\sum _{i=1}^{I}a_{i}}\mathsf {F}(x_{i}/a_{i})\ge \mathsf {F}\left( \sum _{i=1}^{I}\frac{a_{i}}{\sum _{i=1}^{I}a_{i}}\frac{x_{i}}{a_{i}}\right) =\mathsf {F}\left( \frac{\sum _{i=1}^{I}x{}_{i}}{\sum _{i=1}^{I}a_{i}}\right) , \end{aligned}$$

which holds since \(\mathsf {F}\) is convex. If F is strictly convex then we immediately observe that whenever \(a_{i},a_{j}>0\) we have \(\tfrac{x_{i}}{a_{i}}=\tfrac{x_{j}}{a_{j}},\) and whenever \(a_{i}=0\) that also \(x_{i}=0\).

To see the monotonicity property, we observe that \(\widetilde{\mathsf {F}}(a,0)=0\) for all \(a\ge 0\). Hence, we have

$$\begin{aligned} \widetilde{\mathsf {F}}(a_{1}+a_{2},x)\le \widetilde{\mathsf {F}}(a_{1},x)+\widetilde{\mathsf {F}}(a_{2},0)=\widetilde{\mathsf {F}}(a_{1},x), \end{aligned}$$

which proves the claim. \(\square \)

Recalling formula (4.5) of the effective dissipation functional \(\mathfrak {D}_{0}^{V}\) and using the above lemma, we observe that the velocity part of the dissipation functional \(\mathfrak {D}_{0}^{V}\) can now be estimated. In particular, we will see that the limit dissipation functional \(\mathfrak {D}_{0}^{V}\) can be equivalently expressed in coarse-grained variables \((\hat{\mu },\hat{J})\) by using that an equilibration of concentrations also provides an equilibration of the corresponding fluxes. In the reconstruction strategy in Sect. 5.3 this equilibration is explicitly used (see (4.11) and (5.3)).

Proposition 4.4

Let \(\mu \in \mathrm {AC}([0,T],Q)\) with \(\mathfrak {D}_{0}^{V}(\mu )<\infty \) and \(\mathrm {ess\,\sup }_{t\in [0,T]}\mathcal {E}(\mu (t))<\infty \). Then the following holds:

  1. 1.

    We have \(\mathfrak {D}_{0}^{V}(\mu )=\hat{\mathfrak {D}}(\hat{\mu })\) where \(\hat{\mu }=\mu _{1}+\mu _{2}\) and

    $$\begin{aligned} \hat{{\mathfrak {D}}}(\hat{\mu })&=\int _{0}^{T}\hat{\mathcal {R}}(\hat{\mu },\dot{\hat{\mu }})+\hat{\mathcal {R}}^{*}(\hat{\mu },-\mathrm {D}\hat{\mathcal {E}}(\hat{\mu }))\ \mathrm {d}t\\&=\inf _{\hat{J}:\dot{\hat{\mu }}+\mathrm {div}\hat{J}=0}\left\{ \int _{0}^{T}\int _{\Omega } \widetilde{\mathsf {Q}}(\hat{\delta }^{V}\hat{c},\hat{J})+ \frac{\hat{\delta }^V\hat{w}^{V}}{2}\frac{|\nabla \hat{\rho }^{V}|^{2}}{\hat{\rho }^{V}}\ \mathrm {d}x\mathrm {d}t\right\} , \end{aligned}$$

    where the infimum is taken over all Borel fluxes \(\hat{J}\in \mathcal {M}([0,T]\times \Omega ,\mathbb {R}^d)\) which satisfy the coarse-grained continuity equation in the sense of distributions, i.e.

    $$\begin{aligned} \forall \phi \in \mathrm {C}^\infty _{c}(]0,T[\times \Omega )\ : \ \int _0^T\int _\Omega \dot{\phi }\hat{c} \ \mathrm {d}x \ \mathrm {d}t = \ \int _0^T\int _\Omega \nabla \phi \cdot \hat{J}\ \mathrm {d}x\ \mathrm {d}t, \quad \hat{J}\cdot \nu =0 \ \mathrm {on}\ \partial \Omega . \end{aligned}$$
  2. 2.

    The chain-rule holds for \([0,T]\ni t\mapsto \hat{\mathcal {E}}(\hat{\mu }(t))\in \mathbb {R}\).

  3. 3.

    The gradient flow equation of the gradient system \((\hat{Q},\hat{\mathcal {E}},\hat{\mathcal {R}}^{*})\) is given by

    $$\begin{aligned} \dot{\hat{c}}=-\mathrm {div}\left( \hat{\delta }^{V}\hat{c}\,\nabla \left( -\mathrm {D}\hat{\mathcal {E}}(\hat{\mu })\right) \right) =\mathrm {div}\left( \hat{\delta }^{V}\nabla \hat{c}+\hat{\delta }^{V}\hat{c}\nabla \hat{V}\right) , \end{aligned}$$
    (4.8)

    with the potential \(\hat{V}=-\log \hat{w}^{V}\) and stationary measure \(\hat{w}^{V}\).

Equation (4.8) shows that the coarse-grained gradient flow equation induced by \((\hat{Q},\hat{\mathcal {E}},\hat{\mathcal {R}}^{*})\) is a drift-diffusion equation of the coarse-grained concentration \(\hat{c}\) with mixed diffusion constant \(\hat{\delta }^{V}\). In particular, in the tilt free case we have \(\hat{\delta }^{V=\mathrm {const}}=\frac{\beta \delta _{1}+\alpha \delta _{2}}{\alpha +\beta }\), and we recover the result of [6].

Proof

To prove Part 1, we first observe that the bounded energy and dissipation for the trajectory \(\mu \) implies that we have \(\frac{c_{1}}{w_{1}^{V}}=\frac{c_{2}}{w_{2}^{V}}\) a.e. in \([0,T]\times \Omega \), see Lemma 5.10. Using \(\hat{c}=c_{1}+c_{2}\) for the densities, we get

$$\begin{aligned} \hat{c}=\frac{w_{1}^{V}+w_{2}^{V}}{w_{1}^{V}}c_{1}=\frac{w_{1}^{V}+w_{2}^{V}}{w_{2}^{V}}c_{2}. \end{aligned}$$
(4.9)

The Fisher information \(\mathcal{S}_{0}^{V}(\mu ):=\mathcal {R}_{\mathrm {eff}}^{*}(\mu ,-V-\mathrm {D}\mathcal {E}(\mu ))\) has the form

$$\begin{aligned} \mathcal{S}_{0}^{V}(\mu )&=\frac{1}{2}\int _{\Omega }\sum _{j=1}^{2}\delta _{j}w_{j}^{V}\frac{\left| \nabla \rho _{j}^{V}\right| ^{2}}{\rho _{j}^{V}}\mathrm {d}x=\frac{1}{2}\int _{\Omega }\left( \frac{\delta _{1}w_{1}^{V}+\delta _{2}w_{2}^{V}}{w_{1}^{V}+w_{2}^{V}}\right) \left( w_{1}^{V}+w_{2}^{V}\right) \frac{|\nabla \hat{\rho }^{V}|^{2}}{\hat{\rho }^{V}}\mathrm {d}x\\&=\frac{1}{2}\int _{\Omega }\hat{\delta }^{V}\hat{w}^{V}\frac{|\nabla \hat{\rho }^{V}|^{2}}{\hat{\rho }^{V}}\mathrm {d}x=\hat{\mathcal {R}}^{*}(\hat{\mu },-\mathrm {D}\hat{\mathcal {E}}(\hat{\mu })). \end{aligned}$$

Now, take diffusion fluxes \(J_1,J_2\in \mathcal {M}([0,T]\times \Omega ,\mathbb {R}^d)\) which provide the infimum in the velocity part of the dissipation functional \(\mathfrak {D}_0^V(\mu )\). Lemma 5.1 implies that \(J_j\) have Lebesgue densities (which are also denoted by \(J_j\)) that are in \(\mathrm {L}^1([0,T]\times \Omega ,\mathbb {R}^d)\).

Lemma 4.3 provides that also an equilibration of the fluxes occurs. Indeed, defining the coarse-grained flux \(\hat{J}=J_{1}+J_{2}\), we conclude the pointwise estimate

$$\begin{aligned} \frac{|J_{1}|^{2}}{\delta _{1}c_{1}}+\frac{|J_{2}|^{2}}{\delta _{2}c_{2}}\ge \frac{|J_{1}+J_{2}|^{2}}{\delta _{1}c_{1}+\delta _{2}c_{2}}=\frac{|\hat{J}|^{2}}{\frac{\delta _{1}w_{1}^{V}+\delta _{2}w_{2}^{V}}{w_{1}^{V}+w_{2}^{V}}\hat{c}}=\frac{|\hat{J}|^{2}}{\hat{\delta }^V\hat{c}}, \end{aligned}$$
(4.10)

where \(\hat{\delta }^{V}:=\frac{\delta _{1}w_{1}^{V}+\delta _{2}w_{2}^{V}}{w_{1}^{V}+w_{2}^{V}}\). Equality holds if and only if \((J_{1},c_{1})=0\) or \((J_{2},c_{2})=0\) or \(J_{1}/\delta _{1}c_{1}=J_{2}/\delta _{2}c_{2}=\hat{J}/\hat{\delta }^{V}\hat{c}\). The last condition is equivalent to

$$\begin{aligned} \hat{J}=\frac{\delta _{1}w_{1}^{V}+\delta _{2}w_{2}^{V}}{\delta _{1}w_{1}^{V}}J_{1}=\frac{\delta _{1}w_{1}^{V}+\delta _{2}w_{2}^{V}}{\delta _{2}w_{2}^{V}}J_{2}\ , \end{aligned}$$
(4.11)

which provides an explicit formula for the coarse-grained diffusion flux.

Since \((\hat{c},\hat{J})\) solves the coarse-grained continuity equation, we obtain for the dissipation functional that

$$\begin{aligned} \mathfrak {D}_{0}^{V}(\mu )&=\inf _{(c,J,b)\in \mathrm {(gCE)}}\int _{0}^{T}\left\{ \int _{\Omega }\sum _{j=1}^{2}\widetilde{\mathsf {Q}}(\delta _{j}c_{j},J_{j})\mathrm {d}x+\frac{1}{2}\int _{\Omega }\sum _{j=1}^{2}\delta _{j}w_{j}^{V}\frac{|\nabla \rho _{j}|^{2}}{\rho _{j}}\mathrm {d}x\right\} \mathrm {d}t\nonumber \\&\ge \inf _{\dot{\hat{c}}+\mathrm {div}\hat{J}=0}\int _{0}^{T}\left\{ \int _{\Omega }\widetilde{\mathsf {Q}}\left( \hat{\delta }^V\hat{c},\hat{J}\right) \mathrm {d}x+\frac{1}{2}\int _{\Omega }\hat{\delta }^V\hat{w}^{V}\frac{|\nabla \hat{\rho }^V|^{2}}{\hat{\rho }^V}\mathrm {d}x\right\} \mathrm {d}t. \end{aligned}$$
(4.12)

To prove equality, let \({\hat{\mu }}\) such that \({\hat{\mathfrak {D}}}({\hat{\mu }})<\infty \) be given. We define the reconstructed concentrations \(c_1,c_2\) by (4.9). Moreover, let \(\hat{J}\) be a diffusion flux satisfying the continuity equation, which can be again assumed to be in \(\mathrm {L}^1([0,T]\times \Omega ,\mathbb {R}^d)\). We define \(J_1.J_2\in \mathrm {L}^1([0,T]\times \Omega , \mathbb {R}^d)\) by formula (4.11) and they satisfy the same boundary conditions. Using the explicitly derived reaction flux \(b_{1},b_{2}\) from (5.3) defined in the sense of distributions, we observe that (cJb) satisfies the generalized continuity equation. Hence, taking the infimum, we conclude

$$\begin{aligned} {\hat{\mathfrak {D}}}({\hat{\mu }}) = \int _{0}^{T}\left\{ \int _{\Omega }\sum _{j=1}^{2}\widetilde{\mathsf {Q}}(\delta _{j}c_{j},J_{j})\mathrm {d}x+\frac{1}{2}\int _{\Omega }\sum _{j=1}^{2}\delta _{j}w_{j}^{V}\frac{|\nabla \rho _{j}|^{2}}{\rho _{j}}\mathrm {d}x\right\} \mathrm {d}t \ge \mathfrak {D}_0^V(\mu ), \end{aligned}$$

which proves the first part.

For the chain-rule in Part 2, we refer to the proof [18, Lem 4.8] since we consider the pure diffusive situation. The proof uses a time-regularization argument and convexity of the Fisher-information following the ideas of [35, Prop. 2.4].

For Part 3, we compute the evolution equation that is induced by the gradient system is \((\hat{Q},\hat{\mathcal {E}},\hat{\mathcal {R}}^{*})\). We have

$$\begin{aligned} \partial _{\hat{\xi }}\hat{\mathcal {R}}^{*}(\hat{\mu },\hat{\xi })&=-\mathrm {div}\left( \hat{\delta }^{V}\hat{c}\nabla \hat{\xi }\right) ,\\ \mathrm {D}\hat{\mathcal {E}}(\hat{\mu })&=\log \hat{\mu }+1-\log \hat{w}^{V}-\log Z,\\ \nabla \left( -\mathrm {D}\hat{\mathcal {E}}(\hat{\mu })\right)&=-\frac{\nabla \hat{\mu }}{\hat{\mu }}+\frac{\nabla \hat{w}^{V}}{\hat{w}^{V}}=-\frac{\nabla \hat{\mu }}{\hat{\mu }}-\nabla \hat{V}, \end{aligned}$$

which results in

$$\begin{aligned} \dot{\hat{c}}=-\mathrm {div}\left( \hat{\delta }^{V}\hat{c}\,\nabla \left( -\mathrm {D}\hat{\mathcal {E}}(\hat{\mu })\right) \right) =\mathrm {div}\left( \hat{\delta }^{V}\nabla \hat{c}+\hat{\delta }^{V}\hat{c}\nabla \hat{V}\right) . \end{aligned}$$

\(\square \)

We finally remark that the coarse-grained gradient flow Eq. (4.8) is equivalent to the gradient flow equation with Lagrange multipliers (4.6). Indeed, adding both equations in (4.6) and using that the original concentrations can be expressed by the coarse-grained concentrations via (4.9), the coarse-grained gradient flow Eq. (4.8) with the drift term \(\hat{\delta }^{V}\hat{c}\nabla \hat{V}\) can be readily derived. Conversely, using (4.9), we see that \(c=(c_{1},c_{2})\) are on the slow manifold and satisfy (4.6). The corresponding Lagrange multipliers \(\lambda =(\lambda _{1},\lambda _{2})\) can be explicitly calculated. Introducing the difference of the diffusion constants \(\overline{\delta }=\delta _{1}-\delta _{2}\) and the potentials \(\overline{V}=V_{1}-V_{2}\), we have

$$\begin{aligned} \lambda _{1}&=\frac{w_{2}\mathrm {e}^{-V_{2}}}{w_{1}\mathrm {e}^{-V_{1}}+w_{2}\mathrm {e}^{-V_{2}}}\left( -\overline{\delta }\Delta c_{1}+\left( \delta _{2}\nabla \overline{V}-\overline{\delta }\nabla V_{1}\right) \cdot \nabla c_{1}+c_{1}\left\{ \delta _{2}\nabla \overline{V}\,\nabla V_{1}-\overline{\delta }\Delta V_{1}\right\} \right) \\ \lambda _{2}&=\frac{w_{1}\mathrm {e}^{-V_{1}}}{w_{1}\mathrm {e}^{-V_{1}}+w_{2}\mathrm {e}^{-V_{2}}}\left( \overline{\delta }\Delta c_{2}+\left( -\delta _{1}\nabla \overline{V}+\overline{\delta }\nabla V_{2}\right) \cdot \nabla c_{2}+c_{2}\left\{ -\delta _{1}\nabla \overline{V}\,\nabla V_{2}+\overline{\delta }\Delta V_{2}\right\} \right) . \end{aligned}$$

We observe that the Lagrange multiplier \(\lambda _{i}\) has the same regularity as the right-hand side of the evolution equation of \(c_{i}\). Moreover, both evolution equations are completely uncoupled but contain a linear annihilation/creation term, which depends on the potential \(V=(V_{1},V_{2})\) and the diffusion coefficient \(\delta =\left( \delta _{1},\delta _{2}\right) \). A lengthy calculation shows that indeed we have \(\lambda _{1}+\lambda _{2}=0\).

5 Proof of \(\Gamma \)-convergence

In this section we prove the \(\Gamma \)-convergence result of Theorem 4.1. As usual, we prove \(\Gamma \)-convergence in three steps: First deriving compactness, secondly establishing the liminf-estimate by exploiting the compactness, thirdly constructing the recovery sequence for the limsup-estimate.

In the following the next lemma will be useful.

Lemma 5.1

Let \(\mathsf {F}:\mathbb {R}^{k}\rightarrow [0,\infty [\) be a convex, lsc. function of superlinear growth, i.e. \(\mathsf {F}(r)/|r|\rightarrow \infty \) as \(|r|\rightarrow \infty \). Then there is a constant \(k_{\mathsf {F}}>0\) such that for any measurable functions \(W:\Omega \rightarrow \mathbb {R}^{k}\) and \(\rho :\Omega \rightarrow \mathbb {R}_{\ge 0}\) it holds

$$\begin{aligned} \int _{\Omega }|W|\ \mathrm {d}x\le \int _{\Omega }\widetilde{\mathsf {F}}(\rho ,W)\ \mathrm {d}x+k_{\mathsf {F}}\int _{\Omega }\rho \ \mathrm {d}x\ . \end{aligned}$$

Proof

Let W and \(\rho \) be given. We define three measurable subsets of \(\Omega \):

$$\begin{aligned} \Omega _{0}=\{x:\rho (x)=0\},\ \Omega _{1}=\{x:\rho \ne 0,\tfrac{1}{\rho }|W|\le \mathsf {F}(\tfrac{1}{\rho }W)\},\ \Omega _{2}=\{x:\rho \ne 0,\tfrac{1}{\rho }|W|>\mathsf {F}(\tfrac{1}{\rho }W)\}. \end{aligned}$$

Since \(\mathsf {F}\) is superlinear, there is a constant \(k_{\mathsf {F}}>0\) such that on \(\Omega _{2}\) it holds \(W/\rho \le k_{\mathsf {F}}\). Hence we can estimate

$$\begin{aligned} \int _{\Omega }|W|\mathrm {d}x&\le \int _{\Omega _{0}}|W|\mathrm {d}x+\int _{\Omega _{1}}\frac{|W|}{\rho }\rho \mathrm {d}x+\int _{\Omega _{2}}\frac{|W|}{\rho }\rho \mathrm {d}x\\&\le \int _{\Omega _{0}}\widetilde{\mathsf {F}}(\rho ,W)\mathrm {d}x+\int _{\Omega _{1}}\mathsf {F}(\tfrac{1}{\rho }W)\rho \mathrm {d}x+k_{\mathsf {F}}\int _{\Omega _{2}}\rho \mathrm {d}x \le \int _{\Omega }\widetilde{\mathsf {F}}(\rho ,W)\mathrm {d}x+k_{\mathsf {F}}\int _{\Omega }\rho \mathrm {d}x\ . \end{aligned}$$

\(\square \)

Moreover, we need the following classical lemma. It guarantees the necessary regularity for the limits, and moreover, it provides the desired liminf-estimate.

Lemma 5.2

(Lemma 9.4.3, [1]) Let \(F:[0,\infty [\rightarrow [0,\infty ]\) be a proper, lsc, convex function with superlinear growth. We define the related functional

$$\begin{aligned} \mathcal {F}(\mu ,\gamma )={\left\{ \begin{array}{ll} \int _{A}F(\frac{\mathrm {d}\mu }{\mathrm {d}\gamma })\mathrm {d}\gamma , &{} \mathrm {if}\ \ \mu \ll \gamma ,\\ \infty , &{} \mathrm {otherwise}. \end{array}\right. } \end{aligned}$$

Let \(\mu ^{\varepsilon },\gamma ^{\varepsilon }\in \mathrm {Prob}(A)\) be two sequences with \(\mu ^{\varepsilon }{\mathop {\rightharpoonup }\limits ^{*}}\mu ^{0}\) and \(\gamma ^{\varepsilon }{\mathop {\rightharpoonup }\limits ^{*}}\gamma ^{0}\). Then

$$\begin{aligned} \liminf _{\varepsilon \rightarrow 0}\mathcal {F}(\mu ^{\varepsilon },\gamma ^{\varepsilon })\ge \mathcal {F}(\mu ^{0},\gamma ^{0}). \end{aligned}$$

In particular, if the left-hand side is finite, then for the limits it holds \(\mu ^{0}\ll \gamma ^{0}.\)

5.1 Compactness

Recall that for a given potential \(V\in \mathrm {C}^{1}(\Omega ,\mathbb {R}^{2})\) the dissipation functional \(\mathfrak {D}_{\varepsilon }^{V}\) is defined on the space of trajectories equipped with the weak topology, see (4.1). In the following we want to derive compactness for a sequence \((\mu ^{\varepsilon })_{\varepsilon >0}\) of trajectories, satisfying the a priori bounds (4.2). Using the bound of the dissipation functional \(\mathfrak {D}_{\varepsilon }^{V}(\mu ^{\varepsilon })\le C\), we conclude that there exist diffusive fluxes \(J^{\varepsilon }=(J_{1}^{\varepsilon },J_{2}^{\varepsilon })\) and reaction fluxes \(b^{\varepsilon }=(b_{1}^{\varepsilon },b_{2}^{\varepsilon })\) such that \(J_{i}^{\varepsilon }\in \mathcal{M}(\Omega ,\mathbb {R}^{d})\), \(b_{i}^{\varepsilon }\in \mathcal{M}(\Omega ,\mathbb {R})\) and \((\mu ^{\varepsilon },J^{\varepsilon },b^{\varepsilon })\) satisfies the continuity equation

$$\begin{aligned} (c,J,b)\in \mathrm {{(gCE)}}\ \ \Leftrightarrow \ \ \left\{ b_{1}+b_{2}=0\ \mathrm {and}\ \left\{ \begin{array}{c} \dot{c}_{1}=-\mathrm {div}J_{1}+b_{1}\\ \dot{c}_{2}=-\mathrm {div}J_{2}+b_{2} \end{array}\right\} \right\} . \end{aligned}$$

Moreover, we get bounds:

$$\begin{aligned} i=1,2&:\ \ \int _{0}^{T}\int _{\Omega }\frac{|\nabla \rho _{i}^{V,\varepsilon }|^{2}}{\rho _{i}^{V,\varepsilon }}\mathrm {d}x\mathrm {d}t\le C,\quad \int _{0}^{T}\int _{\Omega }\widetilde{\mathsf {Q}}\left( c_{i}^{\varepsilon },J_{i}^{\varepsilon }\right) \mathrm {d}x\mathrm {d}t\le C,\\&\int _{0}^{T}\int _{\Omega }\widetilde{\mathsf {C}}\left( \frac{\sqrt{c_{1}^{\varepsilon }c_{2}^{\varepsilon }}}{\varepsilon },b_{2}^{\varepsilon }(x)\right) \mathrm {d}x\mathrm {d}t\le C,\quad \frac{1}{\varepsilon }\int _{0}^{T}\int _{\Omega }\left( \sqrt{\rho _{1}^{V,\varepsilon }}-\sqrt{\rho _{2}^{V,\varepsilon }}\right) ^{2}\mathrm {d}x\mathrm {d}t\le C\ . \end{aligned}$$

Remark 5.3

Following [26] a distributional solution \((\mu ,J,B)\) of the generalized continuity equation \(\dot{\mu }=-\mathrm {div}J+B\) satisfying \(\int _{0}^{T}\int _{\Omega }|B|+|J|\mathrm {d}x\mathrm {d}t<\infty \) can be assumed to be absolutely continuous in time. The bounds can be obtained easily using Lemma 5.1 for fixed \(\varepsilon >0\).

Because the functional is convex in the concentration c and in the fluxes J and b, weak convergence would be sufficient to prove a liminf-estimate using a Ioffe-type argument. But, as in the PDE-result [6], we aim in proving even strong convergence for the densities \(c^{\varepsilon }\rightarrow c^{0}\) in \(\mathrm {L}^{1}([0,T]\times \Omega ,\mathbb {R}_{\ge 0}^{2})\). This is done in two steps: First, compactness of coarse-grained variables, and secondly, convergence towards the slow manifold is shown, which together implies strong compactness. This strategy has successfully been applied already in the space-independent case in [30, 34]. Moreover, we show that the limit trajectory \(\mu ^{0}=c^{0}\mathrm {d}x\) has a representative which is in \(\mathrm {AC}([0,T],Q)\). Note that it is not possible to prove pointwise convergence \(\mu ^{\varepsilon }(t){\mathop {\rightharpoonup }\limits ^{*}}\mu ^{0}(t)\) for all \(t\in [0,T]\). Instead, pointwise convergence is only shown for the coarse-grained variables \(\hat{\mu }^{\varepsilon }:=\mu _{1}^{\varepsilon }+\mu _{2}^{\varepsilon }\).

First, we derive weak compactness in space-time, which immediately follows from the uniform bound in \(\varepsilon \) and time on the energy.

Lemma 5.4

("Very" weak compactness in space-time) Let \((\mu ^{\varepsilon })_{\varepsilon >0}\), \(\mu ^{\varepsilon }\in \mathrm {L}_{w}^{\infty }([0,T],Q)\) satisfy \(\sup _{\varepsilon >0}\ \underset{t\in [0,T]}{\mathrm {ess\,sup}\ }\mathcal {E}(\mu ^{\varepsilon }(t))\le C\). Then for a.e. \(t\in [0,T]\) the measure \(\mu ^{\varepsilon }(t)\) has a Lebesgue density \(c^{\varepsilon }(t,\cdot )\). Moreover, there is a subsequence (not relabeled), such that their densities \(c^{\varepsilon }\) are uniformly integrable in \(\Omega \times [0,T]\times \left\{ 1,2\right\} \) and hence, \(c_{i}^{\varepsilon }\) converges weakly in \(\mathrm {L}^{1}([0,T]\times \Omega )\) to \(c_{i}^{0}\) for \(i=1,2\).

Proof

The bound on the energy implies that a.e. \(t\in [0,T]\) the measure \(\mu ^{\varepsilon }(t,\cdot )\) has a Lebesgue density \(c^{\varepsilon }(t,\cdot )\). Moreover, the functional \(\mu \mapsto \int _{0}^{T}\mathcal{E}(\mu )\mathrm {d}t\) is superlinear and convex. Hence, it follows by the Theorem of de Valleé-Poussin that the densities \(c^{\varepsilon }\) are uniformly integrable and hence, \(c_{i}^{\varepsilon }\) converges weakly in \(\mathrm {L}^{1}([0,T]\times \Omega )\) to \(c_{i}^{0}\). The corresponding limit trajectory of measures is denoted by \(\mu ^0(t)=(\mu ^0_1(t),\mu ^0_2(t))\in Q\). \(\square \)

In the following, we are going to derive compactness for the concentrations \(c_{i}^{\varepsilon }\) and the diffusive fluxes \(J_{i}^{\varepsilon }\). It is not possible to get compactness for the fast reaction flux \(b_{2}^{\varepsilon }\) by bounding the dissipation functional, which can be seen in the next remark.

Remark 5.5

To see that compactness for the fast reaction flux \(b_{2}^{\varepsilon }\) is not possible to obtain by the a priori bounds 4.2, we set \(\rho ^{\varepsilon }=1\) constant in \([0,T]\times \Omega \times \left\{ 1,2\right\} \) and \(b_{2}^{\varepsilon }=b^{\varepsilon }\) constant in \([0,T]\times \Omega \). Then, a bound on the dissipation functional implies a bound

$$\begin{aligned} \infty >\int _{0}^{T}\int _{\Omega }\widetilde{\mathsf {C}}\left( \frac{\sqrt{c_{1}^{\varepsilon }c_{2}^{\varepsilon }}}{\varepsilon },b_{2}^{\varepsilon }(x)\right) \mathrm {d}x \mathrm {d}t\approx \widetilde{\mathsf {C}}(\frac{1}{\varepsilon },b^{\varepsilon })=\frac{1}{\varepsilon }\mathsf {C}(\varepsilon b^{\varepsilon })\approx |b^{\varepsilon }|\log (\varepsilon |b^{\varepsilon }|+1). \end{aligned}$$

Setting \(b^{\varepsilon }=-\log \varepsilon \), we easily see that \(|b^{\varepsilon }|\log (\varepsilon |b^{\varepsilon }|+1)\rightarrow 0\) as \(\varepsilon \rightarrow 0\), however, \(b^{\varepsilon }\rightarrow \infty \). Hence, it is not possible to obtain compactness for the fast reaction flux \(b_{i}^{\varepsilon }\). Later in Lemma 5.15 the “converse” statement is proved: If \(\int \!\!\int \mathsf {C}(b)\mathrm {d}x\mathrm {d}t<\infty \) then \(\int \!\!\int \widetilde{\mathsf {C}}(\frac{1}{\varepsilon },b)\mathrm {d}x\mathrm {d}t\rightarrow 0\).

Next, we are going to derive time-regularity for the sequence \((\mu ^{\varepsilon })_{\varepsilon >0}\) in proving compactness for the coarse-grained trajectories \(\hat{\mu }^{\varepsilon }=\mu _{1}^{\varepsilon }+\mu _{2}^{\varepsilon }\). In particular, we are able to prove pointwise convergence in time.

Lemma 5.6

(Time Regularity of \(\mu ^{\varepsilon }\)) Let \((\mu ^{\varepsilon })_{\varepsilon >0}\), \(\mu ^{\varepsilon }\in \mathrm {L}_{w}^{\infty }([0,T],Q)\) satisfying the a priori bounds (4.2). Then the curves \(t\mapsto \hat{\mu }^{\varepsilon } (t):=\mu _{1}^{\varepsilon }(t)+\mu _{2}^{\varepsilon }(t)\) have \(\varepsilon \)-uniformly bounded total variation in the space \(\mathrm {Prob}(\Omega )\) equipped with the 1-Wasserstein distance, i.e.

$$\begin{aligned} \Vert \hat{\mu }^{\varepsilon }\Vert _{TV}:=\sup \left\{ \sum _{k=1}^{K}\mathcal{W}_{1}(\hat{\mu }^{\varepsilon }(t_{k}),\hat{\mu }^{\varepsilon }(t_{k-1}))\ :\ 0=t_{0}<\cdots<t_{k}<\cdots <t_{K}=T\right\} . \end{aligned}$$

In particular, by Helly’s selection principle, we conclude pointwise convergence \(\hat{\mu }^{\varepsilon }(t):=\mu _{1}^{\varepsilon }(t)+\mu _{2}^{\varepsilon }(t){\mathop {\rightharpoonup }\limits ^{*}}\hat{\mu }^{0}(t):=\mu _{1}^{0}(t)+\mu _{2}^{0}(t)\) for all \(t\in [0,T]\) in \(\mathrm {Prob}(\Omega )\) along a suitable subsequence, where \(\mu ^0=(\mu ^0_1,\mu ^0_2)\) is from Lemma 5.4.

Proof

Let a partition \(\{t_k\}_{k=0,\dots , K}\) of the interval [0, T] be given. We exploit the dual formulation of the 1-Wasserstein distance, which is given by integrating against Lipschitz functions (see e.g. [15]),

$$\begin{aligned} \mathcal {W}_1(\hat{\mu },\hat{\nu }) = \sup _{\phi :\Vert \phi \Vert _{\mathrm {W}^{1,\infty }(\Omega )}\le 1} \int _\Omega \phi (\mathrm {d}\hat{\mu }-\mathrm {d}\hat{\nu }). \end{aligned}$$

Taking \(\phi \in \mathrm {C}^{1}(\Omega )\) with \(\Vert \phi \Vert _{\mathrm {W}^{1,\infty }(\Omega )}\le 1\) and using the continuity equations \(b_{1}+b_{2}=0\) and \( \left\{ \begin{array}{c} \dot{c}_{1}=-\mathrm {div}J_{1}+b_{1}\\ \dot{c}_{2}=-\mathrm {div}J_{2}+b_{2} \end{array}\right\} \), we conclude for all \([t_{k-1},t_{k}]\subset [0,T]\) that

$$\begin{aligned} \mathcal {W}_1(\hat{\mu }^\varepsilon (t_{k}),\hat{\mu }^\varepsilon (t_{k-1}))&=\int _{\Omega }\phi \cdot (\mathrm {d}\hat{\mu }^{\varepsilon }(t_{k}){-}\mathrm {d}\hat{\mu }^{\varepsilon }(t_{k-1})) =\!\int _{t_{k-1}}^{t_{k}}\langle \phi ,\dot{\hat{\mu }}^{\varepsilon }\rangle \mathrm {d}t = \\&= \! \int _{t_{k-1}}^{t_{k}}\!\!\int _{\Omega }\nabla \phi \cdot \left( J_{1}^{\varepsilon }+J_{2}^{\varepsilon }\right) \mathrm {d}x\mathrm {d}t\le \!\int _{t_{k-1}}^{t_{k}}\!\!\int _{\Omega }\sum _{i=1}^{2}|J_{i}^{\varepsilon }|\mathrm {d}x\mathrm {d}t. \end{aligned}$$

Hence, we get that

$$\begin{aligned} \sum _{k=1}^{K}\mathcal{W}_{1}(\hat{\mu }^{\varepsilon }(t_{k}),\hat{\mu }^{\varepsilon }(t_{k-1}))\le \int _{0}^{T}\!\!\int _{\Omega }\sum _{i=1}^{2}|J_{i}^{\varepsilon }|\mathrm {d}x\mathrm {d}t. \end{aligned}$$

By the bound on the dissipation functional, we obtain \(\varepsilon \)-uniform bounds on the term \(\int _{0}^{T}\int _{\Omega }\mathsf {Q}(\delta _{j}c_{j}^\varepsilon ,J_{j}^\varepsilon )\mathrm {d}x\mathrm {d}t\). Moreover, we have \(\int _{\Omega }c_{j}^{\varepsilon }\mathrm {d}x\le 1\) for all \(t\in [0,T]\). Hence by Lemma 5.1 we conclude that \(\Vert \hat{\mu }^{\varepsilon }\Vert _{TV}\) is \(\varepsilon \)-uniformly bounded. \(\square \)

Next, the compactness result from Lemma 5.2 is used in order to prove compactness for the fluxes and spatial regularity.

Lemma 5.7

(Regularity for the fluxes and spatial regularity) Let \((\mu ^{\varepsilon })_{\varepsilon >0}\) with \(\mu ^{\varepsilon }\in \mathrm {L}_{w}^{\infty }([0,T],Q)\) satisfying the a priori bounds (4.2). Then the corresponding diffusive fluxes \(J^{\varepsilon }:[0,T]\times \Omega \rightarrow \mathbb {R}^{2}\) converge weakly-star \(J^{\varepsilon }{\mathop {\rightharpoonup }\limits ^{*}}J^{0}\) in \(\mathcal{M}([0,T]\times \Omega \times \left\{ 1,2\right\} )\) and \(J_{j}^{0}\ll \mu _{j}^{0}\). Moreover \(\nabla \rho ^{V,\varepsilon }{\mathop {\rightharpoonup }\limits ^{*}}\nabla \rho ^{V,0}\) in \(\mathcal{M}([0,T]\times \Omega \times \left\{ 1,2\right\} )\) and \(\nabla \rho _{j}^{0}\ll \mu _{j}^{0}\). In particular, we conclude that \(\rho _{j}^{\varepsilon }\) is uniformly bounded in \(\mathrm {L}^{1}([0,T],\mathrm {W}^{1,1}(\Omega ))\), which also implies that \(\hat{c}^{\varepsilon }\) is uniformly in \(\mathrm {L}^{1}([0,T],\mathrm {W}^{1,1}(\Omega ))\).

Proof

By the bound on the dissipation functional, we get (after extracting a suitable subsequence of \(\varepsilon \rightarrow 0\)) that \(J^{\varepsilon }{\mathop {\rightharpoonup }\limits ^{*}}J^{0}\). Moreover, we have \(\int _{0}^{T}\int _{\Omega }\frac{|J_{j}^{\varepsilon }|^{2}}{\rho _{j}^{\varepsilon }}\mathrm {d}x\le C\) and \(\rho _{j}^{\varepsilon }{\mathop {\rightharpoonup }\limits ^{*}}\rho _{j}^{0}\). Hence applying the Lemma 5.2, we conclude that \(J_{j}^{0}\ll \mu _{j}^{0}\). Similarly, we conclude compactness for the gradients \(\nabla \rho ^{V,\varepsilon }\). The only thing that remains is to identify the limit. But this is clear by definition of the weak derivatives, i.e. integrating against smooth test functions, because this is captured in the weak*-convergence. Lemma 5.1 implies that \(\rho _{j}^{\varepsilon }\) is uniformly bounded in \(\mathrm {L}^{1}([0,T],\mathrm {W}^{1,1}(\Omega ))\). \(\square \)

The spatial regularity and the temporal regularity provides a compactness result by a BV-generalization of the Aubin-Lions-Simon Lemma.

Theorem 5.8

([3, 21]) Let XYZ be Banach spaces such that X is compactly embedded in Y, and Y is continuously in \(Z^{*}\). Let \(u^{\varepsilon }\) be a bounded sequence in \(\mathrm {L}^{1}([0,T],X)\) and in \(BV([0,T],Z^{*})\). Then (up to a subsequence) \(u^{\varepsilon }\) strongly converges in \(\mathrm {L}^{1}([0,T],Y)\).

In our situation we immediately conclude that \(\hat{c}^{\varepsilon }\) converges strongly.

Corollary 5.9

(Strong convergence of coarse-grained variables) Let \((\mu ^{\varepsilon })_{\varepsilon >0}\) with \(\mu ^{\varepsilon }\in \mathrm {L}_{w}^{\infty }([0,T],Q)\) satisfying the a priori bounds (4.2). Then the coarse-grained densities \(\hat{c^{\varepsilon }}\) converge strongly in \(\mathrm {L}^{1}([0,T]\times \Omega )\).

Proof

Lemma 5.6 provides that \(\hat{c}^{\varepsilon }\) is bounded in \(BV([0,T],\mathrm {W}^{1,\infty }(\Omega )^{*})\) and Lemma 5.7 provides that \(\hat{c}^{\varepsilon }\) is bounded in \(\mathrm {L}^{1}([0,T],\mathrm {W}^{1,1}(\Omega ))\). Since the embedding \(\mathrm {W}^{1,1}(\Omega )\subset \mathrm {L}^{1}(\Omega )\) is compact and the embedding \(\mathrm {L}^{1}(\Omega )\subset \mathrm {W}^{1,\infty }(\Omega )^{*}\) is continuous, Theorem 5.8 yields that the sequence \(\hat{c^{\varepsilon }}\) is compact in \(\mathrm {L}^{1}([0,T]\times \Omega )\). \(\square \)

It is also clear that we get convergence towards the fast manifold, which results from the fast-reaction part of the slope term.

Lemma 5.10

(Convergence towards microscopic equilibrium and strong compactness) Let \((\mu ^{\varepsilon })_{\varepsilon >0}\), \(\mu ^{\varepsilon }\in \mathrm {L}_{w}^{\infty }([0,T],Q)\) satisfying the a priori bounds (4.2). Then there is a subsequence such that \(c^{\varepsilon }\rightarrow c^{0}\) strongly in \(\mathrm {L}^{1}([0,T]\times \Omega ,\mathbb {R}^2)\) and, moreover, it holds \(\rho _{1}^{V,0}=\rho _{2}^{V,0}\) a.e. in \([0,T]\times \Omega \).

Proof

The bound on the dissipation functional provides \(\int _{0}^{T}\int _{\Omega }\left( \sqrt{\rho _{1}^{V,\varepsilon }}-\sqrt{\rho _{2}^{V,\varepsilon }}\right) ^{2}\mathrm {d}x\mathrm {d}t\le C\varepsilon \). Hence, we conclude \(\Vert \sqrt{\rho _{1}^{V,\varepsilon }}-\sqrt{\rho _{2}^{V,\varepsilon }}\Vert _{\mathrm {L}^{2}([0,T]\times \Omega )}\rightarrow 0\) as \(\varepsilon \rightarrow 0\). In particular, we conclude that \(\rho _{1}^{V,0}=\rho _{2}^{V,0}\). The strong convergence towards the slow manifold provides strong convergence for the whole sequence. Indeed, using Cauchy-Schwartz inequality and \(x-y=\left( \sqrt{x}-\sqrt{y}\right) \left( \sqrt{x}+\sqrt{y}\right) \), we have

$$\begin{aligned} \Vert \rho _{1}^{V,\varepsilon }-\rho _{2}^{V,\varepsilon }\Vert _{\mathrm {L}^{1}([0,T]\times \Omega )}&\le \Vert \sqrt{\rho _{1}^{V,\varepsilon }}-\sqrt{\rho _{2}^{V,\varepsilon }}\Vert _{\mathrm {L}^{2}([0,T]\times \Omega )}\Vert \sqrt{\rho _{1}^{V,\varepsilon }}+\sqrt{\rho _{2}^{V,\varepsilon }}\Vert _{\mathrm {L}^{2}([0,T]\times \Omega )}. \end{aligned}$$

The last term can be estimated by the inequality of arithmetic and geometric mean

$$\begin{aligned}&\Vert \sqrt{\rho _{1}^{V,\varepsilon }}+\sqrt{\rho _{2}^{V,\varepsilon }}\Vert _{\mathrm {L}^{2}([0,T]\times \Omega )}^{2}=\int _{0}^{T}\!\!\!\int _{\Omega }\rho _{1}^{V,\varepsilon }+\rho _{2}^{V,\varepsilon }+2\sqrt{\rho _{1}^{V,\varepsilon }\rho _{2}^{V,\varepsilon }}\mathrm {d}x\mathrm {d}t\\&\quad \le 2\int _{0}^{T}\!\!\!\int _{\Omega }\left( \rho _{1}^{V,\varepsilon }+\rho _{2}^{V,\varepsilon }\right) \mathrm {d}x\mathrm {d}t, \end{aligned}$$

and the right-hand side is bounded since \(\mu (t)\in Q\) for \(t\in [0,T]\). Hence, we conclude that \(\Vert \rho _{1}^{V,\varepsilon }-\rho _{2}^{V,\varepsilon }\Vert _{\mathrm {L}^{1}([0,T]\times \Omega )}\rightarrow 0\) as \(\varepsilon \rightarrow 0\).

Using this convergence, we have that also \(c_{i}^{\varepsilon }\rightarrow c_{i}^{0}\) strongly in \(\mathrm {L}^{1}([0,T]\times \Omega )\). Indeed, a direct computation shows that

$$\begin{aligned} c_{i}^{\varepsilon }-w_{i}^{V}\frac{c_{1}^{0}+c_{2}^{0}}{w_{1}^{V}+w_{2}^{V}} =(-1)^{i}\left( \frac{\frac{c_{2}^{\varepsilon }}{w_{2}^{V}}-\frac{c_{1}^{\varepsilon }}{w_{1}^{V}}}{w_{1}^{V}w_{2}^{V}\left( w_{1}^{V}+w_{2}^{V}\right) }\right) +\frac{w_{i}^{V}}{w_{1}^{V}+w_{2}^{V}}\left( c_{1}^{\varepsilon }+c_{2}^{\varepsilon }-(c_{1}^{0}+c_{2}^{0})\right) \ , \end{aligned}$$

and both terms converge strongly to zero as \(\varepsilon \rightarrow 0\) by convergence of \(\rho _{1}^{V,\varepsilon }{-}\rho _{2}^{V,\varepsilon }\rightarrow 0\) and \(\hat{c}^{\varepsilon }\rightarrow \hat{c}^{0}\). \(\square \)

Finally, we show that the limit \(\mu ^{0}=c^{0}\mathrm {d}x\) has an absolutely continuous representative in the space of probability measures. To do this, we exploit the characterization of absolutely continuous curves as solutions of the continuity equation following [1].

Proposition 5.11

Let \((\mu ^{\varepsilon })_{\varepsilon >0}\), \(\mu ^{\varepsilon }\in \mathrm {L}^{\infty }([0,T],Q)\) satisfying the a priori bounds (4.2) and let \(c^{0}\) be the limit of the densities \(c^{\varepsilon }\). Then the coarse-grained slow variable \(\hat{\mu }=\mu _{1}^{0}+\mu _{2}^{0}=\left( c_{1}^{0}+c_{2}^{0}\right) \,\mathrm {d}x\in \mathrm {L}_{w}^{\infty }([0,T],\mathrm {Prob}(\Omega ))\) has a representative (in time), which is absolutely continuous in the space of probability measures equipped with the 2-Wasserstein metric. Moreover, each component \(\mu _{i}^{0}\) has an absolutely continuous representative (in time), which is absolutely continuous in the space of non-negative Radon measures equipped with the 1-Wasserstein metric.

Proof

The coarse-grained measures \(\hat{\mu }^{\varepsilon }\) satisfy the continuity equation \(\dot{\hat{\mu }}^{\varepsilon }+\mathrm {div}(\hat{J}^{\varepsilon })=0\) in the sense of distributions, where \(\hat{J}^{\varepsilon }=J_{1}^{\varepsilon }+J_{2}^{\varepsilon }\) is the coarse-grained diffusion flux. Since the linear continuity equation is stable under weak convergence, we conclude that also the limits satisfy the same continuity equation \(\dot{\hat{\mu }}^{0}+\mathrm {div}(\hat{J}^{0})=0\), where \(\hat{J}^{0}\) is the weak*-limit of \(\hat{J}^{\varepsilon }\) (see Lemma 5.7). Using (4.10), the bound on the dissipation functional (4.2) implies a bound \(\int _{0}^{T}\int _{\Omega }\mathsf {Q}(\hat{c}^{0},\hat{J}^{0})\mathrm {d}x\mathrm {d}t<\infty \). Let us define the transport velocity \(\hat{v}\in \mathcal{M}([0,T]\times \Omega ,\mathbb {R})\)

$$\begin{aligned} \hat{v}={\left\{ \begin{array}{ll} \frac{\hat{J}}{\hat{c}} &{} \mathrm {for}\ \hat{c}>0\\ 0 &{} \mathrm {for}\ \hat{c}=0 \end{array}\right. }\ . \end{aligned}$$

Then \(\int _{0}^{T}\int _{\Omega }\mathsf {Q}(\hat{c},\hat{J})\mathrm {d}x\mathrm {d}t=\frac{1}{2}\int _{0}^{T}\int _{\Omega }|\hat{v}|^{2}\hat{c}\mathrm {d}x\mathrm {d}t=\frac{1}{2}\int _{0}^{T}\int _{\Omega }|\hat{v}|^{2}\mathrm {d}\hat{\mu }\mathrm {d}t\) and the bound on the dissipation functional implies the bound on the Borel velocity field \(\Vert \hat{v}\Vert _{\mathrm {L}^{2}(\hat{\mu })}<\infty \). Hence, by Theorem 8.3.1 from [1] it follows that \(t\mapsto \hat{\mu }(t)\in \left( \mathrm {Prob}(\Omega ),{\mathcal{W}_{2}}\right) \) has an absolutely continuous representative.

To prove time-regularity for \(\mu _{i}^{0}\) for \(i=1,2\), we first observe that \(\mu _{i}^{0}=\frac{w_{i}^{V}}{w_{1}^{V}+w_{2}^{V}}\hat{\mu }\) is a non-negative Radon measure. To show that it has an absolutely continuous representative, we proceed exactly as in Lemma 5.6 and exploit again the dual formulation of the 1-Wasserstein distance on the space of non-negative Radon measures. Here, we use that \(\frac{w_{i}^{V}}{w_{1}^{V}+w_{2}^{V}}\) is in \(\mathrm {C}^1(\Omega )\).

5.2 Liminf-estimate

Once the compactness is established the proof of the liminf-estimate is comparatively easy.

Theorem 5.12

Let \(\mu ^{\varepsilon }\rightarrow \mu ^{0}\) in \({ \mathrm {L}_{w}^{\infty }}([0,T],Q)\) such that \(\sup _{\varepsilon \in ]0,1]}\sup _{t\in [0,T]}\mathcal {E}(\mu ^{\varepsilon })<\infty \). Then, we have the liminf-estimate

$$\begin{aligned} \liminf _{\varepsilon \rightarrow 0}\mathfrak {D}_{\varepsilon }^V(\mu ^{\varepsilon })\ge \mathfrak {D}_{0}^V(\mu ^{0}), \end{aligned}$$

where the limit dissipation functional is defined as in Theorem 4.1.

Proof

We may assume that \(\mathfrak {D}_{\varepsilon }^V(\mu ^{\varepsilon })\le C<\infty \) (otherwise the claim is trivial). For the given curves \(t\mapsto \mu ^{\varepsilon }(t)\in Q\) take diffusive fluxes \(J^{\varepsilon }\) and reactive fluxes \(b^{\varepsilon }\), which satisfy the generalized continuity equation

$$\begin{aligned} (c,J,b)\in \mathrm {{(gCE)}}\ \ \Leftrightarrow \ \ \left\{ b_{1}+b_{2}=0\ \mathrm {and}\ \left\{ \begin{array}{c} \dot{c}_{1}=-\mathrm {div}J_{1}+b_{1}\\ \dot{c}_{2}=-\mathrm {div}J_{2}+b_{2} \end{array}\right\} \right\} , \end{aligned}$$

and approximate the infimum in \(\mathfrak {D}_{\varepsilon }^V(\mu ^{\varepsilon })\) arbitrarily close, i.e.

$$\begin{aligned} \mathfrak {D}_{\varepsilon }^V(\mu _{\varepsilon })+\varepsilon \ge \int _{0}^{T}\mathcal {D}_{\varepsilon }^V(\mu _{\varepsilon },J^{\varepsilon },b^{\varepsilon })\mathrm {d}t. \end{aligned}$$

The integrand \(\mathcal{D}^V_{\varepsilon }\) consists of a velocity and a slope part and both of them split into a reaction and a diffusion part:

$$\begin{aligned} {{\mathcal {D}^V_{\varepsilon }}}{(\mu ,J,b)}=&\int _{\Omega }\sum _{j=1}^{2}\widetilde{\mathsf {Q}}\left( \delta _{j}c_{j},J_{j}\right) \mathrm {d}x+\int _{\Omega }\widetilde{\mathsf {C}}\left( \frac{\sqrt{c_{1}c_{2}}}{\varepsilon },b_{2}\right) \mathrm {d}x+\nonumber \\&\quad +\frac{1}{2}\int _{\Omega }\sum _{j=1}^{2}\delta _{j}w_{j}^{V}\frac{\left| \nabla \rho _{j}^{V}\right| ^{2}}{\rho _{j}^{V}}\mathrm {d}x+\frac{2}{\varepsilon }\int _{\Omega }\sqrt{w_{1}^{V}w_{2}^{V}}\left( \sqrt{\rho _{1}^{V}}-\sqrt{\rho _{2}^{V}}\right) ^{2}\mathrm {d}x\!\\ =:&\mathcal {V}_{\mathrm {diff}}(\mu ,J)+\mathcal {V}_{\mathrm {react,}\varepsilon }(\mu ,b)+\mathcal {S}_{\mathrm {diff}}(\mu )+\mathcal {S}_{\mathrm {react,}\varepsilon }(\mu )\nonumber \ . \end{aligned}$$
(5.1)

Clearly, we also have \(\int _{0}^{T}\mathcal {D}^V_{\varepsilon }(\mu _{\varepsilon },J^{\varepsilon },b^{\varepsilon })\mathrm {d}t\le C+\varepsilon <\infty \). By Lemma 5.10, we conclude compactness for the densities \(c^{\varepsilon }\rightarrow c^{0}\) in \(\mathrm {L}^{1}([0,T]\times \Omega \times \left\{ 1,2\right\} )\) and by Lemma 5.7 that \(J^{\varepsilon }{\mathop {\rightharpoonup }\limits ^{*}}J^{0}\) in \(\mathcal{M}([0,T]\times \Omega \times \left\{ 1,2\right\} )\). Using the lower-semicontinuity result from Lemma 5.2 (which implies the liminf-estimates for \(\mathcal{V}_{\mathrm {diff}}\) and \(\mathcal{S}_{\mathrm {diff}}\)) and that \(\mathcal{V}_{\mathrm {react},\varepsilon },\mathcal{S}_{\mathrm {react},\varepsilon }\ge 0\), we obtain the estimate \(\liminf _{\varepsilon \rightarrow 0}\int _{0}^{T}\mathcal {D}^V_{\varepsilon }(\mu ^{\varepsilon },J^\varepsilon ,b^\varepsilon ) \mathrm {d}t\ge \int _{0}^{T} \mathcal {D}^V_0(\mu ^0,J^0,b^0)\ \mathrm {d}t\), where we analogously define

$$\begin{aligned} \mathcal {D}_0^V(\mu ,J,b) := \int _{\Omega }\sum _{j=1}^{2}\widetilde{\mathsf {Q}}(\delta _{j}c_{j},J_{j})\mathrm {d}x+\frac{1}{2}\int _{\Omega }\sum _{j=1}^{2}\delta _{j}w_{j}^{V}\frac{|\nabla \rho _{j}|^{2}}{\rho _{j}}\mathrm {d}x. \end{aligned}$$
(5.2)

Let us define \(u_{i}^{\varepsilon }=\dot{c}_{i}^{\varepsilon }+\mathrm {div}J_{i}^{\varepsilon }\). We conclude convergence for \(u_{i}^{\varepsilon }\rightarrow u_{i}^{0}\) in the sense of distributions, and, moreover, we have \(u_{1}^{\varepsilon }+u_{2}^{\varepsilon }\rightarrow \dot{c}_{1}^{0}+\dot{c}_{2}^{0}+\mathrm {div}J_{1}^{0}+\mathrm {div}J_{2}^{0}=0=u^0_1+u^0_2\). In particular, we conclude the pointwise estimate

$$\begin{aligned} \int _{\Omega }\sum _{j=1}^{2}\widetilde{\mathsf {Q}}(\delta _{j}c_{j}^{0},J_{j}^{0})\mathrm {d}x\ge \inf _{(J,u)}\left\{ \int _{\Omega }\sum _{j=1}^{2}\widetilde{\mathsf {Q}}(\delta _{j}c_{j}^{0},J_{j}^{0})\mathrm {d}x:\left\{ \begin{array}{c} \dot{c}_{1}+\mathrm {div}J_{1} = u_{1}\\ \dot{c}_{2}+\mathrm {div}J_{2} = u_{2}\\ u_{1}+u_{2}=0 \end{array}\right\} \right\} \ , \end{aligned}$$

which finally establishes the liminf-estimate \( \liminf _{\varepsilon \rightarrow 0}\int _{0}^{T}\mathcal {D}^V_{\varepsilon }(\mu ^{\varepsilon },J^\varepsilon ,b^\varepsilon )\mathrm {d}t\ge \mathfrak {D}_{0}^{V}(\mu )\).

5.3 Construction of the recovery sequence

In this section, we construct the recovery sequence for the functional \(\mathfrak {D}_{0}^{V}\) to finish the \(\Gamma \)-convergence result in Theorem 4.1. To be precise, we will show the following:

Theorem 5.13

Let \(\mu ^{0}\in \mathrm {L}_{w}^{\infty }([0,T]\times \Omega ,\mathbb {R}_{\ge 0}^{2})\) such that the a priori bounds \(\mathfrak {D}_{0}^{V}(\mu ^{0})<\infty \) and \(\mathrm {ess\,sup}_{t\in [0,T]}\mathcal {E}(\mu ^{0}(t))<\infty \) hold. Then there is a sequence \((\mu ^{\varepsilon })_{\varepsilon >0}\), \(\mu ^{\varepsilon }\in \mathrm {AC}([0,T],Q)\), \(\sup _{\varepsilon >0}\ \mathrm {ess\,sup}_{t\in [0,T]}\mathcal {E}(\mu ^{\varepsilon }(t))<\infty \), such that the densities converge \(c^{\varepsilon }\rightarrow c^{0}\) strongly in \(\mathrm {L}^{1}([0,T]\times \Omega \times \left\{ 1,2\right\} )\) and we have \(\mathfrak {D}_{\varepsilon }^{V}(\mu ^{\varepsilon })\rightarrow \mathfrak {D}_{0}^{V}(\mu ^{0})\). \(\square \)

Using Proposition 4.4, we see that the limit functionals do not contain more information than the functionals in coarse-grained variables and it holds \(\mathfrak {D}_{0}^{V}(\mu ^{0})=\hat{\mathfrak {D}}(\hat{\mu }^{0})\). Hence, we may reconstruct the dissipation functional with the corresponding diffusion and reaction flux (Jb) from the coarse-grained variables \((\hat{c},\hat{J})\).

Technical difficulties arise because the dissipation functional \(\mathfrak {D}_{0}^{V}\) is defined not only for solution of the evolution Eq. (1.2) but for general trajectories or fluctuations. These fluctuations are in general neither strictly positive nor smooth. The proof of the limsup-estimate is done in several steps, which are elaborated in the next lemmas. The bound \(\mathfrak {D}_{0}^{V}(\mu ^{0})<\infty \) can be assumed without loss of generality because the other case is already treated in the liminf-estimate.

Proof of Theorem 5.13

We do the reconstruction in three steps using different approximation methods. We will do the following:

  1. 1.

    Proposition 5.21 shows that for \(\mu ^{0}=c^{0}\mathrm {d}x\) with sufficiently smooth and positive density \(c^{0}\) the constant sequence \(\mu ^{\theta ,\gamma }=\mu ^{0}\) satisfies \(\left| \mathfrak {D}_{\varepsilon }^V(\mu ^{\theta ,\gamma })-\mathfrak {D}_{0}^V(\mu ^{\theta ,\gamma })\right| \rightarrow 0\).

  2. 2.

    Lemma 5.16 overcomes the positivity assumption, i.e. it shows that for all \(\mu ^{0}=c^{0}\mathrm {d}x\) there is a positive \(c^{\gamma }\) such that \(c^{\gamma }\rightarrow c^{0}\) and \(\mathfrak {D}_{0}^V(\mu ^{\gamma })\rightarrow \mathfrak {D}_{0}^V(\mu ^{0})\) as \(\gamma \rightarrow 0\).

  3. 3.

    A mollification argument as in [1, Lemma 8.1.10] and stated in Lemma 5.20 allows us to overcome regularity by smoothing, which shows \(\mathfrak {D}_{0}^V(\mu ^{\theta ,\gamma })\rightarrow \mathfrak {D}_{0}^V(\mu ^{\gamma })\) as \(\theta \rightarrow 0\).

Hence, defining the recovery sequence \(\mu ^{\varepsilon }:=\mu ^{\theta ,\gamma }\), we have

$$\begin{aligned} \left| \mathfrak {D}_{\varepsilon }^V(\mu ^{\varepsilon })-\mathfrak {D}_{0}^V(\mu ^{0})\right| \le \left| \mathfrak {D}_{\varepsilon }^V(\mu ^{\theta ,\gamma })-\mathfrak {D}_{0}^V(\mu ^{\theta ,\gamma })\right| +\left| \mathfrak {D}_{0}^V(\mu ^{\theta ,\gamma })-\mathfrak {D}_{0}^V(\mu ^{\gamma })\right| +\left| \mathfrak {D}_{0}^V(\mu ^{\gamma })-\mathfrak {D}_{0}^V(\mu ^{0})\right| , \end{aligned}$$

where the first term tends to zero by the first reconstruction step, the second term tends to zero by the third reconstruction step and the third term tends to zero by the second reconstruction step, which in total proves the desired convergence. \(\square \)

Before performing the three recovery steps in Sect. 5.3.2, we first illustrate the general idea of constructing the recovery sequence neglecting positivity and regularity issues for the moment.

5.3.1 Construction of recovery sequence for smooth and positive measures

To show the general idea, let us firstly assume that the density of \(\hat{\mu }\) is sufficiently smooth and positive, i.e. we assume that its Lebesgue density satisfies \(\hat{c}\ge \frac{1}{C}>0\) on \(\Omega \times [0,T]\) and has enough regularity that will be specified below. Let \(\hat{J}\) be the diffusion flux which provides the minimum in \(\hat{\mathfrak {D}}(\hat{\mu }^{0})=\mathfrak {D}_{0}^{V}(\mu )\) and satisfies \(\dot{\hat{c}}+\mathrm {div}(\hat{J})=0\). We define the reconstructed variables (pointwise in \(\Omega \times [0,T]\)) by

$$\begin{aligned} c_{1}&=\frac{w_{1}^{V}}{w_{1}^{V}+w_{2}^{V}}\hat{c},\quad c_{2}=\frac{w_{2}^{V}}{w_{1}^{V}+w_{2}^{V}}\hat{c},\quad J_{1}=\frac{\delta _{1}w_{1}^{V}}{\delta _{1}w_{1}^{V}+\delta _{2}w_{2}^{V}}\hat{J},\quad J_{2}=\frac{\delta _{2}w_{2}^{V}}{\delta _{1}w_{1}^{V}+\delta _{2}w_{2}^{V}}\hat{J}.\nonumber \\ b_{1}&=\left( \frac{\delta _{1}-\delta _{2}}{\delta _{1}w_{1}^{V}+\delta _{2}w_{2}^{V}}\frac{w_{1}^{V}w_{2}^{V}}{w_{1}^{V}+w_{2}^{V}}\right) \mathrm {div}\hat{J}+\hat{J}\cdot \nabla \left( \frac{\delta _{1}w_{1}^{V}}{\delta _{1}w_{1}^{V}+\delta _{2}w_{2}^{V}}\right) ,\quad b_{2}=-b_{1}. \end{aligned}$$
(5.3)

The reconstructed concentrations c and diffusion fluxes \(J=(J_{1},J_{2})\) are proportional to the coarse grained concentration \(\hat{c}\) and diffusion flux \(\hat{J}\), respectively. On the coarse-grained level, which considers only one species there is no reaction flux anymore. (This changes, when considering large reaction-diffusion systems as explained in Sect. 6.) The reactive flux \(b=(b_{1},b_{2})\) is given as a function of the coarse-grained diffusion flux \(\hat{J}\), which means that in the limit the diffusion determines the hidden reaction.

Concerning regularity issues, we immediately observe the following. Since \(w^{V}\) is smooth and positive, \(c_{1},c_{2}\) have the same regularity as \(\hat{c}\) and also \(J_{1},J_{2}\) have both the same regularity as \(\hat{J}\). In contrast, the reaction fluxes \(b_i\) are a priori not well-defined by the pointwise expression (5.3) for general diffusion fluxes \(\hat{J}\in \mathcal{M}(\Omega \times [0,T],\mathbb {R}^{d})\). In the following we are going to regularize \(\hat{J}\) to make the above formula exact.

Remark 5.14

Note, that regularity assumptions for \(\mathrm {div}\hat{J}\) are not needed if \(\delta _{1}=\delta _{2}\), i.e. if both species diffuse with the same diffusion constant. In particular, in this situation no regularization argument as in Lemma 5.20 is necessary. Moreover, no additional regularity for \(\hat{J}\) is needed if \(\frac{\delta _{1}w_{1}^{V}}{\delta _{1}w_{1}^{V}+\delta _{2}w_{2}^{V}}=\vartheta \in \ ]0,1[\) is constant. This is equivalent to

$$\begin{aligned} V_{1}(x)-V_{2}(x)=\log \left( \frac{1-\vartheta }{\vartheta }\frac{\delta _{1}}{\delta _{2}}\frac{\beta }{\alpha }\right) =\mathrm {const}, \end{aligned}$$

which means that the potentials \(V_{1},V_{2}\) differ in a constant on \(\Omega \) implying that \(\nabla \hat{V}=\nabla V_{1}=\nabla V_{2}\). As we will see, enough regularity for \(\hat{J}\) is already obtained from bounds on the dissipation functional.

So let us assume for the moment that \(b_{i}\) is well-defined. Then we conclude (cJb) solves the generalized continuity equation (gCE), because we have

$$\begin{aligned} \dot{c}_{1}+\mathrm {div}J_{1}&=\frac{w_{1}^{V}}{w_{1}^{V}+w_{2}^{V}}\dot{\hat{c}}+\mathrm {div}\left( \frac{\delta _{1}w_{1}^{V}}{\delta _{1}w_{1}^{V}+\delta _{2}w_{2}^{V}}\hat{J}\right) =\\&=\frac{w_{1}^{V}}{w_{1}^{V}+w_{2}^{V}}\dot{\hat{c}}+\hat{J}\cdot \nabla \left( \frac{\delta _{1}w_{1}^{V}}{\delta _{1}w_{1}^{V}+\delta _{2}w_{2}^{V}}\right) +\left( \frac{\delta _{1}w_{1}^{V}}{\delta _{1}w_{1}^{V}+\delta _{2}w_{2}^{V}}\right) \mathrm {div}\hat{J}\\&=-\frac{w_{1}^{V}}{w_{1}^{V}+w_{2}^{V}}\mathrm {div}\hat{J}+\hat{J}\cdot \nabla \left( \frac{\delta _{1}w_{1}^{V}}{\delta _{1}w_{1}^{V}+\delta _{2}w_{2}^{V}}\right) +\left( \frac{\delta _{1}w_{1}^{V}}{\delta _{1}w_{1}^{V}+\delta _{2}w_{2}^{V}}\right) \mathrm {div}\hat{J}=b_{1}, \end{aligned}$$

where we used that \((\hat{c},\hat{J})\) solves \(\dot{\hat{c}}+\mathrm {div}\hat{J}=0\). Similarly, we see that \(\dot{c}_{2}+\mathrm {div}J_{2}=b_{2}\) and, by definition, we have \(b_{1}+b_{2}=0\). Moreover, boundary properties of \(\hat{J}\) remain for \(J=(J_{1},J_{2})\).

For the dissipation functionals, we obtain that

$$\begin{aligned} \mathfrak {D}_\varepsilon ^V(\mu )\le \int _0^T\mathcal {D}^V_\varepsilon (\mu ,J,b) = \mathfrak {D}_0^V (\mu ) + \int _{0}^{T}\!\!\int _{\Omega }\!\widetilde{\mathsf {C}}\left( \frac{\sqrt{c_{1}c_{2}}}{\varepsilon },b_{2}\right) \mathrm {d}x\mathrm {d}t, \end{aligned}$$

where we used formula (5.1) and that the reconstructed concentrations satisfy \(\frac{c_{1}}{w_{1}^{V}}=\frac{c_{2}}{w_{2}^{V}}\).

That means, that for proving that the constant sequence \(\mu ^{\varepsilon }=\mu \) is a recovery sequence, it suffices to show that \(\int _{0}^{T}\!\!{\int _{\Omega }}\! \widetilde{\mathsf {C}}\left( \frac{\sqrt{c_{1}c_{2}}}{\varepsilon },b_{2}\right) \mathrm {d}x\mathrm {d}t\rightarrow 0\) as \(\varepsilon \rightarrow 0\). This is, in fact, shown in the next lemma under the assumption that \(\mathrm {div}\hat{J},\hat{J}\in \mathrm {L}^{\mathsf {C}}([0,T]\times \Omega )\). The proof basically uses the monotonicity property of the Legendre dual function \(\widetilde{\mathsf {C}}(a_{1},b)\le \widetilde{\mathsf {C}}(a_{2},b)\) as \(a_{1}\ge a_{2}\) (see Lemma 4.3), its superlinear growth and the dominated convergence theorem.

Lemma 5.15

Let \(\hat{c}\in \mathrm {L}^{1}(\Omega \times [0,T])\) with \(\hat{c}\ge \frac{1}{C}\) a.e. in \([0,T]\times \Omega \) for a constant \(C>0\), and let \(\hat{J}:[0, T]\, \times \Omega \rightarrow \mathbb {R}^{d}\) satisfy \(\mathrm {div}\hat{J},|\hat{J}|\in \mathrm {L}^{\mathsf {C}}([0,T]\times \Omega )\). Then for \(b_2\) defined by (5.3) we have \(\int _{0}^{T}\!\!{\int _{\Omega }}\widetilde{\mathsf {C}}\left( \frac{\sqrt{c_{1}c_{2}}}{\varepsilon },b_{2}\right) \mathrm {d}x\mathrm {d}t\rightarrow 0\) as \(\varepsilon \rightarrow 0\).

Proof

Since \(\hat{c}\ge \frac{1}{C}\) a.e., Lemma 4.3 yields the pointwise estimate \(\widetilde{\mathsf {C}}\left( \frac{\sqrt{c_{1}c_{2}}}{\varepsilon },b_{2}\right) \le \widetilde{\mathsf {C}}\left( \frac{1}{C\varepsilon },b_{2}\right) =\frac{1}{C\varepsilon }\mathsf {C}\left( C\varepsilon b_{2}\right) \). Moreover, using the inequality (3.5) we have

$$\begin{aligned} \int _{0}^{T}{\int _{\Omega }}\widetilde{\mathsf {C}}\left( \frac{\sqrt{c_{1}c_{2}}}{\varepsilon },b_{2}\right) \mathrm {d}x\mathrm {d}t\le \tfrac{1}{C\varepsilon }\int _{0}^{T}\int _{\Omega }\mathsf {C}\left( C\varepsilon b_{2}\right) \mathrm {d}x\mathrm {d}t\le \int _{0}^{T}\int _{\Omega }2|b_{2}|\log (C\varepsilon |b_{2}|+1)\mathrm {d}x\mathrm {d}t. \end{aligned}$$

By assumption, we have that \(\mathrm {div}\hat{J},|\hat{J}|\in \mathrm {L}^{\mathsf {C}}([0,T]\times \Omega )\). Since \(V\in \mathrm {C}^{1}(\Omega )\) and the Orlicz space \(\mathrm {L}^{\mathsf {C}}([0,T]\times \Omega )\) is a Banach space, we conclude that \(b_{2}\in \mathrm {L}^{\mathsf {C}}([0,T]\times \Omega )\). By the inequality (3.5) for \(\mathsf {C}\), this implies that for \(\varepsilon <\frac{1}{C}\), the right-hand side is bounded. We show that (for a subsequence) the integrand converges to zero pointwise a.e. in \([0,T]\times \Omega \). By the dominated convergence theorem, this would imply that \(\int _{0}^{T}{\int _{\Omega }}\widetilde{\mathsf {C}}\left( \frac{\sqrt{c_{1}c_{2}}}{\varepsilon },b_{2}\right) \mathrm {d}x\mathrm {d}t\rightarrow 0\) as \(\varepsilon \rightarrow 0\).

To see that the integrand converges to zero pointwise, we firstly observe that \(b_{2}\in \mathrm {L}^{\mathsf {C}}([0,T]\times \Omega )\subset \mathrm {L}^{1}([0,T]\times \Omega )\), which means that

$$\begin{aligned} \int _{0}^{T}\int _{\Omega }\log (\varepsilon |b_{2}|+1)\mathrm {d}x\mathrm {d}t\le \varepsilon \int _{0}^{T}\int _{\Omega }|b_{2}|\mathrm {d}x\mathrm {d}t\rightarrow 0. \end{aligned}$$

Hence, (for a subsequence) \(\log (\varepsilon |b_{2}|+1)\) converges pointwise to zero and, thus, also \(|b_{2}|\log (C\varepsilon |b_{2}|+1)\). \(\square \)

In fact, the above proof is quite robust and already suggests that the same convergence holds even if \(\hat{c}^{\varepsilon }\) might slowly converge to zero in \(\Omega \times [0,T]\) and \(\Vert b_{\varepsilon }^{2}\Vert _{\mathrm {L}^p}\approx \varepsilon ^{-\alpha }\) for some \(\alpha >0\) and \(p>1\), (see Proposition 5.21).

5.3.2 Auxiliary results for constructing the recovery sequence for general measures

First, we show how to overcome the positivity assumption by a controlled positive shift.

Lemma 5.16

For all \(\mu ^{0}=c^{0}\,\mathrm {d}x\) satisfying \(\mathfrak {D}_{0}^{V}(\mu ^{0})=\hat{\mathfrak {D}}(\hat{\mu }^{0})<\infty \) there is a sequence \((\hat{c}^{\gamma })_{\gamma >0}\) of densities satisfying \(\hat{c}^{\gamma }\ge \gamma \), \(\hat{c}^{\gamma }\rightarrow \hat{c}\) in \(\mathrm {L}^{1}([0,T]\times \Omega )\) as \(\gamma \rightarrow 0\) such that for their corresponding measures we have \(\hat{\mathfrak {D}}(\hat{\mu }^{\gamma })=\mathfrak {D}_{0}(\mu ^{\gamma })\rightarrow \mathfrak {D}_{0}(\mu ^{0})=\hat{\mathfrak {D}}(\mu ^{0})\) as \(\gamma \rightarrow 0\) and \(\sup _{\gamma \in ]0,1]}\mathrm {ess\,sup}_{t\in [0,T]}\mathcal {E}(\mu ^{\gamma }(t))<\infty \).

Proof

For small \(\gamma >0\), we define \(\hat{c}^{\gamma }:=\frac{1}{Z_{\gamma }}(\hat{c}+2\gamma )\) , where \(Z_{\gamma }=1+2\gamma |\Omega |=1+2\gamma >0\) is the normalization factor such that \(\int _{\Omega }\hat{c}^{\gamma }\mathrm {d}x=1\). Hence, \(Z_{\gamma }\searrow 1\), \(\hat{c}^{\gamma }\rightarrow \hat{c}\), \(\sup _{\gamma \in ]0,1]}\mathrm {ess\,sup}_{t\in [0,T]}\mathcal {E}(\mu ^{\gamma }(t))<\infty \) and w.l.o.g. we assume that \(\hat{c}^{\gamma }\ge \gamma \). Moreover, we define \(\hat{J}^{\gamma }:=\frac{1}{Z_{\gamma }}\hat{J}\). Clearly, \(\hat{J}^{\gamma }\cdot \nu =0\) on \(\partial \Omega \) and \((\hat{c}^{\gamma },\hat{J}^{\gamma })\) solves the continuity equation \(\dot{\hat{c}}^{\gamma }+\mathrm {div}\hat{J}^{\gamma }=0\). We compute the terms in the dissipation functional \(\mathfrak {D}_{0}^{V}(\mu ^{\gamma }).\) We have \(\hat{\rho }^{V,\gamma }=\frac{\hat{c}^{\gamma }}{w^{V}}=\frac{1}{Z_{\gamma }}\frac{\hat{c}+2\gamma }{w^{V}}\). Using \(\max \left\{ \Vert \nabla \left( 1/w^{V}\right) \Vert _{\mathrm {L}^\infty },\Vert w^{V}\Vert _{\mathrm {L}^\infty }\right\} \le C\), we get

$$\begin{aligned}&\nabla \hat{\rho }^{V,\gamma }=\frac{1}{Z_{\gamma }}\left\{ \nabla \left( \frac{\hat{c}}{w^{V}}\right) +2\gamma \nabla \left( \frac{1}{w^{V}}\right) \right\} \ \ \Rightarrow \ \ \left| \nabla \hat{\rho }^{V,\gamma }\right| \le \frac{1}{Z_{\gamma }}\left\{ \left| \nabla \hat{\rho }^{V}\right| +2\gamma C\right\} . \end{aligned}$$

Using the estimate \(\frac{1}{a+\delta }\le \frac{1}{a}\) for \(a,\delta >0\), \(\frac{1}{Z_{\gamma }}\le 1\) and the inequality \(2xy\le \sqrt{\gamma }x^{2}+\frac{1}{\sqrt{\gamma }}y^{2}\), we get the pointwise estimate

$$\begin{aligned} \frac{|\nabla \hat{\rho }^{V,\gamma }|^{2}}{\hat{\rho }^{V,\gamma }}&=\frac{1}{Z_{\gamma }}\frac{\left( \left| \nabla \hat{\rho }^{V}\right| +2\gamma C\right) ^{2}}{\hat{\rho }^{V}+\frac{2\gamma }{w^{V}}}=\frac{1}{Z_{\gamma }}\frac{\left| \nabla \hat{\rho }^{V}\right| ^{2}+4\gamma C\left| \nabla \hat{\rho }^{V}\right| +4\gamma ^{2}C^{2}}{\hat{\rho }^{V}+\frac{2\gamma }{w^{V}}}\\&\le \frac{\left| \nabla \hat{\rho }^{V}\right| ^{2}}{\hat{\rho }^{V}}+2\frac{2\gamma C\left| \nabla \hat{\rho }^{V}\right| }{\hat{\rho }^{V}+\frac{2\gamma }{w^{V}}}+\frac{4\gamma ^{2}C^{2}}{\hat{\rho }^{V}+\frac{2\gamma }{w^{V}}}\\&\le \frac{\left| \nabla \hat{\rho }^{V}\right| ^{2}}{\hat{\rho }^{V}}+\frac{1}{\hat{\rho }^{V}+\frac{2\gamma }{w^{V}}}\left( \sqrt{\gamma }\left| \nabla \hat{\rho }^{V}\right| ^{2}+\tfrac{1}{\sqrt{\gamma }}\left\{ 2\gamma C\right\} ^{2}\right) +2\gamma C^{2}w^{V}\\&\le \frac{\left| \nabla \hat{\rho }^{V}\right| ^{2}}{\hat{\rho }^{V}}(1+\sqrt{\gamma })+2\sqrt{\gamma }w^{V}C^{2}(1+\sqrt{\gamma }). \end{aligned}$$

Hence, \(\int _{0}^{T}\int _{\Omega }\hat{\delta }^{V}w^{V}\frac{|\nabla \hat{\rho }^{V,\gamma }|^{2}}{\hat{\rho }^{V,\gamma }}\mathrm {d}x\mathrm {d}t\rightarrow \int _{0}^{T}\int _{\Omega }\hat{\delta }^{V}w^{V}\frac{|\nabla \hat{\rho }^{V}|^{2}}{\hat{\rho }^{V}}\mathrm {d}x\mathrm {d}t\) as \(\gamma \rightarrow 0\).

Similarly, we get

$$\begin{aligned} \int _{0}^{T}\int _{\Omega }\frac{|\hat{J}^{\gamma }|^{2}}{\hat{\delta }^V\hat{c}^{\gamma }}\mathrm {d}x\mathrm {d}t=\frac{1}{Z_{\gamma }}\int _{0}^{T}\int _{\Omega }\frac{|\hat{J}|^{2}}{\hat{\delta }^V\left( \hat{c}+2\gamma \right) }\mathrm {d}x\mathrm {d}t\le \int _{0}^{T}\int _{\Omega }\frac{|\hat{J}|^{2}}{\hat{\delta }^V\hat{c}}\mathrm {d}x\mathrm {d}t, \end{aligned}$$

which implies that \(\int _{0}^{T}\!\!\int _{\Omega }\widetilde{\mathsf {Q}}(\hat{\delta }^V\hat{c}^{\gamma },\hat{J}^{\gamma })\mathrm {d}x\mathrm {d}t\le \int _{0}^{T}\!\!\int _{\Omega }\widetilde{\mathsf {Q}}(\hat{\delta }^V\hat{c},\hat{J})\mathrm {d}x\mathrm {d}t\). Hence, we conclude that \(\hat{c}^{\gamma }\ge \gamma \), \(\hat{c}^{\gamma }\rightarrow \hat{c}\) and \(\hat{\mathfrak {D}}(\hat{c}^{\gamma })\rightarrow \hat{\mathfrak {D}}(\hat{c})\) as \(\gamma \rightarrow 0\). \(\square \)

Next, we are going to show that the reaction flux \(b_{1}=-b_{2}\) given by (5.3) can be made sufficiently smooth, i.e. at least in \(\mathrm {L}^{\mathsf {C}}\) which would allow us to proceed similar as in Lemma 5.15.

Recalling the formula for the reconstructed flux, we have

$$\begin{aligned} b_{2}=\left( \frac{\delta _{1}-\delta _{2}}{\delta _{1}w_{1}^{V}+\delta _{2}w_{2}^{V}}\frac{w_{1}^{V}w_{2}^{V}}{w_{1}^{V}+w_{2}^{V}}\right) \mathrm {div}\hat{J}+\hat{J}\cdot \nabla \left( \frac{\delta _{1}w_{1}^{V}}{\delta _{1}w_{1}^{V}+\delta _{2}w_{2}^{V}}\right) =:a_{1}\mathrm {div}\hat{J}+\hat{J}\cdot a_{2}, \end{aligned}$$
(5.4)

and \(b_{2}=-b_{1}\), where \(a_{1}\in \mathrm {C}^{1}(\Omega ,\mathbb {R})\), \(a_{2}\in \mathrm {C}^{0}(\Omega ,\mathbb {R}^{d})\). In particular, the regularity of \(b_{1}=-b_{2}\) does not depend on \(a_{1},a_{2}\). We are going to prove that \(\mathrm {div}\hat{J}\) and \(\hat{J}\) have enough regularity. The regularity of \(\hat{J}\) follows from the bound on the dissipation functional. The regularity of \(\mathrm {div}\hat{J}\) is achieved by mollification.

First, we show that \(\hat{J}\in \mathrm {L}^{\tilde{p}}([0,T]\times \Omega ),\) for some \(\tilde{p}>1\). Clearly, we have \(\hat{J}\in \mathrm {L}^{1}([0,T]\times \Omega )\) by Lemma 5.1 and the bound on the dissipation functional. To improve the regularity of \(\hat{J}\), we show that \(\hat{c}\in \mathrm {L}^{p}([0,T]\times \Omega )\), which follows from the bound on the dissipation functional \(\mathfrak {D}_{0}^V\) and the energy functional \(\mathcal {E}\).

Lemma 5.17

Let \(\mu ^{0}\in \mathrm {L}_{w}^{\infty }([0,T],Q)\) such that \(\mathfrak {D}_{0}^{V}(\mu ^{0})<\infty \) and \(\mathrm {ess\,sup}_{t\in [0,T]}\mathcal {E}(\mu ^{0}(t))<\infty \). Then the density \(\hat{c}^{0}\) is in \(\mathrm {L}^{p}([0,T]\times \Omega )\) with \(p=\frac{d+1}{d}>1\).

Proof

The bound on the energy yields \(\hat{c}\in \mathrm {L}^{\infty }([0,T],\mathrm {L}^{1}(\Omega ))\). The bound on the slope part of the dissipation functional together with Lemma 5.1 provides that \(\hat{c}\in \mathrm {L}^{1}([0,T],\mathrm {W}^{1,1}(\Omega ))\) (see also Lemma 5.7). By the Sobolev embedding theorem, we have the compact embedding \(\mathrm {W}^{1,1}(\Omega )\subset \mathrm {L}^{q}(\Omega )\), where \(1-\tfrac{1}{d}>\tfrac{1}{q}\Leftrightarrow q<\tfrac{d}{d-1}\). Thus \(\hat{c}\in \mathrm {L}^{\infty }([0,T],\mathrm {L}^{1}(\Omega ))\cap \mathrm {L}^{1}([0,T],\mathrm {L}^{\tfrac{d}{d-1}}(\Omega ))\).

Next, we apply a classical interpolation result for Lebesgue spaces, see e.g. Theorem 5.1.2 of [5]. We have for all \(\theta \in ]0,1[\) that

$$\begin{aligned} {[}\mathrm {L}^{p_{1}}([0,T],X_{1}),\mathrm {L}^{p_{2}}([0,T],X_{2})]_{\theta }\simeq \mathrm {L}^{p_{\theta }}([0,T],[X,Y]_{\theta }), \end{aligned}$$

where \(\tfrac{1}{p_{\theta }}=\tfrac{1-\theta }{p_{1}}+\tfrac{\theta }{p_{2}}\). In our situation we have \(p_{1}=1,p_{2}=\infty \), \(X_{1}=\mathrm {L}^{\tfrac{d}{d-1}}(\Omega ),X_{2}=\mathrm {L}^{1}(\Omega )\). Hence, \(p_{\theta }=\tfrac{1}{1-\theta }>1\). Moreover, \([X_1,X_2]_{\theta }=[\mathrm {L}^{q}(\Omega ),\mathrm {L}^{1}(\Omega )]_{\theta }\simeq \mathrm {L}^{q_{\theta }}(\Omega )\), where \(\tfrac{1}{q_{\theta }}=\tfrac{1-\theta }{q}+\tfrac{\theta }{1}\). Setting \(p_{\theta }=q_{\theta }\), we conclude \(\tfrac{1-\theta }{1-2\theta }=q=\tfrac{d}{d-1}\). Solving \(p_{\theta }=q_{\theta }\) for \(\theta \), we obtain \(\theta =\tfrac{1}{1+d}\), and hence \(p_{\theta }=q_{\theta }=\tfrac{d+1}{d}>1\). Summarizing, we conclude \( \hat{c}\in \mathrm {L}^{\tfrac{d+1}{d}}([0,T]\times \Omega )\). \(\square \)

Remark 5.18

Iterating the above procedure, it is possible to show \(\hat{c}\in \mathrm {L}^{(d+2)/d}(([0,T]\times \Omega ))\) for \(d\ge 2.\)

Knowing integrability of \(\hat{c}\), we get also better integrability of the fluxes \(\hat{J}\in \mathrm {L}^{\tilde{p}}([0,T]\times \Omega )\) for some \(\tilde{p}>1\), which follows from the next lemma.

Lemma 5.19

Let \(\mu ^{0}\in \mathrm {L}_{w}^{\infty }([0,T]\times \Omega ,\mathbb {R}_{\ge 0}^{2})\) such that \(\mathfrak {D}_{0}^{V}(\mu ^{0})<\infty \) and \(\mathrm {ess\,sup}_{t\in [0,T]}\mathcal {E}(\mu ^{0}(t))<\infty \) and let \(\hat{J}\in \mathcal {M}([0,T]\times \Omega ,\mathbb {R}^{d})\) be the corresponding diffusion flux satisfying the continuity equation. Then \(\hat{J}\in \mathrm {L}^{\tilde{p}}([0,T]\times \Omega ,\mathbb {R}^{d})\) for \(\tilde{p}=\frac{2d+2}{2d+1}>1\).

Proof

First, we show that for all \(J\in \mathbb {R}^{d}\) and \(c>0\) we have

$$\begin{aligned} \frac{|J|^{2}}{c}+\frac{1}{p}c^{p}\ge \left( 1+\frac{1}{p}\right) |J|^{\frac{2p}{p+1}}. \end{aligned}$$

To see this, let us define for fixed \(J\in \mathbb {R}^{d}\) the function \(F:]0,\infty [\rightarrow \mathbb {R},\ F(c):=\frac{|J|^{2}}{c}+\frac{1}{p}c^{p}\). Clearly, \(F\ge 0\) and \(F(c)\rightarrow \infty \) as \(c\rightarrow 0\) or \(c\rightarrow \infty \). We compute the minimum. We have \(F'(c)=-|J|^{2}c^{-2}+c^{p-1}\) and hence the critical point is at \(c_{0}=|J|^{2/(p+1)}\). Inserting \(c_{0}\) into F we get \(F(c)\ge F(c_{0})=|J|^{2}|J|^{-2/(p+1)}+\frac{1}{p}|J|^{2p/(p+1)}=(1+\frac{1}{p})|J|^{2p/(p+1)}\).

Having this estimate, we see that by Lemma 5.17 we have \(\hat{c}\in \mathrm {L}^{p}([0,T]\times \Omega )\) for \(p=\frac{d+1}{d}\). This implies \(\hat{J}\in \mathrm {L}^{\tilde{p}}([0,T]\times \Omega ,\mathbb {R}^{d})\) for \(\tilde{p}=\frac{2\frac{d+1}{d}}{\frac{d+1}{d}+1}=\frac{2d+2}{2d+1}\). \(\square \)

To obtain regularity for the whole reaction flux \(b_{1} = -b_{2}\), we have to obtain regularity also for \(\mathrm {div}\hat{J}\). This is done by mollifying the solution \((\hat{c},\hat{J})\) of the continuity equation \(\dot{\hat{c}}+\mathrm {div}\hat{J}=0\) in time. We already know that \(\hat{c}:[0,T]\rightarrow \mathrm {Prob}(\Omega )\) is continuous by Lemma 5.11. With a slight abuse of notation, we denote by \(\hat{c}\) also a continuous extension on \(\mathbb {R}\) such that \(\hat{c}\in \mathrm {L}^{p}(\mathbb {R},\mathrm {L}^{p}(\Omega ))\). Now, we mollify in time and define \(\hat{c}^{\theta }(t)=\int _{\mathbb {R}}\hat{c}(s)\psi _{\theta }(t-s)\mathrm {d}s\) where \(\psi _{\theta }\) is a positive and symmetric mollifier that approximates a dirac-distribution as \(\theta \rightarrow 0\). Analogously, we define \(\hat{J}^{\theta }\) by convolution, i.e. \(\hat{J}^{\theta }(t)=\int _{\mathbb {R}}\hat{J}(s)\psi _{\theta }(t-s)\mathrm {d}s\). Since the continuity equation is linear, the smoothed functions \((\hat{c}^{\theta },\hat{J}^{\theta })\) satisfy again the continuity equation with the same no-flux boundary conditions. The next lemma shows how the dissipation functional \(\mathfrak {D}^V_0(\mu )\) can be approximated using mollification.

Lemma 5.20

Let \(\mu ^{0}\in \mathrm {L}_{w}^{\infty }([0,T]\times \Omega ,\mathbb {R}_{\ge 0}^{2})\) such that the a priori bounds \(\mathfrak {D}_{0}^{V}(\mu ^{0})<\infty \) and \(\mathrm {ess\,sup}_{t\in [0,T]}\mathcal {E}(\mu ^{0}(t))<\infty \) hold. Let \(\hat{J}\in \mathcal {M}([0,T]\times \Omega ,\mathbb {R}^{d})\) be the corresponding diffusion flux satisfying the continuity equation. Let \(\psi ^{\theta }:\mathbb {R}\rightarrow \mathbb {R}\) be a positive and symmetric mollifier. Define \(\hat{c}^{\theta }(t)=\int _{\mathbb {R}}\hat{c}(s)\psi _{\theta }(t-s)\mathrm {d}s\) and \(\hat{J}^{\theta }(t)=\int _{\mathbb {R}}\hat{J}(s)\psi _{\theta }(t-s)\mathrm {d}s\). Then, we have \(\sup _{\theta \in ]0,1]}\mathrm {ess\,sup}_{t\in [0,T]}\mathcal {E}(\mu ^{\theta }(t))<\infty \) and \(\int _0^T\mathcal {D}^V_0(\mu ^\gamma ,J^\gamma ,b^\gamma ) \mathrm {d}t\rightarrow \int _0^T\mathcal {D}^V_0(\mu ^0,J^0,b^0)\mathrm {d}t \).

Proof

The energy bound on \(\mu ^{\theta }\) is trivially satisfied. The convergence for the dissipation functional \(\int _0^T\mathcal {D}_0^V\mathrm {d}t\) follows directly as the proof of Lemma 8.1.10 in [1] since the integrand is convex in \((\hat{c},\hat{J})\). \(\square \)

With these preparations, we are able to show the remaining step in the proof of Theorem 5.13.

Proposition 5.21

Let \(\mu ^{0}\in \mathrm {L}_{w}^{\infty }([0,T]\times \Omega ,\mathbb {R}_{\ge 0}^{2})\) such that \(\mathfrak {D}_{0}^{V}(\mu ^{0})=\hat{\mathfrak {D}}_{0}(\hat{\mu })<\infty \) and \(\mathrm {ess\,sup}_{t\in [0,T]}\mathcal {E}(\mu ^{0}(t))<\infty \) and let \(\hat{J}\in \mathcal {M}([0,T]\times \Omega ,\mathbb {R}^{d})\) be the corresponding diffusion flux satisfying the continuity equation. Let \(\psi ^{\theta }:\mathbb {R}\rightarrow \mathbb {R}\) be a positive and symmetric mollifier, which is specified below. Let \(\hat{c}^{\theta },\hat{J}^{\theta }\) the mollified functions as in Lemma 5.20 and \(\hat{c}^{\theta ,\gamma },\hat{J}^{\theta ,\gamma }\) the mollified and shifted functions as in Lemma 5.16.

Let \(\psi ^{\theta }\) be such that \(\Vert \dot{\hat{c}}^{\theta }\Vert _{\mathrm {L}^{\tilde{p}}([0,T]\times \Omega )}\le C \varepsilon ^{-\lambda _1}\) and \(\gamma \ge C\varepsilon ^{1-\lambda _2}\) for \(\varepsilon , \theta ,\gamma \rightarrow 0\), where \(\tilde{p}=\tfrac{2d+2}{2d+1}\) is the integrability exponent of the fluxes as in Lemma 5.19, \(C>0\) is a positive constant and \(\lambda _1,\lambda _2\in [0,1[\), satisfy the inequality \(2\lambda _1(d+1)\le \lambda _2\). Then, we have \(|\mathfrak {D}_{\varepsilon }^V(\mu ^{\theta ,\gamma })-\mathfrak {D}^V_{0}(\mu ^{\theta ,\gamma })|\rightarrow 0\) as \(\varepsilon , \theta ,\gamma \rightarrow 0\).

Proof

First, we observe that for given \(\hat{c}\) such a mollifier and these constants \(\lambda _1,\lambda _2\) satisfying all the conditions can be easily constructed.

To prove the convergence, we follow the same strategy as in the proof of Lemma 5.15. Defining the reconstructed concentrations and fluxes as in (5.3) and using the formulas (5.1) and (5.2), we observe that

$$\begin{aligned} \mathfrak {D}_{\varepsilon }^V(\mu ^{\theta ,\gamma }) \le&\int _0^T \mathcal {D}^V_\varepsilon (\mu ^{\theta ,\gamma },J^{\theta ,\gamma },b^{\theta ,\gamma }) \ \mathrm {d}t \\ =&\int _0^T \mathcal {D}^V_0(\mu ^{\theta ,\gamma },J^{\theta ,\gamma },b^{\theta ,\gamma }) \ \mathrm {d}t +\int _{0}^{T}\int _{\Omega }\widetilde{\mathsf {C}}\left( \frac{\sqrt{c_{1}^{\theta ,\gamma }c_{2}^{\theta ,\gamma }}}{\varepsilon }, b_{2}^{\theta ,\gamma }\right) \ \mathrm {d}x\ \mathrm {d}t\,. \end{aligned}$$

Lemma 5.16 and Lemma 5.20 show that the first term converges to \(\int _0^T\mathcal {D}_0^V(\mu ^0,J^0,b^0)\mathrm {d}t\) as \(\varepsilon ,\theta ,\gamma \rightarrow 0\). Hence, in order to prove the desired convergence it suffices to show that the second term tends to zero.

Using the bound from below on \(\hat{c}^{\epsilon }\), and the inequality \(\log (x+1)\le C_{\tilde{p}}x^{\tilde{p}-1}\), we get the pointwise estimate

$$\begin{aligned} \widetilde{\mathsf {C}}\left( \frac{\sqrt{c_{1}^{\theta ,\gamma }c_{2}^{\theta ,\gamma }}}{\varepsilon },b_{2}^{\theta ,\gamma }\right)&\le \widetilde{\mathsf {C}}\left( C\varepsilon ^{-\lambda _2},b_{2}^{\theta ,\gamma }\right) \le C\varepsilon ^{-\lambda _2}\mathsf {C}\left( C^{-1}\varepsilon ^{\lambda _2}b_{2}^{\theta ,\gamma }\right) \\&\le 2C\varepsilon ^{-\lambda _2}|b_{2}^{\theta ,\gamma }|C^{-1}\varepsilon ^{\lambda _2}\log \left( C^{-1}\varepsilon ^{\lambda _2}|b_{2}^{\theta ,\gamma }|+1\right) \le \widetilde{C}|b_{2}^{\theta ,\gamma }|^{\tilde{p}}\varepsilon ^{\lambda _2(\tilde{p}-1)}, \end{aligned}$$

where \(\widetilde{C}=\widetilde{C}(C_{\tilde{p}},C)\). Using the continuity equation for \((\hat{c}^\theta , \hat{J}^\theta )\), we get from \(\Vert \hat{\dot{c}}^{\theta }\Vert _{\mathrm {L}^{\tilde{p}}([0,T]\times \Omega )}\le C\varepsilon ^{-\lambda _1}\), that \(\Vert \mathrm {div}\hat{J}^{\theta }\Vert _{\mathrm {L}^{\tilde{p}}([0,T]\times \Omega )}\le C\varepsilon ^{-\lambda _1}\). By Lemma 5.19, we also have \(\hat{J}^{\theta }\in \mathrm {L}^{\tilde{p}}([0,T]\times \Omega )\). Using the explicit formula (5.4) for \(b^{\theta ,\gamma }_2\), we have \(\Vert b_{2}^{\theta ,\gamma }\Vert _{\mathrm {L}^{\tilde{p}}([0,T]\times \Omega )}\le \bar{C} \varepsilon ^{-\lambda _1}\) with a new constant \(\bar{C}>0\). Hence, we get

$$\begin{aligned} \int _{0}^{T}\int _{\Omega }\widetilde{\mathsf {C}}\left( \frac{\sqrt{c_{1}^{\theta ,\gamma }c_{2}^{\theta ,\gamma }}}{\varepsilon }, b_{2}^{\theta ,\gamma }\right) \ \mathrm {d}x\ \mathrm {d}t \lesssim \varepsilon ^{-\tilde{p}\lambda _1} \varepsilon ^{\lambda _2(\tilde{p}-1)}=\varepsilon ^{\frac{1}{2d+1}\left\{ \lambda _2- (2d+2)\lambda _1\right\} }. \end{aligned}$$

Since \(\lambda _1, \lambda _2\) satisfy \((2d+2)\lambda _1\le \lambda _2\), we conclude that the right-hand side converges to zero, which proves the claim. \(\square \)

6 Remarks for reaction-diffusion systems involving more species

In the final section, we describe the situation not for two but for general \(I\in \mathbb {N}\) species. Considering \(I\in \mathbb {N}\) species \(X_i\) which diffuse and react linearly \(X_i\leftrightharpoons X_j\) the evolution of their concentrations \(c\in \mathbb {R}_{\ge 0}^{I}\) is given by

$$\begin{aligned} \dot{c}=\mathrm {diag}(\delta _{1},\dots ,\delta _{I})\Delta c+A^{\varepsilon }c\ , \end{aligned}$$

where \(\delta _i>\) are diffusion coefficients, \(A^{\varepsilon }=A^{S}+\frac{1}{\varepsilon }A^{F}\) is a Markov generator (preserving positivity and total mass), which consists of a slow part and a fast part. The main assumption is that \(A^{\varepsilon }\) satisfies detailed balance with respect to its stationary measure \(w^{\varepsilon }\). Similar to [30, 34], we are going to assume that the stationary vector \(w^{\varepsilon }\) satisfies \(w^{\varepsilon }\rightarrow w^{0}\) as \(\varepsilon \rightarrow 0\) and that \(w^{0}>0\). The positivity of the limit stationary measure \(w^{0}\) means that in the limit the evolution is not degenerate and the concentration for all species is present.

The gradient structure is defined on the state space

$$\begin{aligned} Q=\mathrm {Prob}(\Omega \times \left\{ 1,\cdots ,I\right\} ):=\left\{ \mu =(\mu _{1},\dots ,\mu _{I})\ : \ \mu _{i}\in \mathcal{M}(\Omega ), \ \mu _{i}\ge 0, \sum ^{I}_{i=1} \mu _{i}(\Omega )=1\right\} \!. \end{aligned}$$

The driving energy functional \(\mathcal {E}_{\varepsilon }:X\rightarrow \mathbb {R}_{\infty }\) has the form

$$\begin{aligned} \mathcal {E}_{\varepsilon }(\mu )={\left\{ \begin{array}{ll} \int _{\Omega }\sum _{j=1}^{I}E_{B}\left( \frac{c_{j}}{w_{j}^{\varepsilon }}\right) w_{j}^{\varepsilon }\mathrm {d}x, &{} \mathrm {~if~}\mu =c\,\mathrm {d}x\\ \infty , &{} \mathrm {otherwise}. \end{array}\right. }&, \end{aligned}$$

and the dual dissipation potential splits into two parts

$$\begin{aligned} \mathcal {R}^{*}(\mu ,\xi )&=\mathcal {R}_{\mathrm {diff}}^{*}(\mu ,\xi )+\mathcal {R}_{\mathrm {react}}^{*}(\mu ,\xi )\\ \mathcal {R}_{\mathrm {diff}}^{*}(\mu ,\xi )&=\frac{1}{2}\int _{\Omega }\sum _{j=1}^{I}\delta _{j}|\nabla \xi _{j}(x)|^{2}\mathrm {d}\mu _{j},\ \mathcal {R}_{\mathrm {react},\varepsilon }^{*}(\mu ,\xi )=\int _{\Omega }\sum _{i<j}\kappa _{ij}^{\varepsilon }\mathsf {C}^{*}(\xi _{i}(x)-\xi _{j}(x))\,\mathrm {d}\sqrt{\mu _{i}\mu _{j}}, \end{aligned}$$

where \(\kappa _{ij}^{\varepsilon }:=A_{ij}^{\varepsilon }\left( \frac{w_{j}^{\varepsilon }}{w_{i}^{\varepsilon }}\right) ^{1/2}\). In particular, the reaction part of the dissipation potential splits into a fast part and a slow part

$$\begin{aligned} \mathcal {R}_{\mathrm {react},\varepsilon }^{*}(\mu ,\xi )&=\mathcal {R}_{\mathrm {slow},\varepsilon }^{*}(\mu ,\xi )+\frac{1}{\varepsilon }\mathcal {R}_{\mathrm {fast},\varepsilon }^{*}(\mu ,\xi )\\ \mathcal {R}_{\mathrm {xy},\varepsilon }^{*}(\mu ,\xi )&=\int _{\Omega }\sum _{i<j}\tilde{\kappa }_{ij}^{\varepsilon }\,\mathsf {C}^{*}(\xi _{i}(x)-\xi _{j}(x))\ \mathrm {d}\sqrt{\mu _{i}\mu _{j}},\quad \mathrm {xy}\in \left\{ \mathrm {slow},\mathrm {fast}\right\} , \end{aligned}$$

where \(\tilde{\kappa }_{ij}^{\varepsilon }\) are bounded and positive uniformly in \(\varepsilon >0\). In particular, we call a reaction and its flux \(b_{ij}\) slow if \(A_{ij}^{\varepsilon }=O(1)\) and fast if \(A_{ij}^{\varepsilon }=O(\varepsilon ^{-1})\). Due to the detailed balance assumption and by \(w^{0}>0\), the distinction between fast and slow reactions is indeed well-defined.

In the remainder of the section, we briefly explain how to generalize the proof of the EDP-convergence result also for this situation. Major differences occur at two stages, namely in 1) deriving compactness for slow reaction fluxes, 2) proving the limsup-estimate. The reaction fluxes of the fast reactions are not seen in the limit and have to be reconstructed in an analogous way as in the two-species situation.

6.1 Compactness for slow reaction fluxes and liminf-estimate

Similarly to the two species situation, compactness for the concentrations, by using strong convergence for coarse-grained variables and convergence towards the slow manifold, can be derived (cf. Lemmas 5.6, 5.10). Moreover, compactness of diffusion fluxes and spatial regularity follows, too (cf. Lemma 5.19).

In contrast to the situation of two species connected with one fast reaction, where no slow reaction fluxes exists, compactness for slow reaction fluxes \(b_{ij}^{\varepsilon }\) has to be derived in the multi species case. This follows immediately from Lemma 5.2, once compactness of \(\sqrt{c_{i}^{\varepsilon }c_{j}^{\varepsilon }}\) is obtained. At this point it is clear that weak convergence of \(c^{\varepsilon }\rightharpoonup c^{0}\) is not sufficient. Instead the previously derived strong convergence of \(c^{\varepsilon }\rightarrow c^{0}\) implies by dominated convergence also strong convergence of \(\sqrt{c_{i}^{\varepsilon }c_{j}^{\varepsilon }}\rightarrow \sqrt{c_{i}^{0}c_{j}^{0}}\) in \(\mathrm {L}^{1}([0,T]\times \Omega )\), and hence, compactness for the slow fluxes \(b_{ij}^{\varepsilon }\). Compactness for fast reaction fluxes can not be obtained as already mentioned in Remark 5.5. Having derived compactness, the proof of the liminf-estimate is exactly the same as for Theorem 5.12, since the functional \(\mathcal {D}_{\varepsilon }\) is jointly convex in all variables (cJb).

6.2 Equilibration and reconstruction of reaction fluxes und recovery sequence

A crucial observation throughout the proof of the \(\Gamma \)-convergence was Lemma 4.3, which provides an equilibration of fluxes assuming microscopic equilibria for the concentrations. In Lemma 4.4, we derived equilibration for the diffusion fluxes. Similarly, also an equilibration of the slow reaction fluxes can be derived. In [30] a general operator-theoretic coarse-graining and reconstruction procedure has been developed. This method can also be applied to derive coarse-grained fluxes and a coarse-grained continuity equation, see [42]. Importantly for us, as in (5.3) the reconstructed slow reaction fluxes depend linearly on the coarse-grained reaction fluxes. The fast reaction fluxes are then of the form

$$\begin{aligned} b_{ij}=a_{1}\mathrm {div}\hat{J}_{i}+a_{2}\hat{J}_{i}+\sum _{i,j}a_{ij}\hat{b}_{ij}, \end{aligned}$$

where \(a_{j},a_{ij}\in \mathrm {C}^{0}(\Omega ,\mathbb {R}^{k})\), \(k\in \mathbb {N}\).

In order to prove that the constant sequence for smooth and positive concentrations is indeed a recovery sequence, we follow the same reasoning as in the Lemmas 5.15 and 5.21. The only difference comes from the explicit dependence on the coarse-grained reaction flux \(\hat{b}_{ij}\), Using the bound on the limit dissipation functional (which provides bounds on \(\int _{0}^{T}\!\!\int _{\Omega }\widetilde{\mathsf {C}}(\sqrt{\hat{c}_{i}\hat{c}_{j}},\hat{b}_{ij})\mathrm {d}x\mathrm {d}t\)) and the next Lemma 6.1, we obtain that \(\hat{b}_{ij}\in \mathrm {L}^{\mathsf {C}}([0,T]\times \Omega )\) (we refer to [19] for a proof). Since \(\mathrm {L}^{\mathsf {C}}\) is an Orlicz-space, we conclude that the reconstructed fluxes \(b_{i}\) are in \(\mathrm {L}^{\mathsf {C}}\). This allows to proceed as in Lemma 5.21 and proves the existence of a recovery sequence.

Lemma 6.1

([19]) Let \(p>1\). Then, for all \(a\ge 0\) and \(B\in \mathbb {R}\) we have

$$\begin{aligned} \widetilde{\mathsf {C}}(a,B)\ge \left( 1-\frac{1}{p}\right) \mathsf {C}(B)-\frac{2}{p}a^{p}. \end{aligned}$$

In particular, setting \(a=\sqrt{c_{i}c_{j}}\) and \(B=b_{ij}\) we have

$$\begin{aligned} \int _{0}^{T}\!\!\int _{\Omega }\widetilde{\mathsf {C}}(\sqrt{c_{i}c_{j}},b_{ij})\mathrm {d}x\mathrm {d}t+\Vert \sqrt{c_{i}c_{j}}\Vert _{\mathrm {L}^{p}([0,T]\times \Omega )}^{p}\gtrsim \int _{0}^{T}\!\!\int _{\Omega }\mathsf {C}(b_{ij})\mathrm {d}x\mathrm {d}t, \end{aligned}$$

which proves that \(b_{ij}\in \mathrm {L}^{\mathsf {C}}([0,T]\times \Omega )\) if \(c_{i},c_{j}\in \mathrm {L}^{p}([0,T]\times \Omega )\) for some \(p>1\).