Non-equilibrium large deviations and parabolic-hyperbolic PDE with irregular drift

Large deviations of conservative interacting particle systems, such as the zero range process, about their hydrodynamic limit and their respective rate functions lead to the analysis of the skeleton equation; a degenerate parabolic-hyperbolic PDE with irregular drift. We develop a robust well-posedness theory for such PDEs in energy-critical spaces based on concepts of renormalized solutions and the equation's kinetic form. We establish these properties by proving that renormalized solutions are equivalent to classical weak solutions, extending concepts of [DiPerna, Lions; Ann. Math., 1989], [Ambrosio; Invent. Math., 2004] to the nonlinear setting. The relevance of the results toward large deviations in interacting particle systems is demonstrated by applications to the identification of l.s.c. envelopes of restricted rate functions, to zero noise large deviations for conservative (singular) SPDE, and to the $\Gamma$-convergence of rate functions. The first of these solves a long-standing open problem in the large deviations for zero range processes. The second makes rigorous an informal link between the non-equilibrium statistical mechanics approaches of macroscopic fluctuation theory and fluctuating hydrodynamics.


Introduction
Large deviations of conservative interacting particle systems, such as the zero range process, about their hydrodynamic limit are described by the rate function where is a monotone nonlinearity. This motivates the analysis of the corresponding skeleton equation, a parabolic-hyperbolic PDE with irregular drift g ∈ L 2 (T d ×[0, T ]) d , posed on the d-dimensional torus T d . A primary result of this work is the well-posedness and stability of (2). These results make important contributions to the understanding of large deviations in non-equilibrium conservative fluctuating systems, including applications to the large deviations of the zero range process (Theorem 39 below), to large deviations of conservative SPDE (Theorem 29 below), which rigorously establishes the link between the non-equilibrium statistical mechanics theories of fluctuating hydrodynamics and macroscopic fluctuation theory, and to the -convergence of the large deviations rate functions (Theorem 33 below).
To appreciate the difficulty in proving well-posedness and stability for (2) let us consider the model case given by the porous media equation (ρ) = ρ m , for some m ∈ [1, ∞). Scaling arguments (see Sect. 2.1 below) demonstrate that (2) is critical for controls g ∈ L 2 (T d × [0, T ]) d and initial conditions in L 1 (T d ), and is supercritical for initial conditions in L r (T d ), for every r ∈ (1, ∞). Hence, even in the case of independent particles (ρ) = ρ, the skeleton equation (2) with g ∈ L 2 (T d × [0, T ]) d is not of semilinear nature, since also on small scales the diffusive operator does not dominate the convective, irregular term. Consequently, the well-posedness and stability of solutions to (2) are challenging problems.
In this paper, we obtain a new a-priori estimate, which leads to the existence of weak solutions (see Definition 11 below) and optimal regularity estimates. In contrast to linear Fokker-Planck equations (see, for example, Le Bris and Lions [3,15] and Figalli [6], who also treat L p -drifts g under an additional (semi) boundedness assumption on the divergence ∇ · g), the supercriticality of (2) in L r (T d ) for r ∈ (1, ∞) makes it impossible to exploit L r (T d )-based estimates. Our estimates are based on entropy-entropy dissipation, which requires the nonnegativity of the initial data and yields that, for some c ∈ (0, ∞), with (ξ ) = log( (ξ )). We therefore define the energy space for the initial data to be Ent (T d ) = ρ 0 ∈ L 1 (T d ) : ρ 0 ≥ 0 and The proof of uniqueness is significantly more complicated due to the lower integrability of the solution provided by (3). For this reason, previous techniques such as Otto [16] do not apply. Indeed, the arguments of [16] were based on L r (T d )-estimates, for r ∈ (1, ∞). We instead introduce the concept of a renormalized kinetic solution (see Definition 3 below) to recover uniqueness, which introduces several new difficulties. Foremost, it is not obvious that weak solutions are renormalized kinetic solutions, a difficulty that should be expected from the first-order, linear case. In DiPerna and Lions [1] and Ambrosio [2], this equivalence is shown for drifts g with spatial regularity in W 1,p (respectively, BV ) and spatially bounded divergence-conditions that are almost optimal due to the counterexample of Depauw [17]. In treating (2) it is therefore necessary to exploit the degenerate regularity implied by (3), and to estimate additional commutator errors appearing in the nonlinear terms using the regularity of 1 2 (ρ) instead of ρ itself. Furthermore, even on the renormalized level, standard techniques for uniqueness (see, for example, [11]) are not applicable, since the estimate (3) does not imply the decay of entropy and parabolic defect measures at infinity.
The three main theorems below are obtained under general assumptions that apply, in particular, to the porous media nonlinearity (ξ ) = ξ m for every m ∈ [1, ∞). The first main result is the well-posedness and stability of (2).  (2). Moreover, if ρ 1 , ρ 2 are weak solutions to (2) with initial data ρ 1 0 , ρ 2 0 ∈ Ent (T d ) and the same control g ∈ L 2 (T d × [0, T ]) d , (2) Let ρ n 0 , ρ 0 ∈ Ent (T d ) and g n , g ∈ L 2 (T d ×[0, T ]) d satisfy ρ n 0 ρ 0 and g n g weakly in L 1 (T d ) and L 2 (T d × [0, T ]) d respectively. Then, the corresponding weak solutions ρ n and ρ satisfy Having established the well-posedness and stability of the skeleton equation, we next describe the consequences of our results for the non-equilibrium large deviations of interacting particle systems.
The hydrodynamic limit of the empirical density field μ n of the zero range process on the torus with mean local jump rate is the solution to the nonlinear diffusion equation see, for example, Ferrari, Presutti, and Vares [18] and Kipnis and Landim [19]. The particle system will exhibit large fluctuations about this limit which, though infrequent, can have catastrophic effects, such as earthquakes or mechanical failure (see, for example, [20][21][22]). It is thus important to understand and simulate such rare events. On an exponential scale, the (im)probability of a large fluctuation is described in terms of a large deviations principle with rate function I . The general approach to large deviations in interacting particle systems, introduced in the seminal works [23,24], relies on two ingredients to be verified in a case by case manner: first, the so-called superexponential estimates, second, the identification of the lower semicontinuous (l.s.c.) envelope of the rate function restricted to smooth and strictly positive fluctuations, see [19,Chap. 10]. This can be a challenging problem, see, for example, Bodineau, Gallagher, Saint-Raymond, and Simonella, [25,Theorem 3], and the second ingredient is required, since, in a first step, the large deviations lower bound can be established only for nice enough fluctuations. More precisely, one first restricts to the space S of positive fluctuations ρ so that for some strictly positive ρ 0 ∈ C ∞ (T d ) and H ∈ C 3,1 (T d × [0, T ]). The extension to a full large deviations result requires the identification of the l.s.c. envelope of the rate function (1) restricted to S. For the symmetric simple exclusion process, this envelope can be identified, due to the convexity of the rate function [19, Chap. 10, Lemma 5.5]; a property not available for the zero range process. For this reason, for the zero range process, so far, only a restricted large deviations estimate is known, in the sense that the lower bound is known only for nice enough fluctuations, see [26, Theorem 1, and the discussion on p. 66]. The identification of the l.s.c. envelope has remained an open problem for over twenty years since then, and is resolved in the present work on the torus.
Theorem (Theorem 39 below) Let satisfy Assumptions 6, 10, and 15 below. Then, the rate function (1) is equal to the lower semicontinuous envelope of its restriction to the space S.
Since the assumptions on in this theorem include the case of degenerate , e.g.
(ρ) = ρ m for every m ∈ [1, ∞), its applicability goes much beyond the fluctuations of the zero range process. We next demonstrate this by its application to the large deviations for degenerate stochastic PDEs with conservative noise, and their relation to the fluctuations of interacting particle systems.
Macroscopic fluctuation theory (MFT) introduces a general framework for nonequilibrium diffusive systems (see, for example, Bertini, De Sole, Gabrielli, Jona Lasinio, and Landim [27], Derrida [28]), thus extending Onsager-like near to equilibrium theories for non-equilibrium thermodynamics. MFT, which is based on a constitutive formula for large fluctuations around thermodynamic variables, like density and current, can be justified by fluctuating hydrodynamics (see, for example, Hohenberg and Halperin [29], Landau, and Lifshitz [30], Spohn [31], Bouchet, Gawȩdzki, and Nardini [32]). The latter postulates a conservative, singular stochastic PDE describing fluctuations in systems out of equilibrium. The fundamental ansatz of MFT can then be obtained, informally, from fluctuating hydrodynamics as the zero noise large deviations principle for this stochastic PDE. In addition to this conceptual relevance, the relation between zero noise large deviations for conservative stochastic PDE and MFT may serve as the basis for the development of importance sampling techniques to numerically simulate rare events in systems far from equilibrium (see, for example, E, Ren, and Vanden-Eijnden [20], Grafke and Vanden-Eijnden [21], Vanden-Eijnden and Weare [22]). We provide a rigorous link between the zero-noise large deviations of conservative stochastic PDE and the large deviations rate functions appearing in interacting particle systems, including zero range processes, the latter being an example for MFT.
The rate function (1) is informally connected to the small-noise large deviations (ε → 0, K → ∞) of the stochastic PDE where ξ K is an R d -valued spatially correlated noise, converging to space-time white noise ξ as K → ∞, see, for example, Dirr, Stamatakis, and Zimmer [33] and Giacomin, Lebowitz, and Presutti [34,Sect. 4]. In particular, in the ultraviolet limit K → ∞, this observation links the case of independently diffusing particles to the Dean-Kawasaki equation which has been considered, for example, by Dean [35], Kawasaki [36], Donev, Fai, and Vanden-Eijnden [37], Donev and Vanden-Eijnden [38], Konarovskyi and von Renesse [39,40], and Lehmann, Konarovskyi, and von Renesse [41]. This relationship, however, had to remain informal because (5) has only recently been shown to be well-posed by the authors in [42]. In addition, the ultraviolet limit K → ∞ of (5) is supercritical in the language of singular stochastic PDE and regularity structures (see, for example, Hairer [43], Gubinelli, Imkeller, and Perkowski [44]), and therefore falls outside the scope of the theory. For a discussion of this issue and the appearance of ultraviolet divergences in fluctuating hydrodynamics see [27, p. 595]. Furthermore, it has been observed in [41] that a renormalization is necessary in order to obtain function-valued solutions to (6). These renormalization terms destroy the relationship between the stochastic PDE and the zero range process, since they appear in the rate function for (6) and thereby lead to incorrect predictions of rare events for the particle process. We refer to Hairer and Weber [45] for a discussion of related aspects in the context of the (singular) stochastic quantization equation.
To address the issue of renormalization, the main result of this work identifies a joint scaling regime ε 1 /K(ε) such that possible renormalization constants vanish, and such that the solutions ρ ε,K(ε) constructed in [42] correctly simulate the rare events in the particle system by satisfying a large deviations principle with rate function (1) (see (110) for the precise definition) in the strong L 1 t,x -topology. A simplified version of these techniques prove that, for every fixed K ∈ N, the solutions to (5) satisfy a small-noise large deviations principle with rate function for P K g ∈ L 2 (T d × [0, T ]) d the Fourier projection of g onto the Fourier modes of the noise ξ K . The final result of this work is the -convergence of these rate functions, as K → ∞, to the rate function (1).

Comments on the literature
The well-posedness of (degenerate) nonlinear parabolic-hyperbolic PDEs: for a detailed overview of the available methods for the well-posedness of (degenerate) PDEs of porous medium type, that is (2) with g = 0, we refer to Vázquez [46]. In particular, this includes several approaches to the uniqueness of solutions, for example, based on H −1 -monotonicity [46,Sect. 6.7], going back to Brezis [47], on the well-posedness of weak solutions [46,Sect. 5.3], due to Gilding [48] and Gilding and Peletier [49], on the accretivity in L 1 and mild solutions [46,Sect. 10.2], due to Bénilan [50] and Crandall [51], and on the L 1 -contraction by duality arguments [46,Sect. 6.2], going back to Bénilan, Crandall, and Pierre [52]. For non-trivial g, equation (2) is a combination of a porous-medium type operator and a scalar conservation law. Since both of these are formally L 1 -accretive, the literature on such PDEs concentrates on L 1 -based approaches. The abstract theory of accretive operators in Banach spaces, by Bénilan [50], Crandall [51], and Crandall and Liggett [53] can, in principle, be used to prove the well-posedness of limit solutions of implicit Euler schemes. Besides being restricted to regular coefficients in time, since this general theory does not offer a characterization of said limit solutions in terms of the PDE, it cannot offer a solution to the uniqueness and stability problems addressed in this work.
In the context of hyperbolic-parabolic PDEs with regular coefficients, the notion of entropy solutions goes back to Vol'pert and Hudjaev [54], proving the existence and partial uniqueness of BV -solutions. This was extended by Alt and Luckhaus [55] and Otto [16] to the doubly nonlinear setting. The uniqueness of entropy solutions was first shown by Carrillo [56,57]. Notably, these works do not touch the issue of irregular coefficients, and they rely on the notion of weak solutions, that is, solutions satisfying ∇ (ρ) ∈ L 2 t,x , a property that cannot be expected in the case of the irregular coefficients of the present work due to the supercriticality of the equation, see Sects. 2.1 and 2.2 below.
A kinetic approach to spatially homogeneous, parabolic-hyperbolic PDEs has been developed by Chen and Perthame [11], for ≥ 0 and for the flux 1 2 and locally Lipschitz continuous, which was complemented by a renormalized entropy solutions approach by Bendahmane and Karlsen [13]. For simplicity, in the following discussion, we use the terms kinetic solutions and renormalized entropy solutions synonymously.
Parabolic-hyperbolic PDEs with inhomogeneous, regular coefficients have been treated in Karlsen and Ohlberger [58], Chen and Karlsen [59], and Dalibard [60,61], and the references therein. The case of less regular coefficients has been first considered by Karlsen and Risebro [12], which when applied to the setting of the present paper proves the uniqueness of (bounded) entropy solutions, assuming that x with bounded divergence and that 1 2 is locally Lipschitz continuous. In this sense, this work partially generalized [1] to the case of nonlinear fluxes. This was subsequently extended by Wang, Wang, and Li [62] and Barbu and Röckner [63], still in the setting analogous to [1], and locally Lipschitz flux 1 2 . All of the above works are conceptually different from the present paper in several regards: (I) In the present work, we treat fluxes that are only L 2 t,x -integrable, thus going far beyond the classical setting of [1]. In addition, we only assume 1 /2-Hölder continuity of the flux in the function variable. (II) The contributions [11][12][13]62] work entirely on the level of kinetic solutions. This does not imply the uniqueness of weak solutions. In fact, deducing this would require to prove that all weak solutions are kinetic solutions, which would precisely lead to the nonlinear commutator errors controlled in Sect. 4. We emphasize that passing through the concept of weak solutions is essential in the present work, in order to establish the stability of solutions with respect to the control in Proposition 21 and Theorem 28. (III) In addition, compared to the existing theory of kinetic solutions, e.g. [11], a new treatment of the entropy dissipation measure, as developed in Sect. 2.3 and specifically (15), is necessary, since the decay condition [11, Definition 2.2 (iv)] is not satisfied.
Duality arguments: We will now discuss the (in-)applicability of the duality argument from [52] to the parabolic-hyperbolic PDEs with irregular coefficients considered in this work. Following [46], in a nutshell, the duality method is based on the following identity, for the difference of two solutions ρ 1 and ρ 2 , , and on choosing the test function ϕ as a sufficiently regular approximation of the linear dual equation, for smooth forcing θ , This method fails in the context of (2) for several reasons: (I) The possible degeneracy of the diffusion in (2) and, thus, of a in (8) renders the regularity required on ϕ or approximations thereof problematic. In fact, even employing the improved argument for nonnegative solutions from [64] requires the use of maximum principles, which in turn require g to have bounded divergence, and sufficient regularity for the dual equation requires g bounded. (II) The duality method also requires sufficient integrability for the class of solutions considered. Precisely, see [46,Sect. 6.2], it requires ρ, (ρ) ∈ L 2 t,x . Due to the irregularity of the drift g ∈ L 2 t,x , such a higher integrability cannot be expected in the setting of the present work. See also the discussion on L p -estimates and the entropy-dissipation estimate in Sect. 2.2. (III) Aside from these two issues, the duality argument relies on choosing a test function that solves a linear dual equation (8). However, for linear Kolmogorov equations with irregular coefficients it is well-known that the critical integrability of g ∈ L p t L q x is given by the Ladyzhenskaya-Prodi-Serrin (LPS) condition, p, q ≥ 2, d p + 2 q ≤ 1, see, for example [65,66]. Again, this is not satisfied in the present work.
Fluctuations of the zero range process: fluctuations have been analyzed, for example, by Benois, Kipnis, and Landim in [26]. We also refer to Kipnis and Landim [19], Evans and Hanney [70], and the references therein for a detailed account of the theory. Equilibrium large deviations for the zero-range process with long jumps and reservoirs have been analyzed by Bernardin, Gonçalves, Oviedo-Jiménez, and Scotta [71]. For large deviations results in mean-field interacting particle systems we refer to the works by Barré, Bernardin, Chétrite, Chopra, and Mariani [72], Dawson and Gärtner [73,74], and Gvalani and Schlichting [75].
Large deviations for conservative stochastic PDE: large deviations principles for conservative SPDEs have previously been considered in the works [76] by Mariani and [77] by Bellettini, Bertini, Mariani, and Novaga. We here compare the methods and results when applied in the framework of conservative SPDE (5). We emphasize, however, that the scope of the works [76,77] goes beyond this by including the case of asymptotically vanishing dissipation, which heuristically corresponds to asymmetric simple exclusion processes. In this sense, the present work extends the first order large deviations obtained in [76,77] in two directions: first, the rate functions are identified on the space of function-valued solutions. Second, this is achieved under significantly more general conditions on the coefficients, which include nonlinear dissipation and degenerate noise σ . Since we allow for unbounded initial data and σ , the methods developed here handle possible concentration and vacuum effects. Heuristically, this corresponds to potentially degenerate zero range processes, as compared to regularized exclusion processes in [76,77].
In [76,77] the rate function could be identified only after passing to a measurevalued formulation for which the skeleton equation is linear, and the corresponding rate function is convex. This significantly simplifies the identification, for example, since the linear structure is nicely compatible with mollification-a fact which is not true in the nonlinear setting, and the control of the resulting nonlinear commutators is a key technical step in the present work. The linearization also comes at the cost of working on much larger spaces, and on the level of function-valued large deviations no explicit representation of the rate function has been obtained, compare [76,Corollary 1]. This issue is fully resolved in the present work. We present an approach that entirely avoids passing to measure-valued solutions, and therefore establishes an LDP with an explicit rate function for the function-valued solutions that is consistent with the LDP for the particle system.
The second main generalization of [76,77] lies in the assumptions on the coefficients. In [76] it is assumed that σ is C 2 -smooth with σ (0) = σ (1) = 0 and that 0 ≤ ρ 0 ≤ 1. As a consequence, solutions ρ also take values in the interval [0, 1], which rules out concentration and vacuum phenomena. As demonstrated in [78], this significantly simplifies the analysis of the large deviations, since the solutions are bounded and satisfy standard L p -based regularity estimates. In addition, simple exclusion processes and zero range processes heuristically correspond to singular diffusion coefficients σ (ρ) = √ ρ(1 − ρ) and σ (ρ) = √ ρ respectively, thus violating the C 2 -smoothness assumption in [76,77]. The results of the present work include also these singular cases.
Large deviation estimates for singular stochastic PDE: such estimates have been derived by Faris and Jona-Lasinio [100], Jona-Lasinio and Mitter [101], Cerrai and Freidlin [102], and Hairer and Weber [45] in the context of stochastic Allen-Cahn equations. In particular, we emphasize that in [45] it is observed that renormalization constants may enter the rate function in the setting of singular stochastic PDE. The results treated in these works are quite different from the present paper, since, due to the additive noise structure of the stochastic Allen-Cahn equation, the treatment of the corresponding skeleton equation does not pose a major difficulty.
The weak convergence approach to large deviation principles: the weak convergence approach goes back to Budhiraja, Dupuis, and Maroulas [103], and it has been used to derive large deviation estimates for singular SPDE by Cerrai and Debussche [104]. Further applications to stochastic PDE with multiplicative noise, depending only on the values of the solution, have been given by Brzeźniak, Goldys, and Jegaraj [105] and Dong, Wu, Zhang, and Zhang [106].

Overview
We introduce the assumptions on the nonlinearity at the beginning of each section, and observe now that every assumption is satisfied by the model example (ξ ) = ξ m , for every m ∈ [1, ∞). In Sect. 2, we present an informal analysis of the skeleton equation broken down as follows. In Sect. 2.1, we argue by scaling that the skeleton equation (2) is energy critical for L 1 (T d ) and is energy supercritical for L r (T d ), if r ∈ (1, ∞). We obtain formal a-priori estimates for the solution of (2) in Sect. 2.2 and thereby identify the correct energy space (4) for the initial data. Based on these estimates, in Sect. 2.3 we define a renormalized kinetic solution (see Definition 3 below).
In Sect. 3, we prove the uniqueness of renormalized kinetic solutions (see Theorem 8 below). In Sect. 4, we prove in Theorem 14 the equivalence of renormalized kinetic solutions and weak solutions (see Definition 11 below). In Sect. 5, we prove the existence of renormalized kinetic solutions (see Proposition 20 below), and obtain in Proposition 21 the strong continuity of the solutions with respect to weak convergence of the controls. In Sect. 6, we prove the uniform large deviations principle, which relies on the weak approach to large deviations developed in [103]. In Sect. 7, in Theorem 33, we prove the -convergence of the large deviations rate functions (7) to the rate function (1). In Sect. 8, we characterize the l.s.c. envelope of the restricted rate function.

The skeleton equation
The equation defining the large deviations rate function of the zero range process (1) is the so-called skeleton equation. In Sects. 3, 4, and 5 we will prove the existence and uniqueness of solutions to the equation, for ρ 0 ∈ Ent (T d ) and for g ∈ L 2 (T d × [0, T ]) d , We first argue formally in Sect. 2.1 below that equation (9) is energy critical for L 1 (T d ) and energy supercritical for L r (T d ), if r ∈ (1, ∞). This argument suggests that no standard L p -theory can be applied, and indeed in Sect. 2.2 below we derive an energy estimate for solutions with initial data in the space Ent (T d ) defined in (4). This estimate will be the basis for Definition 3 below, where we present the definition of a renormalized kinetic solution.
We observe in particular that the formal estimates obtained in Sect. 2.2 are significantly weaker than are required to apply standard techniques based on the entropy or kinetic formulation of the equation (see [11,Sect. 2]). This can be seen on the level of the parabolic defect measure (see (15) below), which is neither globally integrable nor decaying at infinity. The proof of uniqueness therefore requires new techniques to control errors at infinity, and the proof of existence is based on a compactness argument that requires optimal estimates for the solution.
It follows thatρ solves the equation We are interested in understanding the effect of this scaling on the balance between the parabolic and hyperbolic terms. We preserve the diffusion by fixing τ η 2 λ m−1 = 1, and for r ∈ [1, ∞) we preserve the L r (R d )-norm of the initial data by fixing λ = η d r . It then follows from (10) that To ensure that this norm does not diverge as η → 0, we require that If p = q = 2, we conclude that d /2r ≥ d /2 and therefore that r = 1. Conversely, since the lefthand side of this equality is largest for r = 1, and since the case r = 1 yields the inequality we conclude that p = q = 2 is critical for L 1 (T d ) and supercritical for L r (R d ), for every r ∈ (1, ∞).

A-priori estimates
In this section, we will motivate the definition of a renormalized kinetic solution of the skeleton equation (see Definition 3 below). This definition is the foundation of the existence and uniqueness theory to follow. We will first derive a formal energy estimate for the solution, and thereby identify the correct energy space for the initial data.
We will restrict attention to nonnegative initial data, which is a necessary assumption for the following estimates to be true (see Remark After testing the equation with the composition ψ(ρ), The nonnegativity of ρ, Hölder's inequality, and Young's inequality prove that To close the estimate we require that (ξ )ψ (ξ ) 2 ≤ (ξ )ψ (ξ ) and hence that ψ (ξ ) ≤ (ξ ) (ξ ) . We therefore fix ψ (ξ ) = log( (ξ )) and define The formal estimate then follows from the identity 2∇ ∇ρ. (11) is in general false for signed initial data, which can be seen for the heat equation. Since ρ(x, t) = x solves the heat equation with linear initial data, and since x

Remark 1 Estimate
, we conclude after localizing this argument to the torus.

Remark 2
We observe that estimate (11) is based on the physical entropy of the initial data in the case that (ξ ) = ξ m , for some m ∈ (0, ∞). In this case, from which it follows from the preservation of the L 1 x -norm that the physical entropy is a nondecreasing function of time.

Renormalized kinetic solutions
In this section, we will define renormalized kinetic solutions to the skeleton equation (9). Based on estimate (11), we first rewrite the equation in the form On this level, the criticality of the equation can be seen by analyzing the integrability of the products which are formally L 1 t,x -integrable. Indeed, even in one dimension, embedding theorems do not readily yield an improvement because they do not improve the integrability in time.
The borderline integrability of the products (12) and the lack of regularity for the solution make classical techniques untenable and suggest the necessity of a generalized solution theory. We therefore pass to the equation's kinetic formulation. The kinetic function χ : R 2 → R is defined by χ(s, ξ ) = 1 {0<ξ <s} − 1 {s<ξ <0} , and the kinetic function χ of ρ is defined by

The identities
show formally that the kinetic function χ of ρ satisfies the equation in In view of Sect. 2.2, this measure will be neither finite nor decaying at infinity. We then formally rewrite (14) in the conservative form (16) and apply the distributional identities (13) to formally derive equation (18) below. Finally, we observe that in the following definition of a renormalized kinetic solution, the formal estimates of Sect. 2.2 are implicit in (17). In particular, while we prove the uniqueness of solutions for nonnegative initial data in L 1 (T d ), the proof of existence and, in particular, estimate (17) will require that we consider initial data with finite entropy in the sense of (3).
is a renormalized kinetic solution of (16) with initial data ρ 0 if ρ satisfies the following two properties.
(a) We have that Remark 4 We observe that the equality in equation (18) is satisfied due to the optimal regularity (17), which requires the nonnegativity of the initial data (see Remark 1). In general, we would only expect to obtain an inequality due to the presence of a nonnegative entropy defect measure (see [11,Sect. 2]).
The following lemma proves that renormalized kinetic solutions in the sense of Definition 3 satisfy an integration by parts formula on the level of their kinetic functions.
loc ((0, ∞)) be nondecreasing, let ρ : T d → R be measurable, let χ be the kinetic function of ρ, and assume that Proof The proof is a small modification of [11, Appendix A] and follows from the change of variables formula and the fact that is nondecreasing.

Uniqueness of renormalized kinetic solutions
In this section, we will prove the uniqueness of renormalized kinetic solutions in the sense of Definition 3 for nonlinearities that satisfy Assumption 6 below. The proof of uniqueness is significantly complicated by the fact that the parabolic defect measure is neither globally integrable nor decaying at infinity with respect to the velocity variable ξ ∈ R. It is for this reason that we introduce a cutoff in velocity. Lemma 7 below is used to control the error terms that arise when removing this cutoff function. We then prove the uniqueness of renormalized kinetic in Theorem 8 below, the starting point for which is based on the techniques of [11, Theorem 1.1, Sect. 4].
Proof The proof of this claim is a straightforward consequence of the fact that, for a sequence of positive numbers a n , if ∞ n=1 a n < ∞ then lim inf n→∞ (na n ) = 0. (9) in the sense of Definition 3 with control g and with initial data ρ 1 0 , ρ 2 0 , then Proof In order to simplify the notation, we will write the proof in the porous media and fast diffusion case (ξ ) = ξ m , for some m ∈ (0, ∞), and observe in the proof were the more general conditions of Assumption 6 are needed. For the kinetic function χ i of ρ i , we will write χ i t (x, ξ ) = χ(x, ξ, t) and we will make similar conventions for the defect measure p i t and all other time-dependent functions or measures appearing in the proof. For every ε ∈ (0, 1) let κ ε be a standard convolution kernel on T d of scale ε, for every δ ∈ (0, 1) let κ δ be a standard convolution kernel on R of scale δ, and for every ε, δ where in what follows the convolution kernel κ ε,δ will play the role of the test function in Definition 3. It is also necessary to introduce a cutoff in the velocity variable. In what follows, when necessary we will write (x, ξ ) ∈ T d × R for the variables inside the convolutions, and we will write (y, η) ∈ T d × R for the outer integration variables. The fact that the kinetic function is {0, 1}-valued proves that Let κ ε,δ i be defined by and observe from the equation satisfied by ρ i that, as distributions on T d ×R×(0, T ), It then follows from Lemma 5 and the symmetry of the convolution kernel that We define (22) We will analyze the terms involving the sgn function and the mixed term separately. For every i ∈ {1, 2} let In what follows, we will let t ∈ [0, T ], M ∈ (0, ∞), ε ∈ (0, 1), and δ ∈ (0, (2M) −1 ) be arbitrary. In particular, this choice of δ guarantees that In the argument, we will first let ε → 0, then δ → 0, and then M → ∞ which is consistent with this choice. The sign terms. We will first analyze the sgn terms (23), and we will first consider the case i = 1. It follows from (21) that, as distributions on (0, T ), The symmetry of the convolution kernel and the distributional equality ∂ ξ sgn = 2δ 0 prove that The case i = 2 is identical and this completes the initial analysis of the sgn terms. The mixed term. As distributions on (0, T ), It follows from (21) that Since the integration by parts formula of Lemma 5 proves that and since we have the distributional equality We obtain an identical formula for I ε,δ,M t,2,mix after swapping the roles of i ∈ {1, 2}, and this completes the initial analysis of the mixed term.
The full derivative. We will decompose the full derivative of (22) defined by (23), (25), (26), and (27) into the four terms defined by the parabolic term the hyperbolic term the term involving the control and the term defined by the cutoff The four terms on the righthand side of (28) will be handled separately.
The parabolic terms. After adding and subtracting 2(ρ 1 ρ 2 ) 1−m 2 and using that the parabolic term defined in (29) satisfies The definition of the parabolic defect measures, Hölder's inequality, and Young's inequality prove that and it therefore follows from (33) that It then follows from a direct computation using (24), where in the general case we would use here the local 1 /2-Hölder continuity of and the local boundedness of and away from zero and infinity on compact subsets of (0, ∞), that for some c ∈ (0, ∞) depending on M, and therefore, for some c ∈ (0, ∞) depending on M, The boundedness of δκ δ in δ ∈ (0, 1), the supports of κ δ and ζ M , Hölder's inequality, and Young's inequality prove that, for some c ∈ (0, ∞) depending on M, The H 1 x -regularity of (ρ i ) m 2 and the dominated convergence theorem then prove that which completes the analysis of the parabolic terms. The hyperbolic terms. A similar analysis relying on the supports of κ δ and ζ M , the local Lipschitz continuity of and therefore 1 2 on (0, ∞), (24), and the boundedness of δκ δ in δ ∈ (0, 1) proves that the hyperbolic terms (30) satisfy, for some c ∈ (0, ∞) depending on M ∈ (0, ∞), Hölder's inequality and Young's inequality then prove that lim sup and the L 2 t,x -integrability of g, the H 1 x -regularity of (ρ i ) m 2 , and the dominated convergence theorem prove that which completes the analysis of the hyperbolic terms.
The control terms. It follows from (24) and the choice of δ and M that there exists c ∈ (0, ∞) depending on M such that and therefore the control terms (31) satisfy The L 2 t,x -integrability of g and H 1 x -regularity of (ρ i ) m 2 then prove that lim sup and, therefore, which completes the analysis of the control terms.
The cutoff in velocity. It follows from the definition of the parabolic defect mea- It is at this point that we would use assumption (19) in the general case to bound ( ) −1 . It then follows from the boundedness of |η∂ η ζ M | for η ∈ (0, 1), the support of ∂ η ζ M , Hölder's inequality, and Young's inequality that, for c which completes the analysis of the cutoff in velocity.
The conclusion. Returning to (20), it follows from (34), (35), (36), and (37) that, for c ∈ (0, ∞) independent of M ∈ (0, ∞), For the final term on the righthand side of (38), the dominated convergence theorem proves that For the second term on the righthand side of (38), a straightforward application of Lemma 7 applied to the functions Returning to (38), we conclude from (39) and (40) that, after taking M → ∞, which completes the proof.

Equivalence of renormalized kinetic solutions and weak solutions
In this section, we will prove in Theorem 14 that, for nonlinearities satisfying the additional Assumption 10, renormalized kinetic solutions (see Definition 3) are equivalent to classical weak solutions (see Definition 11 below). Assumption 10 is broken into two cases, and for the model example (ξ ) = ξ m an explicit computation proves that case (i) applies to m ∈ [1, 2] and case (ii) applies to m ∈ [2, ∞). The equivalence of renormalized kinetic solutions and classical weak solutions is used in Proposition 21 below to prove that a weakly convergent sequence of controls induces a strongly convergent sequence of solutions. This fact is not obvious for renormalized kinetic solutions, since the fourth term on the righthand side of (18) will contain what is in general the product between a weakly convergent gradient and weakly convergent control.
(ii) We have that, for every t ∈ [0, T ], for every ψ ∈ C ∞ (T d ), It is essentially clear that a renormalized kinetic solution is a weak solution. In order to prove the converse statement, let ρ be a weak solution in the sense of Definition 11, let ψ ∈ C ∞ c (T d × (0, ∞)), and let ∂ ξ (x, ξ ) = ψ(x, ξ ). The composition (x, ρ ε ), for the spatial convolution ρ ε = ρ * κ ε , is then a distributional solution of the equation (44) Both terms on the righthand side of (44) will be treated identically using only the L 2 t,x -integrability of g and ∇ 1 2 (ρ) respectively. We therefore let F ∈ L 2 (T d × [0, T ]) d and observe that, after integrating by parts in space, The L 1 t,x -integrability of the product 1 2 (ρ)F and boundedness of ∇ x ψ make it relatively straightforward to pass to the ε → 0 limit in the first term on the righthand side of (45). The second term is more difficult even if ρ is H 1 x -valued, due to the fact that the unboundedness of 1 2 (ρ) means that ( t,x -strongly to its limit 1 2 (ρ)F . In fact, since our analysis holds in an arbitrary dimension, this limit effectively holds only L 1 t,x -strongly and not better. In order to improve this convergence, we observe that the compact support of the test function ψ yields an L ∞ t,x -bound for the convolution ρ ε . The goal of our analysis is to transfer the L ∞ t,x -bound for the convolution to an L ∞ t,x -bound for ρ itself, and by doing so to establish the L 2 t,x -strong convergence of the convolutions ( 1 2 (ρ)F * κ ε ) away from the vanishing set that ρ is large. The argument is complicated by the fact that ρ itself is not regular, and we must therefore exploit the regularity of 1 2 (ρ). For this reason, we separate the analysis into two cases, depending on whether 1 2 is concave or convex.
We will first prove that the first term on the righthand side of (46) vanishes as ε → 0. In effect, this is what transfers the L ∞ -bound for ρ ε to ρ in the ε → 0 limit. The first term of (46). We aim to show that Hölder's inequality proves that We will first prove that the final term on the righthand side of (48) remains bounded.
For every x ∈ T d and k ∈ N, for the indicator function Since the map ξ → ξ m 2 is concave, or more generally since 1 2 is concave, Jensen's inequality proves that Therefore, for every (y, t) ∈ Supp(1 k,k+1 ψ(· + x, ρ ε (· + x, ·))), It follows from (49) and (50) that and it follows from the change of variables formula, the fact that is increasing, the definition of A k , and (51) that there exists c ∈ (0, ∞) depending on the L ∞ -norm of ψ such that It remains to estimate the measure of the sets B ε k,x . It follows from Chebyshev's inequality, Jensen's inequality, and the fundamental theorem of calculus that for κ ε 1 k,k+1 = κ ε (y + x − y )1 k,k+1 (y, t) in the final line. Since the triangle inequality proves that |y − y | ≤ |x| + ε on the support of the convolution kernel, it follows from (53) that Returning to (52), it follows from (54) that, for c ∈ (0, ∞) depending on M, Returning to (52), it follows from the support of the convolution kernel and (55) that Returning to (48), it follows from (56) that there exists c ∈ (0, ∞) such that Since for c 0 = T d |∇κ(x)| the functions {c −1 0 |ε∇κ ε |} ε∈(0,1) are a Dirac sequence on T d and since ψ ∈ C ∞ c (T d × (0, ∞)), it follows from the definition of A 1 and the dominated convergence theorem that which in combination with (57) proves that which completes the proof of (47). The second term of (46). We will prove that the second term on the righthand side of (46) satisfies After integrating by parts, For the first term on the righthand side of (46), the L 1 -strong convergence of (ρ m 2 F * κ ε ) to ρ m 2 F , the boundedness of ∇ x ψ , the triangle inequality, and the dominated convergence theorem prove that It remains to treat the second term. In order to treat the potential degeneracy of ρ, it is necessary to replace the convolution of ∇ρ with an appropriate convolution of ∇ρ m 2 . For this, we make a preliminary calculation. In the general case, we let ( 1 2 ) −1 denote the function inverse to 1 2 , and use the equality d dξ together with assumption (41) to deduce that ( In the case (ξ ) = ξ m , equality (62) shows that the second term on the righthand side of (46) satisfies The first convolution will be decomposed in terms of A 0 and A 1 . We will first prove that In the general case, we will use Hölder's inequality and the nonnegativity of ρ to conclude that there exists c ∈ (0, ∞) such that, for p ∈ [2, ∞) as in (41), Since ρ ε (x, t) is bounded on the support of ψ(x, ρ ε (x, t)), it follows from Hölder's inequality and the boundedness of ρ on the set A 0 that there exists c ∈ (0, ∞) depending on M such that (65) is bounded by A calculation identical to (60) using the definition of A 1 proves that In combination (66) and (67) complete the proof of (64). It remains to treat the term involving the integral over A 0 , which is Since m ∈ [1, 2] implies that ρ 1− m 2 and ρ are bounded on the support of 1 A 0 , where in general we will use (41) to conclude that both In combination (60), (63), (64), and (68) complete the proof of (59). In combination,   (0, ∞)), and let κ ε be standard convolution kernel of scale ε on T d . Then, for ρ ε = ρ * κ ε , for every t ∈ [0, T ], Proof The proof is similar to Proposition 12, and to simplify the notation we will write the proof in the case (ξ ) = ξ m , for m ∈ [2, ∞), and indicate where the general version of (ii) in Assumption 10 is used. However, in order to exploit the convexity of the nonlinearity, it is necessary to change the definition of the sets As in the proof of Proposition 13, we will first effectively transfer the L ∞ -bound for ρ ε to ρ by proving that the first term on the righthand side of (69) vanishes as ε → 0. The first term on (69). We aim to show that Hölder's inequality proves that We will first prove that the final term on the righthand side of (71) remains bounded. Following the identical derivation in Proposition 12, we have that, for 1 k,k+1 the indicator function of the set For every x ∈ T d and k ∈ N let B ε k,x = (y, t) ∈ T d × [0, T ]: |ρ(y, t) − ρ ε (y + x, t)|1 k,k+1 (y, t) ≥ k .
Since it follows by definition of the sets B ε k,x and A k that it follows from (72) and (73) that, for c ∈ (0, ∞) depending on ψ, It remains to estimate the measure of the sets B ε k,x . Chebyshev's inequality and Jensen's inequality prove that At this step, in the general case since the convexity proves that the derivative of the inverse of Returning to (75) and applying (77) to (ξ ) = ξ m , it follows from the definition of the sets A k and m ∈ [2, ∞) that, for κ ε 1 k,k+1 = κ ε (y + x − y )1 k,k+1 (y, t), It follows from Jensen's inequality and the fundamental theorem of calculus that Since on the support of the convolution kernel we have |y − y | ≤ |x| + ε, it follows from (78) and (79) that there exists c ∈ (0, ∞) such that k+1 (y, t). (80) Returning to (74), where in the general case assumption (42) is used here to control the growth of (M + k + 1) (M + k) −1 , it follows from (80) Since |x| ≤ ε on the support of the convolution kernel, it follows from (74) and (81) that, for every ε ∈ (0, 1), Then, returning to (71), it follows from (82) that A repetition of the argument leading to (58) proves that lim sup which with (83) completes the proof of (70).
The second term of (69). We will now prove that We first observe after integrating by parts that this term satisfies The first term on the rightand side of (84) is treated identically to the case of (60) and satisfies We will now treat the second term on the righthand side of (84). Since the solution is not in general H 1 x -regular, we introduce a cutoff in the velocity variable to isolate the potential singularities of ∇ρ on the set {ρ 0}. Let φ : R → [0, 1] be defined by φ(ξ ) = {1 if |ξ | ≤ 1, and 1 − |ξ | if 1 ≤ |ξ | ≤ 2, and 0 if |ξ | > 2} , let φ η (ξ ) = φ( ξ η ) for every η ∈ (0, 1), and observe that The integration by parts formula of Lemma 5 proves that the final term of (85) satisfies, for every ( It then follows from (85) and (86) that The two terms on the righthand side of (87) will be treated separately.
We will first prove that the first term on the righthand side of (87) satisfies In the general case, we use the support of φ η and assumption (43) to conclude that is bounded by a constant depending on η, that 1 2 (ρ) is bounded on A 0 , and that assumption (19) implies that lim ξ →0 + (ξ )( (ξ )) −1 = 0. Returning to (88) for the case (ξ ) = ξ m , these considerations and the dominated convergence theorem prove that which completes the proof of (88). It remains to treat the second term of (87), for which we will prove that lim η,ε→0 We commute the convolution on ρ m 2 using the decomposition The first two terms on the righthand side of (91) are treated almost identically. We first observe using the definition of the convolution kernel and the fundamental theorem of calculus that, for c ∈ (0, ∞) depending on ψ, Since it follows from the support of φ η that R φ η (ξ )χ(z, ξ, t) dξ ≤ 2η, it follows from (92), Hölder's inequality, and the supports of the convolution kernels that lim sup The second term on the righthand side of (91) is treated identically, and it follows from (93) that both the first and second terms on the righthand side of (91) vanish as η, ε → 0. It remains to treat the third term on the righthand side of (91). Let φ η (ξ ) = ξ 0 φ η (ξ ) dξ . Since χ = 1 {0<ξ <ρ} the third term on the righthand side of (91) satisfies, after integrating in ξ , Since in the general case we have that ρ(z, t))), it follows from the H 1 x -regularity of 1 2 (ρ) applied to the specific case ρ m 2 and (61) that (94) satisfies, after integrating by parts in the convolution, In the general case, we use the fact that 0 ≤ φ η ≤ 2η and assumption (19) to conclude that, for some c ∈ (0, ∞) independent of η ∈ (0, 1), In the case (ξ ) = ξ m , these considerations and Hölder's inequality prove that there exists c ∈ (0, ∞) independent of ε, η ∈ (0, 1) such that (95) satisfies which vanishes in the η → 0 limit. In combination (91), (93), and (96) complete the proof of (90). In combination (87), (89), and (90)  Proof It is straightforward to see that Definition 3 implies Definition 11. We will therefore only prove that Definition 11 implies Definition 3. Assume that ρ is a weak solution and let χ denote the kinetic function of ρ and let p denote the parabolic defect measure. For every ε ∈ (0, 1) let κ ε : T d → R be a standard symmetric convolution kernel of scale ε ∈ (0, 1) and let ρ ε = (ρ * κ ε ). Let ψ ∈ C ∞ c (T d × R) and let (x, ξ ) = ξ 0 ψ(x, ξ ) dξ . It follows that ρ ε ∈ W 1,1 ([0, T ]; L 1 (T d )) with derivative and that (ρ ε ) ∈ W 1,1 ([0, T ]; L 1 (T d )) with derivative given by (46). Therefore, for every t ∈ [0, T ], ρ ε (x, t)).
We pass to the limit ε → 0 using Proposition 12 and Proposition 13, depending on which case of Assumption 10 is satisfied, with F = ∇ 1 2 (ρ) and F = g respectively. This completes the proof.

Existence of weak solutions
In this section, we will prove that there exists a weak solution to the skeleton equation for controls g ∈ L 2 (T d × [0, T ]) d and initial data ρ 0 ∈ Ent (T d ) in the sense of Definition 11. The existence of renormalized kinetic solutions (see Definition 3) is then a consequence of Theorem 14. However, we emphasize that these methods can also be used to prove the existence of renormalized kinetic solutions in the sense of Definition 3 directly, but we omit these details since this more general result will not be used here.

Preliminaries
In this section, we introduce in Assumption 15 conditions on that will be used to guarantee the regularity and compactness of approximate solutions. This assumption is satisfied by every fast diffusion and porous media nonlinearity. Lemma 16 and Lemma 17 collect two important consequences of Assumption 15.
Proof The lemma relies on part (i) of Assumption 15 and is a straightforward consequence of the Sobolev embedding theorem and an interpolation estimate. The details can be found in [42,Lemma 5.4].

Regularity and existence of solutions
In this section, we will construct a solution to (97) in Proposition 20 based on the entropy dissipation estimate of Proposition 19. The estimate will be obtained for initial data with finite entropy in the sense of Definition 18, and the proof relies on testing the equation with log( (ρ)) which uses essentially the nonnegativity of the initial data. The existence of solutions is then a consequence of Lemma 17.

Proposition 19
Let T ∈ (0, ∞), let satisfy Assumptions 6, 10, and 15, let g ∈ L 2 (T d × [0, T ]) d , and let ρ 0 ∈ Ent (T d ). Then, if ρ is a weak solution of (97) in the sense of Definition 11, for some c ∈ (0, ∞), Proof The nonnegativity of ρ, the H 1 x -regularity of 1 2 (ρ), and an approximation argument by convolution prove that, for ,δ (ξ ) = 2 ξ 0 log(δ + 1 2 (ξ )) dξ , we have that the composition ,δ (ρ) satisfies, for every t ∈ [0, T ], It then follows from Hölder's inequality, Young's inequality, and ξ(ξ We pass to the limit δ → 0 using the monotone convergence theorem, which completes the proof. Proof The proof is based on constructing smooth solutions to a sequence of approximating equations for ρ 0 ∈ Ent (T d ), for η 2 ∈ (0, 1), and for , and, for some c ∈ (0, ∞) depending on η 1 , and that satisfy, for every compact set A ⊆ [0, ∞), It is straightforward to prove the existence of continuous L 1 -valued solutions to (99) taking values in L 2 ([0, T ]; H 1 (T d )). The reason for considering such specific approximations of 1 2 is that, using the properties of these approximations, a repetition of Proposition 19 proves that the approximating solutions satisfy estimate (98) uniformly in η 1 . The claim then follows by an application of Lemma 17.

Weak-strong continuity
We conclude this section with the proof that a weakly convergent sequence of controls induces a strongly convergent sequence of solutions. This fact will be important in the proof of the large deviations principle below. Let T ∈ (0, ∞), let satisfy Assumptions 6, 10, and 15, let ρ 0 ∈ Ent (T d ), and let g n g weakly in

Proposition 21
Then, for the solutions ρ n and ρ of (97) in the sense of Definition 11 with initial data ρ 0 and with controls g n and g respectively, as n → ∞,

Proof
The proof is essentially a repetition of Proposition 20 using the fact that the weak convergence of the g n implies that they are uniformly L 2 t,x -bounded, and therefore that the solutions ρ n are relatively compact in L 1 (T d × [0, T ]) by Lemma 17. The claim then follows from the uniqueness of the limit, which follows from Theorem 8 and Theorem 14.

The uniform large deviations principle for conservative SPDE
In this section, we identify a scaling limit for which the solutions of the equation for correlated noise ξ K defined in Sect. 6.1, satisfy a uniform large deviations principle with respect to initial data in weakly compact subsets of L 1 (T d ) with bounded entropy.

The noise and well-posedness of (100)
We will prove the LDP for certain spectral approximations of space-time white noise. However, our methods apply without any essential changes to noise of a general type, such as that in Remark 30 below.

Definition 22
Let ( , F, P) be a probability space, let {F t } t∈[0,∞) be a filtration on ( , F), and let {B k , W k } k∈N 0 be independent, F t -adapted, d-dimensional Brownian motions. For every K ∈ N we define the noise for the sum over elements k ∈ Z d with |k| ≤ K.
We will understand (100), for ξ K as in Definition 22, in its Itô form (101) for N K = #{k ∈ Z d : |k| ≤ K}. To prove the large deviations result, we will also consider the controlled equation where P K g denotes the Fourier projection of a random control onto the span of We now summarize the well-posedness theory for (101) and (102). The essential observation of Theorem 23 is that the P-a.s. L 1 xcontraction implies the existence of a measurable solution map, where we view the Brownian motions (B k , W k ) k∈N 0 as taking values in the space C([0, ∞); R ∞ ) where R ∞ is equipped with the metric topology of coordinate-wise convergence.

Proof
The existence of an F t -measurable solution and the almost sure L 1 -contraction estimate (103) T ]) such that, P-a.s., Since it follows from Proposition 27 below that Ent (T d ) equipped with strong L 1 (T d )-topology is separable, let {ρ n } n∈N be a countable dense subset of Ent (T d ).
It follows from (103) and the countability of the set N × N that, on a measurable subset of full probability, for every n, m ∈ N, It follows from the density of the {ρ n } n∈N with respect to the strong L 1 (T d )norm and (105) that there almost surely exists a strongly continuous function S ε,K T ]) such that, for every n ∈ N, on a subset of full probability independent of n, It then follows from (105), (106), and a simplified version of [42, Theorem 5.29, Corollary 5.31] that, for every ρ ∈ Ent (T d ), on a subset of full probability depending on ρ, We define S ε,K : from which it follows from (104) that, for every ρ ∈ Ent (T d ), and from (107) that, for almost every realization of B, In combination (108), (109), and the separability of Ent (T d ) prove that S ε,K is measurable, which with (104) and (107) completes the proof.
Proof The proof is a consequence of Theorem 23 and the Girsanov theorem (see, for example, [103,Theorem 10]).

The uniform large deviations principle
We will now use the well-posedness results of Theorem 23 and Proposition 24, the estimates of Proposition 25, and the convergence of Theorem 28 below to establish a large deviations principle for the solutions of (100) along the scaling limit identified in Proposition 26. The LDP is a consequence of the weak approach to large deviations established in [103], as well as Dupuis and Ellis [110] and Budhiraja and Dupuis [111]. Precisely, we will show that the solutions satisfy a large deviations principle with rate function I ρ 0 defined by for the skeleton equation understood in the sense of Definition 11. In fact, relying on the methods of [103, Theorem 6] and Budhiraja, Dupuis, and Salins [112, Theorem 4.3], we will prove the stronger statement that the solutions to (100) satisfy a large deviations principle uniformly with respect to weakly L 1 x -compact subsets of the initial data with bounded entropy.
The proof of the LDP relies on establishing two facts. First, we require the compactness of the family of solutions to the skeleton equation with L 2 t,x -bounded controls. This is a straightforward consequence of Lemma 17 and Proposition 19. Second, in the scaling regime that εK(ε) d+2 → 0, we require that the solutions to the controlled equation (102) converge to the solution of the skeleton equation. We prove this statement in Theorem 28, and using these two statements we prove the uniform LDP in Theorem 29.

Proposition 27
Let satisfy Assumptions 6, 10, and 15, let R ∈ (0, ∞), and let Ent ,R (T d ) be defined in Definition 18. Then, Ent ,R (T d ) equipped with the strong L 1 (T d )-topology is a complete separable metric space.
Proof The completeness is a consequence of the completeness of L 1 (T d ) and Fatou's lemma. The separability follows from the convexity of and Assumption 15, which implies for each R ∈ (0, ∞) that L 2 (T d ) ∩ Ent ,R (T d ) is dense in Ent ,R (T d ).
Since L 2 (T d ) ∩ Ent ,R (T d ) is separable as a closed, convex subset of the separable, reflexive, strictly convex Banach space L 2 (T d ), this completes the proof.
Proof We will first consider nonnegative initial data ρ ε 0 and ρ 0 that are uniformly such that the kinetic function χ ε of ρ ε satisfies P-a.s. the equation, for every t ∈ [0, T ] and ψ ∈ C ∞ c (T d × (0, ∞)), for M K(ε) = |k|≤K(ε) k 2 K(ε) d+2 and for P K(ε) g ε the Fourier projection of g ε . See [42,Sect. 3] for a full derivation of this equation, where we observe that the fourth and final terms on the righthand side of (111) are integrable due to the compact support of ψ on (0, ∞).
The essential difficulty in passing to the ε → 0 limit appears in the final term of (111), which as ε → 0 contains the product of weakly convergent sequences ∇ 1 2 (ρ ε ) and g ε . We observe, however, that this term does not appear in the weak formulation of the skeleton equation. We will therefore quantify the maximal contribution of this term with the measures p ε below, and derive the weak formulation of the skeleton equation by choosing an appropriate sequence of test functions ψ that approach one in the velocity variable. More precisely, let p ε denote the nonnegative, almost surely finite measure defined by and observe that Following the methods of [42,Theorem 5.29], we characterize the limiting behavior of the solutions ρ ε by establishing the tightness of the random variables , in the product metric topology of the state space To prove (112), we will recover the classical weak formulation of the skeleton equation by passing to the limit along the subsequence ε → 0 in each of the terms on the righthand side of (111).
Proof The proof relies on an application of the weak approach to large deviations [103,Theorem 6], the equivalence of uniform Laplace and large deviations principles with respect to compact subsets of the initial data [112,Theorem 4.3], Lemma 17, Proposition 19, and Theorem 28. To apply the framework of [103,Theorem 6], it is necessary to prove the compactness of the solution set of the skeleton equation with uniformly bounded initial data and uniformly bounded controls, and to prove the convergence in distribution of the controlled SPDE (102) to the skeleton equation. The compactness is an immediate consequence of Lemma 17 and Proposition 19, and the required convergence of the controlled SPDE to the skeleton equation is exactly Theorem 28. This completes the proof.

Remark 30
We emphasize that these techniques apply for general noise of the form 2 and ∞ k=1 |∇f δ k | 2 are continuous on T d and provided that ξ δ is probabilistically stationary in the sense that the quadratic variation ∞ k=1 (f δ k ) 2 is constant on T d . This includes, with no changes to the arguments, spatial smoothings ξ δ = (ξ * κ δ ) of space-time white noise for which the solutions satisfy the uniform large deviations principle in the scaling regime εδ(ε) −(d+2) → 0.

-Convergence of rate functions
In this section, we will first show that, for every K ∈ N, the solutions to the equation satisfy a small-noise large deviations principle with rate function for P K g the Fourier projection of g onto the span of { √ 2 sin(k · x), √ 2 cos(k · x)} {|k|≤K} . Then, in Lemma 32 and Theorem 33 we prove that the rate functions (124) converge as K → ∞ in the sense of -convergence to the rate function (110). Proof The proof is identical to the proof of Theorem 29 and is obtained by repeating the same argument in the scaling regime that K ∈ N is fixed.

Lemma 32
Let satisfy Assumptions 6, 10, and 15, let ρ 0 ∈ Ent (T d ), let I ρ 0 be defined in (110), and let I K ρ 0 be defined in (124). Then, for every K ∈ N, the infimums appearing in (110) and (124) are attained. Furthermore, for every ρ ∈ L 1 (T d × [0, T ]) satisfying I ρ 0 (ρ) < ∞ (alternately, I K ρ 0 (ρ) < ∞ for K ∈ N), there exists a unique g ∈ L 2 (T d × [0, T ]) d satisfying Proof The fact that the infimums appearing in (110) and (124) are achieved is an immediate consequence of Theorem 14, Lemma 17, and Proposition 19, since Definition 11 is stable with respect to weak convergence of the control. For the second statement, suppose that ρ ∈ L 1 (T d × [0, T ]) satisfies I ρ 0 (ρ) < ∞ (alternately, I K ρ 0 (ρ) < ∞ for K ∈ N). Then, by the above, the set is a non-empty, weakly closed, convex subset of L 2 (T d × [0, T ]) d . The fact that the set is weakly closed implies the existence of a minimizer. The convexity of the set implies that if g 1 and g 2 are two distinct minimizers then g 3 = 1 2 (g 1 + g 2 ) is also a minimizer, and the uniform convexity of the L 2 t,x -norm proves that g 3 L 2 < g 1 L 2 , which completes the proof.

Theorem 33
Let satisfy Assumptions 6, 10, and 15, let ρ 0 ∈ Ent (T d ), let I ρ 0 be defined in (110), and let I K ρ 0 be defined in (124). Then, as K → ∞, in the sense of -convergence, . In order to establish the -convergence, it is necessary to prove the following two properties. First, for every sequence {ρ K n } n∈N satisfying as n → ∞ that K n → ∞ and ρ K n → ρ strongly in Second, there exists a sequence {ρ K n } n∈N satisfying as n → ∞ that K n → ∞ and that ρ K n → ρ strongly in Proof of (125). Let ρ ∈ L 1 (T d × [0, T ]) and suppose that {ρ K n } n∈N satisfies as n → ∞ that K n → ∞ and that If lim inf n→∞ I K n (ρ K n ) = ∞ then (125) is satisfied. If not, fix a subsequence {K m } m∈N which satisfies for every m ∈ N that and that By Lemma 32, for every m ∈ N fix the unique g m ∈ L 2 (T d × [0, T ]) d satisfying It follows from (127) and (128) Since by assumption, as m → ∞, it follows from Theorem 14, Theorem 20, and Proposition 21 that ρ is a solution of (97) in the sense of Definition 11. The weak lower-semicontinuity of the L 2 -norm, the definition of I ρ 0 , and (127) prove that lim inf which completes the proof of (125). The proof of (126). Let ρ ∈ L 1 (T d ×[0, T ]). If I ρ 0 (ρ) = ∞, then (126) is satisfied. If not, Lemma 32 implies that there exists a unique g ∈ L 2 (T d × [0, T ]) d such that For every K ∈ N let ρ K ∈ L 1 (T d × [0, T ]) denote the unique solution of the equation Since, as K → 0, the methods of Proposition 21 prove that there existsρ ∈ L 1 (T d × [0, T ]) such that, as K → ∞, and such thatρ solves (97) in the sense of Definition 3. Since Theorem 8 and Theorem 14 prove that the solution is unique, Therefore, it follows from (131) and (132) that, as K → ∞, In combination (133) and (134) prove (126), which completes the proof.

The lower semicontinuous envelope of the restricted rate function
As outlined in the introduction, the general approach to large deviations in interacting particle systems introduced in [23,24] relies on a separate derivation of large deviations lower and upper bounds. As a consequence of these arguments, both estimates lead to possibly different rate functions. The proof of a large deviations principle relies on establishing their identity, which proves to be a challenging problem. In particular, this requires the characterization of the l.s.c. envelope of the lower-bound rate function restricted to smooth fluctuations, which in the case of the zero range process has remained an open problem since [26].
That is, I lo is the l.s.c. envelope of I 0 restricted to smooth fluctuations ρ ∈ S. In contrast, the results of [26,Theorem 1], together with subsequent contributions [116][117][118] [26] is to identify the resulting rate functions I lo and I up , with a key difficulty being the characterization of the l.s.c. It then follows from Hölder's inequality that, for every g ∈ Cont ρ , H ρ H 1 Proof It follows from Proposition 37 and Definition 38 that I = I 0 = I up on S. It is therefore sufficient to show that I is equal to the l.s.c. envelope of its restriction to S. Since I is itself l.s.c. it follows that I ≤ I |S (ρ). It remains to show the opposite inequality. To do this, it is necessary to show that for any ρ satisfying I (ρ) < ∞ there exists a sequence ρ n ∈ S such that, as n → ∞, ρ n → ρ strongly in L 1 (T d × [0, T ]) and I (ρ n ) → I (ρ).
Using Lemma 32, let g ∈ L 2 (T d × [0, T ]) d be the unique control satisfying We will now construct the smooth approximation. This occurs in three steps: we first introduce a smoothing of the control g and then a smoothing and perturbation of the initial data ρ(·, 0) ∈ Ent (T d ) so that it becomes strictly positive. Finally, we introduce a cutoff that effectively "turns off" the control when it forces the solution near zero or infinity. This guarantees that the approximate solutions remain strictly bounded away from zero and infinity-and therefore restricts the solution to a region in which the nonlinearity is uniformly elliptic-and allows for the application of standard parabolic and elliptic regularity estimates to prove the regularity of the control in the sense of Definition 38.
A small adaptation of the proofs of Theorem 8, Proposition 9, and Theorem 14 in this simplified setting prove that, for every n ∈ N, there exists a unique solution ρ n ∈ C([0, T ]; L 1 (T d )) of the equation ∂ t ρ n = (ρ n ) − ∇ · ( 1 2 (ρ n )ψ n (ρ n )g n ) in T d × (0, T ) with ρ n (·, 0) = ρ 0,n , in the sense of Definition 11 with control ψ n (ρ n )g n . We will first show that the ρ n correspond to smooth fluctuations in the sense of Definition 38. The comparison principle, the definition of ψ n , and the definition of ρ 0,n prove that Since we have that ψ n , g n , and ρ 0,n are smooth and bounded, that ∈ C 3 loc ((0, ∞)) with strictly positive on compact subsets of (0, ∞), and that ρ n is bounded and bounded away from zero, it follows from interior Schauder estimates (see, for example, Ladyzhenskaya, Solonnikov, and Ural'ceva [119]) that ρ n ∈ C 3,2 (T d × [0, T ]). We view the ρ n as satisfying the equation ∂ t ρ n = (ρ n ) − ∇ · ( 1 2 (ρ n )g n ) in T d × (0, T ) with ρ n (·, 0) = ρ 0,n , with the controlg n = ψ n (ρ n )g n . In view of Remark 35, the positivity and boundedness of ρ n away from zero and infinity and the positivity of on (0, ∞) prove that elements of H 1 (ρ n ) can be identified with a unique element of the Sobolev space L 2 ([0, T ]; H 1 (T d )) that has zero spatial mean on almost every time slice. It then follows from the proof of Proposition 37 that, for every n ∈ N, there exists H n ∈ L 2 ([0, T ]; H 1 (T d )) such that ∂ t ρ n = (ρ n ) − ∇ · ( (ρ n )∇H n ) in T d × (0, T ) with ρ n (·, 0) = ρ 0,n .
Hence, we have for every n ∈ N that H n is a solution to the uniformly elliptic equation −∇ · ( (ρ n )∇H n ) = ∂ t ρ n − (ρ n ) in T d × (0, T ), from which we have from 1 /2n ≤ ρ n ≤ 2n, the local C 3 -regularity and positivity of on (0, ∞), the C 3,2 -regularity of ρ n , and interior elliptic regularity estimates (see, for example, [119]) that H n ∈ C 3,1 (T d × [0, T ]). This completes the proof that ρ n ∈ S for every n ∈ N.
Step 2: The strong convergence of the smooth approximations. It remains to prove that, along a subsequence as n → ∞, ρ n → ρ strongly in L 1 (T d × [0, T ]) and I (ρ n ) → I (ρ).
A small adaptation of Proposition 20 in this simplifed setting proves that, for every n ∈ N, for every t ∈ [0, T ], we have that ρ n (·, t) L 1 (T d ) = ρ 0,n L 1 (T d ) , and the estimates proven in Proposition 19 and the definitions of ρ 0,n , g n , and ψ n prove that there exists c ∈ (0, ∞) independent of n ∈ N such that We have furthermore from the equation and the Sobolev embedding theorem that, for c ∈ (0, ∞) independent of n ∈ N, which in view of (146) is uniformly bounded in n ∈ N.
Since g n is a convolution of g, we have that g n → g strongly in L 2 (T d × [0, T ]) d as n → ∞. It then follows from the almost sure convergence of (148), the support and boundedness of ψ n , (149), and Hölder's inequality that, as n → ∞, It follows similarly from the almost sure convergence of (148), the continuity of and (0) = 0, the fact that g n → g strongly as n → ∞, (149), and the support and boundedness of ψ n that, as n → ∞, In combination, it follows from (151), (152), and (0) = 0 that, as n → ∞, in the sense of Definition 11. We therefore conclude using Theorem 8 and Theorem 14 thatρ = ρ and that the ρ n converge strongly to ρ along the subsequence n → ∞.
Step 3: Convergence of the rate functions. It remains only to show that, along the subsequence n → ∞, we have that I (ρ n ) → I (ρ).
By Lemma 32 let h n ∈ L 2 (T d × [0, T ]) d be the unique function satisfying, using the definitions of the rate function, ψ n , and g n , from which it follows from (154) that h L 2 ≤ g L 2 . Since Proposition 21 proves that ρ solves we have by definition of the rate function that h L 2 ≥ g L 2 and therefore that h L 2 = g L 2 . Lemma 32 proves that h = g, which implies that