Weak–Strong Uniqueness for the Navier–Stokes Equation for Two Fluids with Surface Tension

In the present work, we consider the evolution of two fluids separated by a sharp interface in the presence of surface tension—like, for example, the evolution of oil bubbles in water. Our main result is a weak–strong uniqueness principle for the corresponding free boundary problem for the incompressible Navier–Stokes equation: as long as a strong solution exists, any varifold solution must coincide with it. In particular, in the absence of physical singularities, the concept of varifold solutions—whose global in time existence has been shown by Abels (Interfaces Free Bound 9(1):31–65, 2007) for general initial data—does not introduce a mechanism for non-uniqueness. The key ingredient of our approach is the construction of a relative entropy functional capable of controlling the interface error. If the viscosities of the two fluids do not coincide, even for classical (strong) solutions the gradient of the velocity field becomes discontinuous at the interface, introducing the need for a careful additional adaption of the relative entropy.


Introduction
In evolution equations for interfaces, topological changes and geometric singularities often occur naturally, one basic example being the pinchoff of liquid droplets (see Fig. 1). As a consequence, strong solution concepts for such PDEs are naturally limited to short-time existence results or particular initial configurations like perturbations of a steady state. At the same time, the transition from strong to weak solution concepts for PDEs is prone to incurring unphysical non-uniqueness of solutions; for example, Brakke's concept of varifold solutions for mean curvature flow admits sudden vanishing of the evolving surface at any time [22]; for the This project has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Grant Agreement No. 665385.

Fig. 1. Pinchoff of a liquid droplet driven by surface tension
Euler equation, even for vanishing initial data there exist nonvanishing solutions with compact support [89], and the notion of mild solutions to the Navier-Stokes equation allows any smooth flow to transition into any other smooth flow [26]. In the context of fluid mechanics, the concept of relative entropies has proven successful in ruling out the aforementioned examples of non-uniqueness; energy-dissipating weak solutions e. g. to the incompressible Navier-Stokes equation are subject to a weak-strong uniqueness principle [75,81,93], which states that as long as a strong solution exists, any weak solution satisfying the precise form of the energy dissipation inequality must coincide with it. However, in the context of evolution equations for interfaces, to the best of our knowledge the concept of relative entropies has not been applied successfully so far to obtain weak-strong uniqueness results.
In the present work, we are concerned with the most basic model for the evolution of two fluids separated by a sharp interface (like, for instance, the evolution of oil bubbles in water). The flow of each single fluid is described by the incompressible Navier-Stokes equation, while the fluid-fluid interface evolves by pure transport along the fluid flow and a surface tension force acts at the fluid-fluid interface. For this free boundary problem for the flow of two immiscible incompressible fluids with surface tension, Abels [2] has established the global existence of varifold solutions for quite general initial data.
The main result of the present work is a weak-strong uniqueness result for this free boundary problem for the Navier-Stokes equation for two fluids with surface tension. In Theorem 1, below, we prove that as long as a strong solution to this evolution problem exists, any varifold solution in the sense of Abels [4] must coincide with it.

Free Boundary Problems for the Navier-Stokes Equation
The free boundary problem for the Navier-Stokes equation has been studied in mathematical fluid mechanics for several decades. Physically, it describes the evolution of a viscous incompressible fluid surrounded by or bordering on vacuum. The (local-in-time) existence of strong solutions for the free boundary problem for the Navier-Stokes equation has been proven by Solonnikov [96][97][98] in the presence of surface tension and by Shibata and Shimizu [94] in the absence of surface tension; see also Beale [19,20], Abels [1], Coutand and Shkoller [37], Guo and Tice [62,63], as well as [10,15,21,66,78,95,99,100] for related and further results. While the existence theory for global weak solutions for the Navier-Stokes equation in a fixed domain like R d , d 3, has been developed starting with the seminal work of Leray [75] in 1934, the question of the global existence of any kind of solution to the free boundary problem for the Navier-Stokes equation has remained an open problem. An important challenge for a global existence theory of weak solutions to the free boundary problem for the Navier-Stokes equation is the possible formation of "splash singularities", which are smooth solutions to the Lagrangian formulation of the equations which develop self-interpenetration. Such solutions have been constructed by Castro, Cordoba, Fefferman, Gancedo, and Gomez-Serrano [30], see also [29,39,50] for splash singularities in related models in fluid mechanics.
In the present work we consider a closely related problem, namely the flow of two incompressible and immiscible fluids with surface tension at the fluid-fluid interface, like for example the flow of oil bubbles immersed in water or vice versa. For this free boundary problem for the Navier-Stokes equation for two fluidsdescribed by the system of PDEs (1) below-a global existence theory for generalized solutions is in fact available: in a rather recent work, Abels [2] has constructed varifold solutions which exist globally in time. In an earlier work, Plotnikov [80] had treated the case of non-Newtonian (shear-thickening) fluids. The local-in-time existence of strong solutions has been established by Denisova [46] and Köhne, Prüss, and Wilke [73]; for an interface close to the half-space, an existence and instant analyticity result has been derived by Prüss and Simonett [83,84]. Existence results for the two-phase Stokes and Navier-Stokes equation in the absence of surface tension have been established by Giga and Takahashi [60] and Nouri and Poupaud [79]. Note that in contrast to the case of a single fluid in vacuum, for the flow of two incompressible immiscible inviscid fluids splash singularities cannot occur as shown by Fefferman, Ionescu, and Lie [53] and Coutand and Shkoller [38]; one would expect a similar result to hold for viscous fluids, however solutions may be subject to the Rayleigh-Taylor instability as proven by Prüss and Simonett [82].
In terms of a PDE formulation, the flow of two immiscible incompressible fluids with surface tension may be described by the indicator function χ = χ(x, t) of the volume occupied by the first fluid, the local fluid velocity v = v(x, t), and the local pressure p = p(x, t). The fluid-fluid interface moves just according to the fluid velocity, the evolution of the velocity of each fluid and the pressure are determined by the Navier-Stokes equation, and the fluid-fluid interface exerts a surface tension force on the fluids proportional to the mean curvature of the interface. Together with the natural no-slip boundary condition and the appropriate boundary conditions for the stress tensor on the fluid-fluid interface, one may assimilate the Navier-Stokes equations for the two fluids into a single one, resulting in the system of equations where H denotes the mean curvature vector of the interface ∂{χ = 0} and |∇χ | denotes the surface measure H d−1 | ∂{χ =0} . Here, μ(0) and μ (1) are the shear viscosities of the two fluids and ρ(0) and ρ (1) are the densities of the two fluids. The constant σ is the surface tension coefficient. The total energy of the system is given by the sum of kinetic and surface tension energies It is at least formally subject to the energy dissipation inequality Note that the concept of varifold solutions requires a slight adjustment of the definition of the energy: the surface area R d 1 d|∇χ | is replaced by the corresponding quantity of the varifold, namely its mass. A widespread numerical approximation method for the free boundary problem (1a)-(1c) capable of capturing geometric singularities and topological changes in the fluid phases are phase-field models of Navier-Stokes-Cahn-Hilliard type or Navier-Stokes-Allen-Cahn type, see for example the review [13], [6,64,67,76,77] for modeling aspects, [3,5] for the existence analysis of the corresponding PDE systems, and [7][8][9] for results on the sharp-interface limit.

Weak Solution Concepts in Fluid Mechanics and (Non-)uniqueness
In the case of the free boundary problem for the Navier-Stokes equation, both for a single fluid and for a fluid-fluid interface, a concept of weak solutions is expected to play an even more central role in the mathematical theory than in the case of the standard Navier-Stokes equation: In three spatial dimensions d = 3, even for smooth initial interfaces topological changes may occur naturally in finite time, for example by asymptotically self-similar pinchoff of bubbles [49] (see Fig. 1). In contrast, for the incompressible Navier-Stokes equation without free boundary the global existence of strong solutions for any sufficiently regular initial data remains a possibility.
However, in general, weakening the solution concept for a PDE may lead to artificial (unphysical) non-uniqueness, even in the absence of physically expected singularities. A particularly striking instance of this phenomenon is the recent example of non-uniqueness of mild (distributional) solutions to the Navier-Stokes equation by Buckmaster and Vicol [28] and Buckmaster, Colombo, and Vicol [26]. In the framework of mild solutions to the Navier-Stokes equation, any smooth flow may transition into any other smooth flow [26]. The result of [26,28] are based on convex integration techniques for the Euler equation, which have been developed starting with the works of De Lellis and Székelyhidi [43,44] (see also [25,27,42,69]).
In contrast to the case of distributional or mild solutions, for the stronger notion of weak solutions to the Navier-Stokes equation with energy dissipation in the sense of Leray [75] a weak-strong uniqueness theorem is available; as long as a strong solution to the Navier-Stokes equation exists, any weak solution with energy dissipation must coincide with it. Recall that for a weak solution to the Navier-Stokes equation v, besides the Ladyzhenskaya-Prodi-Serrin regularity criterion v ∈ L p ([0, T ]; L q (R 3 ; R 3 )) with 2 p + 3 q 1 and p 2 [81,92], both a lower bound on the pressure [90] and a geometric assumption on the vorticity [36] are known to imply smoothness of v. Interestingly, weak-strong uniqueness of energy-dissipating solutions fails if the Laplacian in the Navier-Stokes equation is replaced by a fractional Laplacian −(− ) α with power α < 1 3 , see Colombo, De Lellis, and De Rosa [35] and De Rosa [86].
Another way of interpreting a weak-strong uniqueness result is that nonuniqueness of weak solutions may only arise as a consequence of physical singularities. Only when the unique strong solution develops a singularity, the continuation of solutions beyond the singularity-by means of the weak solution concept-may be nonunique. The main theorem of our present work provides a corresponding result for the flow of two incompressible immiscible fluids with surface tension: varifold solutions to the free boundary problem for the Navier-Stokes equation for two fluids are unique until the strong solution for the free boundary problem develops a singularity.

(Non)-uniquenesss in Interface Evolution Problems
Weak solution concepts for the evolution of interfaces are often subject to nonuniqueness, even in the absence of topology changes. For example, Brakke's concept of varifold solutions for the evolution of surfaces by mean curvature [22] suffers from a particularly drastic failure of uniqueness: the interface is allowed to suddenly vanish at an arbitrary time. In the context of viscosity solutions to the level-set formulation of two-phase mean curvature flow, the formation of geometric singularities may still cause fattening of level-sets [18] and thereby nonuniqueness of the mean-curvature evolution, even for smooth initial surfaces [14].
To the best of our knowledge, the only known uniqueness result for weak or varifold solutions for an evolution problem for interfaces is a consequence of the relation between Brakke solutions and viscosity solutions for two-phase meancurvature flow, see Ilmanen [68]. As long as a smooth solution to the level-set formulation exists, the support of any Brakke solution must be contained in the corresponding level-set of the viscosity solution. As a consequence, as long as a smooth evolution of the interface by mean curvature exists, the "maximal" unitdensity Brakke solution corresponds to the smoothly evolving interface. The proof of this inclusion relies on the properties of the distance function to a surface undergoing evolution by mean curvature respectively the comparison principle for mean curvature flow. Both of these properties do not generalize to other interface evolution equations.
Besides Ilmanen's varifold comparison principle, the only uniqueness results in the context of weak solutions to evolution problems for lower-dimensional objects that we are aware of are a weak-strong uniqueness principle for the highercodimension mean curvature flow by Ambrosio and Soner [12] and a weak-strong uniqueness principle for binormal curvature motion of curves in R 3 by Jerrard and Smets [71]. The interface contribution in our relative entropy (13) may be regarded as the analogue for surfaces of the relative entropy for curves introduced in [71].

The Concept of Relative Entropies
The concept of relative entropies in continuum mechanics has been introduced by Dafermos [40,41] and DiPerna [47] in the study of the uniqueness properties of systems of conservation laws. Proving weak-strong uniqueness results for conservation laws or even the incompressible Navier-Stokes equation typically faces the problem that an error between a weak solution u and a strong solution v must be measured by a quantity E [u|v] which is nonlinear even as a function of u alone, like a norm ||u − v||. To evaluate the time evolution d dt E [u|v] of such a quantity, one would need to test the evolution equation for u by the nonlinear function D u E [u|v], which is often not possible due to the limited regularity of u. The concept of relative entropies overcomes this issue if the physical system possesses a strictly convex entropy (or energy) E [u] subject to a dissipation estimate d dt E[u] −D [u]: For strictly convex entropies E [u], the relative entropy is a measure for the error between u and v, as it is nonzero if and only if u = v. At the same time, to evaluate the time evolution d dt E [u|v] of the relative entropy, it is sufficient to exploit the entropy dissipation inequality d dt E[u] −D [u] for the weak solution u and test the weak formulation of the evolution equation for u by the typically more regular test function DE [v]. Having derived an explicit expression for the time derivative d dt E [u|v] of the relative entropy, it is often possible to derive a Gronwall-type estimate like d dt E [u|v] C E [u|v] for the relative entropy and thereby a weak-strong uniqueness result.
Since Dafermos [40], the concept of relative entropies has found many applications in the analysis of continuum mechanics, providing weak-strong uniqueness results for the compressible Navier-Stokes equation [54,58], the Navier-Stokes-Fourier system [55], fluid-structure interaction problems [31], renormalized solutions for dissipative reaction-diffusion systems [32,57], as well as weak-strong uniqueness results for measure-valued solutions for the Euler equation [24], compressible fluid models [65], wave equations in nonlinear elastodynamics [45], and models for liquid crystals [51], to name just a few.
The concept of relative entropies has also been employed in the justification of singular limits of PDEs, see for example the work of Yau [101] on the hydrodynamic limit of the Ginzburg-Landau lattice model, the works of Bardos, Golse, and Levermore [17], Saint-Raymond [87,88], and Golse and Saint-Raymond [61] on the derivation of the Euler equation and the incompressible Navier-Stokes equation from the Boltzmann equation, the work of Brenier [23] on the Euler limit of the Vlasov-Poissson equation, and the works of Serfaty [91] and Duerinckx [48] on mean-field limits of interacting particles. In the context of numerical analysis, it may also be used to derive a posteriori estimates for model simplification errors [56,59].
Jerrard and Smets [71] have used a relative entropy ansatz to establish a weakstrong uniqueness principle for the evolution of curves in R 3 by binormal curvature flow. Their relative entropy may be regarded as the analogue for curves of the interfacial energy contribution to our relative entropy (that is, the terms σ R d 1 − ξ(·, T ) · ∇χ u (·, T ) |∇χ u (·,T )| d|∇χ u (·, T )| + σ R d 1 − θ T d|V T | S d−1 in (13) below). It has subsequently been used by Jerrard and Seis [70] to prove that the evolution of solutions to the Euler equation with near-vortex-filament initial data is governed by binormal curvature flow, as long as a strong solution to the latter (without selfintersections) exists and as long as the vorticity remains concentrated along some curve.
One of the key challenges in the derivation of our result is the development of a notion of relative entropy which provides strong enough control of the interfacial error. The key idea to control the error between an interface ∂{χ u (·, t) = 1} and a smoothly evolving interface I v (t) = ∂{χ v (·, t) = 1} by a relative entropy is to introduce a vector field ξ which is an extension of the unit normal of I v (t), multiplied with a cutoff. The interfacial contribution σ R d 1 − ξ(·, T ) · ∇χ u (·, T ) |∇χ u (·,T )| d|∇χ u (·, T )| to the relative entropy then controls the interface error in a sufficiently strong way, see Section 3 for details.
However, in the case of different viscosities μ + = μ − of the two fluids, the velocity gradient of the strong solution ∇v at the interface will be discontinuous. This necessitates an additional adaption of our relative entropy; if one were to directly compare the velocity fields u and v of two solutions by the relative entropy, the difference of the viscous stresses μ(χ u )D sym u −μ(χ v )D sym v could not be estimated appropriately to derive a Gronwall-type estimate. We rather have to compare the velocity field u to an adapted velocity field v + w, where w is constructed in a way that the adapted velocity gradient ∇v + ∇w approximately accounts for the shifted location of the interface.
The approximate adaption of the interface of the strong solution to the higherorder approximation for the interface is distantly reminiscent of an ansatz by Leger and Vasseur [74] and Kang, Vasseur, and Wang [72], who establish L 2 contractions up to a shift for solutions to conservation laws close to a shock profile. However, it differs both in purpose and in the actual construction from [72,74]. The interfacial shift in [72,74] essentially serves the purpose of compensating the difference in the propagation speed of the shocks of the two solutions, while we need the higherorder approximation of the interface to compensate for the discontinuity in the velocity gradient at the interface. While the interfacial shift in [72] is given as the solution to an appropriately defined time-dependent PDE, in our case we obtain the interfacial shift by applying at any fixed time a suitable regularization operator to the interface of the weak solution near the interface of the strong solution.

Derivation of the Model
Let us briefly comment on the derivation of the system of equations (1). We consider the flow of two viscous, immiscible, and incompressible fluids. Each fluid occupies a domain + t resp. − t , t 0, and the interface separating both phases will be denoted by I (t). In particular, R d = + t ∪ − t ∪ I (t) for every t 0. Within each of these domains ± t , the evolution of the fluid velocity is modeled by means of the incompressible Navier-Stokes equations for a Newtonian fluid ∂ t (ρ ± v ± of the two fluids, and μ + and μ − the shear viscosities. On the interface of the two fluids I (t) a no-slip boundary condition v + t = v − t is imposed. As the two velocities v + t and v − t are defined on complementary domains and coincide on the interface, this enables us to assimilate the two velocity fields into a single velocity Note that the velocity field v inherits the incompressibility (1c) from the incompressibility of v + and v − (2b). We also assimilate the pressures p + t and p − t into a single pressure p, which however may be discontinuous across the interface.
Additionally, we assume that the evolution of the interface I (t) occurs only as a result of the transport of the two fluids along the flow. Denoting by n the outward unit normal vector field of the interface I (t) and by V n the associated normal speed of the interface, this gives rise to the equation This condition may equivalently be rewritten as the transport equation for the indicator function χ of the first fluid phase see for example Remark 9 below for the (standard) arguments. In order to assimilate the equations (2a) for the velocities v ± of the two fluids into the single equation (1b), a condition on the jump of the normal component of the stress tensor T = μ ± (∇v + ∇v T ) − p Id at the interface I (t) is required. In the case of positive surface tension constant σ > 0 at the interface, the balance of forces at the interface reads where the right-hand side σ H accounts for the surface tension force. Here, H denotes the mean curvature vector of the interface and [[ f ]] denotes the jump in normal direction of a quantity f . In combination with (2a) and the no-slip boundary condition v + = v − on I (t), this yields the equation for the momentum balance (1b).

Main Results
The main result of the present work is the derivation of a weak-strong uniqueness principle for varifold solutions to the free boundary problem for the Navier-Stokes equation for two immiscible incompressible fluids with surface tension. As long as a strong solution to the free boundary problem (1a)-(1c) exists, any varifold solution must coincide with it. In particular, the concept of varifold solutions developed by Abels [2] (see Definition 2 below for a precise definition) does not introduce an additional mechanism for non-uniqueness, at least as long as a classical solution exists. At the same time, the concept of varifold solutions of Abels allows for the construction of globally existing solutions [2], while any concept of strong solutions is limited to the absence of geometric singularities and therefore-at least in three spatial dimensions d = 3-to short-time existence results. Furthermore, we prove a quantitative stability result (5) for varifold solutions with respect to changes in the data: As long as a classical solution exists, any varifold solution with slightly perturbed initial data remains close to it.
holds for almost every T ∈ [0, T strong ), provided that the initial relative entropy The constants c > 0 and C > 0 depend only on the data and the strong solution.
In particular, if the initial data of the varifold solution and the strong solution coincide, the varifold solution must be equal to the strong solution in the sense that hold almost everywhere for almost every t ∈ [0, T strong ), and the varifold is given for almost every t ∈ [0, T strong ) by We emphasize that our main result in Theorem 1 remains valid if we allow for a density-dependent bulk force like gravity, that is, if we add a term of the form ρ(χ)g on the right-hand side of (1b). Details are provided in Remark 35. The notion of varifold solutions for the free boundary problem associated with the flow of two immiscible incompressible viscous fluids with surface tension has been introduced by Abels [2]. For Newtonian fluids, the global-in-time existence of such varifold solutions has been proven for quite general initial data in two and three spatial dimensions in [2]. For the notion of an oriented varifold, see the section on notation just prior to Section 3.

Definition 2. (Varifold solution for the two-phase Navier-Stokes equation)
Let a surface tension constant σ > 0, the densities and shear viscosities of the two fluids ρ ± , μ ± > 0, a finite time T vari > 0, a solenoidal initial velocity profile v 0 ∈ L 2 (R d ; R d ), and an indicator function of the volume occupied initially by the first fluid χ 0 ∈ BV(R d ) be given.
A triple (χ , v, V ) consisting of a velocity field v, an indicator function χ of the volume occupied by the first fluid, and an oriented varifold V with is called a varifold solution to the free boundary problem for the Navier-Stokes equation for two fluids with initial data (χ 0 , v 0 ) if the following conditions are satisfied: i) The velocity field v has vanishing divergence ∇ · v = 0 and the equation for the momentum balance is satisfied for almost every T ∈ [0, T vari ) and every smooth vector field η ∈ For the sake of brevity, we have used the abbreviations ρ(χ) ii) The indicator function χ of the volume occupied by the first fluid satisfies the weak formulation of the transport equation for almost every T ∈ [0, T vari ) and all ϕ ∈ C ∞ cpt (R d × [0, T vari )). iii) The energy dissipation inequality is satisfied for almost every T ∈ [0, T vari ), and the energy is a nonincreasing function of time. iv) The phase boundary ∂{χ(·, t) = 0} and the varifold V satisfy the compatibility condition for almost every T ∈ [0, T vari ) and every smooth function ψ ∈ C ∞ cpt (R d ).
Let us continue with a few comments on the relation between the varifold V t and the interface described by the indicator function χ(·, t).
denote the non-negative measure representing (at time t) the varifold associated to a varifold solution (χ , v, V ) to the free boundary problem for the incompressible Navier-Stokes equation for two fluids. The compatibility condition (6e) entails that |∇χ u (t)| is absolutely continuous with respect to |V t | S d−1 . Hence, we may define the Radon-Nikodym derivative which is for every f ∈ L 1 (R d , |∇χ(·, t)|) and almost every t ∈ [0, T vari ).
The compatibility condition between the varifold V t and the interface described by the indicator function χ(·, t) has the following consequence.
Remark 4. Consider a varifold solution (χ , v, V ) to the free boundary problem for the incompressible Navier-Stokes equation for two fluids. Let E t be the measurable set {x ∈ R d : χ(x, t) = 1}. Note that for almost every t ∈ [0, T vari ) this set is then a Caccioppoli set in R d . Let n(·, t) = ∇χ |∇χ | denote the measure theoretic unit normal vector field on the reduced boundary ∂ * E t . By means of the compatibility condition (6e) and the definition (7) we obtain for almost every t ∈ [0, T vari ) and |V t | S d−1 -almost every x ∈ R d .
In order to define a notion of strong solutions to the free boundary problem for the flow of two immiscible fluids, let us first define a notion of smoothly evolving domains.

Definition 5. (Smoothly evolving domains and surfaces)
Let + 0 be a bounded domain of class C 3 and consider a family for every t ∈ [0, T strong ]. We say that + t , − t are smoothly evolving domains and that I (t) are smoothly evolving surfaces if we have + , subject to the following conditions: Moreover, we assume Moreover, we assume that there exists r c ∈ (0, 1 2 ] with the following property: for all t ∈ [0, T strong ) and all x ∈ I (t) there exists a function g : B 1 (0) ⊂ R d−1 → R with ∇g(0) = 0 such that after a rotation and a translation, Furthermore, for any of these functions g the pointwise bounds |∇ m g| r −(m−1) c hold for all 1 m 3.
We have everything in place to give the definition of a strong solution to the free boundary problem for the Navier-Stokes equation for two fluids. For a discussion of the conditions (10a)-(10c) we refer to Remark 36 in the "Appendix". Definition 6. (Strong solution for the two-phase Navier-Stokes equation) Let a surface tension constant σ > 0, the densities and shear viscosities of the two fluids ρ ± , μ ± > 0, a finite time horizon T strong > 0, a domain + 0 occupied initially by the first fluid with interface I v (0) := ∂ + 0 , and an initial velocity profile v 0 be given which are subject to the following regularity and compatibility conditions: Let the initial interface between the fluids I v (0) be a compact C 3 -manifold.
A pair (χ , v) consisting of a velocity field v and an indicator function χ of the volume occupied by the first fluid with is called a strong solution to the free boundary problem for the Navier-Stokes equation for two fluids with initial data (χ 0 , v 0 ) if the volume occupied by the first fluid + t := {x ∈ R d : χ(x, t) = 1} is a smoothly evolving domain and the interface I v (t) := ∂ + t is a smoothly evolving surface in the sense of Definition 5, and if additionally the following conditions are satisfied: i) The velocity field v has vanishing divergence ∇ · v = 0 and the equation for the momentum balance is satisfied for almost every T ∈ [0, T strong ) and every smooth vector field Here, H denotes the mean curvature vector of the interface I v (t). For the sake of brevity, we have used the abbreviations ρ(χ) ii) The indicator function χ of the volume occupied by the first fluid satisfies the weak formulation of the transport equation {t} all spatial derivatives of v up to third order, the time derivative ∂ t ∇v, as well as the mixed derivative ∂ t ∇v of the velocity field exist, and they satisfy the estimate We continue with a remark on the existence of strong solutions in the functional framework of the previous definition.

Remark 7.
Local-in-time existence of such strong solutions (starting with smooth initial data subject to the above compatibility conditions) is essentially shown in [73,Theorem 2], up to two details: the authors only consider the system (1) in a bounded domain (instead of R d ), and they do not state the regularity up to initial time (11c). The former restriction is just a technicality and the methods extend to unbounded domains, see [83]. The regularity up to initial time with the bound (11c), on the other hand, can be deduced by regularity theory, using the transformed formulation of the problem in [73]; this however requires higher-order regularity and compatibility conditions for the initial data in the following sense. Let p 0 be an initial pressure field. Then we assume that We refer to Remark 36 in the "Appendix" for a discussion of these conditions; we also give a brief discussion concerning the existence of strong solutions in the precise functional framework of Definition 6 under these additional regularity and compatibility conditions in Remark 37 in the "Appendix".
Before we state the main ingredient for the proof of Theorem 1, we proceed with two further remarks on the notion of strong solutions. The first concerns the consistency with the notion of varifold solutions due to Abels [2].
Remark 8. Every strong solution (χ , v) to the free boundary problem for the incompressible Navier-Stokes equation for two fluids (1a)-(1c) in the sense of Definition 6 canonically defines a varifold solution in the sense of Definition 2. Indeed, we can define the varifold V by means of dV t = δ ∇χ |∇χ | d|∇χ |. Due to the regularity requirements on the family of smoothly evolving surfaces I (t), see Definition 5, it then follows [4,Lemma 3.4]. Moreover, it follows from the regularity requirements of a strong solution that the velocity field v also satisfies the energy dissipation inequality (6c). This proves the claim.
The second remark concerns the validity of (3) in a strong sense for a strong solution, that is, that the evolution of the interface I (t) occurs only as a result of the transport of the two fluids along the flow. Remark 9. Let (χ , v) be a strong solution to the free boundary problem for the incompressible Navier-Stokes equation for two fluids (1a)-(1c) in the sense of Definition 6 on some time interval [0, T strong ). Let V n (x, t) denote the normal speed of the interface at x ∈ I v (t), that is, the normal component of ∂ t (x, t) where : R d × [0, T strong ) → R d is the family of diffeomorphisms from the definition of a family of smoothly evolving domains (Definition 5). Furthermore, let ϕ ∈ C ∞ cpt (R d × (0, T strong )). Due to the regularity requirements on a family of smoothly evolving domains, see Definition 5, we obtain (see for instance [4,Theorem 2.6]) On the other side, subtracting from the former identity the equation (11b) satisfied by the indicator function χ and making use of the incompressibility of the velocity field v we deduce Since ϕ ∈ C ∞ cpt (R d × (0, T strong )) was arbitrary we recover the identity that is to say, the kinematic condition (3) of the interface being transported with the flow is satisfied in its strong formulation.
Our weak-strong uniqueness result in Theorem 1 relies on the following relative entropy inequality. The regime of equal shear viscosities μ + = μ − corresponds to the choice of w = 0 in the statement below. Note also that in this case the viscous stress term R visc disappears due to μ(χ u ) − μ(χ v ) = 0.
be a solenoidal vector field with bounded spatial derivative ∇w L ∞ < ∞. Suppose furthermore that for almost every t 0, for every x ∈ R d either x is a Lebesgue point of ∇w(·, t) or there exists a half-space H x such that x is a Lebesgue point for both ∇w(·, t)| H x and ∇w(·, t)| R d \H x .
For a point (x, t) such that dist(x, I v (t)) < r c , denote by P I v (t) x the projection of x onto the interface I v (t) of the strong solution. Introduce the extension ξ of the unit normal n v of the interface of the strong solution defined by for some cutoff η with η(s) = 1 for s 1 2 r c and η ≡ 0 for s r c . Let be an extension of the normal velocity of the interface of the strong solution I v (t) to an r c -neighborhood of I v (t). Let θ be the density θ t = d|∇χ u (·,t)| d|V t | S d−1 as defined in (7) and let β : R → R be a truncation of the identity with β(r ) = r for |r | 1 2 , |β | 1, |β | C, and β (r ) = 0 for |r | 1.
Then the relative entropy is subject to the relative entropy inequality for almost every T ∈ (0, T strong ), where we made use of the abbreviations as well as Moreover, we have abbreviated as well as If we additionally allow for a density-dependent bulk force ρ(χ) f acting on the fluid, such as gravity, only one additional term appears on the right-hand side of the relative entropy inequality of Proposition 10, see (216). We will comment in Remark 35 on the minor changes that occur in the proof of the relative entropy inequality due to the additional bulk force.

Notation
We use a ∧ b (respectively a ∨ b) as a shorthand notation for the minimum (respectively maximum) of two numbers a, b ∈ R.
Let ⊂ R d be open. For a function u : × [0, T ] → R, we denote by ∇u its distributional derivative with respect to space and by ∂ t u its derivative with respect to time. For p ∈ [1, ∞] and an integer k ∈ N 0 , we denote by L p ( ) and W k, p ( ) the usual Lebesgue and Sobolev spaces. In the special case p = 2 we use as usual H k ( ) := W k,2 ( ) to denote the Sobolev space. For integration of a function f with respect to the d-dimensional Lebesgue measure respectively the d − 1-dimensional surface measure, we use the usual notation f dx respectively I f dS. For measures other than the natural measure (the Lebesgue measure in case of domains and the surface measure in case of surfaces I ), we denote the corresponding Lebesgue spaces by L p ( , μ). The space of all compactly supported and infinitely differentiable functions on is denoted by C ∞ cpt ( ). The closure of C ∞ cpt ( ) with respect to the Sobolev norm · W k, p ( ) is W k, p 0 ( ), and its dual will be denoted by W −1, p ( ) where p ∈ [0, ∞] is the conjugated Hölder exponent of p, that is 1/ p + 1/ p = 1. For vector-valued fields, say with range in R d , we use the notation L p ( ; R d ), and so on. For a Banach space X , a finite time T > 0 and a number p ∈ [1, ∞] we denote by we refer to the space of bounded and weak- * measurable maps f : [0, T ] → X , where X is the dual space of X . By L p ( ) + L q ( ) we denote the normed space of all functions u : → R which may be written as the sum of two functions v ∈ L p ( ) and w ∈ L q ( ). The space C k ([0, T ]; X ) contains all k-times continuously differentiable and X -valued functions on [0, T ].
In order to give a suitable weak description of the evolution of the sharp interface, we have to recall the concepts of Caccioppoli sets as well as varifolds. To this end, let ⊂ R d be open. We denote by BV( ) the space of functions with bounded variation in . A measurable subset E ⊂ is called a set of finite perimeter in (or a Caccioppoli subset of ) if its characteristic function χ E is of bounded variation in . We will write ∂ * E when referring to the reduced boundary of a Caccioppoli subset E of ; whereas n denotes the associated measure theoretic (inward pointing) unit normal vector field of ∂ * E. For detailed definitions of all these concepts from geometric measure theory, we refer to [33,52]. In case has a C 2 boundary, we denote by H(x) the mean curvature vector at x ∈ ∂ . Recall that for a convex function g : R d → R the recession function g rec : An oriented varifold is simply a non-negative measure For a varifold V , we denote by |V | S d−1 ∈ M( ) its local mass density given by for any Borel set A ⊂ . For a locally compact separable metric space X we write M(X ) to refer to the space of (signed) finite Radon-measures on X . If A ⊂ X is a measurable set and μ ∈ M(X ), we let μ A be the restriction of μ on A. The k-dimensional Hausdorff measure on R d will be denoted by H k , whereas we write L d (A) for the d-dimensional Lebesgue measure of a Lebesgue measurable set A ⊂ R d .
Finally, let us fix some tensor notation. First of all, we use (∇v) i j = ∂ j v i as well as ∇ ·v = i ∂ i v i for a Sobolev vector field v : R d → R d . The symmetric gradient is denoted by D sym v := 1 2 (∇v+∇v T ). For time-dependent fields v : R d ×[0, T ) → R n we denote by ∂ t v the partial derivative with respect to time. Tensor products of vectors u, v ∈ R d will be given by

The Relative Entropy
The basic idea of the present work is to measure the "distance" between a varifold solution to the two-phase Navier-Stokes equation (χ u , u, V ) and a strong solution to the two-phase Navier-Stokes equation (χ v , v) by means of the relative entropy functional where ξ : R d × [0, T strong ) → R d is a suitable extension of the unit normal vector field of the interface of the strong solution and where w is a vector field that will be constructed below and that vanishes in case of equal viscosities μ + = μ − . More precisely, we choose ξ as for some cutoff η with η(s) = 1 for s 1 2 r c and η ≡ 0 for s r c , where P I v (t) x denotes for each t 0 the projection of x onto the interface I v (t) of the strong solution and where the unit normal vector field n v of the interface of the strong solution is oriented to point towards {χ v (·, t) = 1}. For an illustration of the vector field ξ , see Fig. 2.
Rewriting the relative entropy functional in the form with the energy (6d), we see that we may estimate the time evolution of the relative entropy E χ u , u, V χ v , v (t) by exploiting the energy dissipation property (6c) of the varifold solution and by testing the weak formulation of the two-phase Navier-Stokes equation (6a) and (6b) against the (sufficiently regular) test functions v + w respectively 1 2 |v + w| 2 , ∇ · ξ , and β( dist ± (x,I v (t)) r c ). As usual in the derivation of weak-strong uniqueness results by the relative entropy method of Dafermos [40] and Di Perna [47], in the case of equal viscosities μ + = μ − the goal is the derivation of an estimate of the form which implies uniqueness and stability by means of the Gronwall lemma and by the coercivity properties of the relative entropy functional discussed in the next section.
In the case of different viscosities μ + = μ − , we will derive a slightly weaker (but still sufficient) result of roughly speaking the form along with estimates on w which include in particular the bound

The Error Control Provided by the Relative Entropy Functional
The relative entropy functional (14) provides control of the following quantities (up to bounded prefactors): Velocity error control. The relative entropy E[χ u , u, V |χ v , v](t) controls the square of the velocity error in the L 2 norm at any given time t. In the case of equal viscosities, this is immediate from (14) by w ≡ 0, while in the case of different viscosities this follows by the estimate |∇χ u | d|∇χ u | which is a consequence of the construction of w and the choice of ξ , see below. Interface error control. The relative entropy provides a tilt-excess type control of the error in the interface normal In particular, it controls the squared error in the interface normal The term also controls the total length respectively area (for d = 2 respectively d = 3) of the part of the interface I u which is not locally a graph over I v , see Fig. 3. For example, in the region around the left purple half-ray the interface of the weak solution is not a graph over the interface of the weak solution. Furthermore, the term controls the length respectively area (for d = 2 respectively d = 3) of the part of the interface with distance to I v (t) greater than the cutoff length r c , as there we have ξ ≡ 0.
To give another heuristic explanation of the interface error control and also introduce some notation for subsequent use (for the proof in the case of different viscosities in Section 6 and its explanation in Section 3.4), we attempt to write the interface of the weak solution as a graph over the interface of the strong solution (at least, to the extent to which this is possible). Denote the local height of the one-sided interface error by h + : I v (t) → R + 0 as measured along orthogonal rays originating on I v (t) (with some cutoff applied away from the interface I v (t) of the strong solution); denote by h − the corresponding height of the interface error as measured in the other direction. For example, in Fig. 3 the quantity h + (x) for some base point x ∈ I v (t) would correspond to the accumulated length of the solid segments in each of the purple rays, the dotted segments not being counted. Note that the rays are orthogonal on I v (t). Then the tilt-excess type term in the relative entropy also controls the gradient of the one-sided interface error heights Note that wherever I u (t) is locally a graph over I v (t) and is not too far away from I v (t), it must be the graph of the function h + − h − . Here, the graph of a function g over the curved interface I v (t) is defined by the set of points obtained by shifting the points of I v (t) by the corresponding multiple of the surface normal, Varifold multiplicity error control. For varifold solutions, the relative entropy controls the multiplicity error of the varifold corresponds to the multiplicity of the varifold), which in turn by the compatibility condition (6e) and the definition of θ t (see (7)) controls the squared error in the normal of the varifold Weighted volume error control. Furthermore, the error in the volume occupied by the two fluids weighted with the distance to the interface of the strong solution is controlled. Note that this term is the only term in the relative entropy which is not obtained by the usual relative entropy ansatz . We have added this lower-order term-as compared to the term R d 1 − ξ · ∇χ u |∇χ u | d|∇χ u | which provides tilt-excess-type control-to the relative entropy in order to remove the lack of coercivity of the term R d 1 − ξ · ∇χ u |∇χ u | d|∇χ u | in the limit of vanishing interface length of the varifold solution.
Control of velocity gradient error by dissipation. By means of Korn's inequality, the dissipation term controls the L 2 -error in the gradient

The Case of Equal Viscosities
For equal viscosities μ + = μ − , one may choose w ≡ 0. As a consequence, the right-hand side in the relative entropy inequality-see Proposition 10 above-may be estimated to yield the Gronwall-type inequality (15). The details are provided in Section 5.

Additional Challenges in the Case of Different Viscosities
In the case of different viscosities μ 1 = μ 2 of the two fluids, even for strong solutions the normal derivative of the tangential velocity features a discontinuity at the interface. By the no-slip boundary condition, the velocity is continuous across the interface [v] = 0 and the same is true for its tangential derivatives [(t ·∇)v] = 0. As a consequence of this, the discontinuity of μ(χ v ) across the interface and the equilibrium condition for the stresses at the interface entail for generic data a discontinuity of the normal derivative of the tangential velocity t · (n · ∇)v across the interface.
As a consequence, it becomes impossible to establish a Gronwall estimate for the standard relative entropy (14) with w ≡ 0. To see this, consider in the two-dimensional case d = 2 two strong solutions u and v with coinciding initial velocities u(·, 0) = v(·, 0) = u 0 (·), but slightly different initial interfaces χ v (·, 0) = χ {|x| 1} and χ u (·, 0) = χ {|x| 1−ε} for some ε > 0. The initial relative entropy is then of the order ∼ ε 2 . Suppose that (in polar coordinates) the initial velocity u 0 has a profile near the interface like Note that this velocity profile features a kink at the interface. As one verifies readily, as far as the viscosity term is concerned this corresponds to a near-equilibrium profile for the solution (χ v , v) (in the sense that the viscosity term is bounded). However, in the solution (χ u , u) the interface is shifted by ε and the profile is no longer an equilibrium profile. By the scaling of the viscosity term, the timescale within which the profile u 0 equilibrates in the annulus of width ε towards a nearaffine profile is of the order of ε 2 . After this timescale, the velocity u will have changed by about ε in a layer of width ∼ ε around the interface; at the same time, due to the mostly parallel transport at the interface the solution will not have changed much otherwise. As a consequence, the term 1 2 ρ(χ u )|u − v| 2 dx will be of the order of at least cε 3 after a time T ∼ ε 2 , while the other terms in the relative entropy are essentially the same. Thus, the relative entropy has grown by a factor of 1 + cε within a timescale ε 2 , which prevents any Gronwall-type estimate. At the level of the relative entropy inequality (see Proposition 10), the derivation of the Gronwall inequality is prevented by the viscosity terms, which read for w ≡ 0 The latter term prevents the derivation of a dissipation estimate: While it is formally quadratic in the difference of the two solutions (χ u , u) and (χ v , v), due to the (expected) jump of the velocity gradients ∇v and ∇u at the respective interfaces it is in fact only linear in the interface error.
The key idea for our weak-strong uniqueness result in the case of different viscosities is to construct a vector field w which is small in the L 2 norm but whose gradient compensates for most of the problematic term (μ(χ v ) − μ(χ u ))(∇v + ∇v T ). To be precise, it is only the normal derivative of the tangential component of v which may be discontinuous at the interface; the tangential derivatives are continuous by the no-slip boundary condition, while the normal derivative of the normal component is continuous by the condition ∇ · v = 0.
Let us explain our construction of the vector field w at the simple twodimensional example of a planar interface of the strong solution I v = {(x, 0) : x ∈ R}. In this setting, we would like to set for y > 0 that (where e x just denotes the first vector of the standard basis). Note that due to the bounded integrand, this vector field w + (x, y) is bounded by Ch + (x), i.e. it is bounded by the interface error. As we shall see in the proof, the time derivative of w + is also bounded in terms of other error terms. The tangential spatial derivative of this vector field ∂ x w + (x, y, t) is given (up to a constant factor) by in the region in which the interface of the weak solution is a graph of a function over I v . Note that our relative entropy functional provides a higher-order control of the size of the region in which the interface of the weak solution is not a graph over the interface of the strong solution.
However, with this choice of vector field w + (x, y, t), two problems occur: First, the vector field is not solenoidal. For this reason, we introduce an additional Helmholtz projection. Second and constituting a more severe problem, the vector field would not necessarily be (spatially) Lipschitz continuous (as the derivative contains a term with ∂ x h + (x) which is not necessarily bounded), which due to the surface tension terms would be required for the derivation of a Gronwall-type estimate. For this reason, we first regularize the height function h + by mollification on a scale of the order of the error. See Proposition 26 and Proposition 27 for details of our construction of the regularized height function, and Fig. 4 for an illustration of it. The actual construction of our compensation function w is performed in Proposition 28. We then derive an estimate in the spirit of (16) in Proposition 34.

Time Evolution of Geometric Quantities and Further Coercivity
Properties of the Relative Entropy Functional

Time Evolution of the Signed Distance Function
In order to describe the time evolution of various constructions, we need to recall some well-known properties of the signed distance function. Let us start by introducing notation. For a family ( + t ) t∈[0,T strong ) of smoothly evolving domains with smoothly evolving interfaces I (t) in the sense of Definition 5, the associated signed distance function is given by From Definition 5 of a family of smoothly evolving domains it follows that the family of maps t : The signed distance function (resp. its time derivative) to the interface of the strong solution is then of class C 0 t C 3 x (resp. C 0 t C 2 x ) in the space-time tubular neighborhood t∈[0,T strong ) im( t ) × {t} due to the regularity assumptions in Definition 5. We also have the bounds and in particular, for the mean curvature vector, Moreover, the projection P I (t) x of a point x onto the nearest point on the manifold I (t) is well-defined and of class C 0 t C 2 x in the same tubular neighborhood. After having introduced the necessary notation we study the time evolution of the signed distance function to the interface of the strong solution. Because of the kinematic condition that the interface is transported with the flow, we obtain the following statement: 1} is a family of smoothly evolving domains and I v (t) := ∂ + t is a family of smoothly evolving surfaces in the sense of Defini- The time evolution of the signed distance function to the interface I v (t) is then given by for any t ∈ [0, T strong ] and any x ∈ R d with dist(x, I v (t)) r c , whereV n is the extended normal velocity of the interface given bȳ Moreover, the following formulas hold true:

The gradient of the projection onto the nearest point on the interface I v (t) is given by
In particular, we have the bound This also extends away from the interface in the sense that for all (x, t) such that dist(x, I v (t)) < r c , i.e. (23) holds. Hence, we also have the formula . Differentiating this representation of the projection onto the interface and using the fact that n v is a unit vector, we also obtain, using (30), that Hence, we obtain, in addition to (29), the formula On the other side, on the interface the time derivative of the signed distance function equals up to a sign the normal speed. In our case, the latter is given by the normal component of the given velocity field v evaluated on the interface, see Remark 9. This concludes the proof of (21). Moreover, (24) as well as (25) follow immediately from differentiating |∇ dist ± (x, I v (t))| 2 = 1. Finally, (27) and (28) follow immediately from (19) and In the above considerations, we have made use of the following result: Consider the auxiliary function g( Remark 12. Consider the situation of Lemma 11. We proved that The right hand side of this identity is of class L ∞ t W 2,∞ x , as the normal component n v (P I v (t) ) · ∇v of the velocity gradient ∇v of a strong solution is continuous across the interface I v (t). To see this, one first observes that the tangential derivatives ((Id −n v (P I v (t) ) ⊗ n v (P I v (t) ))∇)v are naturally continuous across the interface; one then uses the incompressibility constraint ∇ · v = 0 to deduce that n v (P I v (t) ) · (n v (P I v (t) ) · ∇)v is also continuous across the interface.

Properties of the Vector Field ξ
The vector field ξ -as defined in Proposition 10 and illustrated in Fig. 2-is an extension of the unit normal vector field n v associated to the family of smoothly evolving domains occupying the first fluid of the strong solution. We now provide a more detailed account of its definition. The construction in fact consists of two steps. First, we extend the normal vector field n v to a (space-time) tubular neighborhood of the evolving interfaces I v (t) by projecting onto the interface. Second, we multiply this construction with a cutoff which decreases quadratically in the distance to the interface of the strong solution (see (37)).
1} is a family of smoothly evolving domains and I v (t) := ∂ + t is a family of smoothly evolving surfaces in the sense of Definition 5. Let η be a smooth cutoff function with η(s) = 1 for s 1 2 and η ≡ 0 for s 1. Define another smooth cutoff function ζ : R → [0, ∞) as follows: and ζ ≡ 0 for |r | > 1. Then, we define a vector field ξ : The definition of ξ has the following consequences: Observe that the vector field ξ is indeed well-defined in the space-time domain R d × [0, T strong ) due to the action of the cut-off function ζ ; it also satisfies |ξ | 1 or, more precisely, the sharper inequality |ξ | (1 − dist(x, I v (t)) 2 ) + . Furthermore, the extension ξ inherits its regularity from the regularity of the signed distance function to the interface I v (t). More precisely, it follows that the vector field ξ (resp. its time derivative) is of class x . This turns out to be sufficient for our purposes. The time derivative of our vector field ξ is given as follows: 1} is a family of smoothly evolving domains and I v (t) := ∂ + t is a family of smoothly evolving surfaces in the sense of Defini- . LetV n be the extended normal velocity of the interface (22). Then the time evolution of the vector field ξ from Definition 13 is given by Here, we made use of the abbreviation Proof. We start by deriving a formula for the time evolution of the normal vector field n v (P I v (t) x, t) in the space-time tubular neighborhood dist(x, I v (t)) < r c . By (23), we may use the formula for the time evolution of the signed distance function from Lemma 11. More precisely, due to the regularity of the signed distance function to the interface of the strong solution and the regularity of the vector field V (Remark 12), we can interchange the differentiation in time and space to obtain Next, we show that the normal-normal component of ∇V n vanishes. Observe that by Remark 12 and (23) it holds that Hence, by (23)- (26) and this formula we obtain as desired. In summary, we have proved so far that (34) which holds in the space-time domain dist(x, I v (t)) < r c . However, applying the chain rule to the cut-off function r → ζ(r ) from (31) together with the evolution equation (21) for the signed distance to the interface shows that the cut-off away from the interface is also subject to a transport equation By the definition of the vector field ξ , see (32), and the product rule, this concludes the proof.

Properties of the Weighted Volume Term
We next discuss the weighted volume contribution dx to the relative entropy in more detail.

Remark 16.
Let β be a truncation of the identity as in Proposition 10. Let 1} is a family of smoothly evolving domains, and I v (t) := ∂ + t is a family of smoothly evolving surfaces, in the sense of Definition 5. The map inherits the regularity of the signed distance function to the interface I v (t). More precisely, this map (resp. its time derivative) is of class 1} is a family of smoothly evolving domains and I v (t) := ∂ + t is a family of smoothly evolving surfaces in the sense of Defini- . LetV n be the extended normal velocity of the interface (22). Then the time evolution of the weight function β composed with the signed distance function to the interface I v (t) is given by the transport equation Proof. This is immediate from the chain rule and the time evolution of the signed distance function to the interface of the strong solution, see Lemma 11.

Further Coercivity Properties of the Relative Entropy
We collect some further coercivity properties of the relative entropy functional E χ u , u, V χ v , v as defined in (13). These will be of frequent use in the estimation of the terms occurring on the right hand side of the relative entropy inequality from Proposition 10. We start for reference purposes with trivial consequences of our choices of the vector field ξ and the weight function β.
Let ξ be the vector field from Definition 13 with cutoff multiplier ζ as given in (31). By the choice of the cutoff ζ , it holds that We will also make frequent use of the fact that for any unit vector b ∈ R d we have We also want to emphasize that the relative entropy functional controls the squared error in the normal of the varifold.

Lemma 19. Consider the situation of Proposition 10. We then have
for almost every t ∈ [0, T strong ).
Proof. Observe first that by means of the compatibility condition (6e) we have which holds for almost every t ∈ [0, T strong ). In addition, due to (8), one obtains for almost every t ∈ [0, T strong ). This in turn entails the following identity: which holds true for almost every t ∈ [0, T strong ). However, the functional on the right hand side controls the squared error in the normal of the varifold: . This proves the claim.
We will also refer multiple times to the following bound. In the regime of equal shear viscosities μ + = μ − we may apply this result with the choice w = 0. In the general case, we have to include the compensation function w for the velocity gradient discontinuity at the interface.
for almost every T ∈ [0, T strong ) and all 0 < δ 1. The absolute constant C > 0 only depends on the densities ρ ± .
Proof. We first argue how to control the part away from the interface of the strong solution, that is, outside of {(x, t) : dist(x, I v (t)) r c }. A straightforward estimate using Hölder's and Young's inequality yields Note that by the properties of the truncation of the identity β, see Proposition 10, it follows that |β(dist ± (x, which is indeed a bound of required order. We proceed with the bound for the contribution in the vicinity of the interface of the strong solution. To this end, recall that we are equipped with a family of maps t : Recall the estimates (18). We then move on with a change of variables, the one-dimensional Gagliardo-Nirenberg-Sobolev interpolation inequality as well as Hölder's and Young's inequality, to obtain the bound It thus suffices to derive an estimate for the L 2 -norm of the local interface error height in normal direction The proof of Proposition 26 below, where we establish next to the required L 2bound also several other properties of the local interface error height, shows that (see (60)) This then concludes the proof.
We conclude this section with an L 2 tan L ∞ nor -bound for H 1 -functions on the tubular neighborhood around the evolving interfaces as well as a bound for the derivatives of the normal velocity of the interface of a strong solution in terms of the associated velocity field v, both of which will be used several times in the estimation of the terms on the right hand side of the relative entropy inequality of Proposition 10.

Lemma 21.
Consider the situation of Proposition 10. We have the estimate valid for any g ∈ H 1 (R d ).
Proof. Let f ∈ H 1 (−r c , r c ). The one-dimensional Gagliardo-Nirenberg-Sobolev inerpolation inequality then implies . From this we obtain, together with Hölder's inequality, that This implies (42) by making use of the C 2 -diffeomorphisms t : t) and the associated change of variables, using also the bound (18).

Lemma 22. Consider the situation of Proposition 10 and define the vector field
Proof. The estimates (43) and (44) are a direct consequence of the regularity requirements on the velocity field v of a strong solution, see Definition 6, the pointwise bounds (19) and the representation of the normal vector field on the interface in terms of the signed distance function (23).

Weak-Strong Uniqueness of Varifold Solutions to Two-Fluid Navier-Stokes Flow: The Case of Equal Viscosities
In this section we provide a proof of the weak-strong uniqueness principle to the free boundary problem for the incompressible Navier-Stokes equation for two fluids (1a)-(1c) in the case of equal shear viscosities μ + = μ − . Note that in this case the problematic viscous stress term R visc in the relative entropy inequality (see Proposition 10) vanishes because of μ(χ u ) − μ(χ v ) = 0. In this setting, it is possible to choose w ≡ 0 which directly implies A visc = 0, A adv = 0, A dt = 0, A weight V ol = 0, and A sur T en = 0. It remains to estimate the terms R sur T en , R adv , R dt , and R weight V ol which are left on the right-hand side of the relative entropy inequality. We directly estimate these terms also for w = 0 in order to avoid unnecessary repetition, as the estimates for w = 0 are not more complicated but will be required for the case of different viscosities.

Estimate for the Surface Tension Terms
We start by estimating the terms related to surface tension R sur T en .

Lemma 23.
Consider the situation of Proposition 10. The terms related to surface tension R sur T en are estimated by for any δ > 0.
Proof. We start by using (38) and (32) to estimate Recall from (39) that the squared error in the varifold normal is controlled by the relative entropy functional. Together with the bound from Lemma 20,(19) as well as (47) we get an estimate for the first four terms of R sur T en for almost every T ∈ [0, T strong ) and all δ ∈ (0, 1]. To estimate the remaining two terms we decomposeV n − v as where the vector field V n is given by in the space-time domain {dist(x, I v (t)) < r c } (i. e. in contrast toV n , for V n the velocity v is evaluated not at the projection of x onto the interface, but at x itself). Note that it will not matter as to how V n and similar quantities are defined outside of the area {dist(x, I v (t)) < r c }, as the terms will always be multiplied by suitable cutoffs which vanish outside of {dist(x, I v (t)) < r c }. In the next two steps, we compute and bound the contributions from the two different parts in the decomposition (49) of the errorV n − v.

First
Step: Controlling the Error V n − v. By definition of the vector field V n in (50), we may write It is then not clear why the term should be controlled by our relative entropy functional. However, the integrands enjoy a crucial cancellation To verify this cancellation, we first recall from (23) We then start by rewriting Note that when the derivative hits the cutoff multiplier in the definition of ξ (see (32)), the resulting term on the right hand side of the last identity vanishes. Hence, we obtain, together with (25), On the other side, another application of (25) yields Therefore, the cancellation (51) indeed holds true since by (25) the right-hand side of the last computation remains unchanged after projecting via Id − n v ⊗ n v .

Second
Step: Controlling the ErrorV n − V n . It remains to control the contributions from the following two quantities: Note first that we can write Moreover, recall from (27) the formula for the gradient of the projection onto the nearest point on the interface I v (t). The definition of V n (see (50)) andV n (x) = V n (P I v (t) x), the product rule, (23), (19), and (25) imply, using the definition of ξ and the property |ξ | 1, where in the last step we have used also (27). Together with Young's inequality and the coercivity properties of the relative entropy (37) and (38) we then immediately get the estimate To estimate the second term I I , we start by adding zero and then use again (43), (19) as well as (37) and (38) Using (25), we continue by computing Hence, it follows from ζ (0) = 0 and |ζ | C as well as (45) that Third Step: Summary. Inserting (51), (52), and (53) into (48) entails the bound This yields the desired estimate.

Estimate for the Remaining Terms R adv , R dt , and R weight V ol
To bound the advection-related terms from the relative entropy inequality, the time-derivative related terms R dt , and the terms resulting from the weighted volume control term in the relative entropy and for any δ > 0.
Proof. To derive (54), we use a direct estimate for the second term in R adv as well as Lemma 20 for the first term. The bound (55) is derived similarly. Finally, we show estimate (56). Note that by definition we haveV n (x, t) = V n (P I v (t) x, t). Hence, we obtain using the bound (45) as well as (36) and |β | C An application of Lemma 20 yields (56).

The Weak-Strong Uniqueness Principle in the Case of Equal Viscosities
We conclude our discussion of the case of equal shear viscosities μ + = μ − for the free boundary problem for the incompressible Navier-Stokes equation for two fluids (1a)-(1c) with the proof of the weak-strong uniqueness principle.
We assume that the shear viscosities of the two fluids coincide, that is, Then, there exists a constant C > 0 which only depends on the data of the strong solution such that the stability estimate holds. In particular, if the initial data of the varifold solution and the strong solution coincide, the varifold solution must be equal to the strong solution in the sense that almost everywhere for almost every t ∈ [0, T strong ). Furthermore, in this case the varifold is given by for almost every t ∈ [0, T strong ).
Proof. Applying the relative entropy inequality from Proposition 10 with w = 0, using the fact that the problematic term R visc vanishes in the case of equal shear viscosities μ + = μ − , as well as using the bounds from (46), (54), (55) and (56), we observe that we established the following bound for almost every T ∈ [0, T strong ). An absorption argument along with a subsequent application of Gronwall's lemma then immediately yields the asserted stability estimate.
Consider the case of coinciding initial conditions, that is, In this case, we deduce from the stability estimate that the relative entropy vanishes for almost every t ∈ [0, T strong ): from this it immediately follows that u(·, t) = v(·, t) as well as χ u (·, t) = χ v (·, t) almost everywhere for almost every t ∈ [0, T strong ).
The asserted representation of the varifold V of the varifold solution follows from the following considerations. First, we deduce |∇χ u (·, t)| = |V t | S d−1 for almost every t ∈ [0, T strong ) as a consequence of the fact that the density of the varifold satisfies θ t = d|∇χ u (·,t)| d|V t | S d−1 ≡ 1 almost everywhere for almost every t ∈ [0, T strong ). The remaining fact that the measure on S d−1 is given by δ n u (x,t) for |V t | S d−1 -almost every x ∈ R d for almost every t ∈ [0, T strong ) then follows from the control of the squared error in the normal of the varifold by the relative entropy functional, see (39). This concludes the proof.

Weak-Strong Uniqueness of Varifold Solutions to Two-Fluid Navier-Stokes Flow: The Case of Different Viscosities
We turn to the derivation of the weak-strong uniqueness principle in the case of different shear viscosities of the two fluids. In this regime, we cannot anymore ignore the viscous stress term (μ( The key idea is to construct a solenoidal vector field w which is small in the L 2 -norm but whose gradient compensates for most of this problematic term, and then use the relative entropy inequality from Proposition 10 with this function. The precise definition as well as a list of all the relevant properties of this vector field are the content of Proposition 28. A main ingredient for the construction of w are the local interface error heights as measured in orthogonal direction from the interface of the strong solution (see Fig. 3). For this reason, we first prove the relevant properties of the local heights of the interface error in Proposition 26. However, in order to control certain surfacetension terms in the relative entropy inequality, we actually need the vector field w to have bounded spatial derivatives. To this end, we perform an additional regularization of the height functions. This will be carried out in detail in Proposition 27 by a (time-dependent) mollification. After all these preparations, in Sections 6.4-6.8 we then further estimate the additional terms A visc , A dt , A adv , and A sur T en in the relative entropy inequality from Proposition 10. Based on these bounds, in Section 6.9 we finally provide the proof of the stability estimate and the weak-strong uniqueness principle for varifold solutions to the free boundary problem for the incompressible Navier-Stokes equation for two fluids (1a)-(1c) from Theorem 1.

The Evolution of the Local Height of the Interface Error
Consider a strong solution (χ v , v) to the free boundary problem for the incompressible Navier-Stokes equation for two fluids (1a)-(1c) in the sense of Definition 6 on some time interval [0, T strong ). For the sake of better readability, let us recall some definitions and constructions related to the associated family of evolving interfaces I v (t) of the strong solution.
For the family ( + t ) t∈[0,T strong ) of smoothly evolving domains of the strong solution, the associated signed distance function is given by From Definition 5 of a family of smoothly evolving domains it follows that the family of maps t : In Lemma 11, we computed the time evolution of the signed distance function to the interface I v (t) of a strong solution. Recall also the various relations for the projected inner unit normal vector field n v (P I v (t) x, t) from Lemma 11, which will be of frequent use in subsequent computations. Finally, we remind the reader of the definition of the vector field ξ from Definition 13, which is a global extension of the inner unit normal vector field of the interface I v (t). For an illustration of the vector field ξ , we recall Fig. 2; for an illustration of h + , we refer to Fig. 3.
Similarly, we introduce the local height of the interface error in the other direction Then h + and h − have the following properties: a) (L 2 -bound) We have the estimates |h ± (x, t)| r c 2 and up to an error of such that in the domain t∈[0,T strong ) ( + t ∪ − t ) × {t} the second spatial derivatives of the vector field v exist and satisfy sup t∈[0, , we have the following estimate on the time derivative of the local interface error heights h ± : for any test function η ∈ C ∞ cpt (R d ) with n v · ∇η = 0 on the interface I v (t), and whereh ± is defined as h ± but now with respect to the modified cut-offθ(·) = θ · 2 . Proof.
Step 1: Proof of the estimate on the L 2 -norm. The trivial estimate |h ± (x, t)| r c 2 follows directly from the definition of h ± . To establish the L 2estimate, let + (x) := , t), t) dy. A straighforward estimate then gives Note that the term on the left hand side dominates |h + | 2 since we dropped the cutoff function. Hence, the desired estimate on the L 2 -norm of h + follows at once by a change of variables and recalling the fact that dist( t (x, y), I v (t)) = y. The corresponding bound for h − then follows along the same lines.
Step 2: Proof of the estimate on the spatial derivative (59b). The definition (58) is equivalent to , y)) θ y r c dy.
We compute for any smooth vector field η ∈ C ∞ cpt (R d ; R d ) (recall that t (x, 0) = x and dist( t (x, y), I v (t)) = y for any x ∈ I v (t) and any y with |y| r c ) that where in the last step we have used ∇ dist ± (x, I v (t)) = n v (P I v (t) x). This yields, by another change of variables in the second integral, the fact that χ v ( t (x, y), t) = 1 for any y > 0, (19), (20), | det ∇ −1 t | C as well as by abbreviating n u = ∇χ u for any Borel set U ⊂ R d . Recall that the indicator function χ u (·, t) of the varifold solution is of bounded variation in I : are guaranteed to be one-dimensional Caccioppoli sets in (−r c , r c ) for H d−1 -almost every x ∈ I v (t). Note that whenever |n v ·n u | 1 2 then 1−n v ·n u 1 2 , and therefore, using as well the co-area formula for rectifiable sets (see [11, (2.72) We now distinguish between different cases depending on x ∈ I v (t) up to H d−1 -measure zero. We start with the set of points , y), t)| dy By splitting the measure D tan x h + into a part which is absolutely continuous with respect to the surface measure on I v (t), for which we denote the density by ∇ tan h + , as well as a singular part D s h + , we obtain from (61) (note that the third integral in (61) does not contribute to this estimate by the definition of the set for every Borel set U ⊂ R d . Since U was arbitrary, we deduce that |∇ tan h + | is bounded on A 1 by the two integrands on the right hand side of the last inequality. Hence, we obtain The first term on the right hand side can be estimated as in the proof of the L 2 -bound for h ± . To bound the second term, we start by representing the one-dimensional Caccioppoli sets E + x as a finite union of disjoint intervals (see [11,Proposition 3.52]). It then follows from property iv) in Theorem 39 that ∂ * E + x ∩ (−r c , r c ) can only contain at most one point. Indeed, otherwise we would find at least one point t)) < 0 which is a contradiction to the definition of A 1 . By another application of the co-area formula for rectifiable sets (see [11, (2.72)]) we therefore get We now turn to the second case, namely the set of points A 2 := I v (t)\A 1 . We begin with a preliminary computation. When splitting E + x into a finite family of disjoint open intervals as before, it again follows from property iv) in Theorem 39 that every second point y ∈ ∂ * E + x ∩ (−r c , r c ) has to have the property that n v (x) · n u (x+yn v (x, t)) < 0, that is, |n v (x) − n u | 2 2(1 − n v (x) · n u ). In particular, by another application of the co-area formula for rectifiable sets (see [11, (2

.72)]) we obtain the bound
Now, we proceed as follows. By definition of A 2 , either one of the three summands in (62) has to be 1 12 . We distinguish between two cases. If the third one is not, then this actually means that the set 1 2 } is empty, that is, the third summand has to vanish. Hence, either one of the first two summands in (62) has to be 1 8 . If the first one is not, we use that , y), t)| dy r c and bound this by the second term and then (64). If the second one is not, then Now, we move on with the remaining case, that is, that the third summand in (62) does not vanish. In other words, {ỹ ∈ (−r c , r c )∩∂ * E + x : n v (x)·n u (x+ỹn v (x, t)) 1 2 } is non-empty. We then estimate Taking finally U = A 2 in (61), the conclusions of the above case study together with the three estimates (64), (65) and (66) followed by another application of the co-area formula for rectifiable sets (see [11, (2.72)]) to further estimate the latter, then imply that The two estimates (63) and (67) thus entail the desired upper bound (59b) for the (tangential) gradient of h ± with ξ replaced by n v (P I v (t) x). However, one may replace n v (P I v (t) x) by ξ because of (38).
Step 3: Proof of the approximation property for the interface (59c). In order to establish (59c), we rewrite using the coordinate transform t (recall that dist ± ( t (x, y), I v (t)) = y and that |h ± | r c ) In order to derive a bound for the first term on the right-hand side of (68), we distinguish between different cases depending on x ∈ I v (t) up to H d−1 -measure zero. We first distinguish between h + (x) r c 4 and h + (x) < r c 4 . In the former case, a straightforward estimate yields (recall (18)) which is indeed of required order after a change of variables. We now consider the other case, that is, h + (x) < r c 4 . Recall that the indicator function χ u (·, t) of the varifold solution is of bounded variation in I + := {x ∈ R d : dist ± (x, I v (t)) ∈ (0, r c )}. In particular, E + := {x ∈ R d : 1 − χ u > 0} ∩ I + is a set of finite perimeter in I + . Recall also that E + = I + ∩ {x ∈ R d : (χ v − χ u ) + > 0} since χ v ≡ 1 in I + . Applying Theorem 39 in local coordinates, the sections are guaranteed to be one-dimensional Caccioppoli sets in (0, r c ) for H d−1 -almost every x ∈ I v (t). Hence, we may represent the one-dimensional section E + x for such x ∈ I v (t) as a finite union of disjoint intervals (see [11,Proposition 3.52]): If K (x) = 0 then h + (x) = 0, and the inner integral in the first term on the right hand side of (68) vanishes for this x. If K (x) = 1 and a 1 = 0, then by definition of h + (x) we have (a 1 , b 1 ) = (0, h + (x)) (recall that we now consider the case h + (x) r c 4 ). Thus, again the inner integral in the first term on the right hand side of (68) vanishes for this x. Hence, it remains to discuss the case that there is at least one non-empty interval in the decomposition of E + x , say (a, b), such that a ∈ (0, r c ). From property iv) in Theorem 39 it then follows that Hence, we may bound Gathering the bounds from the different cases together with the estimate in (69), we therefore obtain by the co-area formula for rectifiable sets (see [11, (2.72)]) together with the change of variables t (x, y): which is by (38) as well as (37) indeed a bound of desired order. Moreover, performing analogous estimates for the second term on the right-hand side of (68) and estimating the third term on the right-hand side of (68) trivially, we then get which is precisely the desired estimate (59c).
Step 4: Proof of estimate on the time derivative (59d). To bound the time derivative, we compute using the weak formulation of the continuity equation ∂ t χ u = −∇ · (χ u u) and abbreviating I + (t) Recall from (27) the formula for the gradient of the projection onto the nearest point on the interface I v (t). Recalling also the definitions of the extended normal velocity (50) respectively (22), we also have Adding this formula to the above formula for d dt r c ), and using the fact that r c ) = 1 for any t and any x ∈ I v (t).
, the corresponding estimate (43) for the gradient of V n as well as the formula (27) for the gradient of P I v (t) . Because of (23) and the equation (34) for the time evolution of the normal vector, we thus get the bounds . Taking all of these bounds together, we obtain . As a consequence, we get Moreover, we may compute Since n v · ∇η = 0 holds on the interface I v (t) by assumption, we obtain from (72) In what follows, we will by slight abuse of notation use ∇ tan g(x) as a shorthand for (Id − n v (P I v (t) x) ⊗ n v (P I v (t) x))∇g(x) for scalar fields as well as (∇ tan · g)(x) instead of (Id − n v (P I v (t) x) ⊗ n v (P I v (t) x)) : ∇g(x) for vector fields. Let us also abbreviate P tan x : ). Moreover, it follows from (24), (25) and (23) that n v (P I v (t) x) · d dt (n v (P I v (t) x)) = 0. Hence, we may rewrite with an integration by parts (recall the notation P tan (x) = (Id −n v ⊗ n v )(P I v (t) x, t)) Using, from (25) and (23), that the spatial partial derivatives of the extended normal vector field are orthogonal to the gradient of the signed distance function, the same argument also shows that It follows from (27) as well as (25) and (23) Hence, we obtain Since the domain of integration is I + (t), we may write From this and the fact that n v (P I v (t) ) · ∇ P I v (t) (x) = 0, we deduce by another integration by parts that (where |F| r −1 Hence, plugging in (75), (74) and (76), (73) into (70) and using the esti- This yields by the change of variables t (x, y) and a straightforward estimate Using finally the Sobolev embedding to bound the L ∞ -norm of η on the interface (which is either one-or two-dimensional; note that the constant in the Sobolev embedding may be bounded by Cr −1 c for our geometry), we infer from this estimate the desired bound (59d), using also (38) and (37). This concludes the proof.

A Regularization of the Local Height of the Interface Error
In order to modify our relative entropy to compensate for the velocity gradient discontinuity at the interface, we need regularized versions of the local heights of the interface error h + and h − which in particular have Lipschitz regularity. To this end, we fix some function e(t) > 0 and basically apply a mollifier on scale e(t) to the local interface error heights h + and h − at each time. An illustration of h + and its Then h + e(t) and h − e(t) have the following properties: a) (H 1 -bound) If the interface error terms from the relative entropy are bounded by

b) (Improved approximation property) The functions h + e(t) and h − e(t) provide an approximation for the interface of the weak solution
up to an error of

{t} the second spatial derivatives of the vector field v exist and satisfy sup t∈[0,T strong
for any smooth test function η ∈ C ∞ cpt (R d ) with n v · ∇η = 0 on the interface I v (t), and whereh ± is defined as h ± but now with respect to the modified cut-off function θ(·) = θ · 2 . Proof. Proof of a). In order to estimate the spatial derivative ∇h ± e(t) , we compute using the fact that ∇ x θ |x−x| e(t) = −∇x θ |x−x| (note that all of the subsequent gradients are to be understood in the tangential sense on the manifold I v (t)) Introduce the convex function G( p) := | p| 2 for | p| 1, 2| p| − 1 for | p| 1.
Proof of b). We start with a change of variables to estimate (recall (18)) By adding zero and using (59c) we therefore obtain Observe that one can decompose A straightforward estimate in local coordinates then yields Using (59b) and summing with respect to k ∈ N, we get the desired estimate (78c).

Proof of c). Note that
Abbreviating η e (x, t) := As in the argument for (81), one checks that I v (t) |θ |( |x−x| e(t) ) dS(x) Ce(t) d−1 . Using the lower bound from (81), the proof for the standard L p -inequality for convolutions carries over and we obtain η e L p (I v (t)) C η L p (I v (t)) as well as |η(x, t)| p dS(x) for any p 1. As a consequence of (59d) and these considerations, we deduce that Using the estimate |v( where in the last step we have used the simple equality and the bounds (20) and (81). Recall from the transport theorem for moving hypersurfaces (see [85]) that we have for any with the normal velocity V n (x, t) = (v(x, t) · n v (P I v (t) x, t))n v (P I v (t) x, t). Making use of (86) and d dt P I v (t)x = −V n (x, t) forx ∈ I v (t) (see (72)), we then compute for everyx ∈ I v (t) This, together with another application of (86) and the fact that n v · ∇η = 0 on the interface I v (t), implies forx ∈ I v (t), that η(x) where F e,θ (t) : Observe that we have By the choice of the cutoff θ , we see that for every given x ∈ I v (t) the kernel F e,θ (t) is supported in B e(t)/2 (x) ∩ I v (t). Moreover, the exact same argumentation which led to the upper bound in (81) (we only used the support and upper bound for θ as well as e(t) r c ) shows that the kernel F e,θ satisfies the upper bound for any 1 p < ∞. We next intend to rewrite the function F e,θ (x, x) for fixed x as the divergence of a vector field. By the property (89), we may consider Neumann problem for the (tangential) Laplacian with right hand side F e,θ (·, x) in some neighborhood (of scale e(t)) of the point x. To do this we first rescale the setup, that is, we consider the kernel F 1 (x, x) := F e,θ (e(t)x, e(t)x) forx, x ∈ e(t) −1 I v (t). By scaling and the fact that F e,θ is supported on scale e(t)/2, it follows that F 1 (·, x) has zero average on e(t) −1 I v (t) ∩ B 1 (x) for every point x ∈ e(t) −1 I v (t) and that We fix x ∈ e(t) −1 I v (t) and solve on e(t) −1 I v (t)∩B 1 (x) the weak formulation of the equation − tañ xF 1 (·, x) = F 1 (·, x) with vanishing Neumann boundary condition. More precisely, we requireF 1 (·, x) to have vanishing average on e(t) −1 I v (t) ∩ B 1 (x) (note that in the weak formulation the curvature term does not appear because it gets contracted with the tangential derivative of the test function). By elliptic regularity and (91), it follows that We now rescale back to I v (t) and defineF e,θ (x, . It then follows from scaling, (92) as well as e(t) < r c that ∇x · F e,θ (x, x) = F e,θ and We now have everything in place to proceed with estimating the term To this end, we will make use of (87) and estimate term by term. Because of (20), (81), η e L p (I v (t)) C η L p (I v (t)) , the estimate the Lipschitz property |V n (x)−V n (x)| ||∇v|| L ∞ |x −x|, and the fact that θ(s) = 0 for s 1, the first four terms on the right-hand side of (87) are straightforward to estimate and result in the bound To estimate the fifth term, we first apply Fubini's theorem and then perform an integration by parts (recall that we imposed vanishing Neumann boundary conditions) which entails, because of the above considerations, Using (93) as well as the lower bound from (81) we see that the second term can be estimated by a term of the form (94). For the first term, note that by the properties of F e,θ we may interpret the integral in brackets as the mollification of ∇h ± on scale e(t). Applying the argument which led to (82) (for this, we only need the upper bound (93) for F e,θ , a lower bound as in (81) is only required for θ ) we observe that one can bound this term similar to ∇h ± e(t) (·, t) L 2 (I v (t)) . We therefore obtain the bound Hence, combining (83) with these estimates for the fourth term from (87) as well as (94) and (84), we obtain the desired estimate on the time derivative. This concludes the proof.

Construction of the Compensation Function w for the Velocity Gradient Discontinuity
We turn to the construction of a compensating vector field, which shall be small in the L 2 -norm but whose associated viscous stress μ(χ u )D sym w shall compensate for (most of) the problematic viscous term (μ(χ u ) − μ(χ v ))D sym v appearing on the right hand side of the relative entropy inequality from Proposition 10 in the case of different shear viscosities.
Before we state the main result of this section, we introduce some further notation. Let h + e(t) be defined as in Proposition 27. We then denote by P h + e(t) the downward projection onto the graph of h + e(t) , that is, for all (x, t) such that dist(x, I v (t)) < r c . Note that this map does not define an orthogonal projection. Analogously, one introduces the projection

-function and assume that the relative entropy is bounded by E[χ u , u, V |χ v , v](t) e(t) 2 . Let the regularized local interface error heights h + e(t) and h − e(t) be defined as in Proposition 27.
Then there exists a solenoidal vector field w ∈ L 2 ([0, T strong ]; H 1 (R d )) such that w is subject to the estimates where R > 0 is such that where the vector field W is given by with the symmetric gradient defined by D sym v := 1 2 (∇v + ∇v T ), as well as the estimates and where the vector fields g andĝ are subject to the bounds and whereh ± is defined as h ± but now with respect to the modified cut-off function θ(·) = θ · 2 , see Proposition 26. Furthermore, w may be taken to have the regu- Define the vector field W as given in (97) and set (making use of the notation a ∧ b = min{a, b} and a ∨ b = max{a, b}) as well as For this choice, we have (106) (note that this directly implies the last claim about the regularity of w, namely ∇w(·, t) ∈ W 1,∞ (R d \(I v (t) ∪ I h + e (t) ∪ I h + e (t))) for almost every t) as well as x) Moreover, note that (106) entails by the definition of the vector field W Analogous formulas and properties can be derived for w − . The function w + + w − would then satisfy our conditions, with the exception of the solenoidality ∇ ·w = 0. For this reason, we introduce the (usual) kernel It is immediately apparent that ∇ · w = 0.
Step 2: Estimates on w and ∇w. From (106), |∇η| Cr −1 c as well as the bounds (19) and (28) we deduce the pointwise bound and therefore, by integration and a change of variables t , Observe that this also implies, by (97), that |h + e(t) | 2 + |∇h + e(t) | 2 dS. (112) From this, Theorem 38, and the fact that ∇θ is a singular integral kernel subject to the assumptions of Theorem 38, we deduce Combining the estimates (111) and (113) with the corresponding inequalities for w − and θ * ∇ · w − , we deduce our estimate (96).
Now, let R > 1 be big enough such that We then estimate with an integration by parts and Theorem 38 applied to the singular integral operator ∇θ By Young's inequality for convolutions, (112), (114) and (115), we then obtain Together with the respective estimates for w − and θ * (∇ · w − ), this implies (95). The estimate (98) follows directly from (104) and the estimates (113) and (116) on the H 1 -norm of θ * (∇ · w + ) as well as the definition of w − and the analogous estimates for θ * (∇ · w − ).
Step 3: L ∞ -estimates for ∇w. Regarding the estimate (99) on ∇w L ∞ we have by (110) and the estimates |∇h + e(t) | Cr −2 c and |h + e(t) | r c 1 from Proposition 27 To estimate |∇(θ * (∇ · w + ))|, we first compute, starting with (108), where F(x, t) is subject to a bound of the form |F(x, t)| Cr −5 ) and supported in I v (t) + B r c . Next, we decompose the kernel θ as θ = ∞ k=−∞ θ k with smooth functions θ k with supp θ k ⊂ B 2 k+1 \B 2 k−1 . More precisely, we first choose a smooth function ϕ : R + → [0, 1] such that ϕ(s) = 0 whenever s / ∈ [−1/2, 2] and such that k∈Z ϕ(2 k s) = 1 for all s > 0. Such a function indeed exists, see for instance [16]. We then let θ k (x) := ϕ(2 k |x|)θ (x). Note that Using Young's inequality for convolutions as well as the estimate ∇θ k L 1 (R d ) C we obtain 0 k= log e 2 (t) Moreover, it follows from |∇θ k | C(2 k ) −d , the precise formula for ∇ · w + in (108), (19), (28), a change of variables and Hölder's inequality that |∇h + e(t) | 2 + |h + e(t) | 2 dS where the three terms on the right hand side are given by and as well as To estimate the latter term, we proceed first by noting the definition of h + in (77), as well as the trivial bound |h + | r c it holds |h + e(t) | r c . Then for all x ∈ I v (t) + {|x| > r c + 2 log e 2 (t) } and all k log e 2 (t) − 1 we observe that χ {dist ± (x,I v (t))>h + e(t) (P Iv (t) x)} (x) = 1 for all x ∈ R d such that |x −x| 2 k+1 . In particular, for suchx the third term on the right hand side of (122) vanishes since the corresponding second term in the formula for ∇(∇ · w + ) (see (118)) does not appear anymore. Hence, letx ∈ I v (t) + {|x| r c + 2 log e 2 (t) } and denote by F the tangent plane to the manifold {dist ± (x, I v (t)) = h + e(t) (P I v (t) x)} at the nearest point tox. We then have for any and as a consequence, Recall that we defined θ k (x) := ϕ(2 k |x|)θ (x) where ϕ : R + → [0, 1] is a smooth function such that ϕ(s) = 0 whenever s / ∈ [−1/2, 2] and such that k∈Z ϕ(2 k s) = 1 for all s > 0. Hence, x). It follows from elliptic regularity thatθ(·,x) is C ∞ . Moreover, since we could have rescaled θ k first to unit scale, then solved the associated problem on that scale, and finally rescaled the solution back to the dyadic scale k we see that |θ k (x,x)| C(2 k ) 2−d . We then have by an integration by parts Using these considerations in the previous formula, we obtain Making use of the fact that the integral vanishes for dist(x, F) 2 k+1 and the bounds (19) and (28), we obtain Using |∇h + e(t) | Cr −2 c and |∇ 2 h + e(t) | Cr −4 c e(t) −1 from Proposition 27 as well, we get and Using (126), (127), (128) and (129) to estimate the term in (125), we get In turn, combining this with (123) and (124) and also gathering (120), (121) and (117), as well as the corresponding bounds for ∇w − and ∇(θ * ∇ · w − ), we then finally deduce (99).
Step 4: L 2 L ∞ -estimate for ∇w. By making use of the precise formula (106) for ∇w + and the definition of the vector field W in (97), we immediately get To estimate the contribution from |∇(θ * (∇ ·w + ))| we use the same dyadic decomposition as in (119). We start with the terms in the range k = log e 2 (t) , . . . , 0. Let x ∈ I v (t) and y ∈ (−r c , r c ) be fixed. We abbreviatex := x + yn v (x, t). Denote by F x the tangent plane of the interface I v (t) at the point x. Let F x : F x × R → R d be the diffeomorphism given by F x (x,ŷ) :=x +ŷn F x (x). We start estimating using the change of variables F x , the bound |∇θ k (x)| Cχ 2 k−1 |x| 2 k+1 |x| −d , as well as the fact thatx + yn F x (x) =x + yn v (x, t) is exactly the point on the ray originating fromx ∈ F x in normal direction which is closest tox: Note that the right hand side is independent of y. Hence, we may estimate, with Minkowski's inequality, The inner integral is to be understood in the Cauchy principal value sense. To proceed we use the L 2 -theory for singular operators of convolution type, the precise formula (108) for ∇ · w + as well as (19) and (28) which entails An application of (78a) and the assumption E[χ u , u, V |χ v , v](t) e 2 (t) finally yields We move on with the contributions in the range k = 1, . . . , ∞. Note that by (121) we may directly infer from (78a) and the assumption E[χ u , u, V |χ v , v](t) e 2 (t) Moreover, the contributions estimated in (123) and (124) result in a bound of the form (recall that e(t) < r c ) Note that when summing the respective bounds from (128) and (129) over the relevant range k = −∞, . . . , log e 2 (t) − 1, we actually gain a factor e(t), that is, the contributions estimated in (128) and (129) then directly yield a bound of the form Finally, the contribution from (127) may be estimated as follows. Let x ∈ I v (t), y ∈ [−r c , r c ] and denote by Fx the tangent plane to the manifold {dist ± (x, In light of (127), we start estimating for k log e 2 (t) − 1 by using Jensen's inequality, the bound |∇h + e(t) | Cr −2 c from Proposition 27, as well as the fact that |x −x| |x −x| for Since this bound does not depend anymore on y ∈ [−r c , r c ], we may estimate the contributions from (127) using Minkowski's inequality as well as once more the L 2 -theory for singular operators of convolution type to reduce everything to the H 1 -bound (78a) for the local interface error heights. All in all, the contributions from (127) are therefore bounded by The asserted bound (100) then finally follows from collecting the estimates (131), (132), (133), (134), (135) and (136) together with the analogous bounds for ∇w − and ∇(θ * ∇ · w − ).
We now aim to make use of (78d) to further estimate the second term in the right hand side of (137). To establish the corresponding L 2 -resp. L 4 3 -contributions, we first need to perform an integration by parts in order to use (78d). The resulting curvature term as well as all other terms which do not appear in the third term of (137) can be directly bounded by a term whose associated L 2 -norm is controlled by Cr −1 with the corresponding L 2 -bound and L 4 3 -estimate In both bounds, we add and subtract the compensation function w and therefore obtain together with (98) and (42) |h ± e(t) | 2 + |∇h ± e(t) | 2 dS Analogous estimates may be derived for w − . We therefore proceed with the terms related to θ * ∇ · w ± . First of all, note that the singular integral operator (θ * ∇·) satisfies (see Theorem 38) θ * ∇ ·ĝ We want to exploit the fact that the vector field V n has bounded derivatives up to second order, see (43) and (44). Moreover, the kernel gives rise to a singular integral operator of convolution type, as does ∇θ . To see this, we need to check whether its average over S d−1 vanishes. We write Passing to the limit r → 1 shows that ∇ F, and therefore also ∇ 2 θ(x) ⊗ x, have vanishing average on S d−1 . We may now compute (where the integrals are well defined in the Cauchy principal value sense due to the above considerations) for almost every We then estimate using Young's inequality for convolutions and |∇ 2 θ(x)| |x| −d−1 : As a consequence, we obtain from (143), Young's inequality for convolutions, (114) and (44) that Applying Theorem 38 to the singular integral operators ∇θ resp. ∇ 2 θ ⊗ x as well as making use of (43), (114) and (144) we then obtain the estimate It remains to estimate θ * ∇ · ( denoting the tangential velocity of v. To this end, note that we may rewrite Using Theorem 38, (111) as well as (112) we then obtain θ * ∇ · ((V tan · ∇)w + ) − (V tan · ∇)(θ * ∇ · w + ) 2 |h + e(t) | 2 + |∇h + e(t) | 2 dS.

Estimate for the Additional Surface Tension Terms
Having established all the relevant properties of the compensating vector field w in Proposition 28, we can now estimate the additional terms in the relative entropy inequality from Proposition 10. To this end, we start with the additional surface tension terms given by A precise estimate for these terms is the content of the following result: Lemma 29. Let the assumptions and notation of Proposition 28 be in place. In particular, we assume that there exists a C 1 -function e : [0, T strong ) → [0, r c ) such that the relative entropy is bounded by

t). Then the additional surface tension terms A sur T en are bounded by a Gronwall-type term
A sur T en C r 10 Proof. We estimate term by term in (147). A straightforward estimate for the first two terms using also the coercivity property (39) yields Making use of (19), a change of variables t , Hölder's and Young's inequality, (98), (41), (78a) as well as the coercivity property (36) the term I I I may be bounded by For the term I V , we first add zero, then perform an integration by parts which is followed by an application of Hölder's inequality to obtain )(w · ∇)(∇ · ξ) dx dt By definition of ξ (see (32)) recall that Recalling also (96), (97) and (113) as well as making use of (78c), (19), (28), (78a) and finally the coercivity property (36) the term (I V ) a from (151) is estimated by Recalling from (78b) the definition of , we may estimate the term (I V ) b from (151) by a change of variables t , (19), Hölder's and Young's inequality, (98) as well as (78a) To estimate the term (I V ) c from (151), we again make use of the definition of , (19), Hölder's and Young's inequality, (98) as well as (78a) which yields the following bound: Hence, taking together the bounds from (152), (153) and (154), we obtain In order to estimate the term V , we argue as follows: in a first step, we split R d into the region I v (t) + B r c near to and the region R d \(I v (t) + B r c ) away from the interface of the strong solution. Recall then that the indicator function χ u (·, t) of the varifold solution is of bounded variation in I v (t) + B r c . In particular, Applying Theorem 39 in local coordinates, the sections are guaranteed to be one-dimensional Caccioppoli sets in (−r c , r c ), and such that all of the four properties listed in Theorem 39 hold true for H d−1 -almost every x ∈ I v (t). Recall from [11,Proposition 3.52] that one-dimensional Caccioppoli sets are in fact finite unions of disjoint intervals. We then distinguish for H d−1almost every x ∈ I v (t) between the cases that H 0 ( In other words, we distinguish between those sections which consist of at most one interval and those which consist of at least two intervals. It also turns out to be useful to further keep track of whether n v · n u 1 2 or n v · n u 1 2 holds. We then obtain by Young's and Hölder's inequality, as well as the fact that due to Definition 13 the vector field ξ is supported in To estimate V a from (156), we use the co-area formula for rectifiable sets (see [11, (2.72)]), (100), Hölder's inequality and the coercivity property (38), which together yield (we abbreviate in the first line F(x, y, t) : It remains to bound the term V b from (156). To this end, we make use of the fact that it follows from property iv) in Theorem 39 that every second point y ∈ x, t)). We may therefore estimate, with the help of the co-area formula for rectifiable sets (see [11, (2.72)]) and the bound (99), All in all, we obtain from the assumption E[χ u , u, V |χ v , v](t) e 2 (t) as well as (156), (157), (158) and (99) Hence, we deduce from the bounds (149), (150), (155), (159) as well as (99) the asserted estimate for the additional surface tension terms.

Estimate for the Viscosity Terms
In contrast to the case of equal shear viscosities μ + = μ − , we have to deal with the problematic viscous stress term given by (μ(χ v ) − μ(χ u ))(∇v + ∇v T ).
We now show that the choice of w indeed compensates for (most of) this term in the sense that the viscosity terms from Proposition 10 Proof. We argue pointwise for the time variable and start by adding zero: We start by estimating the first four terms. Note that μ( from (78b) we see that ).
Hence, we may rewrite Carrying out an analogous computation for I V , using again the definition of the smoothed approximation for χ u from (78b) and using (96) as well as (97), we then get the bound Plugging in the estimates (78a) and (78c), we obtain, by Young's inequality, for every δ ∈ (0, 1). To estimate the last two terms V and V I in (162), we may rewrite, making use of the definition (97) of the vector field W and abbreviating where in the penultimate step we have used the fact that ∇ · (u − v − w) = 0, and in the last step we added zero. This yields, after an integration by parts, .
As a consequence of (96), (78a), (19) and the global Lipschitz estimate |∇h ± e (·, t)| Cr −2 c from Proposition 27, we obtain By a change of variables t , (18), (42), (78a) and an application of Young's and Korn's inequality, the latter two terms may be further estimated by for every δ ∈ (0, 1]. In total, we obtain the bound where δ ∈ (0, 1) is again arbitrary. Analogously, one can derive a bound of the same form for the last term V I in (162). Together with the bounds from (163) as well as (164) this concludes the proof.

Estimate for Terms with the Time Derivative of the Compensation Function
We proceed with the estimate for the terms from the relative entropy inequality of Proposition 10: which are related to the time derivative of the compensation function w.
Lemma 31. Let the assumptions and notation of Proposition 28 be in place. In particular, we assume that there exists a C 1 -function e : [0, T strong ) → [0, r c ) such that the relative entropy is bounded by E[χ u , u, V, |χ v , v](t) e 2 (t).
Then, for any δ > 0 there exists a constant C > 0 such that A dt may be estimated by Proof. To estimate the terms involving the time derivative of w we make use of the decomposition of ∂ t w + (v · ∇)w from (101): Employing the bounds (59a), (59b) and the assumption E[χ u , u, V |χ v , v](t) e(t) 2 together with the Orlicz-Sobolev embedding (228) from Proposition 41 or (231) from Lemma 42 depending on the dimension, we obtain Making use of (78a), the bound for the vector fieldĝ from (102), the Gagliardo- 2 for d = 2 and α = 1 4 for d = 3, as well as the assumption Now, by an application of Young's and Korn's inequality for all the terms on the right hand side of (168) which include an L 2 -norm of the gradient of u−v−w (in the case d = 3 we use a , we obtain where δ ∈ (0, 1) is arbitrary. This gives the desired bound for the L 4 3 -contribution of ∂ t w +(v ·∇)w. Concerning the L 2 -contribution, we estimate using (59a), (78a), the bound for g L 2 from (103) as well as the assumption E[χ u , u, V |χ v , v](t) e(t) 2 to get Hence, by another application of Young's and Korn's inequality, we may bound where δ ∈ (0, 1] is again arbitrary. All in all, (169) and (171) therefore imply the desired bound.

Estimate for the Additional Advection Terms
We move on with the additional advection terms from the relative entropy inequality of Proposition 10: A precise estimate is the content of Lemma 32. Let the assumptions and notation of Proposition 28 be in place. In particular, we assume that there exists a C 1 -function e : [0,

Proof. A straightforward estimate yields
Making use of (95), (99) as well as (78a) immediately shows that the desired bound holds true.

Estimate for the Additional Weighted Volume Term
It finally remains to state the estimate for the additional weighted volume term from the relative entropy inequality of Proposition 10: Proof. We may use the exact same argument as in the derivation of the estimate for the term I I I from the additional surface tension terms A sur T en , see (150).

The Weak-Strong Uniqueness Principle with Different Viscosities
Before we proceed with the proof of Theorem 1, let us summarize the estimates from the previous sections in the form of a post-processed relative entropy inequality. The proof is a direct consequence of the relative entropy inequality from Proposition 10 and the bounds (46) Then the relative entropy is subject to the estimate for almost every T ∈ [0, T strong ). Here, C > 0 is a constant which is structurally of the form C = Cr −22 , depending on the various norms of the velocity field of the strong solution, the regularity parameter r c of the interface of the strong solution, and the physical parameters ρ ± , μ ± , and σ .
We have everything in place to to prove the main result of this work.
Proof of Theorem 1. The proof of Theorem 1 is based on the post-processed relative entropy inequality of Proposition 34. It amounts to nothing but a more technical version of the upper bound valid for all solutions of the differential inequality d dt E(t) C E(t)| log E(t)|. However, it is made more technical by the more complex right-hand side (34) in the relative entropy inequality (which involves the anticipated upper bound e(t) 2 ) and the smallness assumption on the relative entropy E[χ u , u, V |χ v , v](t) needed for the validity of the relative entropy inequality.
We start the proof with the precise choice of the function e(t) as well as the necessary smallness assumptions on the initial relative entropy. We then want to exploit the post-processed form of the relative entropy inequality from Proposition 34 to compare E[χ u , u, V |χ v , v](t) with e(t).
Let C > 0 be the constant from Proposition 34 and choose δ > 0 such that δ < 1 6(C+1) . Let ε > 0 (to be chosen in a moment, but finally we will let ε → 0) and consider the strictly increasing function Note that For technical reasons to be seen later, we will also require c > 0 be small enough such that whenever E[χ u , u, V |χ v , v](0) < c and ε < c. We proceed with some further computations. We start with d dt

|e(t). (180)
This, in particular, entails After these preliminary considerations, let us consider the relative entropy inequality from Proposition 10. Arguing similarly to the derivation of the relative entropy inequality in Proposition 10 but using the energy dissipation inequality in Arguing by contradiction, we assume T = ∅ and define Since E[χ u , u, V |χ v , v](0) < e 2 (0) and e 2 is strictly increasing, we deduce by the same argument which established (182) that T * > 0. Hence, we can apply Proposition 34 at least for times T < T * (with τ = 0). However, by the same argument as before the relative entropy inequality from Proposition 10 shows that (T ) may be bounded by means of the post-processed relative entropy inequality. Hence, we obtain using also (177) and (180) We compare this to the equation (181) for e 2 (t) (with τ = 0 and T = T * ). Recall that e 2 (0) strictly dominates the relative entropy at the initial time. Because of (179), the second term on the right hand side of (183) is dominated by one third of the right hand side of (181). Because of (178) and the choice δ < 1 6(C+1) the same is true for the other two terms on the right hand side of (183). In particular, we obtain, using (182) as well, that lim sup which contradicts the definition of T * . This concludes the proof since the asserted stability estimate as well as the weak-strong uniqueness principle is now a consequence of letting ε → 0.

Derivation of the Relative Entropy Inequality
Proof of Proposition 10. We start with the following observation. Since the phasedependent density ρ(χ v ) depends linearly on the indicator function χ v of the volume occupied by the first fluid, it consequently satisfies for almost every T ∈ [0, T strong ) and all ϕ ∈ C ∞ cpt (R d × [0, T strong )). By approximation, the equation holds for all ϕ ∈ W 1,∞ (R d × [0, T strong )). Testing this is a smooth vector field, we then obtain for almost every T ∈ [0, T strong ). Note that the velocity field v of a strong solution has the required regularity to justify the preceding step. Next, we subtract from (185) the equation for the momentum balance (11a) of the strong solution evaluated with a test function η ∈ C ∞ cpt (R d × [0, T strong ); R d ) such that ∇ · η = 0. This shows that the velocity field v of the strong solution satisfies which holds for almost every T ∈ [0, T strong ) and all η ∈ C ∞ cpt (R d × [0, T strong ); R d ) such that ∇ · η = 0. The aim is now to test the latter equation with the field u − v − w. To this end, we fix a radial mollifier φ : R d → [0, ∞) such that φ is smooth, supported in the unit ball and R d φ dx = 1. For n ∈ N we define φ n (·) := n d φ(n ·) as well as u n := φ n * u and analogously v n and w n . We then test (186) with the test function u n − v n − w n and let n → ∞. Since the traces of u n , v n and w n on I v (t) converge pointwise almost everywhere to the respective traces of u, v and w, we indeed may pass to the limit in the surface tension term of (186). Hence, we obtain the identity which holds true for almost every T ∈ [0, T strong ).
In the next step, we test the analogue of (184) for the phase-dependent density ρ(χ u ) of the varifold solution with the test function 1 2 |v + w| 2 and obtain R d for almost every T ∈ [0, T strong ). Recall also from the definition of a varifold solution that we are equipped with the energy dissipation inequality which holds for almost every T ∈ [0, T strong ). Finally, we want to test the equation for the momentum balance (6a) of the varifold solution with the test function v + w. Since the normal derivative of the tangential velocity of a strong solution may feature a discontinuity at the interface, we have to proceed by an approximation argument, that is, we use the mollified version v n + w n as a test function. Note that v n resp. w n are elements of L ∞ ([0, T strong ); C 0 (R d )). Hence, we may indeed use v n + w n as a test function in the surface tension term of the equation for the momentum balance (6a) of the varifold solution. However, it is not clear a priori why one may pass to the limit n → ∞ in this term.
To argue that this is actually possible, we choose a precise representative for ∇v resp. ∇w on the interface I v (t). This is indeed necessary also for the velocity field of the strong solution since the normal derivative of the tangential component of v may feature a jump discontinuity at the interface. However, by the regularity assumptions on v, see Definition 6 of a strong solution, and the assumptions on the compensating vector field w, for almost every t ∈ [0, T strong ) every point x ∈ R d is either a Lebesgue point of ∇v (respectively ∇w) or there exist two half spaces H 1 and H 2 passing through x such that x is a Lebesgue point for both ∇v| H 1 and ∇v| H 2 (respectively ∇w| H 1 and ∇w| H 2 ). In particular, by the L ∞ bounds on ∇v and ∇w the limit of the mollifications ∇v n respectively ∇w n exist at every point x ∈ R d and we may define ∇v respectively ∇w at every point x ∈ R d as this limit.
Recall then that we have chosen the mollifiers φ n to be radially symmetric. Hence, the approximating sequences ∇v n resp. ∇w n converge pointwise everywhere to the precise representation as chosen before. Since both limits are bounded, we may pass to the limit n → ∞ in every term appearing from testing the equation for the momentum balance (6a) of the varifold solution with the test function v n + w n . This entails for almost every T ∈ [0, T strong ). The next step consists of summing (187), (188), (189) and (190). We represent this sum as follows:

L H S kin (T ) + L H S visc + L H S sur En (T )
where each individual term is obtained in the following way. The terms related to kinetic energy at time T on the left hand side of (188), (189) and (190) in total yield the contribution The same computation may be carried out for the initial kinetic energy terms Note that because of (8), it holds that The terms in the energy dissipation inequality related to surface energy are therefore given by as well as Moreover, collecting all advection terms on the right hand side of (187), (188), and (190), as well as adding zero, gives the contribution Next, we may rewrite those terms on the right hand side of (187), (188), and (190) which contain a time derivative as follows: Furthermore, the terms related to surface tension on the right hand side of (187) and (190) are given by (Id−s ⊗ s) : ∇v dV t (x, s) dt function does not contribute to the divergence on the interface), that the latter quantity equals the scalar mean curvature (recall that n v = ∇χ v |∇χ v | points inward) as well as once more the incompressibility of the velocity fields v resp. u we may also rewrite The preceding five identities together then imply that Following the computation which led to (199) we also obtain the identity Using the fact that w is divergence-free, we may also rewrite In particular, as an intermediate summary we obtain the following bound making already use of the notation of Proposition 10: taking the bound (191) together with the identities from (192) to (197) as well as (200) and (201) yields The aim of the next step is to use σ (∇ · ξ) (see Definition 13) as a test function in the transport equation (6b) for the indicator function χ u of the varifold solution.
For the sake of brevity, we will write again n u = ∇χ u |∇χ u | . Plugging in σ (∇ · ξ) and integrating by parts yields for almost every T ∈ [0, T strong ). Making use of the evolution equation (33) for ξ and the fact that ξ is supported in the space-time domain {dist(x, I v (t)) < r c }, we get, by adding zero, that which holds for almost every T ∈ [0, T strong ). Next, we study the quantity Due to the regularity of v resp. ξ as well as the incompressibility of the velocity field v we get Remark 35. Let us comment on the minor changes that occur in the proof of Proposition 10 when allowing for a bulk force ρ(χ) f such as gravity in Definition 2 of a varifold solution (resp. Definition 6 of a strong solution), where In this case, the right hand side of the equation for the momentum balance (6a) for the varifold solution (χ u , u, V ) has to be amended by the term whereas the right hand side of (11a) for the strong solution (χ v , v) in addition includes Moreover, the energy dissipation inequality (6c) of the varifold solution (χ u , u, V ) now also features on the right hand side the term Hence, as a consequence of including a bulk force it is clear that an additional term R H S bulk Force has to appear in the inequality (191), and therefore also in the relative entropy inequality of Proposition 10. We derive the term R H S bulk Force by a quick review of the changes to be made for the argument from (185) to (190) which are the basis for (191).
First, the identity (186) was obtained from (185) (which itself remains unchanged) by subtracting the equation for the momentum balance of the strong solution. Due to the additional term (212), this means that we pick up in (187) after essentially testing with η = u − v − w an extra term of the form Second, we note that (188) remains unchanged under the inclusion of bulk forces, whereas (189) now includes on the right hand side the term (213) as it is merely a reminder of the energy dissipation inequality for the varifold solution. Third, since (190) arose essentially from testing (6a) with η = v + w and multiplying the resulting identity by −1, we pick up in (190) due to (211) the additional term Finally, as (191) was obtained by summing (187), (188), (189) and (190), it thus follows from (213), (214) and (215) that

R H S bulk Force
Since the whole argument after the derivation of (191) is unaffected from the inclusion of bulk forces, we deduce that the only additional term appearing in the relative entropy inequality from Proposition 10 is given by (216). Because of the simple computation ρ(χ u )−ρ(χ v ) = (ρ + −ρ − )(χ u −χ v ) and f ∈ L ∞ (R d ×[0, T strong ); R d ) it follows from an application of Lemma 20 and an absorption argument that the weak-strong uniqueness principle as well as the stability estimate of Theorem 1 are still valid.
We proceed with a remark on the existence of strong solutions in the precise functional framework of Definition 6 based on the assumption of the higher order regularity and compatibility conditions for the initial data (10a)-(12c).
Remark 37. We start by making precise what one can infer from the existing literature about the existence of strong solutions to the two-phase Navier-Stokes problem with surface tension. Note that all what is said until (220) also holds true if one considers gravity, see for instance the works of Prüss and Simonett [82], [84] and [85, p. 581]. The remaining claims hold true after a suitable adaptation of the higher order compatibility conditions for the initial data, see the end of Remark Furthermore, it entails the existence of a pressure field p with regularity ∇ p ∈ L q ([0, T strong ]; L q (R d )) as well as the existence of a family of smoothly evolving sets ( + t ) t∈[0,T strong ] with smoothly evolving surfaces (I v (t)) t∈[0,T strong ] with indicator function χ in the sense of Definition 5. More precisely, the diffeomorphisms in Definition 5 inherit the regularity of the height function h constructed in [73,Theorem 2] and are thus, for the time being, short of one degree of spatial regularity to what is called for in Definition 5. Moreover, it is proved in [73] that in the time interval (0, T strong ] the interface is actually real analytic and that the velocity field v and the pressure p are real analytic as well; at least for positive times and away from the interface. Hence, the triple (v, p, χ) is for positive times a classical solution to the free boundary problem for the incompressible Navier-Stokes equation for two fluids (1a)-(1c). Since (217) also entails that it remains to establish the estimate (11c) for spatial derivatives of order k ∈ {2, 3}, for the time derivative ∂ t v, and the mixed derivative ∂ t ∇v, as well as that the diffeomorphisms from Definition 5 have one additional order of spatial regularity. For this, one relies on the higher order regularity and compatibility conditions for the initial data as given in Definition 6. Let us sketch how this works. The argument uses the transformed formulation of the problem, see [73, (2.2)], stating it on a fixed domain R d \ with a real analytic reference interface . At least for short times, the evolving interface I v (t) is then described by means of the graph of a height function h over this reference surface . Moreover, the evolving domains occupied by the two fluids are described by means of the associated Hanzawa transform (see [73, p. where we abbreviated with ∇ tan the surface gradient on . It is crucial that the nonlinearities on the right hand side are at least quadratic, and that each term which is of the same order as the principal linear part on the left hand side comes with a factor of h or its derivative. Let us denote in the following byv ± resp.p ± the transformed velocity fields resp. the transformed pressures of the two fluids in their respective domains ± h . All regularity properties stated before hold naturally for the transformed data (v,p, h). Moreover, regularity up to the interface is established in [85,Section 9.4] in the sense that we have, for the respective one-sided traces, v ± ∈ C ∞ ( × (0, T strong ]; R d ), p ± ∈ C ∞ ( × (0, T strong ]).
Let us only focus on how to establish the estimate (11c) in the vicinity of the interface; in the bulk one may proceed more directly without having to distinguish between tangential and normal directions. The first step then consists of taking the derivative with respect to a tangential vector field t in the transformed problem (221); this shows that the tangential derivatives (t · ∇)v, (t · ∇)p, and (t · ∇)h satisfy a system analogous to (221). Recalling that we have assumed the higher regularity conditions (12a), we conclude that the theory of [73] applies to the tangential derivatives, yielding the regularity (t · ∇)v ∈ L q ([0, By transforming back to the original variables, we deduce a corresponding estimate for ∇((Id−n I v ⊗n I v )∇)v ± . Differentiating the constraint ∇ ·v ± = 0 in the bulk and using Schwarz's theorem to change the order of differentiation (which is admissible by the smoothness of the velocity in the bulk), one infers that actually all components of the second derivative ∇ 2 v ± except for the normal-normal second derivative of the tangential velocity satisfy an analogous bound to (225). To establish the regularity for the last missing component, the idea is to extract from the Laplacian the normalnormal second derivative and to use the equation for v ± . For this, however, we first need to establish regularity for the time derivative ∂ t v ± . This is basically done by differentiating the transformed problem (221) in time, from which one derives an analogous problem for the time derivatives ∂ tv , ∂p, and ∂ t h of the transformed velocityv, pressurep, and height h. Arguing as before and using the compatibility conditions (12c)-(12d), we infer that and thus that also ∂ t v satisfies a corresponding estimate, since we already know that ∂ t h is continuous up to the initial time t = 0. From this one may then infer that (11c) holds true for k = 2 by using the equation for v ± as already explained before.
Up until now, we only know that h ∈ C([0, T strong ]; C 2 ( )) ∩ C 1 ([0, T strong ]; C 1 ( )) such that sup (x,t)∈ ×[0,T strong ] |∇ tan ∇ tan h(x, t)| + |∂ t ∇ tan h(x, t)| < ∞. However, taking tangential derivatives of the equation for h in the transformed problem (221) together with the one order higher regularity for the velocity field shows that h ∈ C([0, T strong ]; C 3 ( )) ∩ C 1 ([0, T strong ]; C 2 ( )) with a corresponding bound for the highest derivatives. In particular, the diffeomorphisms h share the same properties from which we conclude that the diffeomorphisms from Definition 5 satisfy the required regularity and bounds. Finally, one may follow the above argument to verify that (11c) also holds true for k = 3 and the mixed derivative ∂ t ∇v. To this end, one relies on the higher regularity condition (12a) as well as the higher compatibility conditions (12d)-(12e). First, one differentiates the equation for the tangential derivatives of v ± another time in the tangential direction to obtain boundedness of the gradient of the tangential-tangential second derivatives of the velocity fields v ± . Differentiating the constraint ∇ · v ± = 0 in the bulk twice, using Schwarz's theorem to change the order of differentiation, and differentiating in time the equation for ∇v (leading again to a similar system) yields the bound for ∂ t ∇v ± and all third spatial derivatives of v ± except for the normal-normal-normal third derivative. For this, one differentiates in the bulk the equation for v ± in normal direction concluding that the missing third derivative can be expressed by terms which are already controlled. This concludes the remark on the existence of strong solutions in the precise functional framework of Definition 6.
We rely several times in this work on the following standard result for singular integral operators of convolution type: We also state a non-trivial result from geometric measure theory on properties of one-dimensional sections of Caccioppoli sets.
Theorem 39. ([33, Theorem G]) Consider a set G of finite perimeter in R d , denote by ν G = (ν G x 1 , . . . , ν G x d−1 , ν G y ) ∈ R d the associated measure theoretic inner unit normal vector field of the reduced boundary ∂ * G, and let χ * G be the precise representative of the bounded variation function χ G . Then for Lebesgue almost every x ∈ R d−1 the one-dimensional sections G x := {y ∈ R : (x, y) ∈ G} satisfy the following properties: i) G x is a set of finite perimeter in R, χ G (x, ·) = χ * G (x, ·) Lebesgue almost everywhere in G x , ii) (∂ * G) x = ∂ * G x , iii) ν G y (x, y) = 0 for all y ∈ R such that (x, y) ∈ ∂ * G, and iv) lim y→y + To bound the L 4 -norm of the interface error heights h ± in the case of a twodimensional interface, we employ the following optimal Orlicz-Sobolev embedding:  Then for any weakly differentiable function u decaying to 0 at infinity in the sense {|u(x)| > s} < ∞ for all s > 0, the following estimate holds true: The application of the optimal Orlicz-Sobolev embedding to our setting is stated and proved next. We also set A e(t) (Du(t)) := I (t) A e(t) (|∇u(t)|) dS + |D s u(t)|(I (t)). Then the following estimate holds true: × e(t) 4 + 1 e(t) 2 u(t) 6 L 2 (I (t)) +A 3 e(t) (Du(t)) + u(t) 4 L 2 (I (t)) + A 2 e(t) (Du(t)) , for almost every t ∈ [0, T ] and a constant C > 0.
Proof. Let U ⊂ R 2 be an open and bounded set and consider u ∈ C 1 cpt (U ) such that u L ∞ 1. For the sake of brevity, let us suppress for the moment the dependence on the variable t ∈ [0, T ). The idea is to apply the optimal Orlicz-Sobolev embedding provided by the preceding theorem with respect to the convex function A e . Observe first that A e indeed satisfies all the assumptions. Moreover, since d = 2, we compute for r e, 1 + log r e for e r 1, 1 + log 1 e + r −1 2 + 1 4 log(2r − 1) for r 1.
As a consequence, we get We then deduce from Theorem 40, d = 2, u L ∞ 1, the bound exp(s 2 ) which is precisely what is claimed. Note that since u is continuously differentiable, the singular part in the definition of A e (Du) vanishes. In a next step, we want to extend to smooth functions u on the manifold I (t).
By assumption, we may cover I (t) with a finite family of open sets of the form U (x i ) := I (t) ∩ B 2r c (x i ), x i ∈ I (t), such that U (x i ) can be represented as the graph of a function g : B 1 (0) ⊂ R 2 → R with |∇g| 1 and |∇ 2 g| r −1 c . We fix a partition of unity {ϕ i } i subordinate to this finite cover of I (t). Note that |∇ϕ i | Cr −1 c . Note also that the cardinality of the open cover is uniformly bounded in t. Hence, we proceed with deriving the desired bound only for one uϕ, where ϕ = ϕ i is supported in U = U (x i ). Abbreviatingũ = u • g andφ = ϕ • g, we obtain from the previous step Using the bounds A e (t +t) C A e (t) + C A e (t) and A e (λt) C(λ + λ 2 )A e (t), which hold for all λ > 0 and all t,t 0, as well as the product and chain rule we compute This yields the claim in the case of a smooth function u : I (t) → R.
Since the analogous bound to (81) holds true, we infer u n L ∞ 1 as well as u n − u L 1 (I (t)) → 0 as n → ∞. In particular, we have pointwise almost everywhere convergence at least for a subsequence. This in turn implies by Lebesgue's dominated convergence theorem that u n − u L 4 (I (t)) → 0 as n → ∞ at least for a subsequence. Moreover, the exact same computation which led to (80) shows that .
Integrating this bound over the manifold and then using Fubini shows that

A e(t) (Du n (t)) Cr −2 c A e(t) (Du(t))
holds true uniformly over all n ∈ N. By applying the bound (230) from the second step, we may conclude the proof.
Proof. Fix t > 0. First, observe that I (t) essentially consists of a finite number of nonintersecting curves. By approximation, we may assume u(t) ∈ W 1,1 (I (t)). Let η i be a partition of unity on I (t) with |∇ tan η i (x)| Cr −1 c such that the support of each η i is isometrically equivalent to a bounded interval (note that the Definition 5 implies a lower bound of cr c for the length of any connected component of I (t)) and such that at any point x ∈ I (t) there are at most two i with η i (x) > 0. Treating by abuse of notation the function η i u as if defined on a real interval I = (a, b), we then write From this we infer the desired estimate by approximation.