Weak-strong uniqueness for the Navier-Stokes equation for two fluids with surface tension

In the present work, we consider the evolution of two fluids separated by a sharp interface in the presence of surface tension - like, for example, the evolution of oil bubbles in water. Our main result is a weak-strong uniqueness principle for the corresponding free boundary problem for the incompressible Navier-Stokes equation: As long as a strong solution exists, any varifold solution must coincide with it. In particular, in the absence of physical singularities the concept of varifold solutions - whose global in time existence has been shown by Abels [2] for general initial data - does not introduce a mechanism for non-uniqueness. The key ingredient of our approach is the construction of a relative entropy functional capable of controlling the interface error. If the viscosities of the two fluids do not coincide, even for classical (strong) solutions the gradient of the velocity field becomes discontinuous at the interface, introducing the need for a careful additional adaption of the relative entropy.


Introduction
In evolution equations for interfaces, topological changes and geometric singularities often occur naturally, one basic example being the pinchoff of liquid droplets (see Figure 1). As a consequence, strong solution concepts for such PDEs are naturally limited to short-time existence results or particular initial configurations like perturbations of a steady state. At the same time, the transition from strong to weak solution concepts for PDEs is prone to incurring unphysical non-uniqueness of solutions: For example, Brakke's concept of varifold solutions for mean curvature flow admits sudden vanishing of the evolving surface at any time [19]; for the Euler equation, even for vanishing initial data there exist nonvanishing solutions with compact support [81], and the notion of mild solutions to the Navier-Stokes equation allows any smooth flow to transition into any other smooth flow [23]. In the context of fluid mechanics, the concept of relative entropies has proven successful in ruling out the aforementioned examples of non-uniqueness: Energy-dissipating weak solutions e. g. to the incompressible Navier-Stokes equation are subject to a weak-strong uniqueness principle [68,73,85], which states that as long as a strong solution exists, any weak solution satisfying the precise form of the energy dissipation inequality must coincide with it. However, in the context of evolution equations for interfaces, to the best of our knowledge the concept of relative entropies has not been applied successfully so far to obtain weak-strong uniqueness results.
In the present work, we are concerned with the most basic model for the evolution of two fluids separated by a sharp interface (like, for instance, the evolution of oil  For this free boundary problem for the flow of two immiscible incompressible fluids with surface tension, Abels [2] has established the global existence of varifold solutions for quite general initial data.
The main result of the present work is a weak-strong uniqueness result for this free boundary problem for the Navier-Stokes equation for two fluids with surface tension: In Theorem 1 below we prove that as long as a strong solution to this evolution problem exists, any varifold solution in the sense of Abels [4] must coincide with it.
1.1. Free boundary problems for the Navier-Stokes equation. The free boundary problem for the Navier-Stokes equation has been studied in mathematical fluid mechanics for several decades. Physically, it describes the evolution of a viscous incompressible fluid surrounded by or bordering on vacuum. The (local-intime) existence of strong solutions for the free boundary problem for the Navier-Stokes equation has been proven by Solonnikov [87,88,89] in the presence of surface tension and by Shibata and Shimizu [86] in the absence of surface tension; see also Beale [17,18], Abels [1], and Coutand and Shkoller [34] for related or further results. While the existence theory for global weak solutions for the Navier-Stokes equation in a fixed domain like R d , d ≤ 3, has been developed starting with the seminal work of Leray [68] in 1934, the question of the global existence of any kind of solution to the free boundary problem for the Navier-Stokes equation has remained an open problem. An important challenge for a global existence theory of weak solutions to the free boundary problem for the Navier-Stokes equation is the possible formation of "splash singularities", which are smooth solutions to the Lagrangian formulation of the equations which develop self-interpenetration. Such solutions have been constructed by Castro, Cordoba, Fefferman, Gancedo, and Gomez-Serrano [27], see also [26,36,47] for splash singularities in related models in fluid mechanics.
In the present work we consider a closely related problem, namely the flow of two incompressible and immiscible fluids with surface tension at the fluid-fluid interface, like for example the flow of oil bubbles immersed in water or vice versa. For this free boundary problem for the Navier-Stokes equation for two fluids -described by the system of PDEs (1) below -, a global existence theory for generalized solutions is in fact available: In a rather recent work, Abels [2] has constructed varifold solutions which exist globally in time. In an earlier work, Plotnikov [72] had treated the case of non-Newtonian (shear-thickening) fluids. The local-in-time existence of strong solutions has been established by Denisova [43]; for an interface close to the half-space, an existence and instant analyticity result has been derived by Prüss and Simonett [75,76]. Existence results for the two-phase Stokes and Navier-Stokes equation in the absence of surface tension have been established by Giga and Takahashi [57] and Nouri and Poupaud [71]. Note that in contrast to the case of a single fluid in vacuum, for the flow of two incompressible immiscible inviscid fluids splash singularities cannot occur as shown by Fefferman, Ionescu, and Lie [50] and Coutand and Shkoller [35]; one would expect a similar result to hold for viscous fluids. However, solutions may be subject to the Rayleigh-Taylor instability as proven by Prüss and Simonett [74].
In terms of a PDE formulation, the flow of two immiscible incompressible fluids with surface tension may be described by the indicator function χ = χ(x, t) of the volume occupied by the first fluid, the local fluid velocity v = v(x, t), and the local pressure p = p(x, t). The fluid-fluid interface moves just according to the fluid velocity, the evolution of the velocity of each fluid and the pressure are determined by the Navier-Stokes equation, and the fluid-fluid interface exerts a surface tension force on the fluids proportional to the mean curvature of the interface. Together with the natural no-slip boundary condition and the appropriate boundary conditions for the stress tensor on the fluid-fluid interface, one may assimilate the Navier-Stokes equations for the two fluids into a single one, resulting in the system of equations where H denotes the mean curvature vector of the interface ∂{χ = 0} and |∇χ| denotes the surface measure H d−1 | ∂{χ=0} . Here, µ(0) and µ(1) are the shear viscosities of the two fluids and ρ(0) and ρ(1) are the densities of the two fluids. The constant σ is the surface tension coefficient. The total energy of the system is given by the sum of kinetic and surface tension energies It is at least formally subject to the energy dissipation inequality Note that the concept of varifold solutions requires a slight adjustment of the definition of the energy: The surface area´R d 1 d|∇χ| is replaced by the corresponding quantity of the varifold, namely its mass. A widespread numerical approximation method for the free boundary problem (1a)-(1c) capable of capturing geometric singularities and topological changes in the fluid phases are phase-field models of Navier-Stokes-Cahn-Hilliard type or Navier-Stokes-Allen-Cahn type, see for example the review [12], [6,69,59,61,70] for modeling aspects, [3,5] for the existence analysis of the corresponding PDE systems, and [7,8,9] for results on the sharp-interface limit.

Weak solution concepts in fluid mechanics and (non-)uniqueness.
In the case of the free boundary problem for the Navier-Stokes equation -both for a single fluid and for a fluid-fluid interface -, a concept of weak solutions is expected to play an even more central role in the mathematical theory than in the case of the standard Navier-Stokes equation: In three spatial dimensions d = 3, even for smooth initial interfaces topological changes may occur naturally in finite time, for example by asymptotically self-similar pinchoff of bubbles [46] (see Figure 1). In contrast, for the incompressible Navier-Stokes equation without free boundary the global existence of strong solutions for any sufficiently regular initial data remains a possibility.
However, in general weakening the solution concept for a PDE may lead to artificial (unphysical) non-uniqueness, even in the absence of physically expected singularities. A particularly striking instance of this phenomenon is the recent example of non-uniqueness of mild (distributional) solutions to the Navier-Stokes equation by Buckmaster and Vicol [25] and Buckmaster, Colombo, and Vicol [23]: In the framework of mild solutions to the Navier-Stokes equation, any smooth flow may transition into any other smooth flow [23]. The result of [23,25] are based on convex integration techniques for the Euler equation, which have been developed starting with the works of De Lellis and Székelyhidi [40,41] (see also [22,24,39,63]).
In contrast to the case of distributional or mild solutions, for the stronger notion of weak solutions to the Navier-Stokes equation with energy dissipation in the sense of Leray [68] a weak-strong uniqueness theorem is available: As long as a strong solution to the Navier-Stokes equation exists, any weak solution with energy dissipation must coincide with it. Recall that for a weak solution to the Navier-Stokes equation v, besides the Ladyzhenskaya-Prodi-Serrin regularity criterion v ∈ L p ([0, T ]; L q (R 3 ; R 3 )) with 2 p + 3 q ≤ 1 and p ≥ 2 [73,84], both a lower bound on the pressure [82] and a geometric assumption on the vorticity [33] are known to imply smoothness of v. Interestingly, weak-strong uniqueness of energydissipating solutions fails if the Laplacian in the Navier-Stokes equation is replaced by a fractional Laplacian −(−∆) α with power α < 1 3 , see Colombo, De Lellis, and De Rosa [32] and De Rosa [78].
Another way of interpreting a weak-strong uniqueness result is that nonuniqueness of weak solutions may only arise as a consequence of physical singularities: Only when the unique strong solution develops a singularity, the continuation of solutions beyond the singularity -by means of the weak solution concept -may be nonunique. The main theorem of our present work provides a corresponding result for the flow of two incompressible immiscible fluids with surface tension: Varifold solutions to the free boundary problem for the Navier-Stokes equation for two fluids are unique until the strong solution for the free boundary problem develops a singularity.
1.3. (Non)-Uniquenesss in interface evolution problems. Weak solution concepts for the evolution of interfaces are often subject to nonuniqueness, even in the absence of topology changes. For example, Brakke's concept of varifold solutions for the evolution of surfaces by mean curvature [19] suffers from a particularly drastic failure of uniqueness: The interface is allowed to suddenly vanish at an arbitrary time. In the context of viscosity solutions to the level-set formulation of two-phase mean curvature flow, the formation of geometric singularities may still cause fattening of level-sets [16] and thereby nonuniqueness of the mean-curvature evolution, even for smooth initial surfaces [13].
To the best of our knowledge, the only known uniqueness result for weak or varifold solutions for an evolution problem for interfaces is a consequence of the relation between Brakke solutions and viscosity solutions for two-phase mean-curvature flow, see Ilmanen [62]: As long as a smooth solution to the level-set formulation exists, the support of any Brakke solution must be contained in the corresponding levelset of the viscosity solution. As a consequence, as long as a smooth evolution of the interface by mean curvature exists, the "maximal" unit-density Brakke solution corresponds to the smoothly evolving interface. The proof of this inclusion relies on the properties of the distance function to a surface undergoing evolution by mean curvature respectively the comparison principle for mean curvature flow. Both of these properties do not generalize to other interface evolution equations.
Besides Ilmanen's varifold comparison principle, the only uniqueness results in the context of weak solutions to evolution problems for lower-dimensional objects that we are aware of are a weak-strong uniqueness principle for the highercodimension mean curvature flow by Ambrosio and Soner [11] and a weak-strong uniqueness principle for binormal curvature motion of curves in R 3 by Jerrard and Smets [65]. The interface contribution in our relative entropy (11) may be regarded as the analogue for surfaces of the relative entropy for curves introduced in [65]. 1.4. The concept of relative entropies. The concept of relative entropies in continuum mechanics has been introduced by Dafermos [37,38] and DiPerna [44] in the study of the uniqueness properties of systems of conservation laws. Proving weak-strong uniqueness results for conservation laws or even the incompressible Navier-Stokes equation typically faces the problem that an error between a weak solution u and a strong solution v must be measured by a quantity E[u|v] which is nonlinear even as a function of u alone, like a norm ||u − v||. To evaluate the time evolution d dt E[u|v] of such a quantity, one would need to test the evolution equation for u by the nonlinear function D u E[u|v], which is often not possible due to the limited regularity of u. The concept of relative entropies overcomes this issue if the physical system possesses a strictly convex entropy ( [u|v] for the relative entropy and thereby a weak-strong uniqueness result.
Since Dafermos [37], the concept of relative entropies has found many applications in the analysis of continuum mechanics, providing weak-strong uniqueness results for the compressible Navier-Stokes equation [51,55], the Navier-Stokes-Fourier system [52], fluid-structure interaction problems [28], renormalized solutions for dissipative reaction-diffusion systems [29,54], as well as weak-strong uniqueness results for measure-valued solutions for the Euler equation [21], compressible fluid models [60], wave equations in nonlinear elastodynamics [42], and models for liquid crystals [48], to name just a few.
The concept of relative entropies has also been employed in the justification of singular limits of PDEs, see for example the work of Yau [90] on the hydrodynamic limit of the Ginzburg-Landau lattice model, the works of Bardos, Golse, and Levermore [15], Saint-Raymond [79,80], and Golse and Saint-Raymond [58] on the derivation of the Euler equation and the incompressible Navier-Stokes equation from the Boltzmann equation, the work of Brenier [20] on the Euler limit of the Vlasov-Poissson equation, and the works of Serfaty [83] and Duerinckx [45] on mean-field limits of interacting particles. In the context of numerical analysis, it may also be used to derive a posteriori estimates for model simplification errors [53,56].
Jerrard and Smets [65] have used a relative entropy ansatz to establish a weakstrong uniqueness principle for the evolution of curves in R 3 by binormal curvature flow. Their relative entropy may be regarded as the analogue for curves of the interfacial energy contribution to our relative entropy (i. e. the terms σ´R d 1−ξ(·, T )· ∇χu(·,T ) |∇χu(·,T )| d|∇χ u (·, T )| + σ´R d 1 − θ T d|V T | S d−1 in (11) below). It has subsequently been used by Jerrard and Seis [64] to prove that the evolution of solutions to the Euler equation with near-vortex-filament initial data is governed by binormal curvature flow, as long as a strong solution to the latter (without self-intersections) exists and as long as the vorticity remains concentrated along some curve.
One of the key challenges in the derivation of our result is the development of a notion of relative entropy which provides strong enough control of the interfacial error. The key idea to control the error between an interface ∂{χ u (·, t) = 1} and a smoothly evolving interface I v (t) = ∂{χ v (·, t) = 1} by a relative entropy is to introduce a vector field ξ which is an extension of the unit normal of I v (t), multiplied with a cutoff. The interfacial contribution σ´R d 1 − ξ(·, T ) · ∇χu(·,T ) |∇χu(·,T )| d|∇χ u (·, T )| to the relative entropy then controls the interface error in a sufficiently strong way, see Section 3 for details.
However, in the case of different viscosities µ + = µ − of the two fluids, the velocity gradient of the strong solution ∇v at the interface will be discontinuous. This necessitates an additional adaption of our relative entropy: If one were to directly compare the velocity fields u and v of two solutions by the relative entropy, the difference of the viscous stresses µ(χ u )D sym u − µ(χ v )D sym v could not be estimated appropriately to derive a Gronwall-type estimate. We rather have to compare the velocity field u to an adapted velocity field v + w, where w is constructed in a way that the adapted velocity gradient ∇v + ∇w approximately accounts for the shifted location of the interface.
The approximate adaption of the interface of the strong solution to the higherorder approximation for the interface is distantly reminiscent of an ansatz by Leger and Vasseur [67] and Kang, Vasseur, and Wang [66], who establish L 2 contractions up to a shift for solutions to conservation laws close to a shock profile. However, it differs both in purpose and in the actual construction from [66,67]: The interfacial shift in [66,67] essentially serves the purpose of compensating the difference in the propagation speed of the shocks of the two solutions, while we need the higherorder approximation of the interface to compensate for the discontinuity in the velocity gradient at the interface. While the interfacial shift in [66] is given as the solution to an appropriately defined time-dependent PDE, in our case we obtain the interfacial shift by applying at any fixed time a suitable regularization operator to the interface of the weak solution near the interface of the strong solution.
1.5. Derivation of the model. Let us briefly comment on the derivation of the system of equations (1). We consider the flow of two viscous, immiscible, and incompressible fluids. Each fluid occupies a domain Ω + t resp. Ω − t , t ≥ 0, and the interface separating both phases will be denoted by I(t). In particular, R d = Ω + t ∪Ω − t ∪I(t) for every t ≥ 0. Within each of these domains Ω ± t , the evolution of the fluid velocity is modeled by means of the incompressible Navier-Stokes equations for a Newtonian fluid where v + t : Ω + t → R d and v − t : Ω − t → R d denote the velocity fields of the two fluids, p + t : Ω + t → R and p − t : Ω − t → R the pressure, ρ + , ρ − > 0 the densities of the two fluids, and µ + and µ − the shear viscosities. On the interface of the two fluids I(t) a no-slip boundary condition v + t = v − t is imposed. As the two velocities v + t and v − t are defined on complementary domains and coincide on the interface, this enables us to assimilate the two velocity fields into a single velocity . Note that the velocity field v inherits the incompressibility (1c) from the incompressibility of v + and v − (2b). We also assimilate the pressures p + t and p − t into a single pressure p, which however may be discontinuous across the interface.
Additionally, we assume that the evolution of the interface I(t) occurs only as a result of the transport of the two fluids along the flow. Denoting by n the outward unit normal vector field of the interface I(t) and by V n the associated normal speed of the interface, this gives rise to the equation This condition may equivalently be rewritten as the transport equation for the indicator function χ of the first fluid phase ∂ t χ + (v · ∇)χ = 0, see for example Remark 8 below for the (standard) arguments.
In order to assimilate the equations (2a) for the velocities v ± of the two fluids into the single equation (1b), a condition on the jump of the normal component of the stress tensor T = µ ± (∇v + ∇v T ) − ∇p Id at the interface I(t) is required. In the case of positive surface tension constant σ > 0 at the interface, the balance of forces at the interface reads where the right-hand side σH accounts for the surface tension force. Here, H denotes the mean curvature vector of the interface and [[f ]] denotes the jump in normal direction of a quantity f . In combination with (2a) and the no-slip boundary condition v + = v − on I(t), this yields the equation for the momentum balance (1b).

Main results
The main result of the present work is the derivation of a weak-strong uniqueness principle for varifold solutions to the free boundary problem for the Navier-Stokes equation for two immiscible incompressible fluids with surface tension: As long as a strong solution to the free boundary problem (1a)-(1c) exists, any varifold solution must coincide with it. In particular, the concept of varifold solutions developed by Abels [2] (see Definition 2 below for a precise definition) does not introduce an additional mechanism for non-uniqueness, at least as long as a classical solution exists. At the same time, the concept of varifold solutions of Abels allows for the construction of globally existing solutions [2], while any concept of strong solutions is limited to the absence of geometric singularities and therefore -at least in three spatial dimensions d = 3 -to short-time existence results.
Furthermore, we prove a quantitative stability result (5) for varifold solutions with respect to changes in the data: As long as a classical solution exists, any varifold solution with slightly perturbed initial data remains close to it.
Theorem 1 (Weak-strong uniqueness principle). Let d ∈ {2, 3}. Let (χ u , u, V ) be a varifold solution to the free boundary problem for the incompressible Navier-Stokes equation for two fluids (1a)-(1c) in the sense of Definition 2 on some time interval [0, T vari ). Let (χ v , v) be a strong solution to (1a)-(1c) in the sense of Definition 6 on some time interval [0, T strong ) with T strong ≤ T vari . Let the relative entropy E[χ u , u, V |χ v , v](t) be defined as in Proposition 9.
Then there exist constants C, c > 0 such that the stability estimate holds for almost every T ∈ [0, T strong ), provided that the initial relative entropy satisfies E[χ u , u, V |χ v , v](0) ≤ c. The constants c > 0 and C > 0 depend only on the data and the strong solution.
In particular, if the initial data of the varifold solution and the strong solution coincide, the varifold solution must be equal to the strong solution in the sense that hold almost everywhere for almost every t ∈ [0, T strong ), and the varifold is given for almost every t ∈ [0, T strong ) by The following notion of varifold solutions for the free boundary problem associated with the flow of two immiscible incompressible viscous fluids with surface tension has been introduced by Abels [2]. For the notion of an oriented varifold, see the section on notation just prior to Section 3.
Definition 2 (Varifold solution for the two-phase Navier-Stokes equation). Let a surface tension constant σ > 0, the densities and shear viscosities of the two fluids ρ ± , µ ± > 0, a finite time T vari > 0, a solenoidal initial velocity profile v 0 ∈ L 2 (R d ; R d ), and an indicator function of the volume occupied initially by the first fluid χ 0 ∈ BV(R d ) be given.
A triple (χ, v, V ) consisting of a velocity field v, an indicator function χ of the volume occupied by the first fluid, and an oriented varifold is called a varifold solution to the free boundary problem for the Navier-Stokes equation for two fluids with initial data (χ 0 , v 0 ) if the following conditions are satisfied: i) The velocity field v has vanishing divergence ∇ · v = 0 and the equation for the momentum balancê For the sake of brevity, we have used the abbreviations ρ(χ) ii) The indicator function χ of the volume occupied by the first fluid satisfies the weak formulation of the transport equation is satisfied for almost every T ∈ [0, T vari ), and the energy is a nonincreasing function of time. iv) The phase boundary ∂{χ(·, t) = 0} and the varifold V satisfy the compatibility conditionˆR for almost every T ∈ [0, T vari ) and every smooth function ψ ∈ C ∞ cpt (R d ).
Let us continue with a few comments on the relation between the varifold V t and the interface described by the indicator function χ(·, t).
denote the non-negative measure representing (at time t) the varifold associated to a varifold solution (χ, v, V ) to the free boundary problem for the incompressible Navier-Stokes equation for two fluids. The compatibility condition (6e) entails that |∇χ u (t)| is absolutely continuous with respect to |V t | S d−1 . Hence, we may define the Radon-Nikodym derivative for every f ∈ L 1 (R d , |∇χ(·, t)|) and almost every t ∈ [0, T vari ).
The compatibility condition between the varifold V t and the interface described by the indicator function χ(·, t) has the following consequence.
Remark 4. Consider a varifold solution (χ, v, V ) to the free boundary problem for the incompressible Navier-Stokes equation for two fluids. Let E t be the measurable set {x ∈ R d : χ(x, t) = 1}. Note that for almost every t ∈ [0, T vari ) this set is then a Caccioppoli set in R d . Let n(·, t) = ∇χ |∇χ| denote the measure theoretic unit normal vector field on the reduced boundary ∂ * E t . By means of the compatibility condition (6e) and the definition (7) we obtain for almost every t ∈ [0, T vari ) and |V t | S d−1 -almost every x ∈ R d .
In order to define a notion of strong solutions to the free boundary problem for the flow of two immiscible fluids, let us first define a notion of smoothly evolving domains.

Definition 5 (Smoothly evolving domains and surfaces).
Let Ω + 0 be a bounded domain of class C 3 and consider a family (Ω + t ) t∈[0,Tstrong) of open sets in R d . Let for every t ∈ [0, T strong ]. We say that Ω + t , Ω − t are smoothly evolving domains and that I(t) are smoothly evolving surfaces if we have , subject to the following conditions: Moreover, we assume that there exists r c ∈ (0, 1 2 ] with the following property: For all t ∈ [0, T strong ) and all x ∈ I(t) there exists a function g : B 1 (0) ⊂ R d−1 → R with ∇g(0) = 0 such that after a rotation and a translation, I(t) ∩ B 2rc (x) is given by the graph {(x, g(x)) : x ∈ R d−1 }. Furthermore, for any of these functions g the pointwise bounds |∇ m g| ≤ r −(m−1) c hold for all 1 ≤ m ≤ 3.
We have everything in place to give the definition of a strong solution to the free boundary problem for the Navier-Stokes equation for two fluids.
Definition 6 (Strong solution for the two-phase Navier-Stokes equation). Let a surface tension constant σ > 0, the densities and shear viscosities of the two fluids ρ ± , µ ± > 0, a finite time T strong > 0, a solenoidal initial velocity profile v 0 ∈ L 2 (R d ; R d ), and a domain Ω + 0 occupied initially by the first fluid be given. Let the initial interface between the fluids ∂Ω + 0 be a compact C 3 -manifold. A pair (χ, v) consisting of a velocity field v and an indicator function χ of the volume occupied by the first fluid with is called a strong solution to the free boundary problem for the Navier-Stokes equation for two fluids with initial data (χ 0 , v 0 ) if the volume occupied by the first fluid is a smoothly evolving domain and the interface I v (t) := ∂Ω + t is a smoothly evolving surface in the sense of Definition 5 and if additionally the following conditions are satisfied: i) The velocity field v has vanishing divergence ∇ · v = 0 and the equation for the momentum balancê H · η dS dt is satisfied for almost every T ∈ [0, T strong ) and every smooth vector field Here, H denotes the mean curvature vector of the interface I v (t). For the sake of brevity, we have used the abbreviations ρ(χ) := ρ + χ + ρ − (1 − χ) and µ(χ) ii) The indicator function χ of the volume occupied by the first fluid satisfies the weak formulation of the transport equation Before we state the main ingredient for the proof of Theorem 1, we proceed with two remarks on the notion of strong solutions. The first concerns the consistency with the notion of varifold solutions due to Abels [2]. Remark 7. Every strong solution (χ, v) to the free boundary problem for the incompressible Navier-Stokes equation for two fluids (1a)-(1c) in the sense of Definition 6 canonically defines a varifold solution in the sense of Definition 2. Indeed, we can define the varifold V by means of dV t = δ ∇χ |∇χ| d|∇χ|. Due to the regularity requirements on the family of smoothly evolving surfaces I(t), see Definition 5, it then followŝ Lemma 3.4]. Moreover, it follows from the regularity requirements of a strong solution that the velocity field v also satisfies the energy dissipation inequality (6c). This proves the claim.
The second remark concerns the validity of (3) in a strong sense for a strong solution, i.e., that the evolution of the interface I(t) occurs only as a result of the transport of the two fluids along the flow.
Remark 8. Let (χ, v) be a strong solution to the free boundary problem for the incompressible Navier-Stokes equation for two fluids (1a)-(1c) in the sense of Definition 6 on some time interval [0, T strong ). Let V n (x, t) denote the normal speed of the interface at x ∈ I v (t), i.e., the normal component of ∂ t Ψ(x, t) where Ψ : R d × [0, T strong ) → R d is the family of diffeomorphisms from the definition of a family of smoothly evolving domains (Definition 5). Furthermore, let ϕ ∈ C ∞ cpt (R d × (0, T strong )). Due to the regularity requirements on a family of smoothly evolving domains, see Definition 5, we obtain (see for instance [4,Theorem 2.6 On the other side, subtracting from the former identity the equation (10b) satisfied by the indicator function χ and making use of the incompressibility of the velocity field v we deduceˆT strong 0ˆIv(t) Since ϕ ∈ C ∞ cpt (R d × (0, T strong )) was arbitrary we recover the identity that is to say, the kinematic condition (3) of the interface being transported with the flow is satisfied in its strong formulation.
Our weak-strong uniqueness result in Theorem 1 relies on the following relative entropy inequality. The regime of equal shear viscosities µ + = µ − corresponds to the choice of w = 0 in the statement below. Note also that in this case the viscous stress term R visc disappears due to µ(χ u ) − µ(χ v ) = 0. Proposition 9 (Relative entropy inequality). Let d ≤ 3. Let (χ u , u, V ) be a varifold solution to the free boundary problem for the incompressible Navier-Stokes equation for two fluids (1a)-(1c) in the sense of Definition 2 on some time interval [0, T vari ). Let (χ v , v) be a strong solution to (1a)-(1c) in the sense of Definition 6 on some time interval [0, T strong ) with T strong ≤ T vari and let be a solenoidal vector field with bounded spatial derivative ∇w L ∞ < ∞. Suppose furthermore that for almost every t ≥ 0, for every x ∈ R d either x is a Lebesgue point of ∇w(·, t) or there exists a half-space H x such that x is a Lebesgue point for both ∇w(·, t)| Hx and ∇w(·, t)| R d \Hx .
For a point (x, t) such that dist(x, I v (t)) < r c , denote by P Iv(t) x the projection of x onto the interface I v (t) of the strong solution. Introduce the extension ξ of the unit normal n v of the interface of the strong solution defined by for some cutoff η with η(s) = 1 for s ≤ 1 2 r c and η ≡ 0 for s ≥ r c . LetV n (x, t) := (n(P Iv(t) x, t) · v(P Iv(t) x, t))n(P Iv(t) x, t) be an extension of the normal velocity of the interface of the strong solution I v (t) to an r c -neighborhood of I v (t). Let θ be the density θ t = d|∇χu(·,t)| d|Vt| S d−1 as defined in (7) and let β : R → R be a truncation of the identity with β(r) = r for |r| ≤ 1 2 , |β | ≤ 1, |β | ≤ C, and β (r) = 0 for |r| ≥ 1. Then the relative entropy is subject to the relative entropy inequality for almost every T ∈ (0, T strong ), where we made use of the abbreviations as well as

Moreover, we have abbreviated
as well as Notation. We use a∧b (respectively a∨b) as a shorthand notation for the minimum (respectively maximum) of two numbers a, b ∈ R.
Let Ω ⊂ R d be open. For a function u : Ω × [0, T ] → R, we denote by ∇u its distributional derivative with respect to space and by ∂ t u its derivative with respect to time. For p ∈ [1, ∞] and an integer k ∈ N 0 , we denote by L p (Ω) and W k,p (Ω) the usual Lebesgue and Sobolev spaces. In the special case p = 2 we use as usual H k (Ω) := W k,2 (Ω) to denote the Sobolev space. For integration of a function f with respect to the d-dimensional Lebesgue measure respectively the d − 1-dimensional surface measure, we use the usual notation´Ω f dx respectivelý I f dS. For measures other than the natural measure (the Lebesgue measure in case of domains Ω and the surface measure in case of surfaces I), we denote the corresponding Lebesgue spaces by L p (Ω, µ). The space of all compactly supported and infinitely differentiable functions on Ω is denoted by C ∞ cpt (Ω). The closure of C ∞ cpt (Ω) with respect to the Sobolev norm · W k,p (Ω) is W k,p 0 (Ω), and its dual will be denoted by W −1,p (Ω) where p ∈ [0, ∞] is the conjugated Hölder exponent of p, i.e. 1/p + 1/p = 1. For vector-valued fields, say with range in R d , we use the notation L p (Ω; R d ), and so on. For a Banach space X, a finite time T > 0 and a number p ∈ [1, ∞] we denote by L p ([0, T ]; X) the usual Bochner-Lebesgue space. If X itself is a Sobolev space W k,q , we denote the norm of L p ([0, T ]; X) as · L p t W k,q x . When writing L ∞ w ([0, T ]; X ) we refer to the space of bounded and weak- * measurable maps f : [0, T ] → X , where X is the dual space of X. By L p (Ω) + L q (Ω) we denote the normed space of all functions u : Ω → R which may be written as the sum of two functions v ∈ L p (Ω) and w ∈ L q (Ω). The space C k ([0, T ]; X) contains all k-times continuously differentiable and X-valued functions on [0, T ].
In order to give a suitable weak description of the evolution of the sharp interface, we have to recall the concepts of Caccioppoli sets as well as varifolds. To this end, let Ω ⊂ R d be open. We denote by BV(Ω) the space of functions with bounded variation in Ω. A measurable subset E ⊂ Ω is called a set of finite perimeter in Ω (or a Caccioppoli subset of Ω) if its characteristic function χ E is of bounded variation in Ω. We will write ∂ * E when referring to the reduced boundary of a Caccioppoli subset E of Ω; whereas n denotes the associated measure theoretic (inward pointing) unit normal vector field of ∂ * E. For detailed definitions of all these concepts from geometric measure theory, we refer to [49,30]. In case Ω has a C 2 boundary, we denote by H(x) the mean curvature vector at x ∈ ∂Ω. Recall that for a convex function g : R d → R the recession function g rec : R d → R is defined as g rec (x) := lim τ →∞ τ −1 g(τ x).
An oriented varifold is simply a non-negative measure V ∈ M(Ω×S d−1 ), where Ω ⊂ R d is open and S d−1 denotes the (d−1)-dimensional sphere. For a varifold V , we denote by |V | S d−1 ∈ M(Ω) its local mass density given by |V | S d−1 (A) := V (A × S d−1 ) for any Borel set A ⊂ Ω. For a locally compact separable metric space X we write M(X) to refer to the space of (signed) finite Radon-measures on X. If A ⊂ X is a measurable set and µ ∈ M(X), we let µ A be the restriction of µ on A. The k-dimensional Hausdorff measure on R d will be denoted by H k , whereas we write L d (A) for the d-dimensional Lebesgue measure of a Lebesgue measurable set A ⊂ R d .
Finally, let us fix some tensor notation. First of all, we use (∇v) ij = ∂ j v i as well as ∇ ·v = i ∂ i v i for a Sobolev vector field v : R d → R d . The symmetric gradient is denoted by D sym v := 1 2 (∇v + ∇v T ). For time-dependent fields v : R d × [0, T ) → R n we denote by ∂ t v the partial derivative with respect to time. Tensor products of vectors u, v ∈ R d will be given by 3. Outline of the strategy 3.1. The relative entropy. The basic idea of the present work is to measure the "distance" between a varifold solution to the two-phase Navier-Stokes equation (χ u , u, V ) and a strong solution to the two-phase Navier-Stokes equation (χ v , v) by means of the relative entropy functional is a suitable extension of the unit normal vector field of the interface of the strong solution and where w is a vector field that will be constructed below and that vanishes in case of equal viscosities µ + = µ − . More precisely, we choose ξ as for some cutoff η with η(s) = 1 for s ≤ 1 2 r c and η ≡ 0 for s ≥ r c , where P Iv(t) x denotes for each t ≥ 0 the projection of x onto the interface I v (t) of the strong solution and where the unit normal vector field n v of the interface of the strong solution is oriented to point towards {χ v (·, t) = 1}.
Rewriting the relative entropy functional in the form with the energy (6d), we see that we may estimate the time evolution of the relative entropy E χ u , u, V χ v , v (t) by exploiting the energy dissipation property (6c) of the varifold solution and by testing the weak formulation of the two-phase Navier-Stokes equation (6a) and (6b) against the (sufficiently regular) test functions v + w respectively 1 2 |v + w| 2 , ∇ · ξ, and β( dist ± (x,Iv(t)) rc ). As usual in the derivation of weak-strong uniqueness results by the relative entropy method of Dafermos [37] and Di Perna [44], in the case of equal viscosities µ + = µ − the goal is the derivation of an estimate of the form which implies uniqueness and stability by means of the Gronwall lemma and by the coercivity properties of the relative entropy functional discussed in the next section.
In the case of different viscosities µ + = µ − , we will derive a slightly weaker (but still sufficient) result of roughly speaking the form along with estimates on w which include in particular the bound 3.2. The error control provided by the relative entropy functional. The relative entropy functional (12) provides control of the following quantities (up to bounded prefactors): Velocity error control. The relative entropy E[χ u , u, V |χ v , v](t) controls the square of the velocity error in the L 2 norm R d |u(·, t) − v(·, t)| 2 dx at any given time t. In the case of equal viscosities, this is immediate from (12) by w ≡ 0, while in the case of different viscosities this follows by the estimaté |∇χu| d|∇χ u | which is a consequence of the construction of w and the choice of ξ, see below.
Interface error control. The relative entropy provides a tilt-excess type control of the error in the interface normal In particular, it controls the squared error in the interface normal The term also controls the total length respectively area (for d = 2 respectively d = 3) of the part of the interface I u which is not locally a graph over I v , see Figure 2. For example, in the region around the left purple half-ray the interface of the weak solution is not a graph over the interface of the weak solution. Furthermore, the term controls the length respectively area (for d = 2 respectively d = 3) of the part of the interface with distance to I v (t) greater than the cutoff length r c , as there we have ξ ≡ 0.
Denote the local height of the one-sided interface error by h + : I v (t) → R + 0 as measured along orthogonal rays originating on I v (t) (with some cutoff applied away from the interface I v (t) of the strong solution); denote by h − the corresponding height of the interface error as measured in the other direction. For example, in Figure 2 the quantity h + (x) for some base point x ∈ I v (t) would correspond to the accumulated length of the solid segments in each of the purple rays, the dotted segments not being counted. Note that the rays are orthogonal on I v (t). Then the tilt-excess type term in the relative entropy also controls the gradient of the one-sided interface error heightŝ Note that wherever I u (t) is locally a graph over I v (t) and is not too far away from I v (t), it must be the graph of the function h + − h − . Here, the graph of a function g over the curved interface I v (t) is defined by the set of points obtained by shifting the points of I v (t) by the corresponding multiple of the surface normal, i. e. {x + g(x)n v (x) : x ∈ I v (t)}. Varifold multiplicity error control. For varifold solutions, the relative entropy controls the multiplicity error of the varifold θt(x) corresponds to the multiplicity of the varifold), which in turn by the compatibility condition (6e) and the definition of θ t (see (7)) controls the squared error in the normal of the varifold Weighted volume error control. Furthermore, the error in the volume occupied by the two fluids weighted with the distance to the interface of the strong solution is controlled. Note that this term is the only term in the relative entropy which is not obtained by the usual relative entropy ansatz . We have added this lower-order term -as compared to the term´R d 1 − ξ · ∇χu |∇χu| d|∇χ u | which provides tilt-excess-type control -to the relative entropy in order to remove the lack of coercivity of the term´R d 1 − ξ · ∇χu |∇χu| d|∇χ u | in the limit of vanishing interface length of the varifold solution. Control of velocity gradient error by dissipation. By means of Korn's inequality, the dissipation term controls the L 2 -error in the gradient T 0ˆR d |∇u − ∇v − ∇w| 2 dx dt.
3.3. The case of equal viscosities. For equal viscosities µ + = µ − , one may choose w ≡ 0. As a consequence, the right-hand side in the relative entropy inequality -see Proposition 9 above -may be post-processed to yield the Gronwall-type estimate (13). The details are provided in Section 5.

3.4.
Additional challenges in the case of different viscosities. In the case of different viscosities µ 1 = µ 2 of the two fluids, even for strong solutions the normal derivative of the tangential velocity features a discontinuity at the interface: By the no-slip boundary condition, the velocity is continuous across the interface [v] = 0 and the same is true for its tangential derivatives [(t·∇)v] = 0. As a consequence of this, the discontinuity of µ(χ v ) across the interface and the equilibrium condition for the stresses at the interface [[µ(χ)t · (n · ∇)v + µ(χ)n · (t · ∇)v]] = 0 entail for generic data a discontinuity of the normal derivative of the tangential velocity t · (n · ∇)v across the interface.
As a consequence, it becomes impossible to establish a Gronwall estimate for the standard relative entropy (12) with w ≡ 0. To see this, consider in the twodimensional case d = 2 two strong solutions u and v with coinciding initial velocities u(·, 0) = v(·, 0) = u 0 (·), but slightly different initial interfaces χ v (·, 0) = χ {|x|≤1} and χ u (·, 0) = χ {|x|≤1−ε} for some ε > 0. The initial relative entropy is then of the order ∼ ε 2 . Suppose that (in polar coordinates) the initial velocity u 0 has a profile near the interface like Note that this velocity profile features a kink at the interface. As one verifies readily, as far as the viscosity term is concerned this corresponds to a near-equilibrium profile for the solution (χ v , v) (in the sense that the viscosity term is bounded). However, in the solution (χ u , u) the interface is shifted by ε and the profile is no longer an equilibrium profile. By the scaling of the viscosity term, the timescale within which the profile u 0 equilibrates in the annulus of width ε towards a nearaffine profile is of the order of ε 2 . After this timescale, the velocity u will have changed by about ε in a layer of width ∼ ε around the interface; at the same time, due to the mostly parallel transport at the interface the solution will not have changed much otherwise. As a consequence, the term´1 2 ρ(χ u )|u − v| 2 dx will be of the order of at least cε 3 after a time T ∼ ε 2 , while the other terms in the relative entropy are essentially the same. Thus, the relative entropy has grown by a factor of 1 + cε within a timescale ε 2 , which prevents any Gronwall-type estimate. At the level of the relative entropy inequality (see Proposition 9), the derivation of the Gronwall inequality is prevented by the viscosity terms, which read for w ≡ 0 The latter term prevents the derivation of a dissipation estimate: While it is formally quadratic in the difference of the two solutions (χ u , u) and (χ v , v), due to the (expected) jump of the velocity gradients ∇v and ∇u at the respective interfaces it is in fact only linear in the interface error.
The key idea for our weak-strong uniqueness result in the case of different viscosities is to construct a vector field w which is small in the L 2 norm but whose gradient compensates for most of the problematic term (µ(χ v ) − µ(χ u ))(∇v + ∇v T ). To be precise, it is only the normal derivative of the tangential component of v which may be discontinuous at the interface; the tangential derivatives are continuous by the no-slip boundary condition, while the normal derivative of the normal component is continuous by the condition ∇ · v = 0.
Let us explain our construction of the vector field w at the simple two-dimensional example of a planar interface of the strong solution I v = {(x, 0) : x ∈ R}. In this setting, we would like to set for y > 0 (where e x just denotes the first vector of the standard basis). Note that due to the bounded integrand, this vector field w + (x, y) is bounded by Ch + (x), i. e. it is bounded by the interface error. As we shall see in the proof, the time derivative of w + is also bounded in terms of other error terms. The tangential spatial derivative of this vector field ∂ x w + (x, y, t) is given (up to a constant factor) bý The normal derivative, on the other hand, is given by which (upon choosing c(µ + , µ − )) would precisely compensate the discontinuity of ∂ y (e x · v) in the region in which the interface of the weak solution is a graph of a function over I v . Note that our relative entropy functional provides a higher-order control of the size of the region in which the interface of the weak solution is not a graph over the interface of the strong solution.
However, with this choice of vector field w + (x, y, t), two problems occur: First, the vector field is not solenoidal. For this reason, we introduce an additional Helmholtz projection. Second -and constituting a more severe problem -, the vector field would not necessarily be (spatially) Lipschitz continuous (as the derivative contains a term with ∂ x h + (x) which is not necessarily bounded), which due to the surface tension terms would be required for the derivation of a Gronwall-type estimate. For this reason, we first regularize the height function h + by mollification on a scale of the order of the error. See Proposition 25 and Proposition 26 for details of our construction of the regularized height function. The actual construction of our compensation function w is performed in Proposition 27. We then derive an estimate in the spirit of (14) in Proposition 33.

Time evolution of geometric quantities and further coercivity
properties of the relative entropy functional 4.1. Time evolution of the signed distance function. In order to describe the time evolution of various constructions, we need to recall some well-known properties of the signed distance function. Let us start by introducing notation. For a family (Ω + t ) t∈[0,Tstrong) of smoothly evolving domains with smoothly evolving interfaces I(t) in the sense of Definition 5, the associated signed distance function is given by From Definition 5 of a family of smoothly evolving domains it follows that the family of maps Φ t : The signed distance function (resp. its time derivative) to the interface of the strong solution is then of class C 0 t C 3 x (resp. C 0 t C 2 x ) in the space-time tubular neighborhood t∈[0,Tstrong) im(Φ t ) × {t} due to the regularity assumptions in Definition 5. We also have the bounds and in particular for the mean curvature vector Moreover, the projection P I(t) x of a point x onto the nearest point on the manifold I(t) is well-defined and of class C 0 t C 2 x in the same tubular neighborhood. After having introduced the necessary notation we study the time evolution of the signed distance function to the interface of the strong solution. Because of the kinematic condition that the interface is transported with the flow, we obtain the following statement.
is a family of smoothly evolving domains and I v (t) := ∂Ω + t is a family of smoothly evolving surfaces in the sense of Defi- The time evolution of the signed distance function to the interface I v (t) is then given by for any t ∈ [0, T strong ] and any x ∈ R d with dist(x, I v (t)) ≤ r c , whereV n is the extended normal velocity of the interface given bȳ Moreover, the following formulas hold true for all (x, t) such that dist(x, I v (t)) ≤ r c . The gradient of the projection onto the nearest point on the interface I v (t) is given by In particular, we have the bound Proof. Recall that ∇ dist ± (x, I v (t)) for a point x ∈ I v (t) on the interface equals the inward pointing normal vector n v (x, t) of the interface I v (t). This also extends away from the interface in the sense that for all (x, t) such that dist(x, I v (t)) < r c , i. e. (21) holds. Hence, we also have the . Differentiating this representation of the projection onto the interface and using the fact that n v is a unit vector we obtain using also (28) Hence, we obtain in addition to (27) the formula On the other side, on the interface the time derivative of the signed distance function equals up to a sign the normal speed. In our case, the latter is given by the normal component of the given velocity field v evaluated on the interface, see Remark 8. This concludes the proof of (19). Moreover, (22) as well as (23) follow immediately from differentiating |∇ dist ± (x, I v (t))| 2 = 1. Finally, (25) and (26) follow immediately from (17) and In the above considerations, we have made use of the following result: Consider the auxiliary function g( Remark 11. Consider the situation of Lemma 10. We proved that The right hand side of this identity is of class L ∞ t W 2,∞ x , as the normal component n v (P Iv(t) ) · ∇v of the velocity gradient ∇v of a strong solution is continuous across the interface I v (t). To see this, one first observes that the tangential derivatives ((Id −n v (P Iv(t) ⊗ n v (P Iv(t) )∇)v are naturally continuous across the interface; one then uses the incompressibility constraint ∇ · v = 0 to deduce that n v (P Iv(t) · (n v (P Iv(t) · ∇)v is also continuous across the interface.

4.2.
Properties of the vector field ξ. The vector field ξ -as defined in Proposition 9 -is an extension of the unit normal vector field n v associated to the family of smoothly evolving domains occupying the first fluid of the strong solution. We now provide a more detailed account of its definition. The construction in fact consists of two steps. First, we extend the normal vector field n v to a (space-time) tubular neighborhood of the evolving interfaces I v (t) by projecting onto the interface. Second, we multiply this construction with a cutoff which decreases quadratically in the distance to the interface of the strong solution (see (35)).
is a family of smoothly evolving domains and I v (t) := ∂Ω + t is a family of smoothly evolving surfaces in the sense of Definition 5. Let η be a smooth cutoff function with η(s) = 1 for s ≤ 1 2 and η ≡ 0 for s ≥ 1. Define another smooth cutoff function ζ : R → [0, ∞) as follows: and ζ ≡ 0 for |r| > 1. Then, we define a vector field ξ : The definition of ξ has the following consequences.
Remark 13. Observe that the vector field ξ is indeed well-defined in the spacetime domain R d × [0, T strong ) due to the action of the cut-off function ζ; it also satisfies |ξ| ≤ 1 or, more precisely, the sharper inequality |ξ| ≤ (1−dist(x, I v (t)) 2 ) + . Furthermore, the extension ξ inherits its regularity from the regularity of the signed distance function to the interface I v (t). More precisely, it follows that the vector field ξ (resp. its time derivative) is of class x . This turns out to be sufficient for our purposes.
The time derivative of our vector field ξ is given as follows.
is a family of smoothly evolving domains and I v (t) := ∂Ω + t is a family of smoothly evolving surfaces in the sense of Defi- LetV n be the extended normal velocity of the interface (20). Then the time evolution of the vector field ξ from Definition 12 is given by in the space-time domain dist(x, I v (t)) < r c . Here, we made use of the abbreviation Proof. We start by deriving a formula for the time evolution of the normal vector field n v (P Iv(t) x, t) in the space-time tubular neighborhood dist(x, I v (t)) < r c . By (21), we may use the formula for the time evolution of the signed distance function from Lemma 10. More precisely, due to the regularity of the signed distance function to the interface of the strong solution and the regularity of the vector fieldV (Remark 11), we can interchange the differentiation in time and space to obtain Next, we show that the normal-normal component of ∇V n vanishes. Observe that by Remark 11 and (21) it holds Hence, by (21)- (24) and this formula we obtain = 0 as desired. In summary, we have proved so far that which holds in the space-time domain dist(x, I v (t)) < r c . However, applying the chain rule to the cut-off function r → ζ(r) from (29) together with the evolution equation (19) for the signed distance to the interface shows that the cut-off away from the interface is also subject to a transport equation: By the definition of the vector field ξ, see (30), and the product rule, this concludes the proof.

4.3.
Properties of the weighted volume term. We next discuss the weighted volume contribution´R d |χ u − χ v | dist(x, I v (t)) dx to the relative entropy in more detail.
Remark 15. Let β be a truncation of the identity as in Proposition 9. Let χ v ∈ L ∞ ([0, T strong ); BV(R d ; {0, 1})) be an indicator function such that Ω + t := {x ∈ R d : χ v (x, t) = 1} is a family of smoothly evolving domains, and I v (t) := ∂Ω + t is a family of smoothly evolving surfaces, in the sense of Definition 5. The map inherits the regularity of the signed distance function to the interface I v (t). More precisely, this map (resp. its time derivative) is of class is a family of smoothly evolving domains and I v (t) := ∂Ω + t is a family of smoothly evolving surfaces in the sense of Defi- LetV n be the extended normal velocity of the interface (20). Then the time evolution of the weight function β composed with the signed distance function to the interface I v (t) is given by the transport equation Proof. This is immediate from the chain rule and the time evolution of the signed distance function to the interface of the strong solution, see Lemma 10.

4.4.
Further coercivity properties of the relative entropy. We collect some further coercivity properties of the relative entropy functional E χ u , u, V χ v , v as defined in (11). These will be of frequent use in the post-processing of the terms occurring on the right hand side of the relative entropy inequality from Proposition 9. We start for reference purposes with trivial consequences of our choices of the vector field ξ and the weight function β.
Lemma 17. Consider the situation of Proposition 9. In particular, let β be the truncation of the identity from Proposition 9. By definition, it holds Let ξ be the vector field from Definition 12 with cutoff multiplier ζ as given in (29). By the choice of the cutoff ζ, it holds We will also make frequent use of the fact that for any unit vector b ∈ R d we have We also want to emphasize that the relative entropy functional controls the squared error in the normal of the varifold.
Lemma 18. Consider the situation of Proposition 9. We then havê for almost every t ∈ [0, T strong ).
Proof. Observe first that by means of the compatibility condition (6e) we havê which holds for almost every t ∈ [0, T strong ). In addition, due to (8) one obtainŝ for almost every t ∈ [0, T strong ). This in turn entails the following identitŷ which holds true for almost every t ∈ [0, T strong ). However, the functional on the right hand side controls the squared error in the normal of the varifold: |s − ξ| 2 ≤ 2(1 − s · ξ). This proves the claim.
We will also refer multiple times to the following bound. In the regime of equal shear viscosities µ + = µ − we may apply this result with the choice w = 0. In the general case, we have to include the compensation function w for the velocity gradient discontinuity at the interface.
for almost every T ∈ [0, T strong ) and all 0 < δ ≤ 1. The absolute constant C > 0 only depends on the densities ρ ± .
Proof. We first argue how to control the part away from the interface of the strong solution, i.e., outside of {(x, t) : dist(x, I v (t)) ≥ r c }. A straightforward estimate using Hölder's and Young's inequality yields ˆT 0ˆ{dist(x,Iv(t))≥rc} Note that by the properties of the truncation of the identity β, see Proposition 9, it follows that |β(dist ± (x, which is indeed a bound of required order. We proceed with the bound for the contribution in the vicinity of the interface of the strong solution. To this end, recall that we are equipped with a family of maps Φ t : (16). We then move on with a change of variables, the one-dimensional Gagliardo-Nirenberg-Sobolev interpolation inequality as well as Hölder's and Young's inequality to obtain the bound ˆT It thus suffices to derive an estimate for the L 2 -norm of the local interface error height in normal direction The proof of Proposition 25 below, where we establish next to the required L 2bound also several other properties of the local interface error height, shows that (see (58) This then concludes the proof.
We conclude this section with an L 2 tan L ∞ nor -bound for H 1 -functions on the tubular neighborhood around the evolving interfaces as well as a bound for the derivatives of the normal velocity of the interface of a strong solution in terms of the associated velocity field v, both of which will be used several times in the post-processing of the terms on the right hand side of the relative entropy inequality of Proposition 9.
Lemma 20. Consider the situation of Proposition 9. We have the estimatê Proof. Let f ∈ H 1 (−r c , r c ). The one-dimensional Gagliardo-Nirenberg-Sobolev inerpolation inequality then implies

WEAK-STRONG UNIQUENESS FOR TWO-PHASE FLOW WITH SHARP INTERFACE 29
From this we obtain together with Hölder's inequalitŷ This implies (40) and the associated change of variables, using also the bound (16).

Lemma 21.
Consider the situation of Proposition 9 and define the vector field where O = t∈(0,Tstrong) {x ∈ R d : dist(x, I v (t)) < r c } × {t} denotes the space-time tubular neighborhood of width r c of the evolving interface of the strong solution.
In particular, we have forV n (x, t) := V n (P Iv(t) x, t) the estimate Proof. The estimates (41) and (42) are a direct consequence of the regularity requirements on the velocity field v of a strong solution, see Definition 6, the pointwise bounds (17) and the representation of the normal vector field on the interface in terms of the signed distance function (21).

Weak-strong uniqueness of varifold solutions to two-fluid
Navier-Stokes flow: The case of equal viscosities In this section we provide a proof of the weak-strong uniqueness principle to the free boundary problem for the incompressible Navier-Stokes equation for two fluids (1a)-(1c) in the case of equal shear viscosities µ + = µ − . Note that in this case the problematic viscous stress term R visc in the relative entropy inequality (see Proposition 9) vanishes because of µ(χ u ) − µ(χ v ) = 0. In this setting, it is possible to choose w ≡ 0 which directly implies A visc = 0, A adv = 0, A dt = 0, A weightV ol = 0, and A surT en = 0. What remains to be done is a post-processing of the terms R surT en , R adv , R dt , and R weightV ol which remain on the right-hand side of the relative entropy inequality. 5.1. Estimate for the surface tension terms. We start by post-processing the terms related to surface tension R surT en .
Lemma 22. Consider the situation of Proposition 9. The terms related to surface tension R surT en are estimated by for any δ > 0.
Proof. We start by using (36) and (30) Recall from (37) that the squared error in the varifold normal is controlled by the relative entropy functional. Together with the bound from Lemma 19, (17) as well as (45) we get an estimate for the first four terms of R surT en R surT en for almost every T ∈ [0, T strong ) and all δ ∈ (0, 1]. To estimate the remaining two terms we decomposeV n − v as where the vector field V n is given by in the space-time domain {dist(x, I v (t)) < r c } (i. e. in contrast to V vecn , for V n the velocity v is evaluated not at the projection of x onto the interface, but at x itself). Note that it will not matter as to how V n and similar quantities are defined outside of the area {dist(x, I v (t)) < r c }, as the terms will always be multiplied by suitable cutoffs which vanish outside of {dist(x, I v (t)) < r c }. In the next two steps, we compute and bound the contributions from the two different parts in the decomposition (47) of the errorV n − v.
First step: Controlling the error V n − v. By definition of the vector field V n in (48), should be controlled by our relative entropy functional. However, the integrands enjoy a crucial cancellation To verify this cancellation, we first recall from (21) that ∇ dist ± (x, I v (t)) = n v (P Iv(t) x, t). We then start by rewriting Note that when the derivative hits the cutoff multiplier in the definition of ξ (see (30)), the resulting term on the right hand side of the last identity vanishes. Hence, we obtain together with (23) On the other side, another application of (23) yields Therefore, the desired cancellation (49) indeed holds true since by (23) the righthand side of the last computation remains unchanged after projecting via Id − n v ⊗ n v .
Second step: Controlling the errorV n − V n . It remains to control the contributions from the following two quantities: Note first that we can write Moreover, recall from (25) the formula for the gradient of the projection onto the nearest point on the interface I v (t). The definition of V n (see (48)) andV n (x) = V vecn (P Iv(t) x), the product rule, (21), (17), and (23) imply using the definition of ξ and the property |ξ| ≤ 1 where in the last step we have used also (25). Together Young's inequality and the coercivity properties of the relative entropy (35) and (36) we then immediately get the estimate To estimate the second term II, we start by adding zero and then use again (17) as well as (35) and (36) Using (23), we continue by computinĝ Hence, it follows from ζ (0) = 0 and |ζ | ≤ C as well as (43) that Third step: Summary. Inserting (49), (50), and (51) into (46) entails the bound This yields the desired estimate.

5.2.
Estimate for the remaining terms R adv , R dt , and R weightV ol . To bound the advection-related terms from the relative entropy inequality, the time-derivative related terms R dt , and the terms resulting from the weighted volume control term in the relative entropy , we use mostly straightforward estimates.
Lemma 23. Consider the situation of Proposition 9. The terms R adv , R dt , and R weightV ol are subject to the bounds and for any δ > 0.
Proof. To derive (52), we use a direct estimate for the second term in R adv as well as Lemma 19 for the first term. The bound (53) is derived similarly. Finally, we show estimate (54). Note that by definition we haveV n (x, t) = V n (P Iv(t) x, t). Hence, we obtain using the bound (43) as well as (34) and |β | ≤ C An application of Lemma 19 yields (54).

5.3.
The weak-strong uniqueness principle in the case of equal viscosities. We conclude our discussion of the case of equal shear viscosities µ + = µ − for the free boundary problem for the incompressible Navier-Stokes equation for two fluids (1a)-(1c) with the proof of the weak-strong uniqueness principle.
Proposition 24. Let d ≤ 3. Let (χ u , u, V ) be a varifold solution to the free boundary problem for the incompressible Navier-Stokes equation for two fluids (1a)-(1c) in the sense of Definition 2 on some time interval [0, T vari ) with initial data (χ 0 u , u 0 ). Let (χ v , v) be a strong solution to (1a)-(1c) in the sense of Definition 6 on some time interval [0, T strong ) with T strong ≤ T vari and initial data (χ 0 v , v 0 ). We assume that the shear viscosities of the two fluids coincide, i.e., µ + = µ − .
Then, there exists a constant C > 0 which only depends on the data of the strong solution such that the stability estimate holds. In particular, if the initial data of the varifold solution and the strong solution coincide, the varifold solution must be equal to the strong solution in the sense almost everywhere for almost every t ∈ [0, T strong ). Furthermore, in this case the varifold is given by for almost every t ∈ [0, T strong ).
Proof. Applying the relative entropy inequality from Proposition 9 with w = 0, using the fact that the problematic term R visc vanishes in the case of equal shear viscosities µ + = µ − , as well as using the bounds from (44), (52), (53) and (54), we observe that we established the following bound for almost every T ∈ [0, T strong ). An absorption argument along with a subsequent application of Gronwall's lemma then immediately yields the asserted stability estimate.
Consider the case of coinciding initial conditions, i.e., E[χ u , u, V |χ v , v](0) = 0. In this case, we deduce from the stability estimate that the relative entropy vanishes for almost every t ∈ [0, T strong ). From this it immediately follows that u(·, t) = v(·, t) as well as χ u (·, t) = χ v (·, t) almost everywhere for almost every t ∈ [0, T strong ).
The asserted representation of the varifold V of the varifold solution follows from the following considerations. First, we deduce |∇χ u (·, t)| = |V t | S d−1 for almost every t ∈ [0, T strong ) as a consequence of the fact that the density of the varifold satisfies θ t = d|∇χu(·,t)| d|Vt| S d−1 ≡ 1 almost everywhere for almost every t ∈ [0, T strong ). The remaining fact that the measure on S d−1 is given by δ nu(x,t) for |V t | S d−1 -almost every x ∈ R d for almost every t ∈ [0, T strong ) then follows from the control of the squared error in the normal of the varifold by the relative entropy functional, see (37). This concludes the proof.
6. Weak-strong uniqueness of varifold solutions to two-fluid Navier-Stokes flow: The case of different viscosities We turn to the derivation of the weak-strong uniqueness principle in the case of different shear viscosities of the two fluids. In this regime, we cannot anymore ignore the viscous stress term (µ(χ v ) − µ(χ u ))(∇v + ∇v T ). The key idea is to construct a solenoidal vector field w which is small in the L 2 -norm but whose gradient compensates for most of this problematic term, and then use the relative entropy inequality from Proposition 9 with this function. The precise definition as well as a list of all the relevant properties of this vector field are the content of Proposition 27.
A main ingredient for the construction of w are the local interface error heights as measured in orthogonal direction from the interface of the strong solution (see Figure 2). For this reason, we first prove the relevant properties of the local heights of the interface error in Proposition 25. However, in order to control certain surfacetension terms in the relative entropy inequality, we actually need the vector field w to have bounded spatial derivatives. To this aim, we perform an additional regularization of the height functions. This will be carried out in detail in Proposition 26 by a (time-dependent) mollification. After all these preparations, in Section 6.4-6.8 we then perform the post-processing of the additional terms A visc , A dt , A adv , and A surT en in the relative entropy inequality from Proposition 9. Based on these bounds, in Section 6.9 we finally provide the proof of the stability estimate and the weak-strong uniqueness principle for varifold solutions to the free boundary problem for the incompressible Navier-Stokes equation for two fluids (1a)-(1c) from Theorem 1. For the family (Ω + t ) t∈[0,Tstrong) of smoothly evolving domains of the strong solution, the associated signed distance function is given by From Definition 5 of a family of smoothly evolving domains it follows that the family of maps Φ t : Here, n v (·, t) denotes the normal vector field of the interface I v (t) pointing inwards {x ∈ R d : χ v (x, t) = 1}. The signed distance function (resp. its time derivative) to the interface I v (t) of the strong solution is then of class C 0 t C 3 x (resp. C 0 t C 2 x ) in the space-time tubular neighborhood t∈[0,Tstrong) im(Φ t ) × {t} due to the regularity assumptions in Definition 5. Moreover, the projection P Iv(t) x of a point x onto the nearest point on the manifold I v (t) is well-defined and of class C 0 t C 2 x in the same tubular neighborhood. Observe that the inverse of Φ t is given by is given by In Lemma 10, we computed the time evolution of the signed distance function to the interface I v (t) of a strong solution. Recall also the various relations for the projected inner unit normal vector field n v (P Iv(t) x, t) from Lemma 10, which will be of frequent use in subsequent computations. Finally, we remind the reader of the definition of the vector field ξ from Definition 12, which is a global extension of the inner unit normal vector field of the interface I v (t).
is a family of smoothly evolving domains and I v (t) := ∂Ω + t is a family of smoothly evolving surfaces in the sense of Definition 5. Let ξ be the extension of the unit normal vector field n v from Definition 12.
Let θ : [0, ∞) → [0, 1] be a smooth cutoff with θ ≡ 0 outside of [0, 1 2 ] and θ ≡ 1 in [0, 1 4 ]. For an indicator function χ u ∈ L ∞ ([0, T strong ]; BV(R d ; {0, 1})) and t ≥ 0, we define the local height of the one-sided interface error h + (·, t) : Similarly, we introduce the local height of the interface error in the other direction Then h + and h − have the following properties: a) (L 2 -bound) We have the estimates |h ± (x, t)| ≤ rc 2 and c) (Approximation property) The functions h + and h − provide an approximation of the set {χ u = 1} in terms of a subgraph over the set I v (t) by setting up to an error of such that in the domain t∈[0,Tstrong) (Ω + t ∪Ω − t )×{t} the second spatial derivatives of the vector field v exist and satisfy sup t∈[0,Tstrong) sup x∈Ω + If χ u solves the equation ∂ t χ u = −∇ · (χ u u) for another solenoidal vector field u ∈ L 2 ([0, T strong ]; H 1 (R d ; R d )), we have the following estimate on the time derivative of the local interface error heights , and whereh ± is defined as h ± but now with respect to the modified cut-offθ(·) = θ · 2 . Proof.
Step 1: Proof of the estimate on the L 2 -norm. The trivial estimate |h ± (x, t)| ≤ rc 2 follows directly from the definition of h ± . To establish the L 2estimate, let + (x) :=´r c 0 (1 − χ u )(x + yn v (x, t), t) dy. A straighforward estimate then gives Note that the term on the left hand side dominates |h + | 2 since we dropped the cutoff function. Hence, the desired estimate on the L 2 -norm of h + follows at once by a change of variables and recalling the fact that dist(Φ t (x, y), I v (t)) = y. The corresponding bound for h − then follows along the same lines.
Step 2: Proof of the estimate on the spatial derivative (57b). The definition (56) is equivalent to We compute for any smooth vector field η ∈ C ∞ cpt (R d ; R d ) (recall that Φ t (x, 0) = x and dist(Φ t (x, y), I v (t)) = y for any x ∈ I v (t) and any y with |y| ≤ r c ) where in the last step we have used ∇ dist ± (x, I v (t)) = n v (P Iv(t) x). This yields by another change of variables in the second integral, the fact that χ v (Φ t (x, y), t) = 1 for any y > 0, (17), (18), | det ∇Φ −1 t | ≤ C as well as by abbreviating n u = ∇χu |∇χu| U ∩Iv(t) for any Borel set U ⊂ R d . Recall that the indicator function χ u (·, t) of the varifold solution is of bounded variation in I := {x ∈ R d : dist ± (x, I v (t)) ∈ (−r c , r c )}. In particular, E + := {x ∈ R d : χ u > 0} ∩ I is a set of finite perimeter in I. Applying Theorem 35 in local coordinates the sections E + x = {y ∈ (−r c , r c ) : χ u (x + yn v (x, t)) > 0} are guaranteed to be one-dimensional Caccioppoli sets in (−r c , r c ) for H d−1 -almost every x ∈ I v (t). Note that whenever |n v · n u | ≤ 1 2 then 1 − n v · n u ≥ 1 2 , and therefore using also the co-area formula for rectifiable sets (see [10, (2.72) x+ynv(x,t): x∈U ∩Iv(t),y∈(−rc,rc),nv(x)·nu(x+ynv(x,t))≤ 1 2 } We now distinguish between different cases depending on x ∈ I v (t) up to H d−1measure zero. We start with the set of points By splitting the measure D tan x h + into a part which is absolutely continuous with respect to the surface measure on I v (t), for which we denote the density by ∇ tan h + , as well as a singular part D s h + , we obtain from (59) (note that the third integral in (59) does not contribute to this estimate by the definition of the set for every Borel set U ⊂ R d . Since U was arbitrary, we deduce that |∇ tan h + | is bounded on A 1 by the two integrands on the right hand side of the last inequality. Hence, we obtain The first term on the right hand side can be estimated as in the proof of the L 2bound for h ± . To bound the second term, we make the following observation. First, we may represent the one-dimensional Caccioppoli sets E + x as a finite union of disjoint intervals (see [10,Proposition 3.52]). It then follows from property iv) in Theorem 35 that ∂ * E + x ∩ (−r c , r c ) can only contain at most one point. Indeed, otherwise we would find at least one point y ∈ ∂ * E + x ∩ (−r c , r c ) such that n v (x) · n u (x+yn v (x, t)) < 0 which is a contradiction to the definition of A 1 . By another application of the co-area formula for rectifiable sets (see [10, (2.72)]) we therefore We now turn to the second case, namely the set of points A 2 := I v (t) \ A 1 . We begin with a preliminary computation. When splitting E + x into a finite family of disjoint open intervals as before, it again follows from property iv) in Theorem 35 that every second point y ∈ ∂ * E + x ∩ (−r c , r c ) has to have the property that n v (x) · n u (x+yn v (x, t)) < 0, i.e., |n v (x) − n u | ≤ 2 ≤ 2(1 − n v (x) · n u ). In particular, by another application of the co-area formula for rectifiable sets (see [10, (2.72)]) we obtain the bound Now, we proceed as follows. By definition of A 2 , either one of the three summands in (60) has to be ≥ 1 12 . We distinguish between two cases. If the third one is not, then this actually means that the set {ỹ ∈ (−r c , r c ) ∩ ∂ * E + x : n v (x) · n u (x+ỹn v (x, t)) ≤ 1 2 } is empty, i.e., the third summand has to vanish. Hence, either one of the first two summands in (60) has to be ≥ 1 8 . If the first one is not, we use that´r c 0 |χ u (Φ t (x, y), t) − χ v (Φ t (x, y), t)| dy ≤ r c and bound this by the second term and then (62). If the second one is not, then Now, we move on with the remaining case, i.e., that the third summand in (60) does not vanish. In other words, Taking finally U = A 2 in (59), the conclusions of the above case study together with the three estimates (62), (63) and (64) followed by another application of the co-area formula for rectifiable sets (see [10, (2.72)]) to further estimate the latter, then imply that The two estimates (61) and (65) thus entail the desired upper bound (57b) for the (tangential) gradient of h ± with ξ replaced by n v (P Iv(t) x). However, one may replace n v (P Iv(t) x) by ξ because of (36).
Step 3: Proof of the approximation property for the interface (57c). In order to establish (57c), we rewrite using the coordinate transform Φ t (recall that dist ± (Φ t (x, y), I v (t)) = y and that |h ± | ≤ r c ) In order to derive a bound for the first term on the right-hand side of (66), we distinguish between different cases depending on x ∈ I v (t) up to H d−1 -measure zero. We first distinguish between h + (x) ≥ rc 4 and h + (x) < rc 4 . In the former case, a straightforward estimate yields (recall (16)) ˆr which is indeed of required order after a change of variables. We now consider the other case, i.e., h + (x) < rc 4 . Recall that the indicator function χ u (·, t) of the varifold solution is of bounded variation in Applying Theorem 35 in local coordinates, the sections t)) > 0} are guaranteed to be one-dimensional Caccioppoli sets in (0, r c ) for H d−1 -almost every x ∈ I v (t). Hence, we may represent the one-dimensional section E + x for such x ∈ I v (t) as a finite union of disjoint intervals (see [10,Proposition 3.52]) (a m , b m ).
If K(x) = 0 then h + (x) = 0, and the inner integral in the first term on the right hand side of (66) vanishes for this x. If K(x) = 1 and a 1 = 0, then by definition of h + (x) we have (a 1 , b 1 ) = (0, h + (x)) (recall that we now consider the case h + (x) ≤ rc 4 ). Thus, again the inner integral in the first term on the right hand side of (66) vanishes for this x. Hence, it remains to discuss the case that there is at least one non-empty interval in the decomposition of E + x , say (a, b), such that a ∈ (0, r c ). From property iv) in Theorem 35 it then follows that Hence, we may bound Gathering the bounds from the different cases together with the estimate in (67), we therefore obtain by the co-area formula for rectifiable sets (see [10, (2.72)]) together with the change of variables Φ t (x, y) which is by (36) as well as (35) indeed a bound of desired order. Moreover, performing analogous estimates for the second term on the right-hand side of (66) and estimating the third term on the right-hand side of (66) trivially, we then get which is precisely the desired estimate (57c).
Step 4: Proof of estimate on the time derivative (57d). To bound the time derivative, we compute using the weak formulation of the continuity equation ∂ t χ u = −∇ · (χ u u) and abbreviating I + (t) := {x ∈ R d : dist ± (x, I v (t)) ∈ [0, r c )} (recall that the boundary ∂I + (t) = I v (t) moves with normal speed n Recall from (25) the formula for the gradient of the projection onto the nearest point on the interface I v (t). Recalling also the definitions of the extended (48) respectively (20), we also have −ˆI Adding this formula to the above formula for d rc ), and using the fact that rc ) = 1 for any t and any x ∈ I v (t). (17), the corresponding estimate (41) for the gradient of V n as well as the formula (25) for the gradient of P Iv(t) . Because of (21) and the equation (32) for the time evolution of the normal vector, we thus get the bounds . Taking all of these bounds together, ). As a consequence, we get Moreover, we may compute Since n v · ∇η = 0 holds on the interface I v (t) by assumption, we obtain from (70) −ˆI In what follows, we will by slight abuse of notation use ∇ tan g(x) as a shorthand for (Id − n v (P Iv(t) x) ⊗ n v (P Iv(t) x))∇g(x) for scalar fields as well as (∇ tan · g)(x) instead of (Id − n v (P Iv(t) x) ⊗ n v (P Iv(t) x)) : ∇g(x) for vector fields. Let us also abbreviate P tan x := (Id − n v (P Iv(t) x) ⊗ n v (P Iv(t) x)). Note that by assumption (∇η)(P Iv(t) x) = (∇ tan η)(P Iv(t) x). Moreover, it follows from (22), (23) and (21) that n v (P Iv(t) x) · d dt (n v (P Iv(t) x)) = 0. Hence, we may rewrite with an integration by parts (recall the notation P tan (x) = (Id −n v ⊗ n v )(P Iv(t) x, t)) = −ˆI Using from (23) and (21) that the spatial partial derivatives of the extended normal vector field are orthogonal to the gradient of the signed distance function, the same argument also shows that It follows from (25) as well as (23) and (21) that (n v (P Iv(t) x)·∇)P Iv(t) x = 0. Hence, we obtain Since the domain of integration is I + (t), we may write From this and the fact n v (P Iv(t) ) · ∇P Iv(t) (x) = 0, we deduce by another integration by parts that (where |F | ≤ r −1 = −ˆI Hence, plugging in (73), (72) and (74), (71) into (68) and using the estimates This yields by the change of variables Φ t (x, y) and a straightforward estimate Using finally the Sobolev embedding to bound the L ∞ -norm of η on the interface (which is either one-or two-dimensional; note that the constant in the Sobolev embedding may be bounded by Cr −1 c for our geometry), we infer from this estimate the desired bound (57d), using also (36) and (35). This concludes the proof.

6.2.
A regularization of the local height of the interface error. In order to modify our relative entropy to compensate for the velocity gradient discontinuity at the interface, we need regularized versions of the local heights of the interface error h + and h − which in particular have Lipschitz regularity. To this aim, we fix some function e(t) > 0 and basically apply a mollifier on scale e(t) to the local interface error heights h + and h − at each time. These regularized versions h + e(t) and h − e(t) of the local interface error heights then have the following properties: dS(x) .

(75)
Then h + e(t) and h − e(t) have the following properties: a) (H 1 -bound) If the interface error terms from the relative entropy are bounded byˆR we have the Lipschitz estimate |∇h ± e(t) (·, t)| ≤ Cr −2 c , the global bound |∇ 2 h ± e(t) (·, t)| ≤ Ce(t) −1 r −4 c , and the bound b) (Improved approximation property) The functions h + e(t) and h − e(t) provide an approximation for the interface of the weak solution such that in the domain t∈[0,Tstrong) (Ω + t ∪Ω − t )×{t} the second spatial derivatives of the vector field v exist and satisfy sup t∈[0,Tstrong) sup x∈Ω + If χ u solves the equation ∂ t χ u = −∇ · (χ u u) for another solenoidal vector field u ∈ L 2 ([0, T strong ]; H 1 (R d ; R d )), we have the following estimate on the time derivative of h ± e(t) : for any smooth test function η ∈ C ∞ cpt (R d ) with n v · ∇η = 0 on the interface I v (t), and whereh ± is defined as h ± but now with respect to the modified cut-off function θ(·) = θ · 2 . Proof. Proof of a). In order to estimate the spatial derivative ∇h ± e(t) , we compute using the fact that ∇ x θ |x−x| (note that all of the subsequent gradients are to be understood in the tangential sense on the manifold I v (t)) Introduce the convex function G(p) := |p| 2 for |p| ≤ 1, 2|p| − 1 for |p| ≥ 1.
Proof of b). We start with a change of variables to estimate (recall (16)) By adding zero and using (57c) we therefore obtain Observe that one can decompose A straightforward estimate in local coordinates then yieldŝ Using (57b) and summing with respect to k ∈ N, we get the desired estimate (76c). Proof of c). Note that Abbreviating As in the argument for (79), one checks that´I v (t) |θ |( |x−x| e(t) ) dS(x) ≤ Ce(t) d−1 . Using the lower bound from (79), the proof for the standard L p -inequality for convolutions carries over and we obtain η e L p (Iv(t)) ≤ C η L p (Iv(t)) as well aŝ for any p ≥ 1. As a consequence of (57d) and these considerations, we deduce where in the last step we have used the simple equality and the bounds (18) and (79). Recall from the transport theorem for moving hypersurfaces (see [77]) that we have for any (70)), we then compute for This together with another application of (84) and the fact that n v · ∇η = 0 on the interface where F e,θ (t) : dS(x) .
Observe that we haveˆI By the choice of the cutoff θ, we see that for every given x ∈ I v (t) the kernel F e,θ (t) is supported in B e(t)/2 (x) ∩ I v (t). Moreover, the exact same argumentation which led to the upper bound in (79) (we only used the support and upper bound for θ as well as e(t) ≤ r c ) shows that the kernel F e,θ satisfies the upper bound for any 1 ≤ p < ∞. We next intend to rewrite the function F e,θ (x, x) for fixed x as the divergence of a vector field. By the property (87), we may consider Neumann problem for the (tangential) Laplacian with right hand side F e,θ (·, x) in some neighborhood (of scale e(t)) of the point x. To do this we first rescale the setup, i.e., we consider the kernel F 1 (x, x) := F e,θ (e(t)x, e(t)x) forx, x ∈ e(t) −1 I v (t). By scaling and the fact that F e,θ is supported on scale e(t)/2, it follows that F 1 (·, x) has zero average on e(t) −1 I v (t) ∩ B 1 (x) for every point x ∈ e(t) −1 I v (t) and that We fix x ∈ e(t) −1 I v (t) and solve on e(t) −1 I v (t) ∩ B 1 (x) the weak formulation of the equation −∆ tañ xF 1 (·, x) = F 1 (·, x) with vanishing Neumann boundary condition. More precisely, we requireF 1 (·, x) to have vanishing average on e(t) −1 I v (t) ∩ B 1 (x) (note that in the weak formulation the curvature term does not appear because it gets contracted with the tangential derivative of the test function). By elliptic regularity and (89), it follows We now rescale back to I v (t) and defineF e,θ (x, x) := e(t) 2F 1 (e(t) −1x , e(t) −1 x) for x ∈ I v (t) andx ∈ I v (t) ∩ B e(t) (x). For fixed x ∈ I v (t),F e,θ (·, x) has vanishing average on I v (t) ∩ B e(t) (x) and solves −∆ tañ xF e,θ (·, x) = F e,θ (·, x) on I v (t) ∩ B e(t) (x) with vanishing Neumann boundary condition. We finally introduce F e,θ (x, x) := ∇ tañ xF e,θ (x, x) for x ∈ I v (t) andx ∈ I v (t) ∩ B e(t) (x). It then follows from scaling, (90) as well as e(t) < r c that ∇x · F e,θ (x, x) = F e,θ and We now have everything in place to proceed with estimating the term ˆI To this end, we will make use of (85) and estimate term by term. Because of (18), (79), η e L p (Iv(t)) ≤ C η L p (Iv(t)) , the estimatê the Lipschitz property |V n (x) − V n (x)| ≤ ||∇v|| L ∞ |x −x|, and the fact that θ(s) = 0 for s ≥ 1, the first four terms on the right-hand side of (85) are straightforward to estimate and result in the bound To estimate the fifth term, we first apply Fubini's theorem and then perform an integration by parts (recall that we imposed vanishing Neumann boundary conditions) which entails because of the above considerations Using (91) as well as the lower bound from (79) we see that the second term can be estimated by a term of the form (92). For the first term, note that by the properties of F e,θ we may interpret the integral in brackets as the mollification of ∇h ± on scale e(t). Applying the argument which led to (80) (for this, we only need the upper bound (91) for F e,θ , a lower bound as in (79) is only required for θ) we observe that one can bound this term similar to ∇h ± e(t) (·, t) L 2 (Iv(t)) . We therefore obtain the bound ˆI Hence, combining (81) with these estimates for the fourth term from (85) as well as (92) and (82), we obtain the desired estimate on the time derivative. This concludes the proof.
6.3. Construction of the compensation function w for the velocity gradient discontinuity. We turn to the construction of a compensating vector field, which shall be small in the L 2 -norm but whose associated viscous stress µ(χ u )D sym w shall compensate for (most of) the problematic viscous term (µ(χ u ) − µ(χ v ))D sym v appearing on the right hand side of the relative entropy inequality from Proposition 9 in the case of different shear viscosities. Before we state the main result of this section, we introduce some further notation. Let h + e(t) be defined as in Proposition 26. We then denote by P h + e(t) the downward projection onto the graph of h + e(t) , i.e., for all (x, t) such that dist(x, I v (t)) < r c . Note that this map does not define an orthogonal projection. Analogously, one introduces the projection Then there exists a solenoidal vector field w ∈ L 2 ([0, T strong ]; H 1 (R d )) such that w is subject to the estimateŝ where the vector field W is given by with the symmetric gradient defined by D sym v := 1 2 (∇v + ∇v T )), as well as the estimateŝ and where the vector fields g andĝ are subject to the bounds whereh ± is defined as h ± but now with respect to the modified cut-off function θ(·) = θ · 2 , see Proposition 25. Furthermore, w may be taken to have the regularity Proof.
Step 1: Definition of w. Let η be a cutoff supported at each t ∈ [0, T strong ) in the set I v (t) + B rc/2 with η ≡ 1 in I v (t) + B rc/4 and |∇η| ≤ Cr −1 c , Define the vector field W as given in (95) and set (making use of the notation a ∧ b = min{a, b} and a ∨ b = max{a, b}) as well as For this choice, we have (note that this directly implies the last claim about the regularity of w, namely ∇w(·, t) ∈ W 1,∞ (R d \ (I v (t) ∪ I h + e (t) ∪ I h + e (t))) for almost every t) as well as Moreover, note that (104) entails by the definition of the vector field W ∇ · w + (x, t) Analogous formulas and properties can be derived for w − . The function w + + w − would then satisfy our conditions, with the exception of the solenoidality ∇ · w = 0. For this reason, we introduce the (usual) kernel It is immediate that ∇ · w = 0.
Observe that this also implies by (95) From this, Theorem 34, and the fact that ∇θ is a singular integral kernel subject to the assumptions of Theorem 34, we deducê Combining the estimates (109) and (111) with the corresponding inequalities for w − and θ * ∇ · w − , we deduce our estimate (94).
Now, let R > 1 be big enough such that I v (t) + B rc ⊂ B R (0) for all t ∈ [0, T strong ). We then estimate with an integration by parts and Theorem 34 applied to the singular integral operator ∇θ By Young's inequality for convolutions, (110), (112) and (113) we then obtain Together with the respective estimates for w − and θ * (∇ · w − ), this implies (93). The estimate (96) follows directly from (102) and the estimates (111) and (114) on the H 1 -norm of θ * (∇ · w + ) as well as the definition of w − and the analogous estimates for θ * (∇ · w − ).
Step 4: L 2 L ∞ -estimate for ∇w. By making use of the precise formula (104) for ∇w + and the definition of the vector field W in (95), we immediately get To estimate the contribution from |∇(θ * (∇ · w + ))| we use the same dyadic decomposition as in (117). We start with the terms in the range k = log e 2 (t) , . . . , 0. Let x ∈ I v (t) and y ∈ (−r c , r c ) be fixed. We abbreviatex := x+yn v (x, t). Denote by F x the tangent plane of the interface I v (t) at the point x. Let Φ Fx : F x ×R → R d be the diffeomorphism given by Φ Fx (x,ŷ) :=x+ŷn Fx (x). We start estimating using the change of variables Φ Fx , the bound |∇θ k (x)| ≤ Cχ 2 k−1 ≤|x|≤2 k+1 |x| −d , as well as the fact thatx + yn Fx (x) =x + yn v (x, t) is exactly the point on the ray originating fromx ∈ F x in normal direction which is closest tox Note that the right hand side is independent of y. Hence, we may estimate with Minkowski's inequality The inner integral is to be understood in the Cauchy principal value sense. To proceed we use the L 2 -theory for singular operators of convolution type, the precise formula (106) for ∇ · w + as well as (17) and (26) An application of (76a) and the assumption E[χ u , u, V |χ v , v](t) ≤ e 2 (t) finally yields We move on with the contributions in the range k = 1, . . . , ∞. Note that by (119) we may directly infer from (76a) and the assumption . Moreover, the contributions estimated in (121) and (122) result in a bound of the form (recall that e(t) < r c ) Note that when summing the respective bounds from (126) and (127) over the relevant range k = −∞, . . . , log e 2 (t) − 1, we actually gain a factor e(t), i.e., the contributions estimated in (126) and (127) then directly yield a bound of the form Finally, the contribution from (125) may be estimated as follows. Let x ∈ I v (t), y ∈ [−r c , r c ] and denote by Fx the tangent plane to the manifold {dist ± (x, I v (t)) = h + e(t) (P Iv(t) x)} at the nearest point tox = x + yn v (x, t). In light of (125), we start estimating for k ≤ log e 2 (t) − 1 by using Jensen's inequality, the bound |∇h + e(t) | ≤ Cr −2 c from Proposition 26, as well as the fact that |x −x| ≥ |x −x| for allx ∈ I v (t) (since x = P Iv(t)x is the closest point tox on the interface I v (t)) ˆFx |∇h + e(t) (x)| 2 |x −x| d−1 dS(x).
Since this bound does not depend anymore on y ∈ [−r c , r c ], we may estimate the contributions from (125) using Minkowski's inequality as well as once more the L 2 -theory for singular operators of convolution type to reduce everything to the H 1 -bound (76a) for the local interface error heights. All in all, the contributions from (125) are therefore bounded by The asserted bound (98) then finally follows from collecting the estimates (129), (130), (131), (132), (133) and (134) together with the analogous bounds for ∇w − and ∇(θ * ∇ · w − ).
We now aim to make use of (76d) to further estimate the second term in the right hand side of (135). To establish the corresponding L 2 -resp. L 4 3 -contributions, we first need to perform an integration by parts in order to use (76d). The resulting curvature term as well as all other terms which do not appear in the third term of (135) can be directly bounded by a term whose associated L 2 -norm is controlled by Cr −1 with the corresponding L 2 -bound In both bounds, we add and subtract the compensation function w and therefore obtain together with (96) and (40) |h ± e(t) | 2 + |∇h ± e(t) | 2 dS (139) Analogous estimates may be derived for w − . We therefore proceed with the terms related to θ * ∇ · w ± . First of all, note that the singular integral operator (θ * ∇·) satisfies (see Theorem 34) θ * ∇ ·ĝ . We want to exploit the fact that the vector field V n has bounded derivatives up to second order, see (41) and (42). Moreover, the kernel ∇ 2 θ(x −x) ⊗ (x − x) gives rise to a singular integral operator of convolution type, as does ∇θ. To see this, we need to check whether its average over S d−1 vanishes. We write for every 0 < r < 1. Passing to the limit r → 1 shows that ∇F , and therefore also ∇ 2 θ(x) ⊗ x, have vanishing average on S d−1 . We may now compute (where the integrals are well defined in the Cauchy principal value sense due to the above considerations) for almost every We then estimate using Young's inequality for convolutions and As a consequence, we obtain from (141), Young's inequality for convolutions, (112) as well as (42) |h + e(t) | 2 dS.
6.4. Estimate for the additional surface tension terms. Having established all the relevant properties of the compensating vector field w in Proposition 27, we can move on with the post-processing of the additional terms in the relative entropy inequality from Proposition 9. To this end, we start with the additional surface tension terms given by A precise estimate for these terms is the content of the following result.
Lemma 28. Let the assumptions and notation of Proposition 27 be in place. In particular, we assume that there exists a C 1 -function e : [0, T strong ) → [0, r c ) such that the relative entropy is bounded by E[χ u , u, V, |χ v , v](t) ≤ e 2 (t). Then the additional surface tension terms A surT en are bounded by a Gronwall-type term Proof. We estimate term by term in (145). A straightforward estimate for the first two terms using also the coercivity property (37) yields Making use of (17), a change of variables Φ t , Hölder's and Young's inequality, (96), (39), (76a) as well as the coercivity property (34) the term III may be bounded by rc] |w(x+yn v (x, t))| 2 dS dt For the term IV , we first add zero, then perform an integration by parts which is followed by an application of Hölder's inequality to obtain By definition of ξ, see (30), recall that Recalling also (94), (95) and (111) as well as making use of (76c), (17), (26), (76a) and finally the coercivity property (34) the term (IV ) a from (149) is estimated by as well as (76a) |h ± e(t) | 2 dS dt (151) To estimate the term (IV ) c from (149), we again make use of the definition of , (17), Hölder's and Young's inequality, (96) as well as (76a) which yields the following bound Hence, taking together the bounds from (150), (151) and (152) we obtain In order to estimate the term V , we argue as follows. In a first step, we split R d into the region I v (t) + B rc near to and the region R d \ (I v (t) + B rc ) away from the interface of the strong solution. Recall then that the indicator function χ u (·, t) of the varifold solution is of bounded variation in I v (t) + B rc . In particular, Applying Theorem 35 in local coordinates, the sections are guaranteed to be one-dimensional Caccioppoli sets in (−r c , r c ), and such that all of the four properties listed in In other words, we distinguish between those sections which consist of at most one interval and those which consist of at least two intervals. It also turns out to be useful to further keep track of whether n v · n u ≤ 1 2 or n v · n u ≥ 1 2 holds.
We then obtain by Young's and Hölder's inequality as well as the fact that due to Definition 12 the vector field ξ is supported in To estimate V a from (154), we use the co-area formula for rectifiable sets (see [10, (2.72)]), (98), Hölder's inequality and the coercivity property (36) which together yield (we abbreviate in the first line F (x, y, t) : It remains to bound the term V b from (154). To this end, we make use of the fact that it follows from property iv) in Theorem 35 that every second point y ∈ ∂ * E + x ∩ (−r c , r c ) has to have the property that n v (x) · n u (x+yn v (x, t)) < 0, i.e., 1 ≤ 1 − n v (x) · n u (x+yn v (x, t)). We may therefore estimate with the help of the co-area formula for rectifiable sets (see [10, (2.72)]) and the bound (97) All in all, we obtain from the assumption E[χ u , u, V |χ v , v](t) ≤ e 2 (t) as well as (154), (155), (156) and (97) Hence, we deduce from the bounds (147), (148), (153), (157) as well as (97) the asserted estimate for the additional surface tension terms. 6.5. Estimate for the viscosity terms. In contrast to the case of equal shear viscosities µ + = µ − , we have to deal with the problematic viscous stress term given by (µ(χ v ) − µ(χ u ))(∇v + ∇v T ). We now show that the choice of w indeed compensates for (most of) this term in the sense that the viscosity terms from Proposition 9 may be bounded by a Gronwall-type term.
Lemma 29. Let the assumptions and notation of Proposition 27 be in place. In particular, we assume that there exists a C 1 -function e : [0, T strong ) → [0, r c ) such that the relative entropy is bounded by Then, for any δ > 0 there exists a constant C > 0 such that the viscosity terms R visc + A visc may be estimated by Proof. We argue pointwise for the time variable and start by adding zero We start by estimating the first four terms. Note that µ( from (76b) we see that ).
Hence, we may rewrite Carrying out an analogous computation for IV , using again the definition of the smoothed approximation for χ u from (76b) and using (94) as well as (95), we then get the bound |h ± e(t) | 2 +|∇h ± e(t) | 2 dS Plugging in the estimates (76a) and (76c), we obtain by Young's inequality for every δ ∈ (0, 1). To estimate the last two terms V and V I in (160), we may rewrite making use of the definition (95) of the vector field W and abbreviating where in the penultimate step we have used the fact that ∇ · (u − v − w) = 0, and in the last step we added zero. This yields after an integration by parts .
As a consequence of (94), (76a), (17) and the global Lipschitz estimate |∇h ± e (·, t)| ≤ Cr −2 c from Proposition 26, we obtain ˆR By a change of variables Φ t , (16), (40), (76a) and an application of Young's and Korn's inequality, the latter two terms may be further estimated by |h + e(t) | 2 + |∇h + e(t) | 2 dS for every δ ∈ (0, 1]. In total, we obtain the bound where δ ∈ (0, 1) is again arbitrary. Analogously, one can derive a bound of the same form for the last term V I in (160). Together with the bounds from (161) as well as (162) this concludes the proof. 6.6. Estimate for terms with the time derivative of the compensation function. We proceed with the estimate for the terms from the relative entropy inequality of Proposition 9 which are related to the time derivative of the compensation function w.
Lemma 30. Let the assumptions and notation of Proposition 27 be in place. In particular, we assume that there exists a C 1 -function e : [0, T strong ) → [0, r c ) such that the relative entropy is bounded by Then, for any δ > 0 there exists a constant C > 0 such that A dt may be estimated by Proof. To estimate the terms involving the time derivative of w we make use of the decomposition of ∂ t w + (v · ∇)w from (99): Employing the bounds (57a), (57b) and the assumption Making use of (76a), the bound for the vector fieldĝ from (100), the Gagliardo- Now, by an application of Young's and Korn's inequality for all the terms on the right hand side of (166) which include an L 2 -norm of the gradient of u − v − w (in the case d = 3 we use a where δ ∈ (0, 1) is arbitrary. This gives the desired bound for the L 4 3 -contribution of ∂ t w + (v · ∇)w. Concerning the L 2 -contribution, we estimate using (57a), (76a), the bound for g L 2 from (101) as well as the assumption E[χ u , u, V |χ v , v](t) ≤ e(t) 2 Hence, by another application of Young's and Korn's inequality, we may bound where δ ∈ (0, 1] is again arbitrary. All in all, (167) and (169) therefore imply the desired bound. 6.7. Estimate for the additional advection terms. We move on with the additional advection terms from the relative entropy inequality of Proposition 9 A precise estimate is the content of the following result.
Lemma 31. Let the assumptions and notation of Proposition 27 be in place. In particular, we assume that there exists a C 1 -function e : [0, T strong ) → [0, r c ) such that the relative entropy is bounded by E[χ u , u, V, |χ v , v](t) ≤ e 2 (t). Then the additional advection terms A adv may be bounded by a Gronwall-type term Proof. A straightforward estimate yields Making use of (93), (97) as well as (76a) immediately shows that the desired bound holds true.
6.8. Estimate for the additional weighted volume term. It finally remains to state the estimate for the additional weighted volume term from the relative entropy inequality of Proposition 9 A weightV ol :=ˆT Lemma 32. Let the assumptions and notation of Proposition 27 be in place. In particular, we assume that there exists a C 1 -function e : [0, T strong ) → [0, r c ) such that the relative entropy is bounded by E[χ u , u, V, |χ v , v](t) ≤ e 2 (t). Then the additional weighted volume term A weightV ol may be bounded by a Gronwall term Proof. We may use the exact same argument as in the derivation of the estimate for the term III from the additional surface tension terms A surT en , see (148).
6.9. The weak-strong uniqueness principle with different viscosities. Before we proceed with the proof of Theorem 1, let us summarize the estimates from the previous sections in the form of a post-processed relative entropy inequality. The proof is a direct consequence of the relative entropy inequality from Proposition 9 and the bounds (44) Let ξ be the extension of the inner unit normal vector field n v of the interface I v (t) from Definition 12. Let w be the vector field contructed in Proposition 27. Let β be the truncation of the identity from Proposition 9, and let θ be the density θ t = d|∇χu(·,t)| d|Vt| S d−1 . Let e : [0, T strong ) → (0, r c ] be a C 1 -function and assume that the relative entropy Then the relative entropy is subject to the estimate for almost every T ∈ [0, T strong ). Here, C > 0 is a constant which is structurally of the form C = Cr −22 depending on the various norms of the velocity field of the strong solution, the regularity parameter r c of the interface of the strong solution, and the physical parameters ρ ± , µ ± , and σ.
We have everything in place to to prove the main result of this work.
Proof of Theorem 1. The proof of Theorem 1 is based on the post-processed relative entropy inequality of Proposition 33. It amounts to nothing but a more technical version of the upper bound valid for all solutions of the differential inequality d dt E(t) ≤ CE(t)| log E(t)|. However, it is made more technical by the more complex right-hand side (33) in the relative entropy inequality (which involves the anticipated upper bound e(t) 2 ) and the smallness assumption on the relative entropy E[χ u , u, V |χ v , v](t) needed for the validity of the relative entropy inequality.
We start the proof with the precise choice of the function e(t) as well as the necessary smallness assumptions on the initial relative entropy. We then want to exploit the post-processed form of the relative entropy inequality from Proposition 33 to compare E[χ u , u, V |χ v , v](t) with e(t).
Let C > 0 be the constant from Proposition 33 and choose δ > 0 such that δ < 1 6(C+1) . Let ε > 0 (to be chosen in a moment, but finally we will let ε → 0) and consider the strictly increasing function Note that e 2 (0) = E[χ u , u, V |χ v , v](0) + ε which strictly dominates the relative entropy at the initial time. To ensure the smallness of this function, let us choose c > 0 small enough such that whenever we have E[χ u , u, V |χ v , v](0) < c and ε < c, it holds that for all t ∈ [0, T strong ). This is indeed possible since the condition in (176) is equivalent to 1 2 log(E[χ u , u, V |χ v , v](0) + ε) < e T strong δ log( 1 3C ∧ r c ). For technical reasons to be seen later, we will also require c > 0 be small enough such that whenever E[χ u , u, V |χ v , v](0) < c and ε < c. We proceed with some further computations. We start with d dt This in particular entails After these preliminary considerations, let us consider the relative entropy inequality from Proposition 9. Arguing similarly to the derivation of the relative entropy inequality in Proposition 9 but using the energy dissipation inequality in its weaker form E[χ u , u, V |χ v , v](T ) ≤ E[χ u , u, V |χ v , v](τ ) for a. e. τ ∈ [0, T ], we may deduce (upon modifying the solution on a subset of [0, T strong ) of vanishing measure) for all τ ∈ [0, T strong ). Now, consider the set T ⊂ [0, T strong ) which contains all τ ∈ [0, T strong ) such that lim sup T ↓τ E[χ u , u, V |χ v , v](T ) > e 2 (τ ). Note that 0 ∈ T . We define Since E[χ u , u, V |χ v , v](0) < e 2 (0) and e 2 is strictly increasing, we deduce by the same argument which established (180) that T * > 0. Hence, we can apply Proposition 33 at least for times T < T * (with τ = 0). However, by the same argument as before the relative entropy inequality from Proposition 9 shows that (T ) may be bounded by means of the post-processed relative entropy inequality. Hence, we obtain using also (175) and (178) We compare this to the equation (179) for e 2 (t) (with τ = 0 and T = T * ). Recall that e 2 (0) strictly dominates the relative entropy at the initial time. Because of (177), the second term on the right hand side of (181) is dominated by one third of the right hand side of (179). Because of (176) and the choice δ < 1 6(C+1) the same is true for the other two terms on the right hand side of (181). In particular, we obtain using also (180) lim sup which contradicts the defining property of T * . This concludes the proof since the asserted stability estimate as well as the weak-strong uniqueness principle is now a consequence of letting ε → 0.

Derivation of the relative entropy inequality
Proof of Proposition 9. We start with the following observation. Since the phasedependent density ρ(χ v ) depends linearly on the indicator function χ v of the volume occupied by the first fluid, it consequently satisfieŝ for almost every T ∈ [0, T strong ) and all ϕ ∈ C ∞ cpt (R d × [0, T strong )). By approximation, the equation holds for all ϕ ∈ W 1,∞ (R d × [0, T strong )). Testing this equation is a smooth vector field, we then obtain for almost every T ∈ [0, T strong ). Note that the velocity field v of a strong solution has the required regularity to justify the preceding step. Next, we subtract from (183) the equation for the momentum balance (10a) of the strong solution evaluated with a test function η ∈ C ∞ cpt (R d × [0, T strong ); R d ) such that ∇ · η = 0. This shows that the velocity field v of the strong solution satisfies H · η dS dt which holds for almost every T ∈ [0, T strong ) and all η ∈ C ∞ cpt (R d × [0, T strong ); R d ) such that ∇ · η = 0. The aim is now to test the latter equation with the field u − v − w. To this end, we fix a radial mollifier φ : R d → [0, ∞) such that φ is smooth, supported in the unit ball and´R d φ dx = 1. For n ∈ N we define φ n (·) := n d φ(n ·) as well as u n := φ n * u and analogously v n and w n . We then test (184) with the test function u n − v n − w n and let n → ∞. Since the traces of u n , v n and w n on I v (t) converge pointwise almost everywhere to the respective traces of u, v and w, we indeed may pass to the limit in the surface tension term of (184). Hence, we obtain the identity which holds true for almost every T ∈ [0, T strong ).
In the next step, we test the analogue of (182) for the phase-dependent density ρ(χ u ) of the varifold solution with the test function 1 2 |v + w| 2 and obtain for almost every T ∈ [0, T strong ). Recall also from the definition of a varifold solution that we are equipped with the energy dissipation inequalitŷ which holds for almost every T ∈ [0, T strong ). Finally, we want to test the equation for the momentum balance (6a) of the varifold solution with the test function v + w. Since the normal derivative of the tangential velocity of a strong solution may feature a discontinuity at the interface, we have to proceed by an approximation argument, i.e., we use the mollified version v n + w n as a test function. Note that v n resp. w n are elements of L ∞ ([0, T strong ); C 0 (R d )). Hence, we may indeed use v n + w n as a test function in the surface tension term of the equation for the momentum balance (6a) of the varifold solution. However, it is not clear a priori why one may pass to the limit n → ∞ in this term.
To argue that this is actually possible, we choose a precise representative for ∇v resp. ∇w on the interface I v (t). This is indeed necessary also for the velocity field of the strong solution since the normal derivative of the tangential component of v may feature a jump discontinuity at the interface. However, by the regularity assumptions on v, see Definition 6 of a strong solution, and the assumptions on the compensating vector field w, for almost every t ∈ [0, T strong ) every point x ∈ R d is either a Lebesgue point of ∇v (respectively ∇w) or there exist two half spaces H 1 and H 2 passing through x such that x is a Lebesgue point for both ∇v| H1 and ∇v| H2 (respectively ∇w| H1 and ∇w| H2 ). In particular, by the L ∞ bounds on ∇v and ∇w the limit of the mollifications ∇v n respectively ∇w n exist at every point x ∈ R d and we may define ∇v respectively ∇w at every point x ∈ R d as this limit.
Recall then that we have chosen the mollifiers φ n to be radially symmetric. Hence, the approximating sequences ∇v n resp. ∇w n converge pointwise everywhere to the precise representation as chosen before. Since both limits are bounded, we may pass to the limit n → ∞ in every term appearing from testing the equation for the momentum balance (6a) of the varifold solution with the test function v n + w n .
Moreover, collecting all advection terms on the right hand side of (185), (186), and (188) as well as adding zero gives the contribution Next, we may rewrite those terms on the right hand side of (185), (186), and (188) which contain a time derivative as follows Furthermore, the terms related to surface tension on the right hand side of (185) and (188) are given by RHS surT en = σˆT H · w dS dt.
We proceed by rewriting the surface tension terms. For the sake of brevity, let us abbreviate from now on n u = ∇χu |∇χu| . Using the incompressibility of v and adding zero, we start by rewriting Next, by means of the compatibility condition (6e) we can write σˆT 0ˆIv(t) n u · (n u · ∇)v dS dt − σˆT Moreover, the compatibility condition (6e) also ensures that −σˆT whereas it follows from (8) σˆT Using that the divergence of ξ equals the divergence of n v (P Iv(t) x) on the interface of the strong solution (i. e. H = −(∇ · ξ)n v ; see Definition 12, i.e., the cutoff function does not contribute to the divergence on the interface), that the latter quantity equals the scalar mean curvature (recall that n v = ∇χv |∇χv| points inward) as well as once more the incompressibility of the velocity fields v resp. u we may also rewrite −σˆT The preceding five identities together then imply that Following the computation which led to (197) we also obtain the identity Using the fact that w is divergence-free, we may also rewrite − σˆT 0ˆR d n u · (n u · ∇)w d|∇χ u | dt Appealing once more to the fact that ξ = n v on the interface I v of the strong solution (see Definition 12) and ∇ · w = 0, we obtain σˆT 0ˆIv(t) The last three identities together with (197) and (196) in total finally yield the following representation of the surface tension terms on the right hand side of (185) and (188) It remains to collect the viscosity terms from the left hand side of (185), (187) and (188). Adding also zero, we obtain In particular, as an intermediate summary we obtain the following bound making already use of the notation of Proposition 9: Taking the bound (189) together with the identities from (190) to (195) as well as (198) and (199) yieldŝ The aim of the next step is to use σ(∇ · ξ) (see Definition 12) as a test function in the transport equation (6b) for the indicator function χ u of the varifold solution. For the sake of brevity, we will write again n u = ∇χu |∇χu| . Plugging in σ(∇ · ξ) and integrating by parts yields −σˆR d n u (·, T ) · ξ(·, T ) d|∇χ u (·, T )| + σˆR = σˆT 0ˆR d n u · Id−n v (P Iv(t) x) ⊗ n v (P Iv(t) x) (∇v) T ξ d|∇χ u | dt which holds for almost every T ∈ [0, T strong ). Next, we study the quantity RHS tilt := σˆT 0ˆR d n u · (v · ∇)ξ d|∇χ u | dt + σˆT 0ˆR d χ u (u · ∇)(∇ · ξ ) dx dt (202) + σˆT 0ˆR d n u · Id−n v (P Iv(t) x) ⊗ n v (P Iv(t) x) (∇v) T · ξ d|∇χ u | dt.
Due to the regularity of v resp. ξ as well as the incompressibility of the velocity field v we get Exploiting the fact that ξ(x) = n v (P Iv(t) x)ζ(x) and n v (P Iv(t) x) only differ by a scalar prefactor, namely the cut-off multiplier ζ(x) which one can shift around, it turns out to be helpful to rewrite σˆT 0ˆR d n u · Id−n v (P Iv(t) x) ⊗ n v (P Iv(t) x) (∇v) T · ξ d|∇χ u | dt = σˆT 0ˆR d ξ · (Id−n v (P Iv(t) x) ⊗ n v (P Iv(t) x))n u · ∇)v d|∇χ u | dt (204) = σˆT 0ˆR d ξ · n u − (n v (P Iv(t) x) · n u )n v (P Iv(t) x) · ∇ v d|∇χ u | dt Hence, by using (203) and (204) we obtain

Appendix
Theorem 34 (Boundedness of singular integral operators of convolution type in L p ). Let d ≥ 2, p ∈ (1, ∞), and let K : S d−1 → R be a function of class C 1 with vanishing average. Let f ∈ L p (R d ) and define where the integral is understood in the Cauchy principal value sense. Then there exists a constant C > 0 depending only on d, p, and K such that We also state a non-trivial result from geometric measure theory on properties of one-dimensional sections of Caccioppoli sets.
Theorem 35 ([30, Theorem G]). Consider a set G of finite perimeter in R d , denote by ν G = (ν G x1 , . . . , ν G x d−1 , ν G y ) ∈ R d the associated measure theoretic inner unit normal vector field of the reduced boundary ∂ * G, and let χ * G be the precise representative of the bounded variation function χ G . Then for Lebesgue almost every x ∈ R d−1 the one-dimensional sections G x := {y ∈ R : (x, y) ∈ G} satisfy the following properties: i) G x is a set of finite perimeter in R, χ G (x, ·) = χ * G (x, ·) Lebesgue almost everywhere in G x , ii) (∂ * G) x = ∂ * G x , iii) ν G y (x, y) = 0 for all y ∈ R such that (x, y) ∈ ∂ * G, and iv) lim y→y + 0 χ * G (x, y) = 1 and lim y→y − 0 χ * G (x, y) = 0 whenever ν G y (x, y 0 ) > 0, and vice versa if ν G y (x, y 0 ) < 0. In particular, for every Lebesgue measurable set M ⊂ R d−1 there exists a Borel measurable subset M G ⊂ M such that L d−1 (M \ M G ) = 0 and the four properties stated above are satisfied for all y ∈ M G .
To bound the L 4 -norm of the interface error heights h ± in the case of a twodimensional interface, we employ the following optimal Orlicz-Sobolev embedding.  Then for any weakly differentiable function u decaying to 0 at infinity in the sense {|u(x)| > s} < ∞ for all s > 0, the following estimate holds true: The application of the optimal Orlicz-Sobolev embedding to our setting is stated and proved next. for e(t) ≤ s ≤ 1, 2s − 1 for s ≥ 1.
Proof. Let U ⊂ R 2 be an open and bounded set and consider u ∈ C 1 cpt (U ) such that u L ∞ ≤ 1. For the sake of brevity, let us suppress for the moment the dependence on the variable t ∈ [0, T ). The idea is to apply the optimal Orlicz-Sobolev embedding provided by the preceding theorem with respect to the convex function A e . Observe first that A e indeed satisfies all the assumptions. Moreover, since d = 2 we compute (H(r)) 2 =ˆr for r ≤ e, 1 + log r e for e ≤ r ≤ 1, 1 + log 1 e + r−1 2 + 1 4 log(2r − 1) for r ≥ 1. As a consequence, we get for y ≤ 1, = e exp(y 2 − 1) for 1 ≤ y ≤ 1 + log 1 e , ≥ (y 2 − 1 − log 1 e ) + 1 for y ≥ 1 + log 1 e , ≤ 2(y 2 − 1 − log 1 e ) + 1 for y ≥ 1 + log 1 e . This in turn entails We then deduce from Theorem 36, d = 2, u L ∞ ≤ 1, the bound exp(s 2 ) ≥ 1 2 s 4 for all s ≥ 0 as well as the bound s 2 − log 1 e ≥ which is precisely what is claimed. Note that since u is continuously differentiable, the singular part in the definition of A e (Du) vanishes. In a next step, we want to extend to smooth functions u on the manifold I(t). By assumption, we may cover I(t) with a finite family of open sets of the form U (x i ) := I(t) ∩ B 2rc (x i ), x i ∈ I(t), such that U (x i ) can be represented as the graph of a function g : B 1 (0) ⊂ R 2 → R with |∇g| ≤ 1 and |∇ 2 g| ≤ r −1 c . We fix a partition of unity {ϕ i } i subordinate to this finite cover of I(t). Note that |∇ϕ i | ≤ Cr −1 c . Note also that the cardinality of the open cover is uniformly bounded in t. Hence, we proceed with deriving the desired bound only for one uϕ, where ϕ = ϕ i is supported in U = U (x i ). Abbreviatingũ = u • g andφ = ϕ • g, we obtain from the previous step Using the bounds A e (t +t) ≤ CA e (t) + CA e (t) and A e (λt) ≤ C(λ + λ 2 )A e (t), which hold for all λ > 0 and all t,t ≥ 0, as well as the product and chain rule we compute A e |u|(g(x)) dx + CˆB
By definition of A e we can further estimatê
Definition 5 implies a lower bound of cr c for the length of any connected component of I(t)) and such that at any point x ∈ I(t) there are at most two i with η i (x) > 0. Treating by abuse of notation the function η i u as if defined on a real interval I = (a, b), we then write η i (x)u(x) =ˆx a η i (y)u (y) + η i (y)u(y) dy  From this we infer the desired estimate by approximation.