De Giorgi’s inequality for the thresholding scheme with arbitrary mobilities and surface tensions

We provide a new convergence proof of the celebrated Merriman–Bence–Osher scheme for multiphase mean curvature flow. Our proof applies to the new variant incorporating a general class of surface tensions and mobilities, including typical choices for modeling grain growth. The basis of the proof are the minimizing movements interpretation of Esedoḡlu and Otto and De Giorgi’s general theory of gradient flows. Under a typical energy convergence assumption we show that the limit satisfies a sharp energy-dissipation relation.


Introduction
The thresholding scheme is a highly efficient computational scheme for multiphase mean curvature flow (MCF) which was originally introduced by Merriman, Bence, and Osher [27,28]. The main motivation for MCF comes from metallurgy where it models the slow relaxation of grain boundaries in polycrystals [31]. Each "phase" in our mathematical jargon corresponds to a grain, i.e., a region of homogeneous crystallographic orientation. The effective surface tension σ i j (ν) and the mobility μ i j (ν) of a grain boundary depend on the mismatch between the lattices of the two adjacent grains i and j and on the relative orientation of the grain boundary, given by its normal vector ν. It is well known that for small mismatch angles, the dependence on the normal can be neglected [32]. The effective evolution equations then read where V i j and H i j denote the normal velocity and mean curvature of the grain boundary which is a balance-of-forces condition and simply states that triple junctions are in local equilibrium; here ν i j denotes the unit normal of i j pointing from i into j , (Fig. 1). We refer the interested reader to [17] for more background on the modeling. Efficient numerical schemes allow to carry out large-scale simulations to give insight into relevant statistics like the average grain size or the grain boundary character distribution, as an alternative to studying corresponding mean field limits as in [4,18]. The main obstruction to directly discretize the dynamics (1)- (2) are ubiquitous topological changes in the network of grain boundaries like for example the vanishing of grains. Thresholding instead naturally handles such topological changes. The scheme is a time discretization which alternates between the following two operations: (i) convolution with a smooth kernel; (ii) thresholding. The second step is a simple pointwise operation and also the first step can be implemented efficiently using the Fast Fourier Transform. One of the main objectives of our analysis is to rigorously justify this intriguingly simple scheme in the presence of such topological changes.
The basis of our analysis is the underlying gradient-flow structure of (1)- (2), which means that the solution follows the steepest descent in an energy landscape. More precisely, the energy is the total interfacial area weighted by the surface tensions σ i j , and the metric tensor is the L 2 -product on normal velocities, weighted by the inverse mobilities 1 μ i j . One can read off this structure from the inequality which is valid for sufficiently regular solutions to (1)- (2). In the seminal work [8], Esedoḡlu and Otto showed that the efficient thresholding scheme respects this gradient-flow structure as it may be viewed as a minimizing movements scheme in the sense of De Giorgi. More precisely, they show that each step in the scheme is equivalent to solving a variational problem of the form where E h ( ) and d h ( , n−1 ) are proxies for the total interfacial energy of the configuration and the distance of the configuration to the one at the previous time step n−1 , respectively. Since the work of Jordan, Kinderlehrer, and Otto [15], the importance of the formerly often neglected metric in such gradient-flow structures has been widely appreciated. Also in the present work, the focus lies on the metric, which in the case of MCF is wellknown to be completely degenerate [29]. This explains the proxy for the metric appearing in the related well-known minimizing movements scheme for MCF by Almgren, Taylor, and Wang [1], and Luckhaus and Sturzenhecker [25]. This remarkable connection between the numerical scheme and the theory of gradient flows has the practical implication that it made clear how to generalize the algorithm to arbitrary surface tensions σ i j . From the point of view of numerical analysis, (3) means that thresholding behaves like the implicit Euler scheme and is therefore unconditionally stable. The variational interpretation of the thresholding scheme has of course implications for the analysis of the algorithm as well. It allowed Otto and one of the authors to prove convergence results in the multiphase setting [19,20], which lies beyond the reach of the more classical viscosity approach based on the comparison principle implemented in [3,9,13]. Also in different frameworks, this variational viewpoint turned out to be useful, such as MCF in higher codimension [23] or the Muskat problem [14]. The only downside of the generalization [8] are the somewhat unnatural effective mobilities μ i j = 1 σ i j . Only recently, Salvador and Esedoḡlu [33] have presented a strikingly simple way to incorporate a wide class of mobilities μ i j as well. Their algorithm is based on the fact pointed out in [7] that although the same kernel appears in the energy and the metric, each term only uses certain properties of the kernel, which can be tuned independently: Starting from two Gaussian kernels G γ and G β of different width, they find a positive linear combination K i j = a i j G γ + b i j G β , whose effective mobility and surface tension match the given μ i j and σ i j , respectively. It is remarkable that this algorithm retains the same simplicity and structure as the previous ones [8,28]. We refer to Sect. 2 for the precise statement of the algorithm.
In the present work, we prove the first convergence result for this new general scheme. The main novelty here is that the proof applies in the full generality of this new scheme incorporating arbitrary mobilities. Furthermore, this is the first proof of De Giorgi's inequality in the multiphase case. We exploit the gradient-flow structure and show that under the natural assumption of energy convergence, any limit of thresholding satisfies De Giorgi's inequality, a weak notion of multiphase mean curvature flow. This assumption is inspired by the fundamental work of Luckhaus-Sturzenhecker [25] and has appeared in the context of thresholding in [19,20]. We expect it to hold true before the onset of singularities such as the vanishing of grains. Furthermore, at least in the simpler two-phase case, it can be verified for certain singularities [5,6]. We would in fact expect this assumption to be true generically, which however seems to be a difficult problem in the multiphase case.
The present work fits into the theory of general gradient flows even better than the two previous ones [19,20] and crucially depends on De Giorgi's abstract framework, cf. [2]. This research direction was initiated by Otto and the first author and appeared in the lecture notes [21]. There, De Giorgi's inequality is derived for the simple model case of two phases. Here, we complete these ideas and use a careful localization argument to generalize this result to the multiphase case. A further particular novelty of our work is that for the first time, we prove the convergence of the new scheme for arbitrary mobilities [33].
Our proof rests on the fact that thresholding, like any minimizing movements scheme, satisfies a sharp energy-dissipation inequality of the form where h (t) denotes the piecewise constant interpolation in time of our approximation, h (t) denotes another, intrinsic interpolation in terms of the variational scheme, cf. Lemma 3, and |∂ E h | is the metric slope of E h , cf. (33).
Our main goal is to pass to the limit in (4) and obtain the sharp energy-dissipation relation for the limit, which in the simple two-phase case formally reads To this end, one needs sharp lower bounds for the terms on the left-hand side of (4). While the proof of the lower bound on the metric slope of the energy is a straight-forward generalization of the argument in [21], the main novelty of the present work lies in the sharp lower bound for the distance-term of the form This requires us to work on a mesoscopic time scale τ ∼ √ h, which is much larger than the microscopic time-step size h and which is natural in view of the parabolic nature of our problem. It is remarkable that De Giorgi's inequality (5) in fact characterizes the solution of MCF under additional regularity assumptions. Indeed, if (t) evolves smoothly, this inequality can be rewritten as and therefore V = −μσ H . For expository purpose, we focused here on the vanilla twophase case. In the multiphase case, the resulting inequality implies both the PDEs (1) and the balance-of-forces conditions (2), cf. Remark 1. An optimal energy-dissipation relation like the one here also plays a crucial role in the recent weak-strong uniqueness result for multiphase mean curvature flow by Fischer, Hensel, Simon, and one of the authors [10]. There, a new dynamic analogue of calibrations is introduced and uniqueness is established in the following two steps: (i) any strong solution is a calibrated flow and (ii) every calibrated flow is unique in the class of weak solutions. In fact, Hensel and the first author recently showed in [11] that (a slightly weaker version of) De Giorgi's inequality is sufficient for weak-strong uniqueness. De Giorgi's general strategy we are implementing here is also related to the approaches by Sandier and Serfaty [34] and Mielke [30]. They provide sufficient conditions for gradient flows to converge in the same spirit as -convergence of energy functionals, implies the convergence of minimizers. In the dynamic situation it is clear that one needs conditions on both energy and metric in order to verify such a convergence. There has been continuous interest in MCF in the mathematics literature, so we only point out some of the most relevant recent advances. We refer the interested reader to the introductions of [19] and [22] for further related references. The existence of global solutions to multiphase MCF has only been established recently by Kim and Tonegawa [16] who carefully adapt Brakke's original construction and show in addition that phases do not vanish spontaneously. For the reader who wants to familiarize themselves with this topic, we recommend the recent notes [37]. Another approach to understanding the long-time behavior of MCF is to restart strong solutions after singular times. This amounts to solving the Cauchy problem with non-regular initial data, such as planar networks of curves with quadruple junctions. In this two-dimensional setting, this has been achieved by Ilmanen, Neves, and Schulze [12] by gluing in self-similarly expanding solutions for which it is possible to show that the initial condition is attained in some measure theoretic way. Most recently, using a similar approach of gluing in self-similar solutions, but also relying on blow-ups from geometric microlocal analysis, Lira, Mazzeo, Pluda, and Saez [24] were able to construct such strong solutions, prove stronger convergence towards the initial (irregular) network of curves, and classify all such strong solutions.
The rest of the paper is structured as follows. In Sect. 2 we recall the thresholding scheme for arbitrary mobilities introduced in [33], show its connection to the abstract framework of gradient flows, and record the direct implications of this theory. We state and discuss our main results in Sect. 3. Section 4 contains the localization argument in space, which will play a crucial role in the proofs which are gathered in Sect. 5. Finally, in the short "Appendix", we record some basic facts about thresholding.

Setup and the modified thresholding scheme
Here and in the rest of the paper, [0, 1) d denotes the d-dimensional torus. Thus when we deal with functions u : [0, 1) d → R we always assume that they have periodic boundary conditions. In particular they can be extended periodically on R d . In general if u is a function as before and f : R d → R then by f * u we mean the convolution on R d between f and the periodic extension of u, i.e.
when this expression makes sense.

The modified algorithm
We start by describing the algorithm proposed by Salvador and Esedoḡlu [33]. Let the symmetric matrix σ = (σ i j ) i j ∈ R N ×N of surface tensions and the symmetric matrix μ = (μ i j ) i j of mobilities be given. In this work we define for notational convenience σ ii = μ ii = 0. Let γ > β > 0 be given. Define the matrices for i = j and a ii = b ii = 0. Then a i j , b i j are uniquely determined as solutions of the following linear system ⎧ ⎨ The algorithm introduced by Salvador (2) For any i = 1, ..., N form the comparison functions (3) Thresholding step, define n+1 i We will assume the following: The coefficients a i j , b i j satisfy the strict triangle inequality.
In particular, for v ∈ (1, ..., 1) ⊥ we can define norms We remark that we need the matrices A, B to be positive definite on (1, ..., 1) ⊥ to guarantee that the functional defined in (28) is a distance, see the comment following (28) below.
Observe that condition (13) is always satisfied if we choose γ large and β small provided the surface tensions and the inverse of the mobilities satisfy the strict triangle inequality. Indeed, define which can always be achieved since γ > β > 0 are arbitrary. For the second condition (14), we have the following result of Salvador and Esedoḡlu [33].

Lemma 1
Let the matrix σ of the surface tensions and the matrix 1 μ of the inverse mobilities (for the diagonal we set inverses to be zeros) be negative definite on (1, ..., 1) ⊥ . Let γ > β be such that where s i and m i are the nonzero eigenvalues of J σ J and J 1 μ J respectively, where the matrix J has components J i j = δ i j − 1 N . Then A and B are positive definite on (1, ..., 1) ⊥ .
In particular, if we choose γ large enough and β small enough, condition (14) on the matrices A, B is satisfied provided the matrices σ and 1 μ are negative definite on (1, ..., 1) ⊥ . By a classical result of Schoenberg [35] this is the case if and only if √ σ i j and 1/ √ μ i j are 2 embeddable. In particular, this holds for the choice of Read-Shockley surface tensions and equal mobilities.
where, for a given t > 0, we define G t as the heat kernel in R d , i.e., If the dimension d is clear from the context, we suppress the superscript (d) in (18). We recall here some basic properties of the heat kernel.
We observe that the kernels K i j are positive, with positive Fourier transformK i j provided γ > max i, j σ i, j μ i, j and β < min i, j σ i, j μ i, j . In particular assuming (1) σ i j and 1 μ i j satisfy the strict triangle inequality, (2) σ and 1 μ are negative definite on (1, ..., 1) ⊥ , we can always achieve the conditions posed on A, B and the positivity of the kernels K i j by choosing γ large and β small. Given any h > 0 we define the scaled kernels then the first and the second step in Algorithm 1 may be compactly rewritten as follows For later use, we also introduce the kernel

Connection to De Giorgi's minimizing movements
The first observation is that Algorithm 1 has a minimizing movements interpretation. To explain this, let us introduce the class and its relaxation We denote by ∂ * i the reduced boundary of the set i , and for any pair 1 ≤ i = j ≤ N we denote by i j := ∂ * i ∩ ∂ * j the interface between the sets. For u ∈ M we define For h > 0 fixed we define the approximate energy For u, v ∈ M and h > 0 we also define the distance where we used the semigroup property (22) and the symmetry (20) to derive the last equality.
We also point out that Hence the assumptions on A and B guarantee that d h defines a distance on M (and on A).

Lemma 2 The pair (M, d h ) is a compact metric space. The function E h is continuous with respect to d h . For every
Proof For u, v ∈ M definition (28) and the fact that A and B are positive definite imply that d h is a distance on M. The fact that (M, d h ) is compact and E h is continuous is just a consequence of the fact that d h metrizes the weak convergence in L 2 on M, the interested reader may find the details of the reasoning in [21]. We are thus left with showing that χ n satisfies (29).
then by the symmetry (20) of the Gaussian kernel and by the symmetry of both matrices A, B it is not hard to show that (·, ·) is symmetric. In particular we can write for any u ∈ M Thus (29) is equivalent to the fact that χ n minimizes (χ n−1 , u) among all u ∈ M. Since by (13) we see that χ n minimizes the integrand pointwise, and thus it is a minimizer for the functional.
The previous lemma allows us to apply the general theory of gradient flows in [2] to this particular problem. We record the key statement for our purposes in the following lemma, which will be applied to (M, d h ), where d h is the metric (28).

Lemma 3 Let (M, d) be a compact metric space and E : M → R be continuous. Given
Then we have for all t ∈ Nh Here χ(t) is the piecewise constant interpolation, u(t) is the so-called variational interpolation, which for n ∈ N and t ∈ ((n − 1)h, nh] is defined by and |∂ E|(u) is the metric slope defined by

Statement of results
Our main result is the convergence of the modified thresholding scheme to a weak notion of multiphase mean curvature flow. More precisely, given an initial partition If χ 0 is a function of bounded variation, we denote by 0 Our main result is contained in the following theorem.
Theorem 1 Given χ 0 ∈ A and such that ∇χ 0 is a bounded measure and a sequence h ↓ 0; let χ h be defined by (36). Assume that there exists χ : Then then χ is a De Giorgi solution in the sense of Definition 1 below.
The convergence assumption (38) is motivated by a similar assumption on the implicit time discretization in the seminal paper [25] by Luckhaus and Sturzenhecker, and has also appeared in previous work in the context of the thresholding scheme [19][20][21]. As of now, this assumption can be verified only in particular cases, such as before the first singularity [36] or for certain types of singularities, namely mean convex ones, meaning H > 0. This was shown for the implicit time discretization in [6] and a proof in the case of the thresholding scheme will appear in a forthcoming work by Fuchs and the first author.
Inspired by the general framework [2] and [34], generalizing the previous two-phase version [21], we propose the following definition for weak solutions in the case of multiphase mean curvature flow.

Definition 1
Given χ 0 ∈ A and such that ∇χ 0 is a bounded measure, a map χ : Giorgi solution to the multiphase mean curvature flow with surface tensions σ i j and mobilities μ i j provided the following three facts hold: (2) There exist normal velocities Remark 1 Observe that inequality (40) together with the definition of the weak mean curvatures gives a notion of weak solution for the multiphase mean curvature flow incorporating both the dynamics V i j = −σ i j μ i j H i j and the Herring angle condition at triple junc- where J denotes the rotation by ninety degrees in the normal plane to the triple junction ∂ i j (t). Thus (39) and Using the Herring angle condition we have and after completing the square we arrive at The following lemma establishes, next to a compactness statement, that our convergence can be localized in the space and time variables x and t, but also in the variable z appearing in the convolution.

Lemma 4
We have the following: and that is piecewise constant in time in the sense of (36). Such a sequence is Then as measures on R d × [0, 1) d × (0, T ) we have the following weak convergences for any i = j Here ν i j (·, t) denotes the outer measure theoretic unit normal of i (t) restricted to the interface i j (t).
Here the convergence may be tested also with continuous functions which have polynomial growth in z ∈ R d .
The next proposition is the main ingredient in the proof of Theorem 1. It establishes the sharp lower bound on the distance-term. (37) and the conclusion of Lemma 4 (ii) hold. Assume also that the left hand side of (47) is finite. Then for every

Proposition 1 Suppose that
The final ingredient is the analogous sharp lower bound for the metric slope.
in the sense of (39). Moreover the following inequality is true: We will present the proofs of Theorem 1, Lemma 4, Proposition 1 and Proposition 2 in Sect. 5. Before doing that, we need a simple geometric measure theory construction.

Construction of suitable partitions of unity
In the sequel we will frequently want to localize on one of the interfaces. To do so, we need to construct a suitable family of balls on which the behavior of the flow is split into two majority phases and several minority phases. Hereafter we will ignore the time variable and consider a map χ : the reduced boundary of the set {χ i = 1} and by i j = ∂ * i ∩ ∂ * j the interface between phase i and phase j. Given a real number r > 0 and a natural number n ∈ N we define where the balls appearing in the definition are intended to be open. Observe that for any n ≥ 2 and any r > 0 the collection of balls in F r n is a covering of [0, 1) d with the property that any point Here 2B denotes the ball with center given by the center of B and twice its radius. Given l, p as above, denote by {B r m } an enumeration of E r and by {ρ m } a smooth partition of unity subordinate to {B r m }. Then the following result holds true (for a proof, see the "Appendix"). Lemma 5 Fix 1 ≤ l = p ≤ N . With the above construction the following two properties hold.

Proof of Theorem 1
Proof By Lemma 2, we can apply Lemma 3 on the metric space (M, d h ) so that we get inequality (31) with which follows from the consistency, cf. Lemma 7 in the "Appendix". Inequality (31) then yields that the sequence χ h satisfies (41), so that Lemma 4 (i) applies to get that χ ∈ e., i χ i = 1 and, after extracting a subsequence, To see this, observe that (34) implies where C is a constant which depends on N , A, B but not on h and comes from the fact that all norms on (1, ..., 1) ⊥ are comparable. Inequality (54) clearly implies that K h * u h − K h * χ h converges to zero in L 2 . Observe that inequality (35) in particular yields (43). Recalling (158) in the "Appendix", we learn that u h − χ h converges to zero in L 2 . This implies that we can apply Lemma 4 (ii) both to the sequence u h and the sequence χ h . In particular, we may apply Proposition 1 for χ h and Proposition 2 for u h . Now the proof follows the same strategy as the one in the two-phase case in [21]. For the sake of completeness, we sketch the argument here. First of all, Lemma 3 gives inequality (31) for where we set . As test function η, we now choose η(t) = max{min{ T −t τ , 1}, 0} and obtain Now it remains to pass to the limit as h ↓ 0: to get (40) from inequality (56) one uses the lower semicontinuity (42) for the first left hand side term, the sharp bound (47) for the second left hand side term, the bound (48) for the last left hand side term and finally one uses the consistency Lemma 7 in the "Appendix" to treat the right hand side term. To get (40) it remains to pass to the limit in τ ↓ 0.

Proof of Lemma 4
Proof Argument for (i) For the compactness, the arguments in [21] adapt to this setting with minor changes. The first observation is that, by inequality (158) in the "Appendix", one needs to prove compactness in For this, one just needs a modulus of continuity in time. I.e. it is sufficient to prove that there exists a constant This can be done applying word by word the argument in [21] once we show the following: for any pair χ, χ ∈ A, we have Here the constant C depends on N , A, B but not on h.
To prove (57) we proceed as follows: let S ∈ R N ×N be a symmetric matrix which is positive definite on (1, ..., 1) ⊥ . Since any two norms on a finite dimensional space are comparable, there exists a constant C > 0 depending on S and N such that where | · | S denotes the norm induced by S. For a function u ∈ M write (K h * )u h for the function Then we calculate Then, by our assumption (14), S is positive definite on (1, ..., 1) ⊥ and after integration on [0, 1) d identity (59) becomes We now proceed to estimate the integral on the right hand side. By the choice of S and Jensen's inequality we have Using the triangle inequality and (156) in the "Appendix" we can estimate the right hand side to obtain the following inequality Observing that there is a constant C > 0 such that K i j ≤ C K jk we conclude that This proves (57) and closes the argument for the compactness.
We also have to prove (42), but this follows from (44) with u h replaced by χ h once we have shown that the limit χ is such that |∇χ| is a bounded measure, equiintegrable in time. Indeed one can check from the proof of (44) that the lower bound of (44) does not require the extra assumption (38). Thus one gets that where in the last two lines we used (44) and the definition of σ i j . To prove that the limit χ is such that |∇χ| is a bounded measure, equiintegrable in time one can proceed with an argument similar to the one used in [21] for the two-phase case. Observe that this only requires the weaker assumption (43).

Argument for (ii)
As mentioned in the previous paragraph, we already know that the limit χ is such that |∇χ| is a bounded measure, equiintegrable in time. We will prove (44). Then (45) easily follows by recalling that ν i j = −ν ji . A standard argument (to be found in [21]) which relies on the exponential decay of the kernel yields the fact that we can test convergences (44) with functions with at most polynomial growth in z provided we already have the result for bounded and continuous test functions, thus we focus on this case. Let Upon splitting ξ into the positive and the negative part, by linearity we may assume that 0 ≤ ξ ≤ 1. We can split (62) into the local lower bound and the global upper bound lim sup Indeed we can recover the limsup inequality in (62) by splitting ξ = 1 − (1 − ξ) and applying the local lower bound (63) to 1 − ξ . We first concentrate on the local lower bounds in the case where u h = χ, namely we will show By Fatou's lemma the claim is reduced to showing that for a.e. point t in time and every Fix a point t such that χ(·, t) ∈ BV ([0, 1) d , {0, 1} N ) and any z ∈ R d . In the sequel, we will drop those variables, so χ(x) = χ(x, t), ξ(x) = ξ(z, x, t). By approximation we may assume that ξ ∈ C ∞ ([0, 1) d ). Let ρ mi j be a partition of unity obtained by applying the construction of Sect. 4 to the function χ(x) on the interface i j . Let ν i be the outer measure theoretic normal of i (t). Then by Lemma 5 we have We now focus on estimating the argument of the last limit. Observe that Let , 1}) and that 1 − χ i = k =i χ k we can estimate the last item by Observe that for each m ∈ N, using also the consistency Lemma 7 Thus we obtain Inserting back into (67), recalling also Lemma 5 and the inequality (68), using Fatou's lemma, the fact that ρ mi j is a partition of unity and that 0 ≤ξ m ≤ 1 we obtain that and (65) follows letting go to zero. To derive inequality (63) we just apply Lemma 9 in the "Appendix".
To get the upper bound (64) we argue as follows. First of all, recall Assumption (38) which says lim sup Now, if we define which is a contradiction. Thus we have proved (64).

Proof of Proposition 1
Proof Since we assume that the left hand side of (47) is finite, in view of (28), upon passing to a subsequence we may assume that, in the sense of distributions, the limit exists as a finite positive measure on [0, 1) d × (0, T ). Here we indicated with χ h l (· − h) the time shift of function χ h l . We denote by τ a small fraction of the characteristic spatial scale, namely τ = α √ h for some α > 0, which we think as a small number. Given 1 ≤ l ≤ N we define We divide the proof into two parts: first we show that the normal velocities exist, and afterwards we prove the sharp bound. But first, let us state two distributional inequalities that will be used later. Namely • In a distributional sense it holds that • There exists a constant C > 0 such that for any 1 ≤ i ≤ N and any θ ∈ {γ, β} in a distributional sense it holds that lim sup We observe that it suffices to prove (72), then (73) follows immediately. Indeed recall that A and B are positive definite on (1, ..., 1) ⊥ . In particular there exists a constant C > 0 such that for any v ∈ (1, ..., 1) ⊥ one has |v| 2 A + |v| 2 B ≥ C|v| 2 ≥ Cv 2 i for any i ∈ {1, ..., N }.

Applying this to the vector
The claim then follows from the definition of ω, (72), the symmetry (20) and the semigroup property (22). Indeed it is sufficient to check that, in the sense of distributions To this aim, pick a test function η ∈ C ∞ c ([0, 1) d × (0, T )). Spelling out the definition of the norms | · | A and | · | B , the claim is proved once we show that and the same claim with a i j , γ replaced by b i j , β respectively. We concentrate on (76). Clearly, we are done once we show that for any i = j To show this, using the semigroup property (22) we rewrite the argument of the limit as for every function f for which this expression makes sense. We observe that by the boundedness of the measures 1 To prove this, spelling out the integrand, using the Cauchy-Schwarz inequality and recalling the scaling (21) we observe that Observe that by the compactness of χ h in L 2 ([0, 1) d × (0, T )), (81) is of order h, thus (80) indeed holds true. Now we can turn to the proof of (72), which is essentially already contained in the paper [21]. For the convenience of the reader we sketch the main ideas here. One reduces the claim to proving the following facts: lim sup lim sup Claim (82) was proved in the previous paragraph, while (83) and (84) are consequences of Jensen's inequality in the time variable for the convex functions | · | 2 A and | · | 2 B respectively. More precisely, assume without loss of generality that τ = N h for some N ∈ N, then by a telescoping argument and Jensen's inequality for | · | 2 A we get Recalling that N = α/ √ h we can rewrite the right hand side as This is an average of time shifts of α 2 1 all these time shifts are small, thus the average has the same distributional limit as α 2 1 A . This proves (83). The argument for (84) is similar. Existence of the normal velocities We now prove the existence of the normal velocities. Fix 1 ≤ i ≤ N and observe that for w ∈ {γ, β} we have which follows simply by observing that τ )). Using Jensen's inequality and the elementary identity (156) in the "Appendix" we have Now observe that by testing (44) with G w /K i j (which is bounded, and thus admissible), we learn that Thus, if we divide (86) by √ h and let h ↓ 0, using also (73) we obtain where C is a constant which depends on γ, β, N , the mobilities and the surface tensions. If we divide by α and then let α → 0 we learn that |∂ t χ i | is absolutely continuous with respect which is the normal velocity of χ i in the sense that ∂ t χ i = V i |∇χ i | in the sense of distributions. The optimal integrability V i ∈ L 2 (H d−1 |∂ * i (t) (dx)dt) will be shown in the second part of the proof. Let us record for later use that with a similar reasoning we actually obtain that lim sup h Sharp bound For a given 1 ≤ i ≤ N we denote by δχ + i and δχ − i the positive and negative parts fo δχ i respectively, i.e. we set δχ + i := (χ i − χ i (·−τ )) + and δχ − i := (χ i − χ i (·−τ )) − . Before entering into the proof of the sharp bound, we need to prove the following property. For any i = j we have that, in a distributional sense, the following holds We focus on the first limit, the second one being analogous. The first observation is that the limit is a nonnegative bounded measure, which is absolutely continuous with respect to Indeed, spelling out the z-integral and using the fact that δχ By (44) in Lemma 4, as h ↓ 0, the right hand side converges to which is absolutely continuous with respect to To see this, we rewrite the argument of the limit in (92) as where we set λ h (t, x, z) . Using the fact that 0 ≤ χ i ≤ 1 and l χ l = 1 we obtain the following inequalities Here C is a constant that does not depend on h. Using inequality (96) on the domain {ν 0 ·z ≤ 0} and inequality (97) on the domain {ν 0 · z ≥ 0} we obtain Observe that for any 1 ≤ k ≤ N we have lim sup This can be seen by showing that which can be shown to be true by testing with an admissible test function, and putting the spatial shift z on it. Thus recalling (44) and (90), we obtain that Since we already know that λ is absolutely continuous with respect to H d−1 | i j (t) (dx)dt, the same bound holds true if we replace the right hand side with its absolutely continuous part with respect to H d−1 | i j (t) (dx)dt. Observing that for k = i, j by Lemma 6 in the "Appendix" the measures H d−1 |∂ * k (t) (dx)dt and H d−1 |∂ * i j (t) (dx)dt are mutually singular, this yields (94).
. By a separability argument, we see that the null set on which (101) does not hold can be chosen so that it is independent of the choice of ν 0 . If we select ν 0 = ν i j (x, t) this yields θ ≤ 0 almost everywhere with respect to H d−1 | i j (t) (dx)dt. Since we already know that λ is nonnegative this gives λ = 0.
Before getting the sharp bound, we check that for any i = j we have V i = −V j a.e. with respect to To see this, we start by observing that if ξ ∈ C ∞ c ([0, 1) d × (0, T )), thanks to the fact that k =i χ k = 1 − χ i , we get Choosing , by a separability argument, we obtain that for a.e. t and every g ∈ C ∞ ([0, 1) d ) Pick t such that (103) holds. Let g ∈ C ∞ ([0, 1) d ) and let ρ m be a partition of unity obtained by the construction of Sect. 4 applied to the function χ(·, t) on the interface i j (t). Then Passing to the limit r ↓ 0 in (104) we get by Lemma 5 that Since this identity holds for any g ∈ C ∞ ([0, 1) d ), a density argument gives x. In other words Integrating in time yields that V i = −V j a.e. with respect to H d−1 | i j (t) (dx)dt. We now proceed with the derivation of the sharp lower bound. Define c i j := K i j (z)dz. Then we have Now we rewrite the terms in the second parenthesis using −ab = a + b − + a − b + − a + b + − a − b − and then adding and subtracting the contributions of the minority phases we obtain Now the main idea is to split the integral of K i j in the definition of c i j into two parts. More precisely, by the symmetry (20), for any ν 0 ∈ S d−1 and any V 0 > 0 we have Substituting into (108) and dividing by √ h we obtain 2 0≤ν 0 ·z≤αV 0 We will be interested in bounding the lim inf of the left hand side. Observe that the distributional limit of the last five terms is non-positive. Indeed, the limit of first four terms vanish distributionally by property (91), while the last term is bounded from above by which vanish distributionally in the limit h ↓ 0 by property (91). We thus obtain that the lim inf of the left hand side of (110) is bounded from above by lim inf For the last term we use the sharp bound (72), relating this term to our dissipation measure ω. We would like to get a good bound for the other terms. This cannot be done naively as before, since we want the bound to be sharp. We claim that lim sup Here C is a constant that depends on γ, β, A, B, but not on h. Assume for the moment that (112) is true and let us conclude the argument in this case. Using (112) and (72) we obtain in the sense of distributions on [0, 1) d × (0, T ). Observe also that the left hand side of (113) is an upper bound for 0≤ν 0 ·z≤αV 0 K i j (z)dz(|∂ t χ i | + |∂ t χ j |), thus the inequality still holds true if the left hand side is replaced by this term. Remember that ω ac k is absolutely continuous We now disintegrate the measure ω, i.e. we find a Borel family ω t , t ∈ (0, T ), of positive measures on [0, 1) d such that ω = ω t ⊗ dt. Having said this, it is not hard to see that (113) holds in a disintegrated version, i.e. we have for Lebesgue a.e. t ∈ (0, T ) Here ν 0 ∈ S d−1 and V 0 ∈ (0, ∞) are arbitrary: indeed even if the set of points in time for which (114) holds is a priori dependent on ν 0 and V 0 , a standard separability argument allows us to conclude that we can get rid of this dependence. Fix a point t in time such that (114) holds. In what follows, we drop the time variable t which is fixed, so for example observe that by definition of V i j and by using the fact that Let us relabel ν 0 , V 0 and ξ to make clear that they may depend on the pair i, j. Thus ν Let {ρ m } be a partition of unity obtained using the construction of Sect. 4 applied to the function χ(·, t) on the interface i j (t). Use the above inequality with ξ i j replaced by ρ m ξ i j and sum over m and i, j to get where we have set Observe that i< j m∈N because ρ m is a partition of unity. Moreover by Lemma 5 we get Putting things together we obtain that for any ν We now claim that by approximation the above inequality is valid for any simple function ξ i j ≥ 0. To see this, it is clear that we can concentrate on ξ i j = w i j 1 B i j , where B i j ⊂ [0, 1) d are Borel and w i j ≥ 0. Observe that by the dominated convergence theorem, the family With this in place one can use an approximation argument to replace the vector ν i j 0 with the H d−1 -measurable vector valued function ν i j obtaining the following inequality: Now divide by α 2 and send α to zero. Record the following limits, which can be computed spelling out the definition of K i j , and recalling the symmetry property (20) and the factorization property (23) for the heat kernel Then if we insert back into (121) we obtain i< j Taking the limit m → +∞, using the monotone convergence theorem we obtain i< j or, in other words, Recall that μ i j = μ ji , thus the inequality above may be rewritten as If we now integrate in time we learn by the monotone convergence theorem that V i j ∈ L 2 (H d−1 | i j (t) (dx)dt) and that the sharp bound (47) is satisfied.
Proof of (112) To prove (112) we proceed in several steps. First of all, we claim that the first eight terms may be substituted by To show this, observe that we may replace the implicit z-integrals in the convolution in the first eight terms by twice the integrals over the half space {ν 0 · z ≥ 0} instead of R d . This is clearly true once we observe that, in the sense of distribution and that similar identities hold exchanging the roles of i, j and +, − respectively. That (129) holds is not difficult to show. Indeed multiplying the argument in both the limits by a test function ξ ∈ C ∞ c ([0, 1) d × (0, T )) and integrating over space-time one observes that since the kernel is even, the argument of the second limit is just a spatial shift of z of the first one. By translation invariance the spatial shift may be put onto the test function, and thanks to the scaling of the kernel one can get the claim. We may thus substitute the first eight terms of the left hand side of (112) with twice the same terms with the integration with respect to z on the half space {ν 0 · z ≥ 0}. If we rely again on the fact that δχ + i ∈ {0, 1}, by identity (156) in the "Appendix" we obtain (128), as claimed. Now we need two inequalities for the integrand. First note that the integrand is a mixed space-time second-order finite difference. We claim that The second follows from the triangle inequality. To show the first one, observe that and that similarly Summing up the two inequalities we get Similar bounds hold for the remaining terms in (130). We now split the integral (128) into the domains of integration {0 ≤ ν 0 · z ≤ αV 0 } and {ν 0 · z > αV 0 }. On the first one we use the first inequality in (130) for the integrand. Recalling identity (156) and inequality (157) in the "Appendix" we obtain, and using the fact that k χ k = 1 2 0≤ν 0 ·z≤αV 0 With this in place, we observe that which can be seen by testing (44) with , which is of polynomial growth in z. To conclude (142) we just need to show that for any symmetric matrix A ∈ R d×d and any unit vector ν we have Using the definition of the kernel K i j it suffices to show that Az · ∇G w (z)(ν · z) + dz = − √ w √ π (tr A + ν · Aν) w ∈ {γ, β}.
step 5. Passage to the limit in δd h (·, u)ξ . We claim that To prove this, we observe that the terms which do not involve the Hessian ∇ 2 K h i j are all O( √ h). For example, to prove that spell out the integral in the convolution, use the fact that ∇ K h , use the fact that ∇ · ξ(x, t)ξ(x − √ hz, t) is bounded and test (44) with ∇ K i j /K i j . The other terms can be treated similarly. For the terms involving the Hessian of the kernel, we split the claim into The proof of (146) is similar to the argument for (144). In fact, while the additional derivative on the kernel gives an additional factor 1 √ h , we gain a factor √ h by the Lipschitz estimate and such that for any ξ ∈ L 2 (H d−1 | i, j i j (t) (dx)dt) lim inf h↓0

Some inequalities
Here we gather some elementary inequalities which are used frequently. a, b, a , b ∈ {0, 1}, then the following inequalities hold:

Lemma 10 Let
Proof The first identity follows by expanding |a − b| = |a − b| 2 . The second one is proved in [21]. For the sake of completeness, we reproduce the proof here. There are two cases. In the first one we have Proof The proof of (158) is contained in the proof of Lemma 3 in [21] for the two phases case when K h is the scaled version of the Gaussian with variance 1. The same proof may be adapted to our setting because we still have monotonicity of the energy (Lemma 8) and we can prove essentially by the use of Jensen's inequality that