Variational approach to regularity of optimal transport maps: general cost functions

We extend the variational approach to regularity for optimal transport maps initiated by Goldman and the first author to the case of general cost functions. Our main result is an $\epsilon$-regularity result for optimal transport maps between H\"older continuous densities slightly more quantitative than the result by De Philippis-Figalli. One of the new contributions is the use of almost-minimality: if the cost is quantitatively close to the Euclidean cost function, a minimizer for the optimal transport problem with general cost is an almost-minimizer for the one with quadratic cost. This further highlights the connection between our variational approach and De Giorgi's strategy for $\epsilon$-regularity of minimal surfaces.


C .
Introduction . An ∞ bound on the displacement . Almost-minimality with respect to Euclidean cost . Eulerian point of view . Harmonic Approximation . One-step improvement . Proof of -regularity . -regularity for almost-minimizers . Partial regularity: Proof of Corollary . Appendix A. Some technical lemmata Appendix B. e change of coordinates Lemma . Appendix C. Some aspects of Campanato's theory Appendix D. An bound on the displacement for almost-minimizing transport maps References . I In this paper, we give an entirely variational proof of the -regularity result for optimal transportation with a general cost function and Hölder continuous densities, as established by De Philippis and Figalli [DF ]. is provides a keystone to the line of research started in [GO ] and continued in [GHO ]: In [GO ], the variational approach was introduced and -regularity was established in case of a Euclidean cost function ( , ) = 1 2 | − | 2 , see [GO , eorem . ]. In [GHO ], among other things, the argument was extended to rougher densities, which required a substitute for McCann's displacement convexity; this generalization is crucial here.
One motivation for considering this more general se ing is the study of optimal transportation on Riemannian manifolds with cost function given by 1 2 2 ( , ), where is the Riemannian distance. In this context, an -regularity result is of particular interest because, compared to the Euclidean se ing, even though is a compact perturbation of the Euclidean case, there are other mechanisms creating singularities like curvature. Indeed, under suitable convexity conditions on the support of the target density, the so-called Ma-Trudinger-Wang (MTW) condition on the cost function , a strong structural assumption, is needed to obtain global smoothness of the optimal transport map, see [MTW ] and [Loe ]. Since in the most interesting case of cost 1 2 2 ( , ), the MTW condition is quite restrictive 1 , and does not have a simple interpretation in terms of geometric properties of the manifold 2 , it is highly desirable to have a regularity theory without further conditions on the cost function (and on the geometry of the support of the densities). e outer loop of our argument is similar to that of [DF ]: a Campanato iteration on dyadically shrinking balls that relies on a one-step improvement lemma, which in turn relies on the closeness of the solution to that of a simpler problem with a high-level interior regularity theory. e main di erences are: • In [DF ], the simpler problem is the Monge-Ampère equation with constant right-handside and Dirichlet boundary data coming from the convex potential; for us, the simpler problem is the Poisson equation with Neumann boundary data coming from the ux in the Eulerian formulation of optimal transportation [BB ]. • In [DF ], the comparison relies on the maximum principle; in our case, it relies on the fact that the density/ ux pair in the Eulerian formulation is a minimizer 3 given its own boundary conditions. • In [DF ], the interior regularity theory appeals to the -regularity theory for the Monge-Ampère equation [FK ], which itself relies on Ca arelli's work [Caf ]; in our case, it is just inner regularity of harmonic functions.
Loosely speaking, the Campanato iteration in [DF ] relies on freezing the coe cients, whereas here, it relies on linearizing the problem (next to freezing the coe cients). In the language of nonlinear elasticity, we tackle the geometric nonlinearity (which corresponds to the nonlinearity inherent to optimal transport) alongside the material nonlinearity (which corresponds to the cost function ). As a consequence of this, we achieve C 2, -regularity in a single Campanato iteration, whereas [DF ] proceeds in three rounds of iterations, namely rst C 1,1− , then C 1,1 , and nally C 2, . Another consequence of this approach via linearization is that we instantly arrive at an estimate that has the same homogeneities as for a linear equation (meaning that the Hölder semi-norm of the second derivatives is estimated by the Hölder semi-norm of the densities and not a nonlinear function thereof). Likewise, we obtain the natural dependence on the Hölder semi-norm of the mixed derivative of the cost function 4 . When it comes to this dependence on the cost function , we observe a similar phenomenon as for boundary regularity: regarding how the regularity of the data and the regularity of the solution are related, optimal transportation seems be er behaved than its linearization, as we shall explain now 5 : Assuming unit densities for the sake of the discussion, the Euler-Lagrange equation can be expressed on the level of the optimal transport map as the fully nonlinear (and -dependent) elliptic system given by det ∇ = 1 and curl (∇ ( , ( ))) = 0. Since the la er can be re-phrased by imposing that the matrix ∇ ( , ( ))∇ ( ) is symmetric, the Hölder norm of ∇ lives indeed on the same footing as the Hölder norm of the mixed derivative ∇ . e linearization around ( ) = on the other hand is given by the elliptic system div = 0 and curl (∇ ( , ) + ∇ ( , ) ( )) = 0, which has 1 It is violated whenever a Riemannian sectional curvature is negative at any point of the manifold, see [Loe ].
2 See [KM ] for the concept of cross-curvature and its relation to the MTW condition.
3 In fact, an almost minimizer. 4 e regularity of the cost function enters our result in a more nonlinear way, too: Its qualitative regularity on the C 2 -level determines the energy and length scales below which the linearization regime kicks in.
5 We refer to [MO ] for a discussion of this phenomenon in the study of boundary regularity.
VARIATIONAL APPROACH TO REGULARITY OF OPTIMAL TRANSPORT MAPS divergence-form character 6 . Here Hölder control of ∇ ( , ) matches with Hölder control of only , and not its gradient.
Our approach is analogous to De Giorgi's strategy for the -regularity of minimal surfaces, foremost in the sense that it proceeds via harmonic approximation. In fact, our strategy is surprisingly similar to Schoen & Simon's variant [SS ] to that regularity theory: • Both approaches rely on the fact that the con guration is minimal given its own boundary conditions (Dirichlet boundary conditions there [SS , ( )], ux boundary conditions here [GO , ( . )]), see [SS , p. ] and [GO , Proof of Proposition . , Step ]; the Euler-Lagrange equation does not play a role in either approach. • Both approaches have to cope with a mismatch in description between the given con guration and the harmonic construction (non-graph vs. graph there, time-dependent ux vs. time-independent ux here) which leads to an error term that luckily is superlinear in the energy, see [SS , ( )] and [GO , Proof of Proposition . , Step ]. On a very high-level description, this super-linearity may be considered as coming from the next term in a Taylor expansion of the nonlinearity, which in the right topology can be seen via lower-dimensional isoperimetric principles, see [SS , Lemma ] and [GO , Lemma . ]. (However, there is no direct analogue of the Lipschitz approximation here.) • Both approaches have to establish an approximate orthogonality that allows to relate the distance between the minimal con guration and the construction in an energy norm by the energy gap, see [SS , p. .] and [GO , Proof of Proposition . , Step ] in the simple se ing or rather [GHO , Lemma . ] in our se ing; it thus ultimately relies on some (strict) convexity, see [SS , ( )].
• In order to establish this approximate orthogonality, both approaches have to smooth out the boundary data (there by simple convolution, here in addition by a nonlinear approximation), see [SS , ( ),( ),( )] and [GHO , Proposition . ,( . ), ( . )]. • In view of this, both approaches have to choose a good radius for the cylinder (in the Eulerian space-time here) on which the construction is carried out, see [SS , p. ] and [GHO , Section . . ].
e advantages of a variational approach become particularly apparent in this paper, when we pass from a Euclidean cost function to a more general one: We may appeal to the concept of almost minimizers, which is well-established for minimal surfaces. 7 In our case, this simple concept means that, on a given scale, we interpret the minimizer (always with respect to its own boundary conditions) of the problem with as an approximate minimizer of the Euclidean problem. is allows us to directly appeal to the Euclidean harmonic approximation [GHO , eorem . ]. Incidentally, while dealing with Hölder continuous densities like in [GO ] and not general measures as in [ • We now address a somewhat hidden, but quite important additional di culty that has to be overcome when passing from a Euclidean to a general cost functional 8 : While the Kantorowich formulation of optimal transportation has the merit of being convex, it is so in a very degenerate way; the Benamou-Brenier formulation on the contrary uncovers an almost strict 9 convexity. e variational approach to regularity capitalizes on this strict convexity. However, this Eulerian reformulation seems naturally available only in the Riemannian case, and its strict convexity seems apparent only in the Euclidean se ing. is is one of the reasons to appeal to the concept of almost minimizer, since it allows us to pass from a general cost function to the Euclidean one. However, for con gurations that are not exact minimizers of the Euclidean cost functional, the Lagrangian cost ∫ | − | 2 d and the cost ∫ 1 | | 2 of their Eulerian description ( . ) are in general di erent: While the Eulerian cost is always dominated by the Lagrangian one, this is typically strict 10 .
Hence the prior work [GO , GHO , MO ] on the variational approach used the Euler-Lagrange equation in a somewhat hidden way, namely in terms of the incidence of Eulerian and Lagrangian cost. Luckily, the discrepancy of both functionals can be controlled for almost minimizers, see Lemma . .
. . Main results. Let , ⊂ R be compact. We assume that the cost function : × → R satis es: Let 0 , 1 : R → R be two probability densities, with Spt 0 ⊆ and Spt 1 ⊆ . It is well-known that under (an even milder regularity assumption than) condition (C ), the optimal transportation problem inf where the in mum is taken over all couplings between the measures 0 d and 1 d , admits a solution , which we call a -optimal coupling. For > 0 we de ne the set which is quite natural in the context of optimal transportation, because it allows for a symmetric treatment of the transport problem: it is suitable to describe all the mass that gets transported out of , and all the mass that is transported into . For ∈ (0, 1) we write for the C 0, -semi-norm of the mixed derivative ∇ of the cost function in the cross || == , and denote by Fixing 0 (0) = 1 (0) = 1, we think of the densities as non-dimensional objects. is means that ( || == ) = ∫ ( 0 + 1 ) − ( × ) has units of (length) , so that the Euclidean transport energy ∫ || == 1 2 | − | 2 d has dimensionality (length) +2 , and explains the normalization by −( +2) in assumption ( . ) and in the de nition ( . ) of E below, making it a non-dimensional quantity 11 .
We stress that the implicit constant in ( . ) is independent of the cost . e scale 0 below which our -regularity result holds has to be such that 2 0 ⊆ ∩ and such that the qualitative ∞ / 2 bound (Lemma . ) holds. We note that the dependence of 0 on and the implicit dependence on in the smallness assumption ( . ) are only through the qualitative information (C )-(C ), see Remark . and Lemma . for details. Note also that, without appealing to the well-known result that the solution of ( . ) is a deterministic coupling = (Id × ) # 0 , this structural property of the optimal coupling is an outcome of our iteration.
Remark . . Under the same assumptions as in eorem . , in particular only asking for the one-sided energy 1 +2 ∫ 4 ×R | − | 2 d to be small in ( . ), we can also prove the existence of a function * ∈ C 1, ( ) such that (R × ) ∩ Spt ⊆ {( * ( ), ) : ∈ }, with the same estimate on the semi-norm of ∇ * . is follows from the symmetric nature of the assumptions (C )-(C ), of the normalization conditions on the densities and the cost, and of the smallness assumption ( . ). We refer the reader to S of the proof of eorem . to see how ( . ) entails smallness of a symmetric version of the Euclidean transport energy, as de ned in ( . ), at a smaller scale.
As in [DF ], eorem . leads to a partial regularity result for a -optimal transport map , that is, a map such that the -optimal coupling between 0 and 1 is of the form e existence of such a map is a classical result in optimal transportation under assumptions (C )-(C ) on the cost, as well as its particular structure, namely the fact that it derives from a potential. More precisely, there exists a -convex function : → R such that ( ) = ( ) := c-exp (∇ ( )), 11 Notice that we are using a di erent convention here than in [GHO ], since it is more natural to work with the non-dimensional energy in our context.
12 An assumption of the form 1 means that there exists > 0, typically only depending on the dimension and Hölder exponents, such that if ≤ , then the conclusion holds. We write Λ to indicate that also depends on the parameter Λ. e symbols ∼, and indicate estimates that hold up to a global constant , which typically only depends on the dimension and Hölder exponents. For instance, means that there exists such a constant with ≤ . ∼ means that and .
where the -exponential map is well-de ned in view of (C ) and (C ) via Recall that a function : → R is -convex if there exists a function : Note that by assumption (C ) and the boundedness of , the function is semi-convex, i.e., there exists a constant such that + | | 2 is convex. Hence, by Alexandrov's eorem (see, for instance, [EG , eorem . ], or [Vil , eorem . ]), is twice di erentiable at a.e. ∈ . For more details on -convexity and its connection to optimal transport and Monge-Ampère equations we refer to [Vil , Chapter ] and [Fig , Section . ]. Before stating the partial regularity result, let us mention that our 2 -based assumption on the smallness of the Euclidean energy of the forward transport is not more restrictive than the ∞ -based assumption of closeness of the Kantorovich potential to 1 2 | · | 2 in [DF , eorems . & . ]. However, the assumption on is not invariant under transformations of and that preserve optimality, whereas the optimal transport map , and hence our assumption on the energy − −2 ∫ 4 | − ( )| 2 0 ( ) d , are una ected. For that reason we additionally have to x ∇ (0, 0) = 0, and ∇ (0, 0) = 0 in the following corollary 13 , and ask for [∇ ] ,4 to be small.
Hence, in this result we think of the cost as being close to − · , which is not necessarily the case in eorem . .
As recently pointed out in [Gol ] for the quadratic case, the variational approach is exible enough to also obtain -regularity for optimal transport maps between merely continuous densities.
e modi cations presented in [Gol ] can be combined with our results to prove an -regularity result for the class of general cost functions considered above. is will be the context of a separate note [PR ].
Finally, eorem . can also be applied to optimal transportation on a Riemannian manifold M with cost given by the square of the Riemannian distance function : if 0 , 1 ∈ C 0, (M) are two probability densities, locally bounded away from zero and in nity on M, then the optimal transport map : M → M sending 0 to 1 for the cost = 2 2 is a C 1, -di eomorphism outside two closed sets Σ , Σ ⊂ M of measure zero. See [DF , eorem . ] for details.
. . Strategy of the proofs. In this section we sketch the proof of the -regularity eorem . . As in [GO , GHO ] one of the key steps is a harmonic approximation result, which can be obtained by an explicit construction and (approximate) orthogonality on an Eulerian level.
. . . ∞ bound on the displacement. A crucial ingredient to the variational approach is a local ∞ / 2 -estimate on the level of the displacement. More precisely, given a scale , it gives a pointwise estimate on the non-dimensionalized displacement − in terms of the (non-dimensionalized) Euclidean transport energy which amounts to a squared 2 -average of the displacement. While this looks like an inner regularity estimate in the spirit of the main result, eorem . , it is not. In fact, it is rather an interpolation estimate with the -monotonicity of Spt providing an invisible second control next to the energy. is becomes most apparent in the simple context of [GO , Lemma . ] where monotonicity morally amounts to a (one-sided) ∞ -control of the gradient of the displacement. e interpolation character of the estimate still shines through in the fractional exponent 2 +2 ∈ (0, 1) on the 2 -norm.
Following [GHO ], we here allow for general measures and ; the natural local control of these data on the energy scale is given by which measures locally at scale > 0 the distance from given measures and to the Lebesgue measure, where is the quadratic Wasserstein distance between and d . 15 Notice that if = 0 d and = 1 d with Hölder continuous probability densities such that 1 2 ≤ ≤ 2 on , = 0, 1, see Lemma A. in the appendix. e new aspect compared to [GHO , Lemma . ] is the general cost function . Not surprisingly, it turns out that the result still holds provided is close to Euclidean and that the closeness is measured in the non-dimensional C 2 -norm. We stress the fact that this closeness is not required on the entire "cross" || == 5 , cf. ( . ), but only to the " nite cross" is is crucial, since only this smallness is guaranteed by the niteness of the C 2, -norm, cf. ( . ) below. is sharpening is a consequence of the qualitative hypotheses (C )-(C ).
Proposition . . Assume that the cost function satis es (C )-(C ), and let ∈ Π( , ) be a coupling with -monotone support.
For all Λ < ∞ and for all > 0 such that and for which and 15 We use the convention that 2 ( , ) = inf ∈Π ( , ) ∫ 1 2 | − | 2 d in this paper.
16 Whenever there is no room for confusion, we will drop the dependence of D on the measures and .
we have that Remark . . A close look at the proof of Proposition . actually tells us that if E 6 is replaced in ( . ) by the one-sided energy, that is, if we assume 1 and if ( . ) is replaced by the one-sided inclusion then we still get a one-sided ∞ bound in the form of . is observation will be useful in the proof of eorem . to relate the one-sided energy in ( . ) to the full energy in Proposition . .
Note that due to assumption ( . ) Proposition . might appear rather useless: indeed, one basically has to assume a (qualitative) ∞ bound in the sense that there is a constant Λ < ∞ such that if ∈ 5 , then ∈ Λ , in order to obtain the ∞ bound ( . ). However, as we show in Lemma . , due to the global assumptions (C )-(C ) alone, there exists a scale 0 > 0 and a constant Λ 0 < ∞ such that ( . ) holds. Moreover, in the Campanato iteration used to prove eorem . , which is based on suitable a ne changes of coordinates, the qualitative ∞ bound ( . ) is reproduced in each step of the iteration (with a constant Λ that a er the rst step can be xed throughout the iteration, e.g. Λ = 27 works).
. . . Almost-minimality with respect to Euclidean cost. One of the main new contributions of this work is showing that the concept of almost-minimality, which is well-established in the theory of minimal surfaces, can lead to important insights also in optimal transportation. e key observation is that if is quantitatively close to Euclidean cost, then a minimizer of ( . ) is almost-minimizing for the quadratic cost.
One di culty in applying the concept of almost-minimality is that we are dealing with local quantities, for which local minimality (being minimizing with respect to its own boundary condition) would be the right framework to adopt.
Lemma . . Let ∈ Π( , ) be a -optimal coupling between the measures and . en for any Borel set Ω ⊆ R × R the coupling Ω := Ω is -optimal given its own marginals, i.e. -optimal between the measures Ω and Ω de ned via for any Borel measurable ⊆ R .
17 e assumption of eorem . is scaled to 6 here to match the scale on which smallness of E and D is assumed in both statements.
is lemma allows us to restrict any -optimal coupling to a "good" set, where particle trajectories are well-behaved in the sense that they satisfy an ∞ bound. In particular, we have the following corollary: Corollary . . Let ∈ Π( , ) be a -optimal coupling between the measures and with the property that there exists ≤ 1 such that for all ( , ) ∈ || == ∩ Spt | − | ≤ . en the coupling := || == is -optimal between the measures and as de ned in ( . ) and we have that Spt , Spt ⊆ 2 (in particular Spt ⊆ 2 × 2 ), = and = on , and ≤ , ≤ .
One of the main observations now is that -optimal couplings of the type considered in Corollary . are almost-minimizers of the Euclidean transport cost. e following assumptions ( . ) and ( . ) should be read as properties satis ed by the marginal measures and of the restriction of a -optimal coupling to a nite cross on which the ∞ bound ( . ) holds. Moreover, one of the marginals should be close to the Lebesgue measure in the sense that ( ) .

Proposition . . Let and be two measures such that
for some > 0. Let ∈ Π( , ) be a -optimal coupling between the measures and . en is almost-minimizing for the Euclidean cost, in the sense that for any ∈ Π( , ) we have that where for some constant depending only on .
e above statement is most naturally formulated in terms of couplings, that is, in the Kantorovich framework. However, the way (almost-)minimality enters in the proof of the harmonic approximation result (see eorem . below), it is needed in the Eulerian picture, where the construction of a competitor is done.
. . . e Eulerian side of optimal transportation. Given a coupling ∈ Π( , ) between measures and , we can de ne its Eulerian description, i.e. the density-ux pair ( , ) associated to the coupling by for ∈ [0, 1] and for all test functions ∈ C 0 (R × [0, 1]) and elds ∈ C 0 (R × [0, 1]) . It is easy to check that ( , ) is a distributional solution of the continuity equation that is, for any ∈ C 1 (R × [0, 1]) there holds For brevity, we will o en write ( , ) := ( d , d ). Being divergence-free in ( , ), the densityux pair ( , ) admits internal (and external) traces on ( × (0, 1)) for any > 0, see [CF ] for details, i.e., there exists a measure on We also introduce the time-averaged measure on de ned via Similarly, de ning the measure := ∫ 1 0 d (·, ), it is easy to see that ∇ · = − and that therefore admits internal and external traces, which agree for all > 0 with | |( ) = ( ) = ( ) = 0, and the internal trace agrees with . Note that we have the duality [San , Proposition . ] which immediately implies the subadditivity of ( , also holds for any open set ⊆ R . From the inequality · ( − ) − 1 2 | | 2 ≤ 1 2 | − | 2 , which is true for any , , ∈ R , the duality formula ( . ) immediately implies that the Eulerian cost of the density-ux pair ( , ) corresponding to a coupling via ( . ) is always dominated by the Lagrangian cost of , i.e.
We stress that this inequality is in general strict, see the example in Footnote . Contrary to the case of quadratic cost ( , ) = 1 2 | − | 2 , or, equivalently, ( , ) = − · , given an optimal coupling for the cost , the density-ux pair ( , ) associated to in the sense of ( . ) is not optimal for the Benamou-Brenier formulation [BB ] of optimal transportation, i.e., where the continuity equation and boundary conditions are understood in the weak sense ( . ), see [Vil , Chapter ] for details.
As another consequence, while displacement convexity guarantees in the Euclidean case that the Eulerian density ≤ 1 (up to a small error), c.f. [GO , Lemma . ], in our case is in general merely a measure. is complication is already present in [GHO ] and led to important new insights in dealing with marginals that are not absolutely continuous with respect to Lebesgue measure in the Euclidean case, upon which we are also building in this work.
e Eulerian version of the almost-minimality Proposition . can then be obtained via the following lemma: Lemma . . Let ∈ Π( , ) be a coupling between the measures and with the property that there exists a constant Δ < ∞ such that for any ∈ Π( , ), and let ( , ) be its Eulerian description de ned in ( . ). en . . . e harmonic approximation result. e main ingredient in the proof of eorem . is the harmonic approximation result, which states that if a coupling between two measures supported on a ball (say of radius 7 for some > 0) satis es the ∞ bound Proposition . globally on its support and is almost-minimizing with respect to the Euclidean cost, then the displacement − is quantitatively close to a harmonic gradient eld ∇Φ in || == . is is actually a combination of a harmonic approximation result in the Eulerian picture ( eorem . ) and Lemma . , which allows us to transfer the Eulerian information back to the Lagrangian framework.
eorem . (Harmonic approximation). Let > 0 and , be two measures with the property that Let further ∈ Π( , ) be a coupling between the measures and , such that: ( ) satis es a global ∞ -bound, that is, there exists a constant ≤ 1 such that is the Eulerian description of as de ned in ( . ), then there exists a constant Δ < ∞ such that for any Eulerian competitor, i.e. any pair of measures ( , ) satisfying en for every 0 < 1, there exist > 0 and , < ∞ such that, provided 18 18 Note that by assumption ( . ) the coupling is supported on 7 × 7 , so that by means of the estimate E 6 ( ) E 7 ( ) = ∫ | − | 2 d the smallness assumption of the Euclidean energy of on scale 6 could be replaced by a smallness assumption on the global Euclidean energy of . However, since D behaves nicely under restriction only on average, the applicability of the harmonic approximation result derived in [GHO ] becomes more apparent in the form of assumption ( . ). the following holds: ere exists a radius * ∈ (3 , 4 ) such that if Φ is the solution, unique up to an additive constant, of 19 then 20 and From the Eulerian version of the harmonic approximation eorem . we can also obtain a Lagrangian version via almost-minimality: Lemma . . Let > 0 and let ∈ Π( , ) be a coupling between the measures and , such that ( ) satis es a global ∞ -bound, that is, there exists a constant ≤ 1 such that en for any smooth function Φ there holds . . . One-step improvement and Campanato iteration. With the harmonic approximation result at hand, we can derive a one-step improvement result, which roughly says that if the coupling is quantitatively close to (Id × Id) # 0 on some scale , expressed in terms of the estimate 1, and the fact that the (qualitative) ∞ bound on the displacement ( . ) holds, then on a smaller scale , a er an a ne change of coordinates, it is even closer to (Id × Id) # 0 . is is the basis of a Campanato iteration to obtain the existence of the optimal transport map and its C 1, regularity.
We start with the a ne change of coordinates and its properties: Lemma . . Let ∈ Π( , ) be an optimal transport plan with respect to the cost function between the measures (d ) Given a non-singular matrix ∈ R × and a vector ∈ R , we perform the a ne change of coordinates 21 19 We recall that since the boundary ux * (as de ned in ( . )) is a measure, equation ( . ) has to be understood in the distributional sense, and that = 20 We refer to ( . ) for how to understand the le -hand side of ( . ). 21 We use the notation − * = ( * ) −1 , where * is the transpose of , and ∇ 2 ( , ) = where = −∇ (0, ) and = 1 ( ) so that in particular 0 (0) = 1 (0) = 1 and from which it follows that ∇ (0, 0) = −I, then the coupling is an optimal coupling between the measures (d ) = 0 ( ) d and (d ) = 1 ( ) d with respect to the cost function .
In the change of variables we perform, the role of is to ensure that we get a normalized cost, i.e. ∇ (0, 0) = −I, while and det in ( . ) are needed for to de ne a transportation plan between the new densities. We refer the reader to Appendix B for a proof of this lemma.
( . ) Moreover, we have the inclusion Let us give a rough sketch of how the one-step improvement result can now be iterated: In a rst step, the qualitative bound on the displacement is obtained from the global assumptions (C )-(C ) on the cost function, see Lemma . . is yields an initial scale 0 > 0 below which the cost function is close enough to the Euclidean cost function for ( . ) to hold. We may therefore apply Proposition . , so that a er an a ne change of coordinates the the energy inequality ( . ) holds, the transformed densities and cost function are again normalized at the origin, optimality is preserved, and the qualitative ∞ bound ( . ) holds for the new coupling. We can therefore apply the one-step improvement Proposition . again, going to smaller and smaller scales. Together with Campanato's characterization of Hölder regularity, this yields the claimed existence and C 1, regularity of . e details of the above parts of the proof of our main eorem . are explained in the sections below, with a full proof of eorem . in Section . e proof of Corollary . is essentially a combination of the ideas in [GO ] and [DF ], and is given for the convenience of the reader in Section .
We conclude the introduction with a comment on the extension of the results presented above to general almost-minimizers with respect to Euclidean cost in the following sense: De nition . (Almost-minimality w.r.t. Euclidean cost (on all scales)). A coupling ∈ Π( , ) is almost-minimal with respect to Euclidean cost if there exists 0 > 0 and Δ · : We will restrict our a ention to almost-minimizers in the class of deterministic transport plans coming from a Monge map , i.e. = = (Id, ) # , and call a transport map almost-minimizing with respect to Euclidean cost if for all ≤ 0 and 0 ∈ int there holds for all such that # = and graph = graph outside ( ( 0 ) × R ) ∪ (R × ( ( 0 ))).
In this situation, we get the following generalization 23 of eorem . , whose proof will be sketched in Section .

A
We thank Jonas Hirsch for pointing out and explaining the reference [SS ] and other works related to minimal surfaces. We also thank Georgiana Chatzigeorgiou for carefully reading a previous version of this article and valuable comments, which led to the identi cation of a more serious error that has been xed in the current version. We also thank the referee for very helpful comments, in particular for suggesting to include the more general statement on almost-minimizing transport maps.
MP thanks the Max Planck Institute for Mathematics in the Sciences and TR thanks the University of Toulouse for their kind hospitality. MP was partially supported by the Projects MESA (ANR--CE -) and EFI (ANR--CE -) of the French National Research Agency (ANR).
. A ∞ In this section we establish an ∞ bound on the displacement for transference plans ∈ Π( , ) with -monotone support, that is, provided that the transport cost is small, the marginals , are close to the Lebesgue measure, and the cost function is close to the Euclidean cost function. In Lemma . we use the -monotonicity ( . ) combined with the qualitative hypotheses (C )-(C ) in conjunction with compactness to obtain a more qualitative version of the ∞ / 2 -bound, which just expresses nite expansion. In Proposition . this qualitative ∞ / 2 bound in form of ( . ) is upgraded to the desired quantitative version under the scale-invariant smallness assumption ( . ). e la er is a consequence of the quantitative smallness hypothesis 2 [∇ ] 2 , 1, as we pointed out in Remark . . In both steps, we need to ensure that there are su ciently many points in Spt close to the diagonal.
is is formulated in Lemma A. , which does not rely on monotonicity.
We make the additional assumption that is normalized in such a way that Note that since ∈ 4 and 1, we have Spt ⊆ 2 ( − ) ⊂ 5 . Integrating inequality ( . ) against the measure ( ) (d d ), it follows that ( . ) Note that by ( . ) the integral on the le -hand-side of inequality ( . ) can be expressed as To estimate the la er integral, we recall the following result from [GHO , Lemma . ]: for any ∈ C ∞ ( ), By this estimate with = ( − ·) and using that ∼ 1 by assumption ( . ), we obtain with ( . ) that for some 0 < 1 to be xed later. Hence, We now estimate each term on the right-hand-side of inequality ( . ) separately: ) For the rst term we estimate Using again ( . ) and ∼ 1 for the rst term on the right-hand side, estimate ( . ) with = | −·| 2 , and Young's inequality for the second term we obtain ) For the second term on the right-hand-side of ( . ) we use that Spt ⊆ 5 and ( . ), recalling also the de nition ( . ) of E, to estimate ) We may bound the integral in the third term on the right-hand-side of ( . ) as for ( . ) by using ( . ), ∼ 1 and estimate ( . ) with 24 = | − ·| to get Inserting the estimates ( . ), ( . ), ( . ), and ( . ) into inequality ( . ) yields Since is arbitrary and ∼ 1, this turns into We rst choose and the implicit constant in ( . ), which in view of ( . ) governs , so small that we may absorb the rst term on the right-hand-side into the le -hand-side. We then choose to be a large multiple of (E + D) 1 +2 , so that also the second right-hand-side term in ( . ) can be absorbed. is choice of is admissible in the sense of 1 provided the implicit constant in ( . ) is small enough. is yields ( . ). e next lemma shows that due to the global qualitative information on the cost function , that is, (C )-(C ), there is a scale below which we can derive a qualitative bound on the displacement. It roughly says that there is a small enough scale a er which the cost essentially behaves like Euclidean cost, with an error that is uniformly small due to compactness of the set × .
Lemma . . Assume that the cost function satis es (C )-(C ) and let ∈ Π( , ) be a coupling with -monotone support.
ere exist Λ 0 < ∞ and 0 > 0 such that for all ≤ 0 for which we have the inclusion Proof. We only prove the inclusion ( 5 × R ) ∩ Spt ⊆ 5 × Λ 0 , the other inclusion (R × 5 ) ∩ Spt ⊆ Λ 0 × 5 follows analogously since the assumptions are symmetric in and . S (Use of -monotonicity of Spt ). Let > 0 be such that ( . ) holds, in the sense that we may use Lemma A. (ii), and set We claim that there exists a constant < ∞, depending only on To show this, we use the -monotonicity ( . ) of Spt . Notice that -monotonicity of Spt implies its -monotonicity: Inserting these two identities into inequality ( . ) gives Using the boundedness of C 2 ( × ) we estimate this expression further by where in the last step we estimated the integrals and used that Since the opening angle of ( , ) is 2 , we have It follows with ( . ) that there exists < ∞ such that |∇ ( , )| C 2 ( × ) (| | + | | + | |) ≤ .
Remark . . Note that if ( . ) is replaced by smallness of the one-sided energy, i.e., then Lemma A. still applies and we obtain the one-sided qualitative bound . A E In this section we show that a minimizer of the optimal transport problem with cost function is an approximate minimizer for the problem with Euclidean cost function. However, in order to make full use of the Euclidean harmonic approximation result from [GHO , Proposition . ] on the Eulerian side, we have to be careful in relating Lagrangian and Eulerian energies. is is where the concept of almost-minimality shows its strength, since it provides us with the missing bound of Lagrangian energy in terms of its Eulerian counterpart.
Proof of Proposition . . First, let us observe that we may assume in the following that since otherwise there is nothing to show. By the support assumption ( . ) on and , the couplings and satisfy Together with ( . ) this implies that By the admissibility of , i.e. that and have the same marginals, we may write where is de ned as in ( . ). Abbreviating optimality of with respect to the cost function implies that Using again the admissibility of , we may write Note that by the de nition ( . ) of , the function satis es Now, by ( . ), so that, using ( . ) and ( . ), it follows that By the estimate ) , and Hölder's inequality, we get . E e purpose of this section is to translate almost-minimality from the Lagrangian se ing, as encoded by Proposition . , to the Eulerian se ing so that it may be plugged into the proof of the harmonic approximation result [GHO , Proposition . ]. is is the purpose of Lemma . below, which relates a (Lagrangian) coupling , which we think of being an almost-minimizer, to its Eulerian description ( , ) (introduced in ( . )). e proof of Lemma . relies on the fact that the Eulerian cost is always dominated by the Lagrangian one, while the other inequality in general only holds for minimizers of the Euclidean transport cost. However, in the proof of Lemma . we can use almost-minimality of , together with the equality of Eulerian and Lagrangian energy for minimizers of the Euclidean transport cost (Remark . ) to overcome this nuisance.
e Eulerian version of almost-minimality also implies the following localized version, which will be needed for the harmonic approximation: Corollary . . Let ∈ Π( , ) be a coupling between the measures and with the property that there exists a constant Δ < ∞ such that for any ∈ Π( , ), and let ( , ) be its Eulerian description de ned in ( . ). For any > 0 small enough 25 , let be the inner trace of on × (0, 1) in the sense of ( . ), i.e.
Hence, by Lemma . and subadditivity of the Eulerian cost it follows that which implies ( . ).
Lemma . , in the form of the bound ( . ), allows us to relate Eulerian and Lagrangian side of the harmonic approximation result, which will be central in the application to the one-stepimprovement Proposition . . e proof of this Lagrangian version is very similar to [GHO , Proof of eorem . ], however, we stress again that since we are not dealing with minimizers of the Euclidean transport cost, one has to be careful when passing from Eulerian to Lagrangian energies.
Multiplying out the square and using the de nition of ( , ) from ( . ), we may write 26 Now note that ( . ) implies a local counterpart of ( . ): for any open set ⊆ R , Arguing with an open -neighborhood of and continuity of the right-hand side with respect to , one may show that ( . ) holds for any closed set , so that in particular which, together with ( . ), gives so that ( . ) follows from the identity .

H A
In this section we sketch the proof of the (Eulerian) harmonic approximation eorem . . As already noted in the introduction, the proof of eorem . is done at the Eulerian level (as in [GO , GHO ]) by constructing a suitable competitor.
Proof of eorem . . By scaling we may without loss of generality assume that = 1. Let ( , ) be the Eulerian description of the coupling ∈ Π( , ). e proof of the Eulerian version of the harmonic approximation consists of the following four steps, at the heart of which is the construction of a suitable competitor (S ). Note that since we want to make the dependence on the parameter in the ∞ -bound ( . ) precise, one actually has to look a bit closer into the proofs of the corresponding statements in [GHO ], since in their presentation of the results the estimate (E 6 + D 6 ) 1 +2 is used. 27 S ( Passage to a regularized problem). Choose a good radius * ∈ (3, 4) for which the ux is well-behaved on * . Actually, since we are working with 2 -based quantities, to be able to get 2 bounds on ∇Φ, we would have to be able to estimate in 2 or at least in the Sobolev trace space 2( −1) . However, since the boundary uxes are just measures and since for the approximate orthogonality (see S ) a regularity theory is required up to the boundary, one rst has to go to a solution (with ∫ * d = 0) of a regularized problem Δ = const in * and * · ∇ = on * , where const is the generic constant for which the equation has a solution, and is a regularization through rearrangement of with good 2 bounds (see [GHO , Section . . ] for details).
Using properties of the regularized ux and elliptic regularity, the error made by this approximation can be quanti ed as for any ∈ (0, 1), see [GHO , Proof of Proposition . ]. Note that by the de nition of in ( . ) and assumption ( . ) we have that ∫ so that together with ( . ) we may estimate, using ≤ 1, ∫ for any ∈ (0, 1). Note that the error term is superlinear in E 6 + D 6 . S (Approximate orthogonality [GHO , Proof of Lemma . ]). For every 0 < 1 there e proof of ( . ) essentially relies on the representation formula e three error terms in the second line of this equality are then bounded as follows. e rst term uses that and are close in Wasserstein distance. An estimate on the second term relies on the fact that and are close 28 . is bound relies on the choice of a good radius in S and 2 estimates up to the boundary on ∇ . e bound on the third error term uses elliptic regularity theory and a restriction result for the Wasserstein distance, which implies that 2 * ( ∫ 1 0 , ) E 6 + D 6 . 29 is estimate actually requires a further regularization of , and by relying on interior regularity estimates explains why one has to go from * to the slightly smaller 27 We do not assume that this bound holds in assumption ( . ). However, in the one-step improvement Proposition . and the consecutive iteration to obtain the -regularity eorem . , this is of course an important ingredient, which holds in view of Proposition . . 28 Closeness here means closeness of ± and ± (the positive and negative parts of the measures) with respect to the geodesic Wasserstein distance on * .
29 We note that for the case of quadratic cost and Hölder continuous densities treated in [GO ] a bound on this term is easy due to McCann's displacement convexity, which implies that ≤ 1 up to a small error. ball 2 in the estimate ( . ). A close inspection of [GHO , Proof of Lemma . ] shows that the term involving in these error estimates comes in product with a superlinear power of E 6 + D 6 as in S , so that we may bound ≤ 1 and still be able to obtain an arbitrarily small prefactor in ( . ) by choosing E 6 + D 6 small enough. S (Construction of a competitor [GHO , Proof of Lemma . ]). For every 0 < 1 there exist > 0 and , < ∞ such that if E 6 + D 6 ≤ , then there exists a density-ux pair ( , ) satisfying ( . ) for = * , and such that S (Almost-minimality on the Eulerian level). Since the density-ux pair ( , ) satis es ( . ) for = * , Corollary . implies that Combining the above steps, we have proved that for any 0 < 1 there exist > 0 and < ∞ such that if Φ is the solution of ( . ), then . O e following proposition is a one-step improvement result, which will be the basis of a Campanato iteration in eorem . . Note that the iteration is more complicated than in [GO ], because at each step we have to restrict the -optimal coupling to the smaller cross where the ∞ -bound holds to be able to apply the harmonic approximation result. As a consequence, we have to make sure that the qualitative bound ( . ) on the displacement (which is an important ingredient in obtaining quantitative version of the ∞ -bound ( . )) is propagated in each step of the iteration. 30 We start with the short proof of Lemma . , which is the starting point of each iteration step.
Given any coupling ∈ Π( Ω , Ω ), we can de ne := − Ω + . It is easy to see that is an admissible coupling between the measures and , hence by -optimality of , we obtain by the additivity of the cost functional with respect to the transference plan that is, Ω is a -optimal coupling between Ω and Ω .
As a direct application and as a further preparation for the one-step-improvement we present the short proof of Corollary . . 30 Alternatively, one could devise an argument based on the fact that the qualitative ∞ bound only depends on the cost through its global properties (C )-(C ) and that the set of cost functions considered in the iteration is relatively compact.
Proof of Corollary . . Let ∈ Π( , ) be -optimal. en by Lemma . the coupling := || == is a -optimal coupling between the measures and de ned via for any ⊆ R Borel. In particular, we have that If ⊆ 2 , then since for any ( , ) ∈ Spt , by assumption we have that | − | ≤ ≤ , so × ∩ Spt = ∅. Hence, Spt ⊆ 2 . Similarly, if ⊆ , then which implies that = on . By symmetry, the same properties hold for .
We now give the proof of the one-step improvement Proposition . , which is the working horse of the Campanato iteration.
Indeed, due to the ∞ -bound ( . ) the marginal measures are supported on 7 if E + D 1 (such that ≤ 1). Furthermore, from Corollary . we have that = 0 d and = 1 d on 6 , as well as ≤ 0 d , which implies that since by assumption ( . ) we may assume that [ 0 ] ≤ 1.

S
(Almost-minimality of || == 6 and applicability of the harmonic approximation eorem . ). We show next that the coupling || == 6 is an almost-minimizer of the Euclidean transport problem in the sense that for any ∈ Π( , ) there holds 1 2 where we also used that E 7 E. Note that in view of Lemma . the Eulerian description of || == 6 satis es ( . ). Together with S and S this implies that the assumptions of eorem . are ful lled.

S (Transformation of the displacement) We next show that for all
where the error is controlled by Indeed, by ( . ) we have and the second term, which will turn out to be an error term, can be bounded by We show next that 2 + ≈ + ∇Φ( + (1 − ) ), with an error that can be controlled. is relies on the fact that, where and Taylor approximation. Indeed, |∇ 3 Φ( ( + (1 − ) ))| | + (1 − ) | 2 ( . ) 31 e rst inequality follows from 1 ≥ 1 2 and the Lipschitz continuity of the function ↦ → 1 away from zero.
32 Note that we have not yet xed .
. P -To lighten the notation in this subsection, let us set We are now in the position to give the proof of our main -regularity theorem, which we restate for the reader's convenience: eorem . . Assume that (C )-(C ) hold and that 0 (0) = 1 (0) = 1, as well as ∇ (0, 0) = −I. Assume further that 0 is in the interior of × .
Proof of eorem . . To simplify notation, we write E for E ( ) and H for H ( 0 , 1 , ). Note that since 0 (0) = 1 (0) = 1 and ,4 is small by ( . ), we may assume throughout the proof that 1 2 ≤ 0 , 1 ≤ 2 on 4 . S (Control of the full energy at scale 2 ). We show that under assumption ( . ) we can bound Indeed, from Remarks . and . , we know that ( . ) implies the ∞ bound Let us now prove that from which we get ∫ thus yielding ( . ). To this end, assume there exists ( , ) ∈ ( 4 × 2 ) ∩ Spt . Let also ∈ [ , ] ∩ 5 2 and ∈ R such that ( , ) ∈ Spt , see Figure . en by ( . ), ( . ) and ( . ), we have From ( . ), recalling the de nition ( . ) of and the fact that 1, we get, upon writing which together with ( . ) yields a contradiction, proving ( . ). In the following, S -S are devoted to prove that under the assumption the following Campanato estimate holds: VARIATIONAL APPROACH TO REGULARITY OF OPTIMAL TRANSPORT MAPS 2 4 5 2 F . e de nition of and in the proof of ( . ).
S (Iterating Proposition . ). We now show that we can iterate Proposition . a nite number of times.
Moreover, from ( . ), ( . ), and ( . ), we have the estimate so that Similarly, from ( . ), ( . ) and ( . ), Let us now compute 1 By ( . ), we obtain 1 from which ( . ) follows, concluding the proof of ( . ) S (Spt is contained in the graph of a function within ×R ). We claim that ( . ) implies the existence of a function : → such that In the following, we abbreviate To prove the claim, x 0 ∈ and notice that ( . ) implies that for any > 0 small enough, there S .A. It is easy to see that the in mum in ( . ) is a ained at some = ( 0 ) and = ( 0 ). Analogous to [Cam , Lemma .IV] one can show that there exist a matrix 0 = 0 ( 0 ) and a vector 0 = 0 ( 0 ) such that → 0 and → 0 as → 0 (uniformly in 0 ) with rates and We refer the reader to Appendix C for a proof of the convergences and ( . ). S .B. We claim that 1 ∫ Indeed, we can split 1 ∫ By de nition of , , we have 1 ∫ Using ( . ), 0 ≤ 2, and 0 ∈ , it follows that Finally, the last term in ( . ) is estimated by Le ing → 0 in the above estimates proves the claim ( . ). S .C. By disintegration, there exists a family of measures { } ∈ on such that 1 ∫ Since the le -hand side of ( . ) tends to zero as → 0 by Step .B, it follows that if 0 is a Lebesgue point, we must have .D. For any Lebesgue point 0 ∈ , de ne ( 0 ) := 0 ( 0 ) 0 + 0 ( 0 ). en the previous Step .C shows that ×R = (Id × ) # 0 , that is ( . ).
Remark . . e deterministic structure of the -optimal coupling, that is, the existence of such that = (Id × ) # 0 , is a classical result in optimal transportation. If we had used this result, the proof would have become shorter, as S would not have been needed.
Before we give the proof of Corollary . , let us remark that one can show the following variant of our qualitative ∞ bound on the displacement (Lemma . ): Lemma . . Assume that the cost function satis es (C )-(C ) and that ∇ (0, 0) = 0. Let be a -convex function.
ere exist Λ 0 < ∞ and 0 > 0 such that for all ≤ 0 for which Proof. Since is -convex, it is di erentiable a.e. For any ∈ 4 such that ∇ ( ) exists, let = c-exp (∇ ( )), that is, Let be de ned as in ( . ). en, using ∇ (0, 0) = 0, we have Being -convex, the function , and therefore also the function ↦ → ( ) − 1 2 | | 2 , is semi-convex, which implies that see Lemma A. in the appendix. By the closeness assumption on and (C ) we may therefore bound Steps and of the proof of Lemma . then imply that there exist Λ 0 < ∞ and 0 > 0 (depending on only through assumptions (C )-(C )) such that for all ≤ 0 we have that is, ( . ) holds.
Proof of Corollary . . By Lemma . there exist Λ 0 < ∞ and 0 > 0, depending only on the qualitative assumptions (C )-(C ) on such that for all ≤ 0 for which ( . ) holds, we have We claim that 1 − which immediately implies that ( . ) In particular, it follows by eorem . , that there exists a potentially smaller scale 0 ≤ 0 such that for all ≤ 0 for which ( . ) holds, we have that ∈ C 1, ( ) and ∇ satis es the bound ( . ). Applying ( . ) once more, we see that ( . ) holds.
To prove the claim ( . ), we appeal to semi-convexity of the -convex function (which implies semi-convexity of the function ↦ → ( ) − 1 2 | | 2 ), in particular Lemma A. , to bound It remains to estimate the la er term. To this end, notice that for a.e. ∈ 4 we have ∇ ( ) = −∇ ( , ( )), so that with the normalization assumption ∇ (0, 0) = 0 we may bound In view of ( . ), we may assume Using that by Lemma A. and the smallness assumption ( . ), we have is proves the claimed inequality ( . ). .
In this section we give a sketch of the proof of eorem . . One of the main di erences compared to the situation of eorem . is that our assumptions do not allow us to prove an ∞ bound on the displacement (which followed from ( −)monotonicity of Spt ). However, almost-minimality (on all scales) allows us to obtain an bound for arbitrarily large < ∞.
Proposition . . Assume that 0 , 1 ∈ 0, with 0 (0) = 1 (0) = 1. Let be an almost-minimizing transport map from = 0 d to = 1 d with Δ ≤ 1. Assume further that is invertible. en there exists a radius 1 = 1 ( 0 , 1 ) > 0 such that for any 6 ≤ 1 , implies that for any < ∞, e scale 1 below which the result holds depends on the global Hölder semi-norms [ 0 ] and [ 1 ] of the densities and the condition 1 ⊂ Spt 0 ∩ Spt 1 . e proof of Proposition . is given in Appendix D. Note that since −1 is also almostminimizing, the bound for −1 follows from applying Proposition D. to −1 . e estimate (for arbitrarily large < ∞) allows us to split the particle trajectories into two groups: • good trajectories that satisfy an ∞ bound on the displacement, corresponding to starting points in the set where := (E 6 ( ) + D 6 ( , )) for some ∈ (0, 1 +2 ), that we x in what follows, and • bad trajectories that are too long, corresponding to Due to the bound, the energy carried by bad trajectories is superlinearly small: By de nition of E 6 ( ) and , from which we see that, given ∈ (0, 1 +2 ), we may choose large enough so that the exponent is larger than 1. In particular, for any > 0 we may bound 1 provided E 6 ( ) 1. Once the bad trajectories have been removed, the good trajectories can be treated as before. More precisely, if we restrict the coupling to the set G × (G), then the resulting coupling is still deterministic and almost-minimizing with respect to quadratic cost (given its own boundary conditions). In particular, since G × (G) ⊂ || ≤ . e harmonic gradient eld allows us to de ne the a ne change of coordinates from Lemma . with = e 2 with = ∇ 2 Φ(0) and = ∇Φ(0) satisfying ( . ) (and = I) to obtain a new coupling = between the measures and from the (full) coupling .
We can now use the harmonic approximation result eorem . together with the harmonic estimates ( . ) to bound 1 For the bad trajectories we use the estimate ( . ) together with the bound +2 E 6 ( ) 2−2 + 2 ( [ 0 ] 2 ,6 + [ 1 ] 2 ,6 ) + E 6 ( ) 33 Using that the optimal coupling between G and ( G) is deterministic, so that one can appeal to almostminimality within the class of deterministic couplings.
34 Note that there is a slight mismatch in the power in the ∞ bound between the de nition of good trajectories and the se ing of [GHO ]. However, one can convince oneself easily that the results of [GHO ] still apply.
It remains to show that the transformed coupling is still almost-minimal on all small scales in the sense of De nition . . To this end, let ≤ 1 , ( 0 , 0 ) ∈ Spt , and Note that where 0 = 0 and 0 = −1 0 + . Since is almost-minimizing, it follows that hence is almost-minimizing among deterministic couplings with rate Assuming that Δ = 2 , together with the bounds on and from ( . ) this gives , 1 ) Δ , in particular the rate Δ exhibits the same behaviour as the Hölder seminorm of ∇ in ( . ) and shows that the one-step-improvement can be iterated down to arbitrarily small scales, yielding the 1, -regularity of in a ball with radius given by a fraction of .
As a corollary of eorem . , we obtain a variational proof of partial regularity for optimal transport maps proved in [DF ]. e changes of variables used to arrive to a normalized situation are exactly the same as in [DF ] and the argument to derive partial regularity from -regularity follows [GO ].
Because 0 and 1 are bounded and bounded away from zero, sends sets of measure 0 to sets of measure 0 so that | \ | = | \ | = 0. e goal is now to prove that and are open sets and that is a C 1, -di eomorphism between and . Fix 0 ∈ , then by ( . ), 0 := ( 0 ) ∈ . Up to translation, we may assume that 0 = 0 = 0. De ne so that ( ) = c-exp (∇ ( )), from which we know that is the -optimal transport map from 0 to 1 . By Alexandrov's eorem, there exist a symmetric matrix such that , we obtain 0 (0) = 1 (0). Up to dividing 0 and 1 by an equal constant, we may assume that 0 (0) = 1 (0) = 1. Moreover, with this change of variables, ( . ) turns into 1 Finally, is still C 2, and satis es Assumptions (C )-(C ) and since 0 and 1 are bounded and bounded away from zero, 0 and 1 are C 0, , and we have Hence by ( . ) and ( . ), we may apply eorem . to obtain that is C 1, in a neighborhood of zero. By Remark . , we also obtain that −1 is C 1, in a neighborhood of zero. Going back to the original map, this means that is a C 1, di eomorphism between a neighborhood of 0 and the neighborhood ( ) of ( 0 ). In particular, × ( ) ⊆ × so that and are both open and by ( . ), is a global C 1, di eomorphism between and .
A A. S A. . Properties of the support of couplings. e following lemma is an important ingredient in the proofs of our ∞ bounds on the displacement of couplings with -monotone support, Proposition . and Lemma . :  1, then the symmetric results hold, namely ( 2 × ) ∩ Spt ≠ ∅ and ( 7 × ( , )) ∩ Spt ≠ ∅ for all ∈ 5 and ∈ −1 .
Hence ( × 2 ) ∩ Spt ≠ ∅ provided that (E + 6 + D 6 ) 1 +2 . e next lemma, which is quite elementary, relates the support of a measure and the support of its push forward under an a ne transformation: Lemma A. . Let be a measure on R and set := # , where ( ) := + with ∈ R × invertible and ∈ R . en ∈ Spt ⇔ −1 ( ) ∈ Spt .
(D. ) S (Use of almost-minimality at scale ). Let us denote the displacement by := − Id. We claim that if 1 is small enough, for any ball := 12 ( 0 ) ⊂ 1 and any set ⊂ ( 0 ), we have 36 To see this, we start from the following pointwise identity, that holds for any map Φ, which is (a deformation of) a standard identity used to derive ( −)monotonicity of an optimal transport map. As usual in optimal transportation, such a monotonicity property follows from 36 We use the notation ⨏ for the average of a function on a set, with respect to the Lebesgue measure.