Convergence Rates for Discretized Monge–Ampère Equations and Quantitative Stability of Optimal Transport

In recent works—both experimental and theoretical—it has been shown how to use computational geometry to efficiently construct approximations to the optimal transport map between two given probability measures on Euclidean space, by discretizing one of the measures. Here we provide a quantitative convergence analysis for the solutions of the corresponding discretized Monge–Ampère equations. This yields H1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H^{1}$$\end{document}-converge rates, in terms of the corresponding spatial resolution h, of the discrete approximations of the optimal transport map, when the source measure is discretized and the target measure has bounded convex support. Periodic variants of the results are also established. The proofs are based on new quantitative stability results for optimal transport maps, shown using complex geometry.


Introduction
The theory of optimal transport [47], which was originally motivated by applications to logistics and economics, has generated a multitude of applications ranging from meteorology and cosmology to image processing and computer graphics in more recent years [44,45]. This has led to a rapidly expanding literature on numerical methods to construct optimal transport maps, using an appropriate discretization scheme. From the PDE point of view, this amounts to studying discretizations of the second boundary value problem for the Monge-Ampère operator. The present paper is concerned with a particular discretization scheme, known as semi-discrete optimal transport in the optimal transport literature (see [6] and references therein for other discretization schemes, based on finite differences). This approach uses computational geometry to compute a solution to the corresponding discretized Monge-Ampère equation and exhibits remarkable numerical performance, using a damped Newton iteration [34,38]. The convergence of the iteration toward the discrete solution was recently settled in [31], and one of the main aims of the present paper is to establish quantitative convergence rates of the discrete solutions, as the spatial resolution h tends to zero.

Background
Throughout the paper, we fix open bounded domains X and Y in R n with Y assumed convex and a probability measure ν on Y with a density which is uniformly bounded from below: We recall that in the case when μ is a probability measure on X which is also absolutely continuous with respect to dx, i.e., μ ∈ P ac (X ), then a map T in L ∞ (X , Y ) is said to be a transport map (with source μ and target ν) if and T is said to be an optimal transport map (with respect to the Euclidean cost function |x − y| 2 ), denoted by T ν μ , if it realizes the infimum defining the Wasserstein L 2 -distance W 2 (μ, ν) [47]: where the infimum ranges over all transport maps. By Brenier's theorem [14], there exists a unique optimal transport map T ν μ and it has the characteristic property of being a gradient map: T ν μ = ∇φ (in the almost everywhere sense) for a convex function φ on X , called the potential of T ν μ (see [47] for further background on optimal transport theory). The potential φ is the unique (modulo an additive constant) convex solution to the corresponding second boundary value problem for the Monge-Ampère operator: The sub-gradient image of φ is contained in the closure of Y , (∂φ)(X ) ⊆Ȳ (1.2) and φ solves the equation (1.3) where the Monge-Ampère measure M A g is the probability measure on X defined by M A g (φ) := g(∇φ) det(∇ 2 φ)dx (1.4) when φ is C 2 -smooth and the general definition, due to Alexandrov, is recalled in Sect. 2.1. We will say that (X , Y , μ, ν) is regular if the corresponding solution φ is in C 2 (X ). By Cafferelli's regularity results [17][18][19], this is the case if X and Y are assumed strictly convex with C 2 -boundary and the densities of μ and ν are Hölder continuous and strictly positive onX andȲ , respectively. Then, φ defines a classical solution of the corresponding PDE, and the corresponding optimal transport map ∇φ yields a diffeomorphism between the closures of X and Y . In fact, as is well known, for any probability measure μ, there exists a solution (in the weak sense of Alexandrov) of the corresponding second boundary value problem, which is uniquely determined up to normalization (Lemma 2.2). In the sequel, it will be convenient to use the normalization condition that the integral of a solution over (X , dx) vanishes.

Discretization Using Semi-discrete Transport
A time-honored approach for discretizing Monge-Ampère equations, which goes back to the classical work of Alexandrov on Minkowski type problems for convex bodies and polyhedra, amounts to replacing the given probability measure μ with a sequence of discrete measures converging weakly toward μ (see [1,Thm 7.3.2 and Section 7.6.2] and [5,Section 17]). A standard way to obtain such a sequence is to first discretize X by fixing a sequence of "point clouds" (x 1 , . . . , x N ) ∈ X N and a dual tessellation of X with N cells (C i ) N i=1 . This means that the union of C i cover X , x i ∈ C i and the intersection of different cells have zero Lebesgue measure. For example, given a point cloud the corresponding Voronoi tessellation of X provides a canonical dual tessellation of X . The "spatial resolution" of the discretization is quantified by where diam(C i ), denotes the diameter of the cell C i . The corresponding discretization of the measure μ is then defined by setting where we have used the subindex h to emphasize that we are focusing on the limit when h → 0 (see also Sect. 3.4 for other discretizations). This discretization scheme corresponds, from the point of view of optimal transport, to the notion of semi-discrete optimal transport (since it corresponds to optimal transport between the "continuous" measure ν and the discrete measure μ h ; see [31,34] and Sect. 5.2). From the point of view of numerics, this kind of discretization scheme was first introduced in the different setting of the Dirichlet problem in [43].

Convergence Rates for Discretized Monge-Ampère Equations
Given a point-cloud on X with spatial resolution h, we denote by φ h the normalized convex solution to the corresponding Monge-Ampère equation 1.3 with right-hand side given by the discrete measure μ h . It follows from a standard general convexity argument that in the limit when h → 0 (and hence N → ∞), the functions φ h converge uniformly toward the solution φ, and as a consequence, the gradients of φ h converge weakly toward the gradient of φ (in the sense of distributions; see Proposition 2.3). But the argument gives no control on the rate of convergence (in terms of h or N −1 ) and the main purpose of the present work is to provide such a result: More precisely, if ∇ 2 φ ≥ C −1 0 I , then where δ is the constant in formula 1.1). The result thus holds more generally as long as φ is uniformly convex and δ > 0.
As a consequence, if C P denotes the constant in the L 2 -Poincaré inequality on X (i.e., u L 2 (X ) ≤ C P ∇u L 2 (X ) for u with zero average), then From the point of view of optimal transport theory, the previous theorem says that the optimal transport map ∇φ from μ to ν (defining a diffeomorphism between the closures of X and Y ) may be quantitatively approximated by the L ∞ -maps ∇φ h . As explained in Sect. 5, the gradient maps ∇φ h are piecewise constant on the convex hull of the corresponding point cloud (more precisely, ∇φ h is constant on the facets of the weighted Delaunay tessellation of R n induced by φ h ).
We will also establish the following universal bound which applies in the general case. It yields, in particular, a quantitative approximation of the corresponding optimal transport map if μ has a density. Theorem 1.2 (General case) Let X and Y be bounded domains in R n with Y assumed convex and ν a probability measure of the form 1.1. Then, for any given probability measure μ on X , where the constant C 2 only depends on upper bounds on the diameters of X and Y and on positive lower bounds on the volume V (Y ) of Y and δ(:= sup Y (ν/dy)).
The previous theorems are obtained as straightforward consequences of the following analytic inequalities of independent interest (see Theorem 3.5 for the precise statements). Let φ 0 and φ 1 be two convex functions on X whose sub-gradient images are equal toȲ (without of loss of generality the functions may be assumed to be smooth). Then, for a constant C 0 depending on a positive lower bound on ∇ 2 φ 0 . Moreover, in general, for a constant C independent of φ 0 and φ 1 . Using that (1.11) where W 1 denotes the Wasserstein L 1 -distance on P(X ) and d(Y ) the diameter of Y , Theorems 1.1, 1.2 are deduced by setting φ 0 = φ and φ 1 = φ h , and noting that the right-hand side in formula 1.11 is bounded from above by h.

Quantitative Stability for Optimal Transport Maps
The combination of the inequalities 1.9, 1.10 with the inequality 1.11 may be succinctly formulated as a quantitative stability result for optimal transport maps: for any μ 1 ∈ P ac (X ). More generally, if the potential φ of T ν μ 0 is uniformly convex, then the inequality above holds with of Y and C 1 is as in formula 1.7. Moreover, if μ 0 and μ 1 have densities in L 2 (X ), then where C P is the Poincaré constant on X .
Recall that here T ν μ denotes the L 2 -optimal transport map from μ to ν. The power 1/2 in the first inequality of the previous theorem is sharp, as explained in Sect. 4.2.
In the general case, we have: for any pair μ 0 , μ 1 ∈ P ac (X ). More precisely, c 2 only depends on upper bounds on the diameters of X and Y and on positive lower bounds on the volume V (Y ) of Y and δ(:= sup Y (ν/dy)) Moreover, if μ 0 and μ 1 have densities in L 2 (X ), then for a constant c 3 depending on the same quantities as the constant c 2 .
Since W 1 ≤ W p , where W p denotes the L p -Wasserstein distance, the previous theorems also hold with W 1 replaced by W p .
Note that Theorems 1.3 and 1.4 imply Theorems 1.1 and 1.2, respectively. Indeed, even though the Monge-Ampère measure of φ h is discrete and hence the push-forward (∇φ h ) * μ h is ill-defined, one can first apply Theorems 1.3, 1.4 to a regularization φ h of φ h (e.g., obtained by convolution) and then let → 0. Anyhow, as explained above, all theorems above will be deduced from the analytic inequalities 1.9, 1.10.

Relations to Computational Geometry
An important motivation for the present work comes from the recent result in [31], showing that the vector φ h := (φ h (x 1 ), . . . φ h (x N )) ∈ R N -which solves a discrete variant of the Monge-Ampère equation (with target Y )-may be effectively computed using a damped Newton iteration on R N , which converges globally at a linear rate toward φ h (and locally at a quadratic rate if g is Lipschitz continuous). The iteration is defined in terms of computational geometry, and the restriction of the solution φ h to the convex hull X h of the points {x 1 , . . . , x N } can then be recovered as the piecewise affine function defined by convex hull of the discrete graph of φ h . From this computational point of view, Theorems 1.1, 1.2 above yield quantitative convergence results for the corresponding discrete objects defined on X h in the "continuous limit" when h → 0. This is explained in Sect. 5.

The Periodic Setting
Now assume that μ is a given Z n -periodic measure on R n normalized so that its total mass on a (or equivalently any) fundamental region X is equal to one (e.g., X = [0, 1[ n ). We then consider the corresponding Monge-Ampère equation 1.3 for a convex function φ on R n subject to the condition that ∂φ be periodic (which replaces the second boundary condition 1.2). Such a convex function will be called quasiperiodic.
In this periodic setting, Theorem 1.1 still applies (with X = [0, 1[ n ) (see Sect. 6). In terms of optimal transport, the induced diffeomorphism T φ of the torus (R/Z) n transporting μ to the Riemannian volume form on the flat torus is optimal with respect to the cost function d(x, y) 2 where d denotes the Riemannian distance function on the flat torus [16,30].

Proofs by Complexification
The proofs of the key analytic inequalities 1.9 and 1.10 use a complexification argument to first deduce the special case when g is constant from well-known inequalities in Kähler geometry and pluripotential theory (due to Aubin [2] and Blocki [12], respectively). Then, a separate variational argument is used to reduce the case of a general g to the special case of a constant g. The universal dependence in Theorem 1.2 is obtained by exploiting that a convex body Y induces a canonical toric Kähler-Einstein metric, whose analytical properties are controlled by upper bounds on the diameter of Y and the inverse of the volume of Y (thanks to the estimates in [25,32]).
One may wonder whether the use of complex geometry is really necessary? In a nutshell, the complexification method has two advantages: • The exterior algebra of positive forms can be can be leveraged • By a compactification argument (involving toric varieties), one may perform integration by parts without boundary terms.
(the second point is not needed in the periodic setting). The first point can without doubt be replaced by a suitable linear algebra of mixed real Monge-Ampère operators and mixed discriminants, etc. However, at least to the author, this appears to make the calculations rather unwieldy (but see [33] for a real formalism mimicking the complex formalism). Moreover, it seems likely that the second point can be circumvented by an appropriate choice of cut-off functions in R n . Anyhow, an important merit of the general complexification method employed in this paper is that it also opens the door for direct applications of other results in complex geometry to the second boundary value problem for the real Monge-Ampère equation and optimal transport. This method is, in spirit, similar to Gromov's approach [28] to the Brunn-Minkowski and Alexander-Fenchel inequalities for convex bodies, which also exploits the complex geometry of toric varieties.

Comparison with Previous Results
There seem to be no prior results giving convergence rates (in any norms) for the functions φ h or the vectors φ h in the limit when h → 0 (even in the model case of a uniform grid on the torus). Another quantitative stability result for optimal transport maps, in the regular setting, has previously been obtained by Ambrosio (reported in [26] where it was shown to be sharp). Translated into the present notation the result in [26] shows that the inequality in Theorem 1.3 holds if the L ∞ -mappings T ν μ i are replaced by their inverse, i.e., by the optimal transport maps T μ i ν and the L 1 -Wasserstein distance W 1 is replaced with the (weaker) L 2 -Wasserstein distance W 2 (see [26,Cor 3.4] and its proof). In other words, while Theorems 1.3, 1.4 above concern stability of optimal transport maps with respect to variations of the source measure, the result [26,Cor 3.4] concerns variations with respect to the target measure. Since T μ ν = ∇φ * , where φ * denotes the Legendre transform of the solution φ to the second boundary value problem discussed above, the stability result in [26,Cor 3.4] can be used to obtain the same rates O(h 1/2 ) for the H 1 -norm of the difference of Legendre transforms φ * h − φ * . In the different setting of the Oliker-Prussner discretization of the Dirichlet problem for the Monge-Ampère equation [43] with g = 1 (where the second boundary condition 1.2 is replaced by the vanishing of the solution φ at the boundary of X ), convergence rates for L ∞ -norms were established in [42] at a rate O(h α ) under the assumption that the solution φ ∈ C 2,α (X ). The proofs are based on a combination of discrete Alexandroff L ∞ -estimates with Brunn-Minkowski inequalities (see also the recent work [22] where an optimal rate O(h 2 ) is shown, assuming bounds on the fourth-order derivatives of φ). Moreover, in [41] the L ∞ -rates in [42] and [22] were used to obtain rates for W 2 p -norms.

Comparison with Subsequent Developments
After the preprint version on ArXiv of the present paper had appeared, several interesting new developments have emerged. In [39,Theorem 2.3], the stability result for optimal transport maps in [26, Cor 3.4] (discussed above) was sharpened by replacing W 2 with W 1 , thus providing an analog of Theorem 1.3 when variations of the target measure are considered instead. The proof of [39, Theorem 2.3] is based on an inequality of the form 1.9, but with φ 0 and φ 1 replaced by their Legendre transforms φ * 0 and φ * 1 in the left-hand side (the constant also depends on a strict lower bound on ∇ 2 φ, i.e., an upper bound on ∇ 2 φ * ; see [39,Lemma 2.4]). The latter inequality is shown using an elegant convexity argument (see also [4] for a different Riemannian generalization of [26,Cor 3.4]). Moreover, for general μ, but with the density of ν assumed constant, an analog of the first inequality in Theorem 1.4 is established, when variations of the target measure are considered; see [39,Theorem 3.1]. The power of W 1 obtained in [39, Theorem 3.1] is equal to 2/15. Remarkably, the power is thus independent of the dimension n (which is important for the applications to machine learning considered in [39]). Moreover, an analog of the second inequality in Theorem 1.4 is obtained with the L 2 -norm in the right-hand side replaced by an L 1 -norm (i.e., the total variation distance) raised to the power 1/5. One is thus naturally led to ask whether there is a relation between the quantitative stability with respect to variations of the target and the source, respectively? The answer is affirmative, as follows from arguments in the appendix of [39]. However, when passing from one type of the inequalities, the argument unfortunately diminishes the exponent of W 1 ; it gets divided by (n + 2). (Hence, the exponent in Theorem 1.4 is larger than the one implied by [39,Theorem 3.1], as long as n ≤ 5.) This is briefly explained in Sect. 4.3.
In the very recent work [35], a variant of the H 1 -estimate in Theorem 1.1 is obtained (see [35,Cor 4.2,formula 4.8]). The main difference between the H 1 -estimate in Theorem 1.1 and the one in [35] is that in [35] the values in Y of ∇φ h , which are constant on the facets of the weighted Delaunay tessellation of X , induced by φ h , are replaced by the barycenters of the facets ∂φ h (x i ) of the dual tessellation of Y . Moreover, while the constant in the estimate in Theorem 1.1 depends on a strictly lower bound on ∇ 2 φ, the constant in [35] depends on an upper bound of ∇ 2 φ. The proof in [35] is based on a generalization of the quantitative stability result for the optimal transport map T μ ν in [26, Cor 3.4] (discussed above) to optimal transport plans. The results in [35] also hold for fully discrete approximation schemes (see [35,Thm 5.1]). See the end of Section 4 in [35] for a detailed comparison between Theorem 1.1 (and Theorem 5.4) and the results in [35].

Organization
We start in Sect. 2 with preliminaries from convex and complex analysis. Since the complex analytic side may not be familiar to some readers, a rather thorough presentation is provided. In Sect. 3, the theorems stated in the introduction are proven, starting with the special case when the density g is constant on Y and finally reducing to the special case. In Sect. 4, the relations to quantitative stability of optimal transport maps are spelled out. The relations to computational geometry are explained in Sect. 5.4, based on Proposition 5.1. In the final section, the periodic setting is considered.

Convex Analytic Notions
Given a convex function φ on R n taking values in ]−∞, ∞] (and not identically equal to ∞), we denote by ∂φ its subgradient, i.e., the set-valued function on R n defined by for any Borel subset B of R n . This yields a well-defined measure on R n . Indeed, introducing the Legendre transform of φ(x), i.e., the convex function φ * defined by the following formula holds: i.e., M A(φ) is the measure obtained as the push-forward of the Lebesgue measure dy under the L ∞ loc -map ∇φ * (the formula follows from point 2 and 3 below). We recall the following basic properties which hold for a given lower semicontinuous (lsc) convex function φ : R n →] − ∞, ∞]: The function φ is piecewise affine on R n if and only if it is the max of a finite number of affine functions, i.e., there exists y 1 , . . . , y M ∈ R n (assumed distinct) and c 1 , . . . , c M in R such that  A reference for point 1-5 is [47] and 6 could be taken as the definition of a convex piecewise affine function and then point 7 follows readily. As for point 8, it can be shown using the following observation: x is not in a 0-dimensional cell if and only if the convex hull C x (= ∂φ(x)) of the vectors y i corresponding to the facets F i containing x has dimension p < n (as can be seen by identifying the F i s with intersecting pieces of hyperplanes in the graph of φ in R n+1 and C x with the corresponding normal cone at (x, φ(x)). As a consequence, if x is not in a 0-dimensional cell, then there exists a whole neighborhood U of x having the latter property and hence (∂φ (as follows from properties 3 and 5 in the previous section). A leading role in the present paper will be played by the subspace C Y (R n ) + of C Y (R n ) consisting of the convex functions φ on R n with "maximal growth" in the sense that the reversed version of the previous inequality also holds: is a probability measure (by point 3 in the previous section). The converse is not true in general, but we will have great use the fact that if is a probability measure and moreover M A(φ) has compact support, then φ is in C Y (R n ) + , as follows from the following lemma: where C n,q only depends on n and q and d(Y ) and V (Y ) denote the diameter and volume of Y , respectively.
Proof Following the argument in the proof of [7, Prop 2.2], we denote by v := φ * is the Legendre transform of φ, which defines a convex function on the interior of Y . By the Sobolev inequality for the embedding [40,Thm 4.4]). In general, as explained in [7, Finally, the proof is concluded by observing that |∇v(y) (which follows directly from the fact that the transformation φ → φ * is decreasing and involutive).

The Second Boundary Value Problem for the Monge-Ampère Equation
We next recall some basic properties of the second boundary value problem for the Monge-Ampère equation, introduced in Sect. 1.1. We thus let X and Y be bounded domains in R n , with Y bounded and convex. Given a function g with support Y such that g ∈ L 1 (Y , dy) the corresponding "g-Monge-Ampère measure" gdy, (see [37, Section 3] for a more general setting involving a cost function c). In particular, if φ is smooth, then M A g (φ) is given by formula 1.4. In the case when gdy has unit integral we denote by ν the corresponding probability measure: ν := gdy Lemma 2.2 Given a probability measure μ with compact support contained in X , a solution φ X to the corresponding second boundary value problem 1.3, 1.2 exists and is uniquely determined (mod R). Moreover, φ X is equal to the restriction to X of φ R n (mod R).
Proof This result goes back to Alexandrov's classical work on Monge-Ampère equations, but for the convenience of the reader, we show here how to deduce the result from Brenier's theorem [14]. Given a closed subset F in R n denote by χ F the functions which is equal to 0 on F and ∞ on the complement of F. In the proof, we will refer to points 1-8 in Sect. 2.1. Set ψ X := (χ X + φ X ) * (where we have used that φ X extends uniquely to a Lipschitz continuous function on the closure of X ). Since μ and ν have the same mass, the second boundary condition 1.2 implies that ∂φ X (R n ) =Ȳ . Hence, ψ X is finite on Y (by point 3) and the MA-equation 1.3 is equivalent to By Brenier's uniqueness theorem [14], the latter equation determines the L ∞ -map ∇ψ X a.e ν. Since the support Y of ν is connected, the restriction ψ of ψ X to Y is thus uniquely determined (mod R). Now, by point 1 Hence, replacing X with R n reveals that the corresponding two solutions coincide on X (mod R) as desired. Also note that Brenier's existence theorem says that given μ and ν, there exists some convex function ψ on Y satisfying Eq. 2.5. Hence, defining φ X in terms of ψ as above yields the existence of a solution to the second boundary value problem.
The uniqueness property in the previous lemma implies the following qualitative stability result: Proposition 2.3 (Qualitative stability) Using the setup in the previous lemma let φ h be a solution to the second boundary value problem obtained by replacing μ with μ h where μ h is a family of probability measures on X converging weakly toward μ as h → 0. Then, ∇φ h converges weakly toward ∇φ X (more precisely, all the distributional derivatives of φ h converge weakly toward the distributional derivatives of φ X ).
Proof Let φ h be the convex continuous solution on R n normalized by φ h (x 0 ) = 0 for a fixed point x 0 ∈ X . Since ∂φ h ⊂ Y and Y is assumed bounded, it follows from the Arzela-Ascoli compactness theorem that there exists a subsequence φ h j converging uniformly to a convex function φ ∞ on X as j → ∞. By standard continuity properties of Monge-Ampère operators, it follows that M A ν (φ ∞ ) = μ (see [37,Corollary 3.1] for a more general result). But then we can apply the uniqueness property in the previous theorem to conclude that φ ∞ = φ, where φ is the unique solution to 1.3, 1.2 on R n satisfying φ(x 0 ) = 0 and hence the whole family φ h converges locally uniformly on R n toward φ. The last statement then follows directly from the definition of distributional derivatives.

Complex Analytic Notions
For the benefit of the reader lacking background in complex analysis and geometry, we provide a (hopefully user-friendly) recap of some complex analytic notions (see the book [24] for further general background and [7] for the case of toric varieties). Setting z := x + iy ∈ C n the space 1 (C n ) of all complex one-forms on C n decomposes as a sum of the two subspaces spanned by {dz i } and {dz i }, respectively. This induces a decomposition of the exterior algebra of all complex differential forms · (C n ) into forms of bidegree ( p, q), where p ≤ n and q ≤ n. Accordingly, the exterior derivative d decomposes as d = ∂ +∂, where and taking its complex conjugate defines the (0, 1)-form∂φ. In particular, defines a real (1, 1)-form (this normalization turns out to be useful, as illustrated by Example 2.4, 2.4). Such a smooth form ω is said to be positive (Kähler), written as ω ≥ 0 (ω > 0), if the corresponding Hermitian matrix is semi-positive (positive definite) at any point. Positivity can also be defined for general ( p, p)-forms, but for the purpose of the present paper it is enough to know that where the last inequality holds in the sense of measures.
loc is said to be psh if it is strongly upper semi-continuous and ω φ ≥ 0 holds in the weak sense of currents.

Example 2.4
The normalization used in the definition of ω φ ensures that when n = 1 the measure ω φ on C is the Dirac measure at 0 ∈ C in the case when φ = log |z| 2 .
More generally, given a real (1, 1)-form ω 0 a function u said to be ω 0 -psh if If ω 0 := ω φ 0 , this means that φ is psh if and only if u := φ − φ 0 is ω 0 -psh. When φ i for i = 1, . . . , p is psh and in L ∞ loc the positive closed ( p, p)-currents is defined by the local pluripotential theory of Bedford-Taylor. The current does not charge pluripolar subsets (i.e., sets locally contained in the −∞-locus of a psh function) and in particular not analytic subvarieties. Accordingly, the Monge-Ampère measure of a locally bounded psh function φ(z), is defined by the measure We also recall that if φ 0 and φ 1 are as above, then the positive current i∂

Metrics on Line Bundles and ! 0 -psh Functions
The local complex analytic notions above naturally extend to the global setting where C n is replaced by a complex manifold (since the decomposition 2.6 is invariant under a holomorphic change of coordinates). However, if X is compact, then all psh functions φ on X are constant (by the maximum principle). Instead, the role of a (say, smooth) psh function φ on C n is played by a positively curved metric on a line bundle L → X (using additive notations for metrics). To briefly explain this first recall that a line bundle L over a complex manifold X is, by definition, a complex manifold (called the total space of L) with a surjective holomorphic map π to X such that the fibers L x of π are one-dimensional complex vector spaces and such that π is locally trivial. In other words, any point x ∈ X admits a neighborhood U such that π : L → U is (equivariantly) isomorphic to the trivial projection U × C → U . Fixing such an isomorphism, holomorphic sections of L → U may be identified with holomorphic functions on U . In particular, the function 1 over U corresponds to a non-vanishing holomorphic section s U of L → U . Now, a smooth (Hermitian) metric · on the line bundle L is, by definition, a smooth family of Hermitian metrics on the onedimensional complex subspaces L x , i.e., a one-homogeneous function on the total space of the dual line bundle L * which vanishes precisely on the zero-section. Given a covering U i of X and trivializations of L → U i , a metric · on L may be represented by the following family of local functions φ U i on U i : (accordingly a metric on L is often, in additive notation, denoted by the symbol φ).
Even if the functions φ U i do not agree on overlaps, the (normalized) curvature form ω of the metric · is a well-defined closed two-form on X , locally defined by Singular metrics on L may be defined in a similar way. In particular, a singular metrics is said to have positive curvature if the local functions φ U i are psh, i.e., the corresponding curvature form ω defines a positive (1, 1)-current on X . The difference of two metrics, written as φ 1 − φ 2 in additive notation, is always a globally welldefined function on X (as a consequence, the curvature currents of any two metrics on L are cohomologous and represent the first Chern class c 1 (L) ∈ H 2 (X , Z)). Fixing a reference metric φ 0 and setting u := φ − φ 0 , this means that the space all metrics φ on L with positive curvature current may be identified with the space of all ω 0 -psh functions u on X .
comes with a tautological line bundle whose total space is C m+1 and the line over defining a smooth metric with strictly positive curvature form. Another T n -invariant metric on O(1) with positive curvature current, which is continuous, but not smooth, is the "max-metric" defined by the one-homogeneous psh function max{|Z 0 |, . . . , |Z n |}, which may be represented by Given any complex subvariety X P m , one obtains a line bundle L over X by restricting O(1) to X (and a smooth positively curved metric φ by restricting the Fubini-Study metric). If X is singular then, by Hironaka's theorem it admits a smooth resolution, i.e., a smooth compact manifold X with surjective and generically one-toone projection π to X . Pulling back L by π yields a line bundle L over X (endowed with a smooth metric φ ).

The Toric Variety Associated to a Moment Polytope Y
Let Y be a bounded closed convex polytope with non-empty interior and rational vertices. It determines a compact toric complex analytic variety X C and an ample line bundle L over X C . More precisely, this is the case if the rational polytope Y is replaced by the integer polytope kY for a sufficiently large positive integer k. But since scalings of Y will be harmless we may as well assume that k = 1. Then, X C may be defined as the closure of the image of the holomorphic (algebraic) embedding defined by C * n → P M−1 z → [z m 0 : · · · : z m M−1 ], (2.9) using multinomial notation, where m 0 , . . . , m M−1 label the integer vectors in Y . The line bundle L is then simply defined as the restriction to X C of the hyperplane line bundle O(1) on P M−1 . By construction, the following holds: • X C may be embedded in a complex projective space P N in such a way that L coincides with the restriction to X C of the hyperplane line bundle O(1) on P N . • The standard action of the real n-dimensional torus T n on C * n extends to a holomorphic action of X C which lifts to the line bundle L.
By the first point above, we can identify C * n with an open dense subset of X C , whose complement in X C is an analytic subvariety. Now, the key point is that the function φ(x) is in the class C Y (R n ) (Sect. Proof Since the lemma will play a key role in the proof of the main results, we recall, for the benefit of the reader, the simple proof. Without loss of generality, we can assume that m 0 := 0 is in Y . By construction, the "max metric" on O(1) → P M−1 restricts to a continuous (and in particularly bounded) positively curved metric on L → X C . The map 2.9 may be factored as C * n → C N → P M−1 . Hence, the standard trivialization of O(1) over the affine piece C N pulls back to give a trivialization of L over C * n , where the restricted max metric is represented by φ Y (x), when expressed in the logarithmic coordinates x on R n . Now, any other L ∞ -metric φ on L → X C satisfies φ − φ Y ∈ L ∞ (X C ) and, as a consequence, restricting to C * n and switching to the real logarithmic coordinate x shows that φ( Since C * n is dense in X C and u ≤ C, it admits a canonical upper semi-continuous (usc) extension to all of X C (namely, the smallest one). Since φ Y (z) extends to define a continuous metric on L → X C , this means that φ extends from C * n to a canonical usc metric on L → X C (in additive notation) which has a positive curvature current on C * n X C . But the complement X C −C * n is an analytic subvariety of X C and hence it follows from basic local extension properties of psh functions that the corresponding metric on L → X C has positive curvature current.
We note that for any φ as above the volume of Y . Indeed, since M A C (φ) does not charge analytic subvarieties, the integral can be restricted to C * n and then formula 2.13 can be invoked.

Complex vs Real Notions
If φ(z) = φ(x), i.e., φ is independent of the y-variable, then φ(z) is psh on C n if and only if φ(x) is convex on R n , as follows directly from the relation: the standard Kähler form on C n . The form ω 0 descends to the Abelian variety C n /(Z+ iZ).
The real torus T n acts transitively on the fibers of the map Log. Pulling back a convex function φ(x) on R n by Log yields a T n -invariant function on C * n that we will, abusing notation slightly, denote by φ(z). The function φ(x) is convex if and only if the corresponding function φ(z) on C * n is plurisubharmonic. This can be seen by identifying C * n with C n /iπ Z n (= R n + iπ T n ) using the multivalued logarithmic holomorphic coordinate 2 log z, and proceeding as in Sect. 2.3. We will have great use for the following lemma: Lemma 2.7 Let φ(x) be a finite convex function on R n and denote by φ(z) the corresponding T n -invariant psh function on C * n defined as the pull-back of φ(x) under the Log map. Similarly, if u is a difference of two finite convex functions on R n we denote by u(z) the corresponding function on C * n . Then, the following two identities of measures on R n hold: is assumed to be smooth and strictly convex, where g φ denotes the Riemannian metric on R n defined by the symmetric Hessian matrix ∇ 2 φ and ∇u is defined almost everywhere with respect to d x.
Proof These formulas are essentially well known (in particular, the first one), but for completeness a proof is provided. First consider two smooth functions φ(z) and u(z) defined on C n with holomorphic coordinate z. Assume that φ(x + iy) and u(x + iy) are independent of y. Then, and Indeed, without loss of generality, we may assume that φ(x) = λ i |x i | 2 (since it is enough to prove the formulas at a fixed point, which may be taken as x = 0 and since the formulas are invariant under z → Az for A ∈ SO(n)). Then, i and hence formula 2.14 follows directly. As for formula 2.15, it follows from noting that the term involving dz i ∧ dz i in ∂u ∧∂u only gives a nonzero contribution when it encounters the product of all terms in ∂∂φ not involving the index i. Indeed, this gives which proves formula 2.15. Then, by standard local approximation arguments, formula 2.14 extends to the case when φ is non-smooth and formula 2.15 to the case when u is non-smooth. Finally, consider C * n and denote by w i its standard holomorphic coordinates. This means that z i := 2 log w i is multivalued on C * n and the real part of z is equal to Log (w). But locally z defines holomorphic coordinates on C * n to which formula 2.14 and formula 2.15 apply. Moreover, we can identify dy/(4π) n with the standard invariant probability measure on the T n -fiber of the Log map from C * n to R n . Since the push-forward operation amounts to integration along the fibers, the formulas in the lemma thus follow from formula 2.14 and formula 2.15.

Proof of Theorems 1.1, 1.2
Let X and Y be open bounded domains in R n , with Y convex. Assume given a positive function g ∈ L 1 (Y , dy) such Recall that M A g (φ) denotes the "g-Monge-Ampère measure" of a given convex function φ, defined in Sect. 2.1.2. When g = 1, we simply write M A g = M A.

The Key Analytic Inequalities When g = 1
We first consider when g = 1, starting with the case of a uniformly convex φ 0 .
In order to prove this, we start with some preparations. First, by Lemma 2.7 holds for any smooth convex function φ on X . Next, we will use the same notation φ 0 and φ 1 for the canonical extensions to R n solving the corresponding second boundary problem on all of R n (as in Lemma 2.2). In general, if φ(x) is a convex function on R n , we will denote by φ(z) the corresponding T n -invariant plurisubharmonic function on C * n , obtained by pulling back φ to C * n using the log map (abusing notation slightly, as in Sect. 2.3.1). It will be enough to prove the following identity: Indeed, first applying formula 3.2 to φ = |x| 2 /2 a gives using that, by assumption, ω φ 0 ≥ C −1 0 ω φ . Finally, using that all the integrands in the rhs of formula 3.3 are non-negative will then conclude the proof. In order to prove formula 3.3 first note that as follows directly from the algebraic identity This means that if integration by parts can be justified, then the desired identity 3.3 follows. However, the non-compactness of C * n poses non-trivial difficulties, so instead of working directly on C * n we will use a compactification argument, which applies when Y is a rational polytope (the general case will then follow by approximation).

Compactification When Y is a Convex Polytope with Rational Vertices
Let X C be the compact toric complex analytic variety determined by the moment polytopeȲ and denote by L the corresponding ample line bundle over X C (Sect. 2.2.2). By Lemma 2.1, combined with Lemma 2.6, the T n -invariant psh functions φ 0 (z) and φ 1 (z) on C * n extend to define L ∞ -metrics on the line bundle L → X C with positive curvature currents ω φ 0 and ω φ 1 . In particular, φ 0 − φ 1 ∈ L ∞ (X C ). We claim that This is a well-known identity in Kähler geometry (in the smooth case) and global pluripotential theory (in the general singular L ∞ -case) and follows from the general integration by parts formula for psh functions in L ∞ loc in [13, Thm 1.14] (see [9, Formula 2.9]). But for the benefit of the reader, we provide the following alternative proof. First, assume that the metrics φ i on L are smooth (i.e., the restrictions to L → X of smooth metrics on O(1) → P N ). Expanding point-wise, as in formula 3.4 and using that the form := (ω φ 0 ) n− j ∧ (ω φ 1 ) j is closed formula 3.5 then follows from Stokes formula if X C is non-singular and from Stokes formula on a non-singular resolution of X C if X C is singular. To handle the general case, we invoke the general fact that any (possibly singular) metric on an ample line bundle L over a projective complex variety can be written as a decreasing limit of smooth metrics [15] (in fact, this can be shown by a simple direct argument in the present toric setting). Formula 3.5 then follows from the previous smooth case, combined with the continuity of expressions of the form appearing in formula 3.5 under decreasing limits (see [13,Prop 2.8] and its proof for much more general convergence results).

Conclusion of Proof of Proposition 3.1
Let Y be as in the previous step. Since the complex Monge-Ampère measures of locally bounded psh functions do not charge complex analytic subvarieties, formula 3.5 on C * n follows from formula 3.5 on X C . Since each term in the right-hand side above is non-negative, we deduce that By 2.13 and 3.2, this proves formula 3.5 when Y is a convex polytope with rational vertices. In the general case, we can write Y is an increasing limit of rational convex polytopes Y j . Then, the general case follows from the previous case and basic qualitative stability properties for the solution of the second boundary value problem for M A on X with respect to variations of the target domain Y .

The Key Inequality for General 0 and 1
We next turn to the key analytic inequality in the case of general φ 0 , but still with g = 1.

Proposition 3.2 Given bounded open domains X and Y with Y assumed convex,
assume that φ 0 and φ 1 are convex functions on X , such that the closures of the sub-gradient images (∂φ i )(X ) are equal to Y . Then, there exists a constant C only depending on X and Y such that

More precisely, the constant C only depends on upper bounds on the the diameters of X and Y , d(X ) and d(Y ) and a positive lower bound on the volume V (Y ) of Y
Proof It will be enough to show that for some positive constants A 1 and A 2 . In fact, setting we will show that C Y only depends on upper bounds on R(X ), R(Y ) and To this end we may, without loss of generality, assume that φ 0 and φ 1 are supnormalized as in Lemma 2.1. Fix a sequence of rational convex polytopes Y j increasing to Y and some (say smooth) functions uniformly on R n . Just as in the proof of Theorem 1.1, it will be enough to prove that the inequality in the proposition holds when Y is a rational convex polytope and φ Y is replaced by φ Y j (by letting j → ∞ in the end). We will identify, just as in the proof of Theorem 1.1 a convex function φ(x) on R n with a psh function φ(z) on C * n . Setting ω Y j := ω φ Y j , it will thus be enough to show that We will deduce this inequality from the following inequality of Blocki [12] (see also [10, Lemma A.1] for a more general inequality). Let (M, ω) be a compact Kähler manifold and u 0 and u 1 ω-psh functions on M in L ∞ (X ). Then, there exists a constant C M such that The constant C M only depends (in a continuous fashion) on upper bounds on the L ∞norms of u 0 and u 1 and the volume of ω. More generally, exactly the same proof as in [12] shows that the inequality holds more generally when ω is a semi-positive form with an L ∞ -potential and positive volume. In the present setting, we set originally defined on C * n . Just as in the proof of Theorem 1.1, the functions φ 0 , φ 1 and φ Y j may be identified with L ∞ -metrics on the ample line bundle L → M j over the toric variety M j determined by Y j . Moreover, the corresponding currents extend to positive currents on M j . Passing to a smooth resolution we may as well assume that M j is smooth (and L semi-ample). Applying the inequality 3.9 on M j , all that remains is to verify that the corresponding constants C M j are under control. Applying Lemma 2.1 and using that, by assumption, Y j is contained in Y , gives, for a fixed q > n, Since the probability measure The same estimate also holds (for the same reason) for u 1 . Finally, since the left-hand side in the inequality 3.6 is invariant under φ Y → φ Y + c for c ∈ R, it follows that the constant C Y appearing in 3.6 only depends on upper bounds on φ Y through the quantity A 3 defined by formula 3.8. Moreover, as explained above C Y also depends on an upper bound on the volume of ω Y j on M j . But (by formula 2.10) this equals n! times the volume of Y j , which is bounded from above by a constant times d(Y ) n . All in all this proves the inequality 3.6 with the desired control on the constant C Y .

The Dependence on X and Y
In particular, taking b to be the barycenter of Y we may as well assume that 0 is the barycenter of Y . Similarly, since both sides of inequality in Proposition 3.2 are invariant under φ i (x) → φ i (x + a) for any a ∈ R n , we may as well assume that 0 ∈ X . It will then be enough to show that the auxiliary convex function φ Y used in the proof above can be chosen so that the corresponding constants A 1 , A 2 and A 3 only depend on upper bounds the outer radius R(X ) and R(Y ) of X and Y (defined as in formula 3.7) and 1/V (Y ). Indeed, since 0 ∈ Y and 0 ∈ X we have that R(Y ) ≤ d(Y ) and R(X ) ≤ d(X ).
We will take φ Y to be the unique smooth and strictly convex function φ in C Y (R n ) + solving (the first equation geometrically means that φ(z) is a Kähler potential for a Kähler-Einstein metric on C * n and the second one ensures that φ is uniquely determined).
Since we have assumed that 0 is the barycenter of Y , it follows from [7] that such a solution indeed exists (and is uniquely determined). Moreover, by [32] the solution satisfies Thus, all that remains is to verify the bounds on A 2 and A 3 . To this end, first note that it follows directly from the defining equation for φ that it is enough to establish an upper bound on A 3 . This will be accomplished by the following:

where the constant A only depends on upper bounds on R(Y
Proof This proposition is implicitly contained in [25], but since it is not stated explicitly there we briefly explain how to extract the bound in the proposition from the estimates in [25,Section 3.3]. To this end, denote by v(y) the convex function on Y defined by the Legendre transform of φ (which is denoted by u in [25] and Y is denoted by P there). The equation for φ translates into log det(∇v) = x · ∇v − v (the lhs is denoted by L in [25] and the rhs by h, but the constant C appearing in [25, formula 22] vanishes in the present setting). The assumption that ∇φ(0) = 0 equivalently means that ∇v(0) = 0 which, in turn, means that the infimum of v on Y is attained at y = 0. Set Step 2 in [25, Section 3.3] yields a lower bound on m in terms of an upper bound on R(Y ) and a lower bound on the volume of Y .
Step 4 gives an upper bound on m in terms of upper bounds on 1/R 0 (Y ) and 1/V (Y ). This implies that where κ is bounded from above by a constant only depending on upper bounds on 1/R 0 (Y ) and 1/V (Y ) (see [25, formula 24]). But then it follows that, for any q > 0, the moment R n |x| q M A(φ)/V is bounded from above by a constant only depending on upper bounds on R(Y ), 1/R 0 (Y ) and 1/V (Y ). Hence, by Lemma 2.1, the inequality in the proposition holds if sup R n (φ −φ Y ) is added to the right-hand side of the inequality. But, since sup R n (φ − φ Y ) = m, which as explained above is bounded from above by upper bounds on 1/R 0 (Y ) and 1/V (Y ), this proves the proposition.
Finally, the desired bound on A 3 follows from the following elementary lemma show-

Lemma 3.4 Let Y be a convex body and denote by b Y the barycenter of Y .
Assume that Y is contained in a ball B R of radius R. Then, where c n denotes the volume of the unit-sphere ∂ B 1 in R n .
Proof Denote by d(y) the function on Y defined as the distance of y ∈ Y to ∂Y and set Y := {d ≥ } for a given > 0. Decomposing b Y as a convex combination of and using that the function d(y) is concave (since it can be expressed as the infimum over the affine functions defined by the distances to the supporting hyperplanes of Y ) and non-negative gives By the monotonicity of the surface area of convex bodies (i.e., the classical fact that V (∂ P) ≤ V (∂ Q) if the convex body P is included in the convex body Q [46]), we have, assuming that Y − Y is non-empty, for any such > 0. Optimizing over , i.e.,

The Key Analytic Inequalities for General g
We next turn to the case of a general density g with a strict positive lower bound δ on Y (formula 3.1).

Theorem 3.5 Let X and Y be bounded open domains X in R n and assume that Y is
convex and endowed with a probability measure ν satisfying 1.1. Let φ 0 and φ 1 be convex functions on X , such that the closures of the sub-gradient images (∂φ i )(X ) are equal to Y . If there exists a positive constant C 0 such that ∇ 2 φ 0 ≥ C −1 0 I (in the sense of distributions), then In general, there exists a constant C independent of φ 0 and φ 1 such that

More precisely, the constant C only depends on upper bounds on the the diameters of X and Y , d(X ) and d(Y ) and a positive lower bound on the volume V (Y ) of Y
Fix φ 0 ∈ C Y (R n ) + (the class defined in Sect. 2.1.1) and set μ 0 := M A g (φ 0 ). Consider the following functional on C Y (R n ) + : In order to prove Theorem 3.5, it will, by Propositions 3.1, 3.2 be enough to prove the following: Proposition 3.6 Setting δ := inf Y g we have To this end, consider the following auxiliary functional on C Y (R n ) + : To simplify the notation, we will write I 1 = I and J 1 = J . We start with a number of lemmas.
Lemma 3.7 The Gateaux differential of J g at φ is represented by −M A g (φ) + μ 0 , i.e., fixing φ 1 and φ 2 in C Y (R n ) + and setting φ t : Proof This is essentially well known in the literature of optimal transport, where J g appears as the Kantorovich functional (up to a change of sign and additive constant). A simple direct proof is given in [7].

Remark 3.8
The formula for the differential of J g implies that when g = 1 and Y is a rational convex polytope, then J g coincides with Aubin's J -functional defined with respect to the curvature form ω 0 corresponding to φ 0 (see [2] for the smooth setting and [9] for the general singular setting).
Using the previous lemma, we next establish the following Lemma 3.9 The following inequality holds: By the previous lemma, we have for any t > 0. Hence, using that J g (t) is convex, by the previous lemma. Since I g (0) = J g (0)(= 0), it follows that I g (t) ≥ J g (t) for all t ≥ 0 and in particular for t = 1, which concludes the proof.
We next show that J g is controlled from below by J : Proof We continue with the notation in the proof of the previous lemma. By a standard approximation argument we may as well assume that φ and φ 0 are smooth and strictly convex. Moreover, as before it will be enough to consider the case when Y is a rational polytope. First observe that it will be enough to prove the following claim: Indeed, by Lemma 3.7 the derivatives of both J g (t) and J (t) vanish at t = 0. As a consequence, if the claim holds, then dt for all t and since J g (0) = J (0) = 0 it follows that J g (t) ≥ δ J (t) for all t and in particular for t = 1, proving the lemma. In order to prove the claim above, it is enough to establish the following formula: To this end first consider a general curve is smooth on R n+1 . We complexify R n by C * n (as in Sect. 2.3.1) and R by C (in the usual way). Denote by (z, τ ) the corresponding complex coordinates on C * n × C. Denoting by π the natural projection from C * n × C to C * n , we have the following general formula: on C, where M A C ( ) denotes complex Monge-Ampère measure of the smooth function in C * n × C. This formula is a special case of the more general formula [11, formula 2.11]. Next, we note that if φ t is affine, then M A C ( ) ≤ 0. Indeed, a direct calculation gives Using formula 3.2 this concludes the proof of formula 1.2.

Conclusion of Proof of Proposition 3.6
Combing the previous lemmas with Lemma 3.9 gives I g ≥ J g ≥ δ J . Hence, the proof is concluded by noting that J ≥ I /(n + 1). In the complex setting this is a well-known inequality in the Kähler geometry literature which goes back to Aubin [2]. For a simple proof see [9, formula 2.7]. The present real setting then follows from a complexification and approximation argument, just as before.

Proof of Theorems 1.1, 1.2
First note that the following simple lemma holds, where d(Y ) denotes the diameter of Y : Lemma 3.11 Assume that φ 0 and φ 1 are convex functions on R n whose gradient images are contained inȲ . Then, Proof This follows directly from the Kantorovich-Rubinstein formula where the sup runs over all Lipschitz continuous functions on X with Lipschitz constant one (by taking u = (φ 1 − φ 0 )/d(Y )).
Applying the key inequalities in Theorem 3.5 to φ 0 = φ and φ 1 = φ h , and invoking the previous lemma, all that remains is thus to verify that To this end, we rewrite the integral appearing in the right-hand side of formula 3.12 as Since u ∈ Lip 1 (X ), we trivially have where the last inequality follows directly from the very definition of h. Hence, W 1 (μ 0 , μ 1 ) is bounded from above by h times the total mass of μ, which proves the inequality 3.13. Similarly, combining the analytic inequalities in Theorem 3.5 with Lemma 3.11 and the inequality 3.13, concludes the Proof of Theorems 1.1, 1.2

Alternative Discretization Schemes When Has a Density
Assume for simplicity that the domain X has been normalized to have unit volume and fix a sequence of point clouds (x 1 , ..x N ) on X . In the case when μ has a density f it may, from a computational point of view, by more convenient to consider discretizations of μ h of the formμ for appropriate weights w i (independent of μ). For example, a convenient choice is to take w i to be equal to the volume of the intersection with X of the Voronoi cell corresponding to x i . We claim that the result in Theorem 1.1 then still holds if f is Lipschitz continuous. Indeed, by the proof of Theorem 1.1, it is enough to show that which, in turn, will follow from But the latter inequality follows from the identity 3.14 applied to μ = dx. Similarly, if f is merely assumed to be in the Hölder space C α (X ) for α < 1, then the same argument reveals the an analog of Theorem 1.1 holds, with the rate Ch 1/2 replaced by Ch α/2 . Finally, in the case when the point cloud {x i } is equal to the intersection of X with the grid (ZN −1/n ) n and the distance between the point cloud and the boundary of X is of the order O(N −1/n ), then a slight variant of the previous argument reveals that the same results as above hold, with h := N −1/n , when all weights w i in formula 3.15 are taken to be equal to 1/N .

Quantitative Stability of Optimal Transport Maps
Denote by P ac (R n ) the space of all probability measures on R n , which are absolutely continuous with respect to Lebesgue measure. Given μ and ν in P ac (R n ) with compact support X and Y , respectively, there exist optimal transport maps T ν μ and T μ ν in L ∞ (X , Y ) and L ∞ (Y , X ) transporting μ to ν and ν to μ, respectively [14,47]. The maps are uniquely determined and inverses of each other almost everywhere with respect to μ and ν, respectively.

Proof of Theorems 1.3, 1.4
By Brenier's theorem [14,47], we can write T ν μ i = ∇φ i for a convex function φ i on X such that ∂φ i (X ) = Y . The first inequality in Theorem 1.3 then follows directly from the first inequality in Theorem 3.5, combined with the inequality in Lemma 3.11. To prove the second inequality in 1.3 first note that combining first inequality in Theorem 3.5 with the Cauchy-Schwartz inequalities yields Applying, the Poincaré inequality to φ 0 − φ 1 (whose integral with respect to (X , dx) may be assumed to vanish) gives φ 0 − φ 1 L 2 (X ) ≤ C P ∇φ 0 − ∇φ 1 L 2 (X ) . Hence, dividing both sides of the inequality 4.1 with ∇φ 0 − ∇φ 1 L 2 (X ) concludes the proof of Theorem 1.3. Finally, Theorem 1.4 is proved in the same way, but now instead using the second inequality in Theorem 3.5.

Sharpness of the Exponent 1/2 in the Uniformly Convex Case
We next show that the exponent 1/2 in the first inequality in Theorem 1.3 is optimal.
We take X and Y to both be equal to (]0, 1[) n ⊂ R n and ν = dy.
where the second term ensures uniform convexity. Denote by S the scaling map on R defined by x → x. Then, for > 0, Hence, using the dual representation 3.12 for W 1 , On the other hand, denoting by H (x) the sign of x, we have ∇φ − ∇φ 0 This proves the equality in the lemma when n = 1. Finally, for n > 1 we simply set Hence, the curve ψ satisfies the equality in the lemma on X = (]0, 1[) n . Now, assume that the first inequality in Theorem 1.3 holds with the exponent 1/2 replaced by α for some given α > 0 and with a constant c only depending on a strict power lower bound on the uniform convexity of the convex potential of T 0 . This implies that there exists a constant C 0 , only depending on φ 0 , such that for any convex functions φ i such that ∂φ (X ) = Y with φ 0 uniformly convex. Indeed, this follows from applying the inequality in question to regularizations of ∇φ 0 and ∇φ 1 . But then the previous lemma forces α ≤ 1/2, as desired.

Variations with Respect to the Source vs the Target Measure
The two different types of quantitative stability results discussed in Sects. 1.7, 5.1 can be related thanks to the following lemma, essentially contained in the proof of [39, Cor 2.6]: Lemma 4.2 Assume that there exist α 1 > 0 and C 1 such that for any μ 0 , μ 1 ∈ P ac (X ) Then, there exists A 1 > 0 such that for any μ 0 , μ 1 ∈ P ac (X ), Conversely, assume that there exist α 2 > 0 and A 2 > 0 such that for any μ 0 , μ 1 ∈ P ac (X ) Then, there exists a constant C 2 > 0 such that for any μ 0 , μ 1 ∈ P ac (X ) Proof Recall that, using the notation in the proof of Theorems 1.3, 1.4 above, we can express T ν μ 0 = ∇φ i and T μ 0 ν = ∇φ * i , where φ * denotes the Legendre transform of φ. The proof of the lemma is now obtained by combining the fact that the Legendre transform preserves the L ∞ -norm together with the following two well-known inequalities on a bounded Lipschitz domain ⊂ R n for Lipschitz continuous functions u 0 and u 1 with Lip constant one, assumed convex in the second inequality: together with the Poincaré inequality on . We recall that the first inequality for u 0 −u 1 above is a special case of the Gagliardo-Nirenberg inequalities (see [36,Lemma 5.2] for a direct simple proof) and the second one is shown in [21,Thm 22]).

Formulation in Terms of Computational Geometry
In this section, we show how to use Theorem 1.1 to obtain a quantitative approximation of the solution φ to the second boundary problem for the Monge-Ampère operator, using computational geometry, as developed in [3,4,29,31,34,38]. A comparison with the standard notation from computational geometry and semi-discrete optimal transport is provided in Sects. 5.1 and 5.2, respectively.
Assume given a discrete probability measure on R n and a convex bounded domain Y of unit volume. In order to facilitate the comparison with the setup considered in the previous sections (in particular Sect. 1.1.1), the discrete probability measure in question will be denoted by μ h : The index h is thus used a reminder that the measure μ h is discrete. Let φ h be the (finite) convex function on R n (uniquely determined up to an additive constant) by which we identify with a function on R n which is equal to φ h on {x 1 , . . . , x N } and ∞ elsewhere. We will denote by X h the bounded convex polytope defined as the convex hull of {x 1 , . . . , x N } : We will refer to the following equation for a vector φ ∈ R N as the discrete Monge-Ampère equation (associated with the discrete measure μ h on R n and the measure ν on Y ) : are the cells in the partition of R n induced by the piecewise affine function φ * (see point 6 and 7 in Sect. 2.1).
By the following proposition, the previous equation is solved by the vector φ h and moreover, the graph of φ h over X h is simply the convex hull of the discrete graph over • On the convex hull X h of {x 1 , . . . , x N } the convex function φ h is piecewise affine and given by where {y i } M i=1 runs over the vertices in the polyhedral decomposition of Y determined by the convex piecewise affine function ψ h on Y . Moreover, the graph of φ over X h is equal to the convex hull of the graph of the discrete function φ(x i ) over X h . • The following duality relations hold:

and the multi-valued maps ∂φ h and ∂ψ h , which are inverses to each other, yield a bijection between the facets and the vertices of the induced tessellations of X h and Y .
Proof This is without doubt essentially well known (and it is, as discussed below, closely related to results in computational geometry). But for completeness a simple analytic proof is provided using the basic properties of the Legendre transform recalled in Sect. 2.1, points 1-8. First observe that as follows directly from the defining property of φ h combined with point 4. Hence, for almost any y ∈ Y there exists x i such that y ∈ ∂φ(x i ). But then formula 5.2 on Y follows from point 5. Now formula 5.1 follows from the very definition To prove formula 5.3 on X h , it will by the previous argument (with φ h replaced by ψ h ) be enough to show that But then it follows, from general principles, that on R n Now, denote byφ the convex function on X h whose graph is the convex hull of the discrete graph of φ h . Equivalently, this means that Hence, χ Kφ ≤ φ h on R n and φ h ≤φ on X h which implies that φ h =φ on X h , i.e., the graph of φ h is the convex hull of the discrete graph of φ h , as desired.

Remark 5.2
If Y is a polytope, it follows from classical results that φ h is piecewise affine on all of R n (see (see [5,Section 17.2]). However, this is not true in general, as illustrated by the case when Y is the unit-ball and μ h = δ 0 , so that φ h (x) = |x|.
By the next lemma, the solution φ h to the discrete Monge-Ampère equation 5.1 is, in fact, the unique solution (mod R) :

Lemma 5.3 Equation 5.1 admits a unique solution in R n /R.
Proof This is a consequence of the results in [29,31]: the solutions are critical point of the Kantorovich functional on R N , which in [29,31] is shown to be strictly concave on the subset K + of R N /R defined as all φ such that Vol ν (F φ * i ∩Y ) > 0 for i = 1, . . . , N (using properties of graph Laplacians). But it may be illuminating to point out that the uniqueness result is also a consequence of the following general result (applied to X = {x 1 , . . . x N } and μ = μ h ). Let ν be a probability measure of the form g Y dy with g > 0 and μ a probability measure on R n with support compact X . Assume that φ is a function on X such that (∇(χ X +φ) * ) * ν = μ. Then, φ is uniquely determined (mod R) a.e. with respect to μ (and moreover, equal to the restriction to X of a convex function on R n with subgradient image in Y ). Indeed, setting ψ := (χ Y +φ) * the argument in the proof of Lemma 5.3 gives that the convex function P Y φ := (χ Y +ψ) * (= (χ Y +φ) * ) on R n is uniquely determined and satisfies M A g (P Y φ) = μ. But, by general principles, we have that (5.5) and hence P Y φ = φ on X , showing that φ is uniquely determined (mod R) on X . The vanishing 5.5 can, for example, by shown by noting that P Y φ is the upper envelope of all functions ϕ in C Y (R) dominated by φ. Then, 5.5 follows from a local argument using the maximum principle for the Monge-Ampère operator (just as in the more general case of the complex Monge-Ampère operator considered in [8,Prop 2.10]).
There has been extensive numerical work on constructing solutions to the discrete Monge-Ampère equation 5.1 on R N , based on computational geometry [3,4,31,34,38]. In particular, it was shown in [31] that the solution φ h ∈ R N is the limit point of a damped Newton iteration, which converges globally at a linear rate (and locally at quadratic rate if g is Lipschitz continuous). Performing one step in the iteration, φ (m) → φ (m+1) , amounts to computing the convex hull in R n+1 of the discrete graph over {x 1 , . . . x N } defined by φ (m) (or equivalently, the corresponding power diagram in R n and its intersection with Y ; see Sect. 5.1).
Here we show that Theorems 1.1, 1.2 yield a quantitative rate of convergence, in the limit when the spatial resolution h → 0, for the solutions φ h and the corresponding piecewise constant maps T h from X h to R n defined as follows. Consider the tessellation of X h by the convex polytopes F * j obtained by projecting to R n the facets of the "upper boundary" of the convex hull in R n+1 of the discrete graph h of the solution φ h ∈ R N . Then, T h maps F * j to the dual vertex y j ∈ R n . In geometric terms, y j is the projection to R n of the corresponding normal of the graph h in R n+1 (normalized so that its projection to R is −1).
We now come back to the setup introduced in Sect. 1.2, where T ν μ denotes the optimal L ∞ -map transporting μ to ν and h denotes the spatial resolution of the discretization μ h of μ.
where T ν μ is the optimal diffeomorphism transporting μ to ν. In the general case, when X and Y are bounded domains in R n with Y assumed convex and ν a probability measure of the form 1.1 the estimates above hold if h is replaced by h 1/2 (n−1) and then the constant C only depends on upper bounds on the diameters of X and Y and on a positive lower bound on δ(:= sup Y (ν/dy)).
Proof Combing Theorems 1.1, 1.2 with Proposition 5.1 and Lemma 5.3 immediately yields the theorem.

Comparison with Duality in Computational Geometry
Proposition 5.1 is closely related to the well-known duality in computational geometry between weighted Voronoi tessellations (also known as Laguerre tessellations and power diagrams) of R n and weighted Delaunay tessellations [3] of convex polytopes (see also [29] which is the definition of the cells in the weighted Voronoi tessellation of R n associated to the weighted points (x i ; w i ) N i=1 ) [3]. The corresponding weighted Delaunay tessellation is defined as the projection to X h of the polyhedral cell-complex in R n+1 defined by the convex hull of the graph of φ h . The duality in Proposition 5.1 implies that the corresponding sub-gradient map ∂φ h maps the facets F * i of the weighed Delaunay tessellation of X h to the vertices y i of the weighted Voronoi tessellation of R n . More generally, p-dimensional faces correspond to (n − p)-dimensional faces [3]. Remark 5.5 By the duality above, computing a convex hull of a discrete graph over N points in R n is equivalent to computing the corresponding weighted Voronoi tessellation (as emphasized in [3]). For example, when n = 2 the time-complexity is O (N log N ). Moreover, if g is constant and Y is a polytope, then the computation of the volumes of the corresponding cells has time-complexity O(N ) (since the cells are convex polytopes). More generally, efficient computation of the volume can be done if the density g is piecewise affine. However, the Newton iteration referred to above (as opposed to a steepest descent iteration) also requires computing the volumes of the intersections of the cells [34,38]. This has worst-case time-complexity O(N 2 ), but the experimental findings in [34,38] indicate much better near linear time complexity.

Comparison with Semi-Discrete Optimal Transport
While the L ∞ -map T := ∇φ is the optimal map pushing forward the measure μ ∈ P as (X ) to ν ∈ P ac (Y ), T : X → Y , T * : μ → ν the push-forward of the discrete measure μ h under the piece-wise constant map is not even well defined. However, the inverse (T h ) −1 of T h is a well-defined L ∞map from Y to X h , pushing forward ν to the discrete measure μ h on X h . In fact, (T h ) −1 is the optimal such map, as follows from standard results in the theory of semidiscrete optimal transport (indeed, T −1 h = ∇φ * h , where φ * h is the Legendre transform of φ h ). However, the notation used here differs from the standard notation adopted in the literature on semi-discrete optimal transport [31]. Indeed, in the latter case the measure ν is viewed as the source measure. Accordingly, the space that is here called X is called Y in [31] and the measure μ that is treated as a source measure here is viewed as the target measure in [31], where it is called ν. Moreover, the function −φ h (x) + |x| 2 /2 on X , in the present notation, corresponds to the function ψ in the notation of [31] (when the cost function is taken to be |x − y| 2 /2). The mismatch between the notation here and the one in [31] is a reflection of the fact that here the emphasis is put on the solutions of the Monge-Ampère equations (and their discretizations), rather than on semi-discrete optimal transport maps.

The Periodic Setting
Assume given a Z n -periodic measures μ and ν on R n normalized so that their total mass on a (or equivalently any) fundamental region is equal to one. We will assume that ν has a positive density g with positive lower bound δ > 0. We will identify μ and ν with probability measures on the torus M := ( R Z ) n . We endow M with its standard flat Riemannian metric induced from the Euclidean metric on R n and denote by dx the corresponding volume form on M. Setting gives a bijection between quasi-convex functions φ on R n (i.e., such that ∂φ is periodic) and functions u ∈ C 0 (M) which are quasi-convex in the sense that ∇ 2 u + I ≥ 0 holds in the weak sense of currents.
When u ∈ C 2 (M) and μ = f dx, the Monge-Ampère 1.3 for a quasi-convex function φ on R n is equivalent to the following equation for u : g(I + ∇u(x)) det(I + ∇ 2 u)dx = f dx We will say that (μ, ν) is regular if such a solution exists. This is the case if both f and g are Hölder continuous and strictly positive [20]. In the case of general (μ, ν) there always exists a weak solution, in the sense of Alexandrov, which is unique modulo an additive constant [16].
Let now μ h be a family of discrete measures on M, defined as in formula 1.5 (with h denoting the corresponding mesh norm). Then, the following analog of Theorem 1. The constant C only depends on f through a positive lower bound on (∇ 2 u + I ). The analog of Theorem 1.2 also holds with a universal constant C only depending on the dimension n and δ −1 .
Proof As before it is enough to consider the case when g = 1. Setting M C := (C/Z + iZ) n (which defines an Abelian variety) and identifying the convex function 4πφ(x) on R n with a metric on the theta line bundle L → M C [27] this is proved as in the proof of Theorem 1.1 (and is, in fact, considerably simpler as no extension and compactification argument is needed). In the language of ω 0 -psh functions, this equivalently means that the quasi-convex function u(x) on M is identified with the ω 0 -psh function 4π u(z) on M C , where ω 0 is the standard invariant Kähler form on M C , defined by formula 2.12. Then, formula 3.5 holds on M C with φ i replaced by u i , as follows directly from Stokes theorem on M C . This implies the general case by a regularization argument, which is particularly simple in this setting as one can use ordinary convolutions. Finally, the estimate of W 1 (μ 0 , μ h ) proceeds precisely as before, using that any quasi-convex function on M has Lipschitz constant bounded from above by √ n (see [30,Lemma 9]).
An analog of Theorem 5.4 also holds in the periodic setting. In fact, the situation is facilitated by the fact that the solution φ h is piecewise affine on all of R n (since the convex hull of the corresponding periodic point-cloud covers all of R n ). As a consequence, the discrete Monge-Ampère equation 5.1 may, in the periodic setting, be reformulated as follows. Identify a vector φ ∈ R N with a quasi-periodic discrete function on the periodic discrete subset N of R n determined by the point-cloud {x 1 , . . . , x N }. We will say that the discrete function φ is convex if its graph over N coincides with the boundary vertices of its convex hull in R n+1 . Then, the Monge-Ampère equation for the quasi-periodic function φ h on R n is (almost tautologically) equivalent to the following discrete Monge-Ampère equation for a discrete quasiperiodic convex function φ h : where the discrete Monge-Ampère operator M A g (φ)(x i ) is defined as M A g (Pφ){x i } where Pφ is the function on R n defined by the convex envelope of φ. Since Pφ is piecewise affine on R n , this means that M A g (φ)(x i ) is the volume (with respect to gdy) of the convex hull of the gradients of the affine functions representing Pφ close to x i (as in points 6-8 in Sect. 2.1).

Remark 6.2
The discrete Monge-Ampère equation above can be seen as a quasiperiodic variant of the Oliker-Prussner discretization of the Dirichlet problem for the Monge-Ampère operator (where g = 1 is assumed) [41][42][43].