Quantitative stability of barycenters in the Wasserstein space

Carlier, Guillaume; Delalande, Alex; Mérigot, Quentin

doi:10.1007/s00440-023-01241-5

Quantitative stability of barycenters in the Wasserstein space

Published: 25 October 2023

Volume 188, pages 1257–1286, (2024)
Cite this article

Probability Theory and Related Fields Aims and scope Submit manuscript

Guillaume Carlier^1,2,
Alex Delalande³ &
Quentin Mérigot^4,5

316 Accesses
Explore all metrics

Abstract

Wasserstein barycenters define averages of probability measures in a geometrically meaningful way. Their use is increasingly popular in applied fields, such as image, geometry or language processing. In these fields however, the probability measures of interest are often not accessible in their entirety and the practitioner may have to deal with statistical or computational approximations instead. In this article, we quantify the effect of such approximations on the corresponding barycenters. We show that Wasserstein barycenters depend in a Hölder-continuous way on their marginals under relatively mild assumptions. Our proof relies on recent estimates that allow to quantify the strong convexity of the barycenter functional. Consequences regarding the statistical estimation of Wasserstein barycenters and the convergence of regularized Wasserstein barycenters towards their non-regularized counterparts are explored.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An elementary proof of the Brouwer’s fixed point theorem

Article Open access 14 March 2022

Random Gradient-Free Minimization of Convex Functions

Article 30 November 2015

On the rate of convergence in Wasserstein distance of the empirical measure

Article 18 October 2014

References

Agueh, M., Carlier, G.: Barycenters in the Wasserstein space. SIAM J. Math. Anal. 43(2), 904–924 (2011)
Article MathSciNet Google Scholar
Ahidar-Coutrix, A., Le Gouic, T., Paris, Q.: Convergence rates for empirical barycenters in metric spaces: curvature, convexity and extendable geodesics. Probab. Theory Relat. Fields 177(1), 323–368 (2020)
Article MathSciNet Google Scholar
Altschuler, J.M., Boix-Adsera, E.: Wasserstein barycenters can be computed in polynomial time in fixed dimension. J. Mach. Learn. Res. 22(44), 1–19 (2021)
MathSciNet Google Scholar
Benamou, J.-D., Carlier, G., Cuturi, M., Nenna, L., Peyré, G.: Iterative bregman projections for regularized transportation problems. SIAM J. Sci. Comput. 37(2), A1111–A1138 (2015)
Article MathSciNet Google Scholar
Bigot, J., Cazelles, E., Papadakis, N.: Penalization of barycenters in the Wasserstein space. SIAM J. Math. Anal. 51(3), 2261–2285 (2019)
Article MathSciNet Google Scholar
Bigot, J., Gouet, R., Klein, T., López, A.: Upper and lower risk bounds for estimating the Wasserstein barycenter of random measures on the real line. Electron. J. Stat. 12(2), 2253–2289 (2018)
Article MathSciNet Google Scholar
Bigot, J., Klein, T.: Characterization of barycenters in the Wasserstein space by averaging optimal transport maps. ESAIM PS 22, 35–57 (2018)
Article MathSciNet Google Scholar
Boissard, E., Le Gouic, T., Loubes, J.-M.: Distribution’s template estimate with Wasserstein metrics. Bernoulli 21(2), 740–759 (2015)
Article MathSciNet Google Scholar
Brascamp, H.J., Lieb, E.H.: On extensions of the Brunn-Minkowski and Prékopa-Leindler theorems, including inequalities for log concave functions, and with an application to the diffusion equation. J. Funct. Anal. 22(4), 366–389 (1976)
Article Google Scholar
Brenier, Y.: Polar factorization and monotone rearrangement of vector-valued functions. Commun. Pure Appl. Math. 44(4), 375–417 (1991)
Article MathSciNet Google Scholar
Brezis, H.: Functional Analysis, Sobolev Spaces and Partial Differential Equations. Universitext, Springer, New York (2010)
Google Scholar
Carlier, G., Eichinger, K., Kroshnin, A.: Entropic-Wasserstein barycenters: PDE characterization, regularity, and CLT. SIAM J. Math. Anal. 53(5), 5880–5914 (2021)
Article MathSciNet Google Scholar
Carlier, G., Oberman, A., Oudet, E.: Numerical methods for matching for teams and Wasserstein barycenters. ESAIM M2AN 49(6), 1621–1642 (2015)
Article MathSciNet Google Scholar
Chewi, S., Maunu, T., Rigollet, P., Stromme, A.J.: Gradient descent algorithms for Bures–Wasserstein barycenters. In: Abernethy, J., Agarwal, S. (eds.) Proceedings of Thirty Third Conference on Learning Theory. Proceedings of Machine Learning Research, vol. 125, pp. 1276–1304. PMLR, Berlin (2020)
Google Scholar
Colombo, P., Staerman, G., Piantanida, P., Clavel, C.: Automatic text evaluation through the lens of Wasserstein barycenters. In: EMNLP 2021, Punta Cana, Dominica (2021)
Cuturi, M., Doucet, A.: Fast computation of Wasserstein barycenters. In: Xing, E.P., Jebara, T. (eds.) Proceedings of the 31st International Conference on Machine Learning, volume 32(2) of Proceedings of Machine Learning Research, pp. 685–693. PMLR, Bejing, China, (2014)
Delalande, A.: Nearly tight convergence bounds for semi-discrete entropic optimal transport. In: Camps-Valls, G., Ruiz, F.J.R., Valera, I. (eds.) Proceedings of the 25th International Conference on Artificial Intelligence and Statistics, volume 151 of Proceedings of Machine Learning Research, pp. 1619–1642. PMLR (2022)
Delalande, A.: Quantitative Stability in Quadratic Optimal Transport. Université Paris-Saclay, Theses (2022)
Delalande, A., Mérigot, Q.: Quantitative stability of optimal transport maps under variations of the target measure. Duke Math. J. (2022)
Dognin, P., Melnyk, I., Mroueh, Y., Ross, J., Dos Santos, C., Sercu, T.: Wasserstein barycenter model ensembling. In: International Conference on Learning Representations (2019)
Ekeland, I., Témam, R.: Convex Analysis and Variational Problems. Society for Industrial and Applied Mathematics, Philadelphia (1999)
Fournier, N., Guillin, A.: On the rate of convergence in Wasserstein distance of the empirical measure. Probab. Theory Relat. Fields 162(3), 707–738 (2015)
Article MathSciNet Google Scholar
Ho, N., Nguyen, X., Yurochkin, M., Bui, H.H., Huynh, V., Phung, D.: Multilevel clustering via Wasserstein means. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pp. 1501–1509. PMLR (2017)
Kim, Y.-H., Pass, B.: Wasserstein barycenters over Riemannian manifolds. Adv. Math. 307, 640–683 (2017)
Article MathSciNet Google Scholar
Kitagawa, J., Mérigot, Q., Thibert, B.: Convergence of a newton algorithm for semi-discrete optimal transport. J. Eur. Math. Soc. 21(9), 2603–2651 (2019)
Article MathSciNet Google Scholar
Le Gouic, T., Loubes, J.-M.: Existence and consistency of Wasserstein barycenters. Probab. Theory Relat. Fields 168(3), 901–917 (2017)
MathSciNet Google Scholar
Le Gouic, T., Paris, Q., Rigollet, P., Stromme, A.: Fast convergence of empirical barycenters in Alexandrov spaces and the Wasserstein space. J. Eur. Math. Soc. 25, 2229–2250 (2022)
Article MathSciNet Google Scholar
Lian, X., Jain, K., Truszkowski, J., Poupart, P., Yu, Y.: Unsupervised multilingual alignment using Wasserstein barycenter. In: Bessiere, C. (ed.) Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI-20. International Joint Conferences on Artificial Intelligence Organization, pp. 3702–3708. Main track (2020)
McCann, R.J.: A convexity principle for interacting gases. Adv. Math. 128(1), 153–179 (1997)
Article MathSciNet Google Scholar
Panaretos, V.M., Zemel, Y.: An Invitation to Statistics in Wasserstein Space. SpringerBriefs in Probability and Mathematical Statistics, Springer, Cham (2020)
Book Google Scholar
Pass, B.: Optimal transportation with infinitely many marginals. J. Funct. Anal. 264(4), 947–963 (2013)
Article MathSciNet Google Scholar
Peyré, G., Cuturi, M.: Computational optimal transport. Found. Trends Mach. Learn. 11(5–6), 355–607 (2019)
Article Google Scholar
Rabin, J., Peyré, G., Delon, J., Marc, B.: Wasserstein barycenter and its application to texture mixing. In: SSVM’11, pp. 435–446. Springer, Israel (2011)
Santambrogio, F.: Optimal Transport for Applied Mathematicians, vol. 55, pp. 58–63. Birkäuser, New York (2015)
Book Google Scholar
Santambrogio, F., Wang, X.-J.: Convexity of the support of the displacement interpolation: counterexamples. Appl. Math. Lett. 58, 152–158 (2016)
Article MathSciNet Google Scholar
Shalev-Shwartz, S., Shamir, O., Srebro, N., Sridharan, K.: Learnability, stability and uniform convergence. J. Mach. Learn. Res. 11(90), 2635–2670 (2010)
MathSciNet Google Scholar
Solomon, J., de Goes, F., Peyré, G., Cuturi, M., Butscher, A., Nguyen, A., Du, T., Guibas, L.: Convolutional Wasserstein distances: efficient optimal transportation on geometric domains. ACM Trans. Graph. 34(4), 1–11 (2015)
Article Google Scholar
Srivastava, S., Li, C., Dunson, D.B.: Scalable Bayes via barycenter in Wasserstein space. J. Mach. Learn. Res. 19(8), 1–35 (2018)
MathSciNet Google Scholar
Sturm, K.-T.: Probability measures on metric spaces of nonpositive curvature. Contemp. Math. 338, 01 (2003)
MathSciNet Google Scholar
van der Vaart, A.W., Wellner, J.A.: Weak Convergence and Empirical Processes: With Applications to Statistics. Springer Series in Statistics, Springer, Berlin (1996)
Book Google Scholar
Vapnik, V.: Principles of risk minimization for learning theory. In: Moody, J., Hanson, S., Lippmann, R.P. (eds.) Advances in Neural Information Processing Systems, vol. 4. Morgan-Kaufmann, Cambridge (1991)
Google Scholar
Varadarajan, V.S.: On the convergence of sample probability distributions. Sankhyā Indian J. Stat. 19(1/2), 23–26 (1958)
MathSciNet Google Scholar
Villani, Cédric.: Optimal Transport: Old and New, vol. 338. Springer, Berlin (2008)
Google Scholar
Weed, J., Bach, F.R.: Sharp asymptotic and finite-sample rates of convergence of empirical measures in Wasserstein distance. Bernoulli (2019)

Download references

Acknowledgements

The authors acknowledge the support of the Lagrange Mathematics and Computing Research Center and of the ANR (MAGA, ANR-16-CE40-0014). We thank Blanche Buet for interesting discussions related to this work.

Author information

Authors and Affiliations

Ceremade, Univ. Paris-Dauphine PSL, 75775, Paris, France
Guillaume Carlier
Mokaplan, Inria, Paris, France
Guillaume Carlier
Lagrange Mathematics and Computing Research Center, 75007, Paris, France
Alex Delalande
Laboratoire de mathématiques d’Orsay, CNRS, Université Paris-Saclay, 91405, Orsay, France
Quentin Mérigot
Institut universitaire de France, Paris, France
Quentin Mérigot

Authors

Guillaume Carlier
View author publications
You can also search for this author in PubMed Google Scholar
Alex Delalande
View author publications
You can also search for this author in PubMed Google Scholar
Quentin Mérigot
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors reviewed the manuscript.

Corresponding author

Correspondence to Alex Delalande.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A. Dual formulation for the Wasserstein barycenter problem

Proof of Proposition 1.1

Instead of showing directly the formulation of Proposition 1.1, we will rather show

$$\begin{aligned} \min _{\mu \in \mathcal {P}(\Omega )} F_\mathbb {P}(\mu )&= \max \Bigg \{ \int _{\mathcal {P}(\Omega )} \langle \phi ^c_\rho | \rho \rangle \textrm{d}\mathbb {P}(\rho ) \mid (\phi _\rho )_\rho \in \textrm{L}^\infty (\mathbb {P}; W^{1, \infty }(\Omega )), \\ {}&\quad \int _{\mathcal {P}(\Omega )} \phi _\rho (\cdot ) \textrm{d}\mathbb {P}(\rho ) = 0 \Bigg \}, \end{aligned}$$

where for any $\rho \in \mathcal {P}(\Omega )$, $\phi _\rho ^c$ denotes the following c-transform of $\phi _\rho $: $\phi _\rho ^c(\cdot ) = \inf _{y \in \Omega } \frac{1}{2} \left\| \cdot -y\right\| ^2 - \phi _\rho (y)$. Such a formulation entails the result of Proposition 1.1 by the change of variable $(\psi _\rho )_\rho = \frac{\left\| \cdot \right\| ^2}{2} - (\phi _\rho )_\rho \in \textrm{L}^\infty (\mathbb {P}; W^{1, \infty }(\Omega ))$.

Duality Let’s first show that the value of $\min _{\mu \in \mathcal {P}(\Omega )} F_\mathbb {P}(\mu )$ is equal to the value of the following supremum

$$\begin{aligned} \mathrm {(D)_\mathbb {P}}' := \sup \Bigg \{&\int _{\mathcal {P}(\Omega )} \langle \phi ^c_\rho | \rho \rangle \textrm{d}\mathbb {P}(\rho ) \mid (\phi _\rho )_\rho \in \textrm{L}^1(\mathbb {P}; \mathcal {C}(\Omega )), \quad \int _{\mathcal {P}(\Omega )} \phi _\rho (\cdot ) \textrm{d}\mathbb {P}(\rho ) = 0 \Bigg \}, \end{aligned}$$

where $\textrm{L}^1(\mathbb {P}; \mathcal {C}(\Omega ))$ denotes the set of $\mathbb {P}$-measurable and Bochner integrable mappings from $\mathcal {P}(\Omega )$ to the space $(\mathcal {C}(\Omega ), \left\| \cdot \right\| _\infty )$ of continuous function from $\Omega $ to $\mathbb {R}$ equipped with the supremum norm. Introduce the functional $H: \mathcal {C}(\Omega ) \rightarrow \mathbb {R}$ defined for all $\varphi \in \mathcal {C}(\Omega )$ by

$$\begin{aligned} H(\varphi ) = \inf \Bigg \{&-\int _{\mathcal {P}(\Omega )} \langle \phi ^c_\rho | \rho \rangle \textrm{d}\mathbb {P}(\rho ) \mid (\phi _\rho )_\rho \in \textrm{L}^1(\mathbb {P}; \mathcal {C}(\Omega )), \quad \int _{\mathcal {P}(\Omega )} \phi _\rho (\cdot ) \textrm{d}\mathbb {P}(\rho ) = \varphi (\cdot ) \Bigg \}. \end{aligned}$$

Notice then that $\mathrm {(D)_\mathbb {P}}' = -H(0)$. On the other hand, notice that H has the following convex conjugate: for $\mu \in \mathcal {P}(\Omega )$,

$$\begin{aligned} H^*(\mu )&= \sup \left\{ \langle \varphi | \mu \rangle - H(\varphi ) \mid \varphi \in \mathcal {C}(\Omega ) \right\} \\&= \sup \Bigg \{ \langle \varphi | \mu \rangle + \int _{\mathcal {P}(\Omega )} \langle \phi ^c_\rho | \rho \rangle \textrm{d}\mathbb {P}(\rho ) \mid \varphi \in \mathcal {C}(\Omega ), (\phi _\rho )_\rho \in \textrm{L}^1(\mathbb {P}; \mathcal {C}(\Omega )),\\&\quad \int _{\mathcal {P}(\Omega )} \phi _\rho (\cdot ) \textrm{d}\mathbb {P}(\rho ) = \varphi (\cdot ) \Bigg \} \\&= \sup \left\{ \int _{\mathcal {P}(\Omega )} \langle \phi _\rho | \mu \rangle \textrm{d}\mathbb {P}(\rho ) + \int _{\mathcal {P}(\Omega )} \langle \phi ^c_\rho | \rho \rangle \textrm{d}\mathbb {P}(\rho ), \quad (\phi _\rho )_\rho \in \textrm{L}^1(\mathbb {P}; \mathcal {C}(\Omega )) \right\} \\&= \int _{\mathcal {P}(\Omega )} \left( \sup _{\phi _\rho \in \mathcal {C}(\Omega )} \langle \phi _\rho | \mu \rangle + \langle \phi ^c_\rho | \rho \rangle \right) \textrm{d}\mathbb {P}(\rho ) \\&= \int _{\mathcal {P}(\Omega )} \frac{1}{2} \textrm{W}_2^2(\mu , \rho ) \textrm{d}\mathbb {P}(\rho ), \end{aligned}$$

where we used the Kantorovich duality formula (see for instance [43]) to get to the last line. We thus have

$$\begin{aligned} \min _{\mu \in \mathcal {P}(\Omega )} F_\mathbb {P}(\mu ) = \inf _{\mu \in \mathcal {P}(\Omega )} H^*(\mu ) = - H^{**}(0). \end{aligned}$$

Therefore, showing that $\mathrm {(D)_\mathbb {P}}' = \min _{\mu \in \mathcal {P}(\Omega )} F_\mathbb {P}(\mu )$ corresponds to show that $H(0) = H^{**}(0)$. Since H is convex (by concavity of the c-transform operation), this will follow from the continuity of H at 0 for the supremum-norm over $\mathcal {C}(\Omega )$ (Proposition 4.1 of [21]). For this, we can first notice that H never takes the value $-\infty $: for any $\varphi \in \mathcal {C}(\Omega )$ and $(\phi _\rho )_\rho \in \textrm{L}^1(\mathbb {P}; \mathcal {C}(\Omega ))$ such that $\int _{\mathcal {P}(\Omega )} \phi _\rho (\cdot ) \textrm{d}\mathbb {P}(\rho ) = \varphi (\cdot )$, one has

$$\begin{aligned} \forall \rho \in \mathcal {P}(\Omega ),\quad -\phi _\rho ^c(x) = \sup _{y \in \mathbb {R}^d} \phi _\rho (y) - \frac{1}{2}\left\| x - y\right\| ^2 \ge \phi _\rho (0) - \frac{1}{2} \left\| x\right\| ^2. \end{aligned}$$

If follows that

$$\begin{aligned} H(\varphi ) \ge \varphi (0) - \int _{\mathcal {P}(\Omega )} \frac{M_2(\rho )}{2} \textrm{d}\mathbb {P}(\rho ) > -\infty . \end{aligned}$$

On the other hand, notice that H is bounded from above in a neighborhood of 0 in $\mathcal {C}(\Omega )$: for any $\varphi \in \mathcal {C}(\Omega )$ such that $\left\| \varphi \right\| _\infty \le 1$, one has $-\varphi ^c(x) \le 1$ for any $x \in \mathbb {R}^d$ so that

$$\begin{aligned} H(\varphi ) \le - \int _{\mathcal {P}(\Omega )} \langle (\varphi )^c | \rho \rangle \textrm{d}\mathbb {P}(\rho ) \le 1. \end{aligned}$$

A standard convex analysis result (Proposition 2.5 in [21]) then ensures that H is continuous at 0, so that $H(0) = H^{**}(0)$ and $\mathrm {(D)_\mathbb {P}}' = \min _{\mu \in \mathcal {P}(\Omega )} F_\mathbb {P}(\mu )$.

Restriction to $\textrm{L}^\infty (\mathbb {P}; W^{1, \infty }(\Omega ))$. We show here that we can run the supremum $\mathrm {(D)_\mathbb {P}}'$ only over $\textrm{L}^\infty (\mathbb {P}; W^{1, \infty }(\Omega ))$ instead of $\textrm{L}^1(\mathbb {P}; \mathcal {C}(\Omega ))$, that is

$$\begin{aligned} \mathrm {(D)_\mathbb {P}}'&= \sup \Bigg \{ \int _{\mathcal {P}(\Omega )} \langle \phi ^c_\rho | \rho \rangle \textrm{d}\mathbb {P}(\rho ) \mid (\phi _\rho )_\rho \in \textrm{L}^\infty (\mathbb {P}; W^{1, \infty }(\Omega )),\\ {}&\qquad \times \int _{\mathcal {P}(\Omega )} \phi _\rho (\cdot ) \textrm{d}\mathbb {P}(\rho ) = 0 \Bigg \}. \end{aligned}$$

Let $(\phi _\rho )_\rho \in \textrm{L}^1(\mathbb {P}; \mathcal {C}(\Omega ))$ be an admissible solution to $\mathrm {(D)_\mathbb {P}}'$, i.e. $(\phi _\rho )_\rho $ satisfies

$$\begin{aligned} \int _{\mathcal {P}(\Omega )} \phi _\rho (\cdot ) \textrm{d}\mathbb {P}(\rho ) = 0. \end{aligned}$$

(13)

Then we can build from $(\phi _\rho )_\rho $ another admissible solution $(\tilde{\phi }_\rho )_\rho $ that belongs to $\textrm{L}^\infty (\mathbb {P}; W^{1, \infty }(\Omega ))$ and that performs better at $\mathrm {(D)_\mathbb {P}}'$, i.e. that verifies

$$\begin{aligned} \int _{\mathcal {P}(\Omega )} \langle \tilde{\phi }^c_\rho | \rho \rangle \textrm{d}\mathbb {P}(\rho ) \ge \int _{\mathcal {P}(\Omega )} \langle \phi ^c_\rho | \rho \rangle \textrm{d}\mathbb {P}(\rho ). \end{aligned}$$

(14)

Indeed, introduce $(\hat{\phi }_\rho )_\rho := (\phi ^{cc}_\rho )_\rho $. Then for all $\rho \in \mathcal {P}(\Omega )$, $\hat{\phi }_\rho = \phi ^{cc}_\rho $ is obviously 2R-Lipschitz (as a c-transform) and satisfies $\hat{\phi }_\rho ^c = \phi _\rho ^c$ and $\hat{\phi }_\rho \ge \phi _\rho $ (as a double c-transform). Using then (13), one has that

$$\begin{aligned} \alpha (\cdot ):= \int _{\mathcal {P}(\Omega )} \hat{\phi }_\rho (\cdot ) \textrm{d}\mathbb {P}(\rho ) \ge 0, \end{aligned}$$

where $\alpha $ is also 2R-Lipschitz. Now denoting $\tilde{\phi }_\rho = \hat{\phi _\rho } - \alpha $ for all $\rho \in \mathcal {P}(\Omega )$, the mapping $(\tilde{\phi }_\rho )_\rho \in \textrm{L}^1(\mathbb {P}; \mathcal {C}(\Omega ))$ is admissible to $\mathrm {(D)_\mathbb {P}}'$ by construction and satisfies $\tilde{\phi }_\rho \le \hat{\phi }_\rho $ for all $\rho \in \mathcal {P}(\Omega )$, so that $\tilde{\phi }^c_\rho \ge \hat{\phi }^c_\rho = \phi ^c_\rho $ (using that the c-transform is order-reversing). For each $\rho \in \mathcal {P}(\Omega )$, up to subtracting $\tilde{\phi }_\rho (0)$ to $\tilde{\phi }_\rho $ (this operation leaves $(\tilde{\phi }_\rho )_\rho $ admissible to $\mathrm {(D)_\mathbb {P}}'$ and does not change its value), one can assume that $\tilde{\phi }_\rho (0) = 0$. Noticing that $\tilde{\phi }_\rho $ is 4R-Lipschitz by construction, we have the bound $\left\| \tilde{\phi }_\rho \right\| _{W^{1, \infty }(\Omega )} \le 4R(1+R)$. We thus have built an admissible $(\tilde{\phi }_\rho )_\rho \in \textrm{L}^\infty (\mathbb {P}; W^{1,\infty }(\Omega ))$ from an admissible $(\phi _\rho )_\rho \in \textrm{L}^1(\mathbb {P}; \mathcal {C}(\Omega ))$ that satisfies (14), which shows that we can run the supremum $\mathrm {(D)_\mathbb {P}}'$ only over $\textrm{L}^\infty (\mathbb {P}; W^{1, \infty }(\Omega ))$ instead of $\textrm{L}^1(\mathbb {P}; \mathcal {C}(\Omega ))$

Existence of a maximizer There now remains to show that the supremum in $\mathrm {(D)_\mathbb {P}}'$ can be replaced by a maximum. Let $\left( (\phi _\rho ^n)_\rho \right) _{n\ge 0}$ be a maximizing sequence to $\mathrm {(D)_\mathbb {P}}'$, and assume from what precedes that this sequence belongs to $\textrm{L}^\infty (\mathbb {P}; W^{1, \infty }(\Omega ))$ and satisfies for all $n\ge 0$ and $\rho \in \mathcal {P}(\Omega )$, $\left\| \phi ^n_\rho \right\| _{W^{1, \infty }(\Omega )} \le 4R(1+R)$. Further assume that this sequence verifies for all $n \ge 1$,

$$\begin{aligned} \int _{\mathcal {P}(\Omega )} \left\langle {\big (\phi ^n_\rho \big )^c}|{\rho }\right\rangle \textrm{d}\mathbb {P}(\rho ) \ge \mathrm {(D)_\mathbb {P}}' - \frac{1}{n}. \end{aligned}$$

(15)

For any $n \ge 0$, the mapping $(\rho , x) \mapsto \phi ^n_\rho (x)$ is bounded in $\textrm{L}^2(\mathbb {P}\otimes \lambda )$ where $\lambda $ denotes the Lebesgue measure over $\Omega $. Therefore, by Banach–Alaoglu theorem, the sequence $\left( (\phi _\rho ^n)_\rho \right) _{n\ge 0}$ (seen as a sequence in $\textrm{L}^2(\mathbb {P}\otimes \lambda )$) admits a weakly converging subsequence in $\textrm{L}^2(\mathbb {P}\otimes \lambda )$, that we do not relabel and for which we denote $(\phi ^\infty _\rho )_\rho $ the weak limit in $\textrm{L}^2(\mathbb {P}\otimes \lambda )$. Using now Mazur’s lemma [11, Corollary 3.8], we know that there exists a sequence of integers $(N_n)_{n \ge 0}$ and coefficients $((\lambda _{n,k})_{n \le k \le N_n})_{n \ge 0} \ge 0$ satisfying for all $n \ge 0$, $\sum _{k=n}^{N_n} \lambda _{n, k} = 1$ such that the sequence $\left( (\bar{\phi }_\rho ^n)_\rho \right) _{n\ge 0}$ defined for all $n \ge 0$ and $\rho \in \mathcal {P}(\Omega )$ by $\bar{\phi }_\rho ^n:= \sum _{k=n}^{N_n} \lambda _{n,k} \phi _\rho ^k$ converges strongly to $(\phi ^\infty _\rho )_\rho $ in $\textrm{L}^2(\mathbb {P}\otimes \lambda )$. By concavity of the c-transform operation and equation (15), we then have the bound

$$\begin{aligned} \int _{\mathcal {P}(\Omega )} \left\langle {\big (\bar{\phi }^n_\rho \big )^c}|{\rho }\right\rangle \textrm{d}\mathbb {P}(\rho )&\ge \sum _{k = n}^{N_n} \lambda _{n,k} \int _{\mathcal {P}(\Omega )} \left\langle {\big (\phi ^k_\rho \big )^c}|{\rho }\right\rangle \textrm{d}\mathbb {P}(\rho ) \nonumber \\&\ge \sum _{k = n}^{N_n} \lambda _{n,k} \left( \mathrm {(D)_\mathbb {P}}' - \frac{1}{k} \right) \nonumber \\&\ge \mathrm {(D)_\mathbb {P}}' - \frac{1}{n}. \end{aligned}$$

(16)

The sequence $\left( (\bar{\phi }_\rho ^n)_\rho \right) _{n\ge 0}$ is therefore also a maximizing sequence of $\mathrm {(D)_\mathbb {P}}'$ and it also satisfies for any $n \ge 0$ and $\rho \in \mathcal {P}(\Omega )$ the bound

$$\begin{aligned} \left\| \bar{\phi }^n_\rho \right\| _{W^{1, \infty }(\Omega )} \le 4R(1+R). \end{aligned}$$

(17)

Since the sequence $\left( (\bar{\phi }_\rho ^n)_\rho \right) _{n\ge 0}$ strongly converges to $(\phi ^\infty _\rho )_\rho $ in $\textrm{L}^2(\mathbb {P}\otimes \lambda )$, one can extract a subsequence (that we do not relabel) such that for $\mathbb {P}$-almost-every $\rho \in \mathcal {P}(\Omega )$, the sequence $(\bar{\phi }^n_\rho )_{n \ge 0}$ converges to $\phi ^\infty _\rho $ in $\textrm{L}^2(\lambda )$. Using (17) and Arzelà-Ascoli theorem, we deduce that for $\mathbb {P}$-almost-every $\rho \in \mathcal {P}(\Omega )$, the sequence $(\bar{\phi }^n_\rho )_{n \ge 0}$ converges uniformly to $\phi ^\infty _\rho $ in $\mathcal {C}(\Omega )$ and that

$$\begin{aligned} \left\| \phi ^\infty _\rho \right\| _{W^{1, \infty }(\Omega )} \le 4R(1+R). \end{aligned}$$

In particular, $(\phi ^\infty _\rho )_\rho $ belongs to $\textrm{L}^\infty (\mathbb {P}; W^{1, \infty }(\Omega ))$ and we have the limit

$$\begin{aligned} 0 =\int _{\mathcal {P}(\Omega )} \bar{\phi }^n_\rho (\cdot ) \textrm{d}\mathbb {P}(\rho ) \xrightarrow [n \rightarrow \infty ]{} \int _{\mathcal {P}(\Omega )} \phi ^\infty _\rho (\cdot ) \textrm{d}\mathbb {P}(\rho ), \end{aligned}$$

so that $(\phi ^\infty _\rho )_\rho $ is admissible to $\mathrm {(D)_\mathbb {P}}'$. Eventually, for $\mathbb {P}$-almost-every $\rho \in \mathcal {P}(\Omega )$, we have the limit

$$\begin{aligned} \left\langle {\big (\bar{\phi }^n_\rho \big )^c}|{\rho }\right\rangle \xrightarrow [n \rightarrow \infty ]{} \left\langle {\big (\phi ^\infty _\rho \big )^c}|{\rho }\right\rangle , \end{aligned}$$

(18)

so that by Lebesgue’s dominated convergence theorem and the bound (16),

$$\begin{aligned} \int _{\mathcal {P}(\Omega )} \left\langle {\big (\phi ^\infty _\rho \big )^c}|{\rho }\right\rangle \textrm{d}\mathbb {P}(\rho ) = \lim _{n \rightarrow +\infty } \int _{\mathcal {P}(\Omega )} \left\langle {\big (\bar{\phi }^n_\rho \big )^c}|{\rho }\right\rangle \textrm{d}\mathbb {P}(\rho ) = \mathrm {(D)_\mathbb {P}}', \end{aligned}$$

which proves that $(\phi ^\infty _\rho )_\rho \in \textrm{L}^\infty (\mathbb {P}; W^{1, \infty }(\Omega ))$ is a maximizer for $\mathrm {(D)_\mathbb {P}}'$.$\square $

Appendix B. Strong-convexity of $\mathcal {K}_\rho $ for measures with non-convex support

This section gathers occurrences of measures $\rho $ where the strong convexity estimate (4) of Assumption 1.3 is verified.

1.1 B.1 Measures with convex support

This result is mostly extracted from [19].

Proposition B.1

Let $\rho \in \mathcal {P}_{a.c.}(\Omega )$. Assume that $\textrm{spt}(\rho )$ is convex and that there exists $m_\rho , M_\rho \in (0, +\infty )$ such that $m_\rho \le \rho \le M_\rho $ on $\textrm{spt}(\rho )$. Let $\psi , \tilde{\psi } \in \mathcal {C}(\Omega )$. Then

$$\begin{aligned} \langle \psi - \tilde{\psi } | (\nabla \psi ^*)_\# \rho \rangle + C_{d, R, m_\rho , M_\rho } \mathbb {V}\textrm{ar}_{\rho }(\tilde{\psi }^* - \psi ^*) \le \mathcal {K}_\rho (\tilde{\psi }) - \mathcal {K}_\rho (\psi ), \end{aligned}$$

where $C_{d,R, m_\rho , M_\rho } = \left( e(d+1)2^{d+1} R \textrm{diam}(\textrm{spt}(\rho )) \left( \frac{M_\rho }{m_\rho } \right) ^2 \right) ^{-1}$.

Proof

We only present here a formal sketch of the proof, which heavily relies on computations done in Section 2 of [19]. Assuming that $\psi $ and $\tilde{\psi }$ are smooth enough (see Proposition 2.4 of [19]) and introducing for $t \in [0,1], \psi ^t = (1-t) \psi + t \tilde{\psi }$, Proposition 2.2 of [19] allows to differentiate $\mathcal {K}_\rho (\psi ^t)$ with respect to t and to obtain:

$$\begin{aligned} \mathcal {K}_\rho (\tilde{\psi })&- \mathcal {K}_\rho (\psi ) = \frac{\textrm{d}}{\textrm{d}t} \mathcal {K}_\rho (\psi ^t) \Big \vert _{t=0} + \int _0^1 \int _0^s \frac{\textrm{d}^2}{\textrm{d}t^2} \mathcal {K}_\rho (\psi ^t) \textrm{d}t \textrm{d}s \nonumber \\&= \langle \psi - \tilde{\psi } | (\nabla \psi ^*)_\# \rho \rangle + \int _0^1 \int _0^s \int _\Omega \langle \nabla v(\nabla (\psi ^t)^*) | \textrm{D}^2 (\psi ^t)^* \cdot \nabla v(\nabla (\psi ^t)^*)\rangle \textrm{d}\rho \textrm{d}t \textrm{d}s, \end{aligned}$$

(19)

were $v = \tilde{\psi } - \psi $. Reasoning as in the proof of Proposition 2.4 of [19], the Brascamp–Lieb concentration inequality [9] and the log-concavity of the determinant seen as an application on the set of s.d.p. matrices ensure the following bound:

$$\begin{aligned} C_{R, m_\rho , M_\rho } \min (t, 1-t)^d 2 \mathbb {V}\textrm{ar}_{\frac{1}{2}(\mu + \tilde{\mu })}(\tilde{\psi } - \psi ) \le \int _\Omega \langle \nabla v(\nabla (\psi ^t)^*) | \textrm{D}^2 (\psi ^t)^* \cdot \nabla v(\nabla (\psi ^t)^*)\rangle \textrm{d}\rho , \end{aligned}$$

where $C_{R, m_\rho , M_\rho } = \left( e R \textrm{diam}(\textrm{spt}(\rho )) \left( \frac{M_\rho }{m_\rho } \right) ^2 \right) ^{-1}$, $\mu = (\nabla \psi ^*)_\# \rho $ and $\tilde{\mu } = (\nabla \tilde{\psi })_\# \rho $. Back to (19), this leads to

$$\begin{aligned} \langle \psi - \tilde{\psi } | (\nabla \psi ^*)_\# \rho \rangle + C_{d, R, m_\rho , M_\rho } 2 \mathbb {V}\textrm{ar}_{\frac{1}{2}(\mu + \tilde{\mu })}(\tilde{\psi } - \psi ) \le \mathcal {K}_\rho (\tilde{\psi }) - \mathcal {K}_\rho (\psi ), \end{aligned}$$

where $C_{d,R, m_\rho , M_\rho } = \left( e(d+1)2^{d+1} R \textrm{diam}(\textrm{spt}(\rho )) \left( \frac{M_\rho }{m_\rho } \right) ^2 \right) ^{-1}$. We conclude using the convex analysis argument of Proposition 3.1 from [19], which directly ensures

$$\begin{aligned} \mathbb {V}\textrm{ar}_{\rho }(\tilde{\psi }^* - \psi ^*) \le 2 \mathbb {V}\textrm{ar}_{\frac{1}{2}(\mu + \tilde{\mu })}(\tilde{\psi } - \psi ). \end{aligned}$$

We get the general case (without the smoothness assumptions on $\psi $ and $\tilde{\psi }$) using approximation arguments presented in Proposition 2.5 and 2.7 of [19].

1.2 B.2 Measures with connected union of convex sets as support

We extend Proposition B.1 to the case of a source measure $\rho $ with a possibly non-convex support. We will assume that $\textrm{spt}(\rho )$ can be written as a connected finite union of convex sets.

Proposition B.2

Let $\rho \in \mathcal {P}_{a.c.}(\Omega )$ such that there exists $m_\rho , M_\rho \in (0, +\infty )$ verifying $m_\rho \le \rho \le M_\rho $ on $\textrm{spt}(\rho )$. Assume that $\textrm{spt}(\rho )$ is connected and that there exists $N\ge 1$ convex sets $(C_i)_{1 \le i \le N}$ in $\Omega $ such that $\textrm{spt}(\rho ) = \bigcup _{i=1}^N C_i$. Also assume that for any $i \ne j$ such that $C_i \cap C_j \ne \emptyset $, one has $\rho (C_i \cap C_j) > 0$. Then there exists a constant $c_\rho $ depending on $\rho $ such that for any $\psi , \tilde{\psi } \in \mathcal {C}(\Omega )$,

$$\begin{aligned} \langle \psi - \tilde{\psi } | (\nabla \psi ^*)_\# \rho \rangle + c_\rho \mathbb {V}\textrm{ar}_{\rho }(\tilde{\psi }^* - \psi ^*) \le \mathcal {K}_\rho (\tilde{\psi }) - \mathcal {K}_\rho (\psi ). \end{aligned}$$

Remark B.1

(Constant $c_\rho $ and Poincaré–Wirtinger constant of $\rho $) The constant $c_\rho $ of Proposition B.2 is not made precise in the statement. A look at the proof of this proposition only allows to bound $c_\rho $ in terms of the second smallest eigenvalue $\lambda _2(L)$ of a weighted graph Laplacian L, that is built from the graph whose vertices are the convex sets $C_i$ and whose edge weights are the masses $\rho (C_i \cap C_j)$ that $\rho $ grants to the intersection of the convex sets $C_i$ and $C_j$. The constant $c_\rho $ then reads:

$$\begin{aligned} c_\rho = \left( e(d+1)2^{d+1} R^2 \left( \frac{M_\rho }{m_\rho } \right) ^2 \left( N^2 + \frac{ 2 N^3}{\lambda _2(L)} \right) \right) ^{-1}. \end{aligned}$$

The quantity $\lambda _2(L)$ is not explicit, but it can be linked to the weighted Cheeger constant of $\rho $, defined by

$$\begin{aligned} h(\rho ) = \inf _{A \subset \textrm{spt}(\rho )} \frac{ \left| \partial A\right| _\rho }{ \min (\rho (A), \rho (\textrm{spt}(\rho ) \setminus A)) }, \end{aligned}$$

where $\left| \partial A\right| _\rho = \int _{\partial A \cap \textrm{int}(\textrm{spt}(\rho ))} \rho (x) \textrm{d}\mathcal {H}^{d-1}(x)$ and where the infimum is taken over Lipschitz domains $A \subset \textrm{int}(\textrm{spt}(\rho ))$ with boundary of finite $\mathcal {H}^{d-1}$-measure. Quoting [25] (Lemma 5.3), this constant can in turn be linked to the $\textrm{L}^1$ Poincaré–Wirtinger constant $C_{PW}(\rho )$ of $\rho $. Indeed, $h(\rho )$ is positive whenever $\rho $ satisfies an $\textrm{L}^1$ Poincaré–Wirtinger inequality, i.e. whenever there exists a finite $C_{PW}(\rho ) > 0$ such that for all smooth function f on $\Omega $,

$$\begin{aligned} \left\| f - \mathbb {E}_\rho f\right\| _{\textrm{L}^1(\rho )} \le C_{PW}(\rho ) \left\| \nabla f\right\| _{\textrm{L}^1(\rho ; \mathbb {R}^d)}. \end{aligned}$$

The Poincaré–Wirtinger constant $C_{PW}(\rho )$ and the Cheeger constant $h(\rho )$ are then related by the inequality

$$\begin{aligned} h(\rho ) \ge \frac{2}{C_{PW}(\rho )}. \end{aligned}$$

Using ideas similar to the ones found in Section 5.2 of [25], the eigenvalue $\lambda _2(L)$ can be bounded in terms of the Cheeger constant of $\rho $, and thus in terms of $C_{PW}(\rho )$. We do not detail this comparison here but only report that $c_\rho $ may be written

$$\begin{aligned} c_\rho = \left( e(d+1)2^{d+1} R^2 \left( \frac{M_\rho }{m_\rho } \right) ^2 N \left( N + \frac{1}{2}\left( \frac{ M_\rho s_{d-1} R^{d-1} N^2 C_{PW}(\rho )}{ \varepsilon ^2 }\right) ^3 \right) \right) ^{-1}, \end{aligned}$$

where $s_{d-1}$ denotes the surface area of the unit sphere in $\mathbb {R}^d$ and

$$\begin{aligned} \varepsilon = \min \left( \min _{i, j \vert C_i \cap C_j \ne \emptyset } \rho (C_i \cap C_j), \min _i \rho \left( C_i \setminus \cup _{j \ne i} C_j \right) \right) > 0. \end{aligned}$$

Proof of Proposition B.2

Let’s denote for now $f = \tilde{\psi }^* - \psi ^*$. We will first exploit a discrete Laplacian over $\mathcal {X}= \textrm{spt}(\rho )$ in order to upper bound $\mathbb {V}\textrm{ar}_\rho (f)$ by a sum of variances of f w.r.t. probability measures supported over the convex sets $(C_i)_i$. We will then use Proposition B.1 to conclude.

For any $i \in \{1,\dots ,N\}$, we denote $\rho _i = \frac{1}{\rho (C_i)} \rho _{\vert C_i}$ and $m_i = \int _{C_i} f \textrm{d}\rho _i$. Then one has the following bound:

$$\begin{aligned} \mathbb {V}\textrm{ar}_\rho (f)&= \frac{1}{2} \int _{\mathcal {X}\times \mathcal {X}} (f(x) - f(y))^2 \textrm{d}\rho (x) \textrm{d}\rho (y) \nonumber \\&\le \frac{1}{2} \sum _{i,j} \int _{C_i \times C_j} (f(x) - f(y))^2 \textrm{d}\rho (x) \textrm{d}\rho (y) \nonumber \\&= \frac{1}{2} \sum _{i,j} \int _{C_i \times C_j} (f(x) - m_i + m_i - m_j + m_j - f(y))^2 \textrm{d}\rho (x) \textrm{d}\rho (y) \nonumber \\&= \left( \sum _i \rho (C_i) \right) \sum _i \int _{C_i} (f(x) - m_i)^2 \textrm{d}\rho (x) + \frac{1}{2} \sum _{i,j} (m_i - m_j)^2 \rho (C_i) \rho (C_j) \nonumber \\&= \left( \sum _i \rho (C_i) \right) \sum _i \rho (C_i) \mathbb {V}\textrm{ar}_{\rho _i}(f) + \frac{1}{2} \sum _{i,j} (m_i - m_j)^2 \rho (C_i) \rho (C_j). \end{aligned}$$

(20)

We now consider the graph $G = (\{C_i\}_{1 \le i \le N}, \{w_{ij}\}_{1 \le i,j \le N})$ with vertices $\{C_i\}_{1 \le i \le N}$ and weighted edges $\{w_{ij}\}_{1 \le i,j \le N}$ defined by

$$\begin{aligned} \forall i,j \in \{1, \dots , N\}, \quad w_{ij} = \rho (C_i \cap C_j). \end{aligned}$$

By construction, this graph has a single connected component. We introduce the weighted Laplacian matrix $L \in \mathbb {R}^{N\times N}$ of G as follows:

$$\begin{aligned} \forall i,j \in \{1, \dots , N\}, \quad L_{ij} = \left\{ \begin{array}{ll} \sum _{k} w_{ik} &{} \text{ if } i = j, \\ -w_{ij} &{} \text{ else. } \end{array} \right. \end{aligned}$$

Then L is a symmetric and positive semi-definite matrix. Its null space is made of constant vectors and we denote $\lambda _2(L)$ its second smallest eigenvalue, which is non-zero. Denoting $m = (m_i)_{1\le i \le N} \in \mathbb {R}^N$, we introduce $\bar{m} = \left( \frac{1^{}}{N} \sum _i m_i\right) \mathbbm {1}_N \in \mathbb {R}^N$ the constant vector whose coordinates equal the mean of m (we use $\mathbbm {1}_N = (1)_{1\le i \le N} \in \mathbb {R}^N$). Notice that $m - \bar{m}$ is in the orthogonal to the null space of L, ensuring the following bound:

$$\begin{aligned} \frac{1}{2} \sum _{i,j} (m_i - m_j)^2 \rho (C_i) \rho (C_j)&\le N^2 \frac{1}{2} \sum _{i,j} (m_i - m_j)^2 \frac{1}{N^2} \nonumber \\&= N \left\| m - \bar{m} \right\| ^2 \nonumber \\&\le \frac{ N }{\lambda _2(L)} \langle m - \bar{m} | L \left( m - \bar{m}\right) \rangle \nonumber \\&= \frac{ N }{\lambda _2(L)} \sum _{i, j} w_{ij} (m_i^2 - m_i m_j) \nonumber \\&= \frac{ N }{\lambda _2(L)} \sum _{i, j} \frac{w_{ij}}{2} (m_i - m_j)^2. \end{aligned}$$

(21)

But for any i, j such that $w_{ij}>0$, denoting $m_{i \cap j} = \frac{1}{\rho (C_i \cap C_j)} \int _{C_i \cap C_j} f \textrm{d}\rho $, one has

$$\begin{aligned} \frac{1}{2} (m_i - m_j)^2 \le (m_{i \cap j} - m_i)^2 + (m_{i \cap j} - m_j)^2. \end{aligned}$$

And for such i, j,

$$\begin{aligned} (m_{i \cap j} - m_i)^2&= \left( \frac{1}{\rho (C_i \cap C_j)} \int _{C_i \cap C_j} (f - m_i) \textrm{d}\rho \right) ^2 \\&\le \frac{1}{\rho (C_i \cap C_j)} \int _{C_i} (f - m_i)^2 \textrm{d}\rho \\&= \frac{\rho (C_i)}{w_{ij}} \mathbb {V}\textrm{ar}_{\rho _i}(f), \end{aligned}$$

where we used Jensen’s inequality and the fact that $C_i \cap C_j \subset C_i$. A similar bound can be shown for $(m_{i \cap j} - m_j)^2$, and plugging these into (21) yields

$$\begin{aligned} \frac{1}{2} \sum _{i,j} (m_i - m_j)^2 \rho (C_i) \rho (C_j)&\le \frac{ N }{\lambda _2(L)} \sum _{i} \sum _{j \vert C_i \cap C_j \ne \emptyset } \left( \rho (C_i) \mathbb {V}\textrm{ar}_{\rho _i}(f) + \rho (C_j) \mathbb {V}\textrm{ar}_{\rho _j}(f)\right) \\&\le \frac{ 2 N^2 }{\lambda _2(L)} \sum _{i} \rho (C_i) \mathbb {V}\textrm{ar}_{\rho _i}(f). \end{aligned}$$

Injecting this into (20) yields

$$\begin{aligned} \mathbb {V}\textrm{ar}_\rho (f) \le \left( N + \frac{ 2 N^2}{\lambda _2(L)} \right) \sum _i \rho (C_i) \mathbb {V}\textrm{ar}_{\rho _i}(f). \end{aligned}$$

(22)

Now recalling that $f = \psi - \tilde{\psi }$, we have by Proposition B.1 for any $i\in \{1, \dots , N\}$ that

$$\begin{aligned} \langle \psi - \tilde{\psi } | (\nabla \psi ^*)_\# \rho _i\rangle + C_{d,R, m_\rho , M_\rho } \mathbb {V}\textrm{ar}_{\rho _i}(\tilde{\psi }^* - \psi ^*) \le \mathcal {K}_{\rho _i}(\tilde{\psi }) - \mathcal {K}_{\rho _i}(\psi ), \end{aligned}$$

where $C_{d,R, m_\rho , M_\rho }= \left( e(d+1)2^{d+1} R^2 \left( \frac{M_\rho }{m_\rho } \right) ^2 \right) ^{-1} $. Weighting this last inequality with $\rho (C_i)$ and summing over $i \in \{1, \dots , N\}$, this raises

$$\begin{aligned} \langle \psi - \tilde{\psi } | (\nabla \psi ^*)_\# \rho \rangle + \frac{C_{d,R, m_\rho , M_\rho }}{N} \sum _{i=1}^N \rho (C_i) \mathbb {V}\textrm{ar}_{\rho _i}(\tilde{\psi }^* - \psi ^*) \le \mathcal {K}_{\rho }(\tilde{\psi }) - \mathcal {K}_{\rho }(\psi ). \end{aligned}$$

Using (22) eventually gives

$$\begin{aligned} \langle \psi - \tilde{\psi } | (\nabla \psi ^*)_\# \rho \rangle + c_{\rho } \mathbb {V}\textrm{ar}_{\rho }(\tilde{\psi }^* - \psi ^*) \le \mathcal {K}_{\rho }(\tilde{\psi }) - \mathcal {K}_{\rho }(\psi ), \end{aligned}$$

where $c_\rho = \left( e(d+1)2^{d+1} R^2 \left( \frac{M_\rho }{m_\rho } \right) ^2 \left( N^2 + \frac{ 2 N^3}{\lambda _2(L)} \right) \right) ^{-1}$.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Carlier, G., Delalande, A. & Mérigot, Q. Quantitative stability of barycenters in the Wasserstein space. Probab. Theory Relat. Fields 188, 1257–1286 (2024). https://doi.org/10.1007/s00440-023-01241-5

Download citation

Received: 03 March 2023
Revised: 27 September 2023
Accepted: 04 October 2023
Published: 25 October 2023
Issue Date: April 2024
DOI: https://doi.org/10.1007/s00440-023-01241-5

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Quantitative stability of barycenters in the Wasserstein space

Abstract

Access this article

Similar content being viewed by others

An elementary proof of the Brouwer’s fixed point theorem

Random Gradient-Free Minimization of Convex Functions

On the rate of convergence in Wasserstein distance of the empirical measure

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix A. Dual formulation for the Wasserstein barycenter problem

Proof of Proposition 1.1

Appendix B. Strong-convexity of \(\mathcal {K}_\rho \) for measures with non-convex support

1.1 B.1 Measures with convex support

Proposition B.1

Proof

1.2 B.2 Measures with connected union of convex sets as support

Proposition B.2

Remark B.1

Proof of Proposition B.2

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Quantitative stability of barycenters in the Wasserstein space

Abstract

Access this article

Similar content being viewed by others

An elementary proof of the Brouwer’s fixed point theorem

Random Gradient-Free Minimization of Convex Functions

On the rate of convergence in Wasserstein distance of the empirical measure

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix A. Dual formulation for the Wasserstein barycenter problem

Proof of Proposition 1.1

Appendix B. Strong-convexity of \(\mathcal {K}_\rho \) for measures with non-convex support

1.1 B.1 Measures with convex support

Proposition B.1

Proof

1.2 B.2 Measures with connected union of convex sets as support

Proposition B.2

Remark B.1

Proof of Proposition B.2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation