Skip to main content

Multiplicative Schrödinger problem and the Dirichlet transport


We consider an optimal transport problem on the unit simplex whose solutions are given by gradients of exponentially concave functions and prove two main results. First, we show that the optimal transport is the large deviation limit of a particle system of Dirichlet processes transporting one probability measure on the unit simplex to another by coordinatewise multiplication and normalizing. The structure of our Lagrangian and the appearance of the Dirichlet process relate our problem closely to the entropic measure on the Wasserstein space as defined by von-Renesse and Sturm in the context of Wasserstein diffusion. The limiting procedure is a triangular limit where we allow simultaneously the number of particles to grow to infinity while the ‘noise’ tends to zero. The method, which generalizes easily to many other cost functions, including the squared Euclidean distance, provides a novel combination of the Schrödinger problem approach due to C. Léonard and the related Brownian particle systems by Adams et al. which does not require gamma convergence. Second, we analyze the behavior of entropy along the paths of transport. The reference measure on the simplex is taken to be the Dirichlet measure with all zero parameters which relates to the finite-dimensional distributions of the entropic measure. The interpolating curves are not the usual McCann lines. Nevertheless we show that entropy plus a multiple of the transport cost remains convex, which is reminiscent of the semiconvexity of entropy along lines of McCann interpolations in negative curvature spaces. We also obtain, under suitable conditions, dimension-free bounds of the optimal transport cost in terms of entropy.

This is a preview of subscription content, access via your institution.


  1. Although we use the same notation \(\varDelta _n\), it is helpful to regard \(\varphi \) as a function on the primal simplex. See Remark 1 and compare with (15).

  2. As suggested by an anonymous referee, it would be nice to obtain sufficient conditions directly in terms of the distributions \(P_0\) and \(P_1\). This is an interesting problem (possibly related to analysis of the corresponding Monge–Ampère equation studied in Sect. 4.2) on its own and is left for future research. On the other hand, once the function \(\varphi \) is fixed, the transport map T is optimal for any \(P_0\) if we set \(P_1 = T_{\#} P_0\).

  3. We thank an anonymous referee for pointing out this point.


  1. Adams, S., Dirr, N., Peletier, M.A., Zimmer, J.: From a large-deviations principle to the Wasserstein gradient flow: a new micro–macro passage. Commun. Math. Phys. 307(3), 791 (2011)

    MathSciNet  MATH  Google Scholar 

  2. Amari, S.: Information Geometry and Its Applications. Springer, Berlin (2016)

    MATH  Google Scholar 

  3. Ambrosio, L., Gigli, N., Savaré, G.: Gradient Flows: In Metric Spaces and in the Space of Probability Measures. Springer, Berlin (2008)

    MATH  Google Scholar 

  4. Brenier, Y.: Polar factorization and monotone rearrangement of vector-valued functions. Commun. Pure Appl. Math. 44(4), 375–417 (1991)

    MathSciNet  MATH  Google Scholar 

  5. Chang, J.T., Pollard, D.: Conditioning as disintegration. Stat. Neerl. 51(3), 287–317 (1997)

    MathSciNet  MATH  Google Scholar 

  6. Conforti, G.: A second order equation for Schrödinger bridges with applications to the hot gas experiment and entropic transportation cost. Probab. Theory Relat. Fields 174(1–2), 1–47 (2019)

    MATH  Google Scholar 

  7. Cordero-Erausquin, D., McCann, R.J., Schmuckenschläger, M.: Prékopa–Leindler type inequalities on Riemannian manifolds, Jacobi fields, and optimal transport. Annales de la faculté des sciences de Toulouse 15(4), 613–635 (2006)

    MathSciNet  MATH  Google Scholar 

  8. Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, New York (2006)

    MATH  Google Scholar 

  9. Ding, J., Zhou, A.: Eigenvalues of rank-one updated matrices with some applications. Appl. Math. Lett. 20(12), 1223–1226 (2007)

    MathSciNet  MATH  Google Scholar 

  10. Duong, M.H., Laschos, V., Renger, M.: Wasserstein gradient flows from large deviations of many-particle limits. ESAIM Control Optim. Calc. Var. 19(4), 1166–1188 (2013). Erratum at

  11. Egozcue, J.J., Pawlowsky-Glahn, V.: Simplicial Geometry for Compositional Data, vol. 264, no. 1, pp. 145–159. Geological Society, Special Publications, London (2006)

    MATH  Google Scholar 

  12. Émery, M., Yor, M.: A parallel between Brownian bridges and gamma bridges. Publ. Res. Inst. Math. Sci. 40(3), 669–688 (2004)

    MathSciNet  MATH  Google Scholar 

  13. Erbar, M., Kuwada, K., Sturm, K.T.: On the equivalence of the entropic curvature-dimension condition and Bochner’s inequality on metric measure spaces. Invent. Math. 201(3), 993–1071 (2015)

    MathSciNet  MATH  Google Scholar 

  14. Erbar, M., Maas, J., Renger, D.R.M.: From large deviations to Wasserstein gradient flows in multiple dimensions. Electron. Commun. Probab. 20(89), 1–12 (2015)

    MathSciNet  MATH  Google Scholar 

  15. Feng, S.: Large deviations for Dirichlet processes and Poisson–Dirichlet distribution with two parameters. Electron. J. Probab 12, 787–807 (2007)

    MathSciNet  MATH  Google Scholar 

  16. Fernholz, E.R.: Stochastic Portfolio Theory. Applications of Mathematics. Springer, Berlin (2002)

    MATH  Google Scholar 

  17. Fournier, N., Guillin, A.: On the rate of convergence in Wasserstein distance of the empirical measure. Probab. Theory Relat. Fields 162(3–4), 707–738 (2015)

    MathSciNet  MATH  Google Scholar 

  18. Gangbo, W., McCann, R.J.: The geometry of optimal transportation. Acta Math. 177(2), 113–161 (1996)

    MathSciNet  MATH  Google Scholar 

  19. Horn, R., Johnson, C.: Matrix Analysis. Cambridge University Press, Cambridge (1990)

    MATH  Google Scholar 

  20. Jordan, R., Kinderlehrer, D., Otto, F.: The variational formulation of the Fokker–Planck equation. SIAM J. Math. Anal. 29(1), 1–17 (1998)

    MathSciNet  MATH  Google Scholar 

  21. Khan, G., Zhang, J.: The Kähler geometry of certain optimal transport problems. Pure Appl. Anal. 2(2), 397–426 (2020)

    MathSciNet  MATH  Google Scholar 

  22. Léonard, C.: From the Schrödinger problem to the Monge–Kantorovich problem. J. Funct. Anal. 262(4), 1879–1920 (2012)

    MathSciNet  MATH  Google Scholar 

  23. Léonard, C.: A survey of the Schrödinger problem and some of its connections with optimal transport. Discrete Contin. Dyn. Syst. 34(4), 1533–1574 (2014)

    MathSciNet  MATH  Google Scholar 

  24. Lynch, J., Sethuraman, J.: Large deviations for processes with independent increments. Ann. Probab. 15(2), 610–627 (1987)

    MathSciNet  MATH  Google Scholar 

  25. McCann, R.J.: A convexity principle for interacting gases. Adv. Math. 128(1), 153–179 (1997)

    MathSciNet  MATH  Google Scholar 

  26. Mikami, T.: Monge’s problem with a quadratic cost by the zero-noise limit of \(h\)-path processes. Probab. Theory Relat. Fields 129(2), 245–260 (2004)

    MathSciNet  MATH  Google Scholar 

  27. Otto, F.: The geometry of dissipative evolution equations: the porous medium equation. Commun. Partial Differ. Equ. 26, 101–174 (2001)

    MathSciNet  MATH  Google Scholar 

  28. Pal, S.: Embedding optimal transports in statistical manifolds. Indian J. Pure Appl. Math. 48(4), 541–550 (2017)

    MathSciNet  MATH  Google Scholar 

  29. Pal, S.: Exponentially concave functions and high dimensional stochastic portfolio theory. Stoch. Process. Their Appl. 129(9), 3116–3128 (2019)

    MathSciNet  MATH  Google Scholar 

  30. Pal, S.: On the difference between entropic cost and the optimal transport cost. Arxiv preprint arXiv:1905.12206 (2019)

  31. Pal, S., Wong, T.K.L.: The geometry of relative arbitrage. Math. Financ. Econ. 10, 263–293 (2016)

    MathSciNet  MATH  Google Scholar 

  32. Pal, S., Wong, T.K.L.: Exponentially concave functions and a new information geometry. Ann. Probab. 46(2), 1070–1113 (2018)

    MathSciNet  MATH  Google Scholar 

  33. Rockafellar, R.T.: Convex Analysis. Princeton Landmarks in Mathematics. Princeton University Press, Princeton (1997)

    Google Scholar 

  34. Santambrogio, F.: Optimal Transport for Applied Mathematicians. Springer, Berlin (2015)

    MATH  Google Scholar 

  35. Villani, C.: Topics in Optimal Transportation. Graduate Studies in Mathematics. American Mathematical Society, Providence (2003)

    MATH  Google Scholar 

  36. Villani, C.: Optimal Transport: Old and New. Springer, Berlin (2008)

    MATH  Google Scholar 

  37. von Renesse, M.K., Sturm, K.T.: Entropic measure and Wasserstein diffusion. Ann. Probab. 37(3), 1114–1191 (2009)

    MathSciNet  MATH  Google Scholar 

  38. Wong, T.K.L.: Optimization of relative arbitrage. Ann. Finance 11(3–4), 345–382 (2015)

    MathSciNet  MATH  Google Scholar 

  39. Wong, T.K.L.: Logarithmic divergences from optimal transport and Rényi geometry. Inf. Geom. 1(1), 39–78 (2018)

    MathSciNet  MATH  Google Scholar 

  40. Wong, T.K.L.: Information geometry in portfolio theory. In: Nielsen, F. (ed.) Geometric Structures of Information, pp. 105–136. Springer, Cham (2019)

    Google Scholar 

  41. Wong, T.K.L., Yang, J.: Optimal transport and information geometry. arXiv preprint arXiv:1906.00030 (2019)

Download references


S. P. thanks Martin Huesmann for very useful discussions.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Ting-Kam Leonard Wong.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

S. Pal’s research is supported by NSF Grant DMS-1612483. T.-K. L. Wong’s research is supported by NSERC Grant RGPIN-2019-04419.



Proof of Lemma 3

Let \(\theta _i = - \log p_i\) and \(\phi _i = - \log q_i\) for \(1 \le i \le n\). Then the cost function (2) takes the form

$$\begin{aligned} c(p, q) = \log \left( \frac{1}{n}\sum _{i = 1}^n e^{\theta _i - \phi _i}\right) - \frac{1}{n} \sum _{i = 1}^n (\theta _i - \phi _i). \end{aligned}$$

By the Cauchy–Schwarz inequality, we have

$$\begin{aligned} \begin{aligned} c(p, q)&\le \frac{1}{2} \left[ \log \left( \frac{1}{n}\sum _{i = 1}^n e^{2\theta _i}\right) - \frac{1}{n} \sum _{i = 1}^n (2 \theta _i) \right] \\&\quad + \frac{1}{2} \left[ \log \left( \frac{1}{n}\sum _{i = 1}^n e^{-2\phi _i}\right) - \frac{1}{n} \sum _{i = 1}^n (-2 \phi _i) \right] . \end{aligned} \end{aligned}$$


$$\begin{aligned} \log \left( \frac{1}{n} \sum _{i = 1}^n e^{2\theta _i}\right) \le 2 \max _{1 \le i \le n} |\theta _i| \le 2 \sum _{i = 1}^n |\theta _i|, \end{aligned}$$

we have the estimate

$$\begin{aligned} c(p, q) \le \left( 1 + \frac{1}{n}\right) \left( \sum _{i = 1}^n |\theta _i| + |\phi _i|\right) . \end{aligned}$$

Integrating against any coupling \(R \in \varPi (P, Q)\) and replacing the constant (which is irrelevant) by 2 shows that the transport cost is finite whenever \(P, Q \in {\mathcal {L}}\). \(\square \)

Proof of Theorem 1

Since \(P, Q \in {\mathcal {L}}\), by Proposition 3 we have \({\mathbf {C}}(P, Q) < \infty \). Since the cost function is continuous and bounded below, by general results of optimal transport (see for example [35, 36]), there exists an optimal coupling \(R^* \in \varPi (P, Q)\) solving the transport problem, and its support is c-cyclical monotone.

Let \(m \ge 1\) and let \(\{(p(s), q(s)\}_{s = 0}^{m - 1}\) be a sequence in the support of \(R^*\). By the c-cyclical monotonicity of \(R^*\), we have

$$\begin{aligned} \sum _{s = 0}^{m-1} \log \left( \frac{1}{n} \sum _{i = 1}^n \frac{q_i(s)}{p_i(s)} \right) \le \sum _{s = 0}^{m-1} \log \left( \frac{1}{n} \sum _{i = 1}^n \frac{q_i(s)}{p_i(s + 1)} \right) , \end{aligned}$$

where by convention \((p(m), q(m)) := (p(0), q(0))\). For each s let \(\pi (s) = q(s) \odot p(s)^{-1}\) and \(r(s) = p(s)^{-1}\). Rearranging, we have

$$\begin{aligned} \begin{aligned}&\sum _{s = 0}^{m-1} \log \left( \sum _{i = 1}^n \frac{q_i(s)/p_i(s)}{\sum _{k = 1}^n q_k(s)/p_k(s)} \frac{p_i(s)}{p_i(s + 1)} \right) \\&\quad = \sum _{s = 0}^{m-1} \log \left( \sum _{i = 1}^n \pi _i(s) \frac{r_i(s + 1)}{r_i(s)} \right) \ge 0. \end{aligned} \end{aligned}$$

Thus the (multi-valued) portfolio map

$$\begin{aligned} r \mapsto \{\pi = q \odot p^{-1} : p = r^{-1}, (p, q) \in \mathrm {supp}(R^*)\} \end{aligned}$$

induced by the optimal coupling is multiplicatively cyclical monotone in the sense of (14). (In [31, Proposition 12] we performed this argument using another coordinate system.)

By [31, Proposition 4, Proposition 6], there exists an exponentially concave function \(\varphi \) on \(\varDelta _n\) such that if \(\varvec{\pi }\) is the portfolio map generated by \(\varphi \), (pq) is any pair in the support of \(R^*\) and \(\varphi \) is differentiable at \(r = p^{-1}\), then

$$\begin{aligned} \pi = q \odot p^{-1} = \varvec{\pi }(r). \end{aligned}$$

Rearranging, we have \(q = p \odot \varvec{\pi }(p^{-1})\) which is the image of p under the mapping (15). Since \(P \in {\mathcal {L}}_a\) is absolutely continuous and \(\varphi \) is differentiable almost everywhere, for P-a.e. values of p there is a unique element \(q \in \varDelta _n\) such that \((p, q) \in \mathrm {supp}(R^*)\) and (78) holds. This proves both (i) and (ii). \(\square \)

Proof of Proposition 2

First we show that \(P_t \in {\mathcal {L}}\) for all t. By Remark 3, for each p the trace of \(\{T_t(p)\}_{0 \le t \le 1}\) is a straight line in \(\varDelta _n\). It follows that for each i we have

$$\begin{aligned} \begin{aligned} |\log (T_t(p))_i |&\le \max \{ |\log p_i|, | \log (T_1(p))_i|\} \le |\log p_i| + | \log (T_1(p))_i|. \end{aligned} \end{aligned}$$

Since both \(P_0, P_1 \in {\mathcal {L}}\) by assumption, we have \(P_t \in {\mathcal {L}}\) as well.

Next we prove that \(P_t\) is absolutely continuous. For vectors a and b we let \(\frac{a}{b} = (\frac{a_i}{b_i})\) be the vector of component-wise ratios, and we use \(a \cdot b\) and \(\langle a, b \rangle \) interchangeably to denote the Euclidean dot product.

Let \(0< t < 1\) be given. Let \(\mathbf{w}_t(r) = \frac{\varvec{\pi }_t(r)}{r}\) be the vector of unnormalized weight ratios. Recall that \(q = T_t(p) = p \odot \varvec{\pi }_t(p^{-1}) = r^{-1} \odot \varvec{\pi }_t(r)\) and similarly for \(q'\). Then, by (16), we have

$$\begin{aligned} q = T_t(p) = \left( \frac{(\mathbf{w}_t(r))_i}{\sum _{j = 1}^n (\mathbf{w}_t(r))_j}\right) _{1 \le i \le n}. \end{aligned}$$

Thus, if we can prove that the distribution \({\tilde{P}}_t\) of \(\mathbf{w}_t(r)\) (where \(r = p^{-1}\) and \(p \sim P_0\)) is absolutely continuous, then \(P_t\) is absolutely continuous and we are done.

To this end, consider the quantity

$$\begin{aligned} \begin{aligned}&\left\langle \frac{\varvec{\pi }_t(r')}{r'} - \frac{\varvec{\pi }_t(r)}{r}, r' - r \right\rangle \\&\quad = (1 - t) \left\langle \frac{{\overline{e}}}{r'} - \frac{{\overline{e}}}{r}, r' - r \right\rangle + t \left\langle \frac{\varvec{\pi }_1(r')}{r'} - \frac{\varvec{\pi }_1(r)}{r}, r' - r \right\rangle \\&\quad = (1 - t) \left( 2 - {\overline{e}} \cdot \frac{r}{r'} - {\overline{e}} \cdot \frac{r'}{r} \right) + t \left( 2 - \varvec{\pi }_1(r') \cdot \frac{r}{r'} - \varvec{\pi }_1(r) \cdot \frac{r'}{r} \right) \\&\quad \le - (1 - t) \log \left( \left( {\overline{e}} \cdot \frac{r}{r'} \right) \left( {\overline{e}} \cdot \frac{r'}{r} \right) \right) - t \log \left( \left( \varvec{\pi }_1(r') \cdot \frac{r}{r'} \right) \left( \varvec{\pi }_1(r) \cdot \frac{r'}{r} \right) \right) . \end{aligned} \end{aligned}$$

In the last line we used the estimate \(\log (1 + x) \le x\).

By the multiplicative cyclical monotonicity of the portfolio maps (see (14)), we have

$$\begin{aligned} \left( {\overline{e}} \cdot \frac{r}{r'} \right) \left( {\overline{e}} \cdot \frac{r'}{r} \right) \ge 1, \quad \left( \varvec{\pi }_1(r') \cdot \frac{r}{r'} \right) \left( \varvec{\pi }_1(r) \cdot \frac{r'}{r} \right) \ge 1 \end{aligned}$$

for all \(r, r' \in \varDelta _n\). It follows from (79) and the Cauchy–Schwarz inequality that

$$\begin{aligned} \left\| \frac{\varvec{\pi }_t(r')}{r'} - \frac{\varvec{\pi }_t(r)}{r} \right\| \ge (1 - t) \frac{\log \left( \left( {\overline{e}} \cdot \frac{r}{r'} \right) \left( {\overline{e}} \cdot \frac{r'}{r} \right) \right) }{\Vert r' - r\Vert }, \quad r \ne r'. \end{aligned}$$

By (21), the right hand side of (80) equals

$$\begin{aligned} (1 - t) \frac{c(r, r') + c(r', r)}{\Vert r - r'\Vert }, \end{aligned}$$

which is positive for \(r \ne r'\). By the Taylor approximation (24) \(c(r, r') + c(r', r)\) is of order \(\Vert r - r'\Vert ^2\) when \(r \approx r'\), thus (81) is of order \((1 - t) \Vert r - r'\Vert \) when \(r \approx r'\).

From (80) and the previous observation, the mapping \(p \mapsto r = p^{-1} \mapsto \mathbf{w}_t(r)\) is one-to-one and its inverse is locally Lipschitz. Since \(P_0\) is absolutely continuous by assumption, we have that \({\tilde{P}}_t\), and hence \(P_t\), is absolutely continuous.

To prove the second claim, let \(\varvec{\pi }_t\) be the portfolio map at time t. By Lemma 2, we have

$$\begin{aligned} \begin{aligned} {\mathbf {C}}(P_0, P_t)&= {\mathbb {E}}_{p \sim P_0} \left[ H\left( {\overline{e}} \mid \varvec{\pi }_t(p^{-1})\right) \right] \\&= {\mathbb {E}}_{p \sim P_0} \left[ H\left( {\overline{e}} \mid (1 - t) {\overline{e}} + t \varvec{\pi }_1(p^{-1})\right) \right] . \end{aligned} \end{aligned}$$

By properties of the relative entropy (see for example [8, Theorem 2.7.2]) the quantity \(H\left( {\overline{e}} \mid (1 - t) {\overline{e}} + t \varvec{\pi }_1(p^{-1})\right) \) is smooth and convex in t, and is increasing and strictly convex whenever \(\varvec{\pi }_1(p^{-1}) \ne {\overline{e}}\). Since \(P_0 \ne P_1\) by assumption, the last condition holds on a set of positive probability under \(P_0\). This completes the proof of the proposition. \(\square \)

Proof of Lemma 7

Recall that

$$\begin{aligned} \mathbf{D}[q' : q] = \log (1 + \nabla \varphi (q) \cdot (q' - q) ) - (\varphi (q') - \varphi (q)). \end{aligned}$$

Since \(\log (1 + x) \le x\), we have the upper bound

$$\begin{aligned} \mathbf{D}[q' : q] \le \nabla \varphi (q) \cdot (q' - q) - (\varphi (q') - \varphi (q)) \end{aligned}$$

which is the Bregman divergence of \(\varphi \) (see [2, Chapter 1]). Let \(q \ne q'\). Applying Taylor’s theorem along the line segment \([q, q']\) from q to \(q'\), we have

$$\begin{aligned} \mathbf{D}[q' : q] \le \Vert q' - q \Vert ^2 (v^{\top } (-\nabla ^2 \varphi (q'')) v) \end{aligned}$$

for some \(q''\) on \([q, q']\) and \(v = \frac{q' - q}{\Vert q' - q \Vert }\). From the hypotheses we have

$$\begin{aligned} (v^{\top } (-\nabla ^2 \varphi (q'')) v) \le C_1, \end{aligned}$$

so the upper bound in (41) holds with \(\alpha ' = C_1\).

To derive a lower bound, let \(\Phi = e^{\varphi }\) and express (82) in the form

$$\begin{aligned} \begin{aligned} \mathbf{D}[q' : q]&= \log \left( \frac{\Phi (q) + \nabla \Phi (q) \cdot (q' - q)}{\Phi (q')}\right) \\&= - \log \left( \frac{\Phi (q) + \nabla \Phi (q) \cdot (q' - q) + \Vert q' - q \Vert ^2 (v^{\top } \nabla ^2 \Phi (q'') v)}{\Phi (q) + \nabla \Phi (q) \cdot (q' - q) } \right) \\&= - \log \left( 1 + \frac{(v^{\top } \nabla ^2 \Phi (q'') v)}{\Phi (q) + \nabla \Phi (q) \cdot (q' - q)} \Vert q' - q\Vert ^2\right) . \end{aligned} \end{aligned}$$

Again \(q''\) is some point on \([q, q']\) and v is as above. Using \(- \log (1 + x) \ge -x\), we have the bound

$$\begin{aligned} \mathbf{D}[q' : q] \ge \frac{C_2}{\Phi (q) + \nabla \Phi (q) \cdot (q' - q)} \Vert q' - q\Vert ^2. \end{aligned}$$

Since \(\Phi \) is non-negative and concave on \(\varDelta _n\), it is bounded above by some \(M > 0\). Since \(\Vert q' - q\Vert \le 1\) for \(q, q' \in \varDelta _n\), we have

$$\begin{aligned} \Phi (q) + \nabla \Phi (q) \cdot (q' - q) \le M + C_3. \end{aligned}$$

Plugging this into (83) gives the lower bound with \(\alpha = \frac{C_2}{M + C_3}\). \(\square \)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Pal, S., Wong, TK.L. Multiplicative Schrödinger problem and the Dirichlet transport. Probab. Theory Relat. Fields 178, 613–654 (2020).

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI:


  • Optimal transport
  • Exponentially concave function
  • Displacment interpolation
  • Schrödinger problem
  • Entropic measure
  • L-divergence
  • Large deviations
  • Dirichlet process

Mathematics Subject Classification

  • 60J75
  • 60G57
  • 60F10