Traversing the Schrödinger Bridge Strait: Robert Fortet’s Marvelous Proof Redux

Essid, Montacer; Pavon, Michele

doi:10.1007/s10957-018-1436-9

Traversing the Schrödinger Bridge Strait: Robert Fortet’s Marvelous Proof Redux

Published: 14 November 2018

Volume 181, pages 23–60, (2019)
Cite this article

Journal of Optimization Theory and Applications Aims and scope Submit manuscript

456 Accesses
6 Citations
Explore all metrics

Abstract

In the early 1930s, Erwin Schrödinger, motivated by his quest for a more classical formulation of quantum mechanics, posed a large deviation problem for a cloud of independent Brownian particles. He showed that the solution to the problem could be obtained through a system of two linear equations with nonlinear coupling at the boundary (Schrödinger system). Existence and uniqueness for such a system, which represents a sort of bottleneck for the problem, was first established by Fortet in 1938/1940 under rather general assumptions by proving convergence of an ingenious but complex approximation method. It is the first proof of what are nowadays called Sinkhorn-type algorithms in the much more challenging continuous case. Schrödinger bridges are also an early example of the maximum entropy approach and have been more recently recognized as a regularization of the important optimal mass transport problem. Unfortunately, Fortet’s contribution is by and large ignored in contemporary literature. This is likely due to the complexity of his approach coupled with an idiosyncratic exposition style and due to missing details and steps in the proofs. Nevertheless, Fortet’s approach maintains its importance to this day as it provides the only existing algorithmic proof, in the continuous setting, under rather mild assumptions. It can be adapted, in principle, to other relevant optimal transport problems. It is the purpose of this paper to remedy this situation by rewriting the bulk of his paper with all the missing passages and in a transparent fashion so as to make it fully available to the scientific community. We consider the problem in ${\mathbb {R}}^d$ rather than in ${\mathbb {R}}$ and use as much as possible his notation to facilitate comparison.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The mean field Schrödinger problem: ergodic behavior, entropy estimates and functional inequalities

Article Open access 23 June 2020

On Large Deviation Principles and the Monge–Ampère Equation (Following Berman, Hultgren)

Revisiting the Monge Problem in the Landauer Limit

Article Open access 31 October 2022

Notes

Let $\mathcal{V}$ be a metric space and ${{{\mathcal {D}}}}({{{\mathcal {V}}}})$ be the set of probability measures defined on ${{{\mathcal {B}}}}({{{\mathcal {V}}}})$, the Borel $\sigma $-field of ${{{\mathcal {V}}}}$. We say that a sequence $\{P_N\}$ of elements of ${{{\mathcal {D}}}}({{{\mathcal {V}}}})$ converges weakly to $P\in \mathcal{D}({{{\mathcal {V}}}})$, and write $P_N\Rightarrow P$ if $\int _\mathcal{V}f\mathrm{d}P_N\rightarrow \int _{{{\mathcal {V}}}}f \mathrm{d}P$ for every bounded, continuous function f on ${{{\mathcal {V}}}}$.
The initial marginal of the prior measure, as long as $\rho _0(x){\mathrm{d}}x$ is at finite relative entropy from it, does not play any role in the optimization problem. Instead of $\rho _0(x){\mathrm{d}}x$, which is the standard case in control problems, another popular choice is Lebesgue measure so that the prior is an unbounded measure called stationary Wiener measure, see, e.g., [12].
Probability densities on ${\mathbb {R}}^n\times {\mathbb {R}}^n$ with marginals $\rho _0$ and $\rho _1$.
Remarkable analogies to quantum mechanics, which appear to me very worth of reflection.
In this paper, the maximum or minimum of two functions will always be taken pointwise.
In Fortet’s paper, $H'_{|n}$ is denoted $H'_n$ [21, p. 88]. Unfortunately, the same notation is later used for another quantity [21, p. 90].
Fortet seems to imply by this proposition that H and $H'$ cannot vanish at a point without vanishing everywhere. Although this is true for $H'$, see Proposition 5.8 below, it does not imply the same property for H.
The statement can be found on [21, p. 92]. The proof there provided, however, appears to be incorrect as it does not make use of hypothesis ($\star $) confusing $H_n'$ of the iteration (AS) with $H'_{|n}$ (also denoted by $H'_n$ by Fortet) defined in (18).

References

Jaynes, E.T.: Information theory and statistical mechanics. Phys. Rev. 106(4), 620 (1957)
Article MathSciNet MATH Google Scholar
Jaynes, E.T.: On the rationale of maximum-entropy methods. Proc. IEEE 70(9), 939–952 (1982)
Article Google Scholar
Burg, J.P.: Maximum entropy spectral analysis. In: 37th Annual International Meeting, Society of Exploration Geophysicists Oklahoma City, Okla, 31 Oct 1967 (1967)
Burg, J.P., Luenberger, D.G., Wenger, D.L.: Estimation of structured covariance matrices. Proc. IEEE 70(9), 963–974 (1982)
Article Google Scholar
Dempster, A.P.: Covariance selection. Biometrics 28, 157–175 (1972)
Article Google Scholar
Csiszár, I.: I-divergence geometry of probability distributions and minimization problems. Ann. Probab. 3, 146–158 (1975)
Article MathSciNet MATH Google Scholar
Csiszár, I.: Sanov property, generalized I-projection and a conditional limit theorem. Ann. Probab. 12, 768–793 (1984)
Article MathSciNet MATH Google Scholar
Csiszar, I., et al.: Why least squares and maximum entropy? An axiomatic approach to inference for linear inverse problems. Ann. Stat. 19(4), 2032–2066 (1991)
Article MathSciNet MATH Google Scholar
Mikami, T.: Monge’s problem with a quadratic cost by the zero-noise limit of h-path processes. Probab. Theory Relat. Fields 129(2), 245–260 (2004)
Article MathSciNet MATH Google Scholar
Mikami, T., Thieullen, M.: Duality theorem for the stochastic optimal control problem. Stoch. Process. Appl. 116(12), 1815–1835 (2006). https://doi.org/10.1016/j.spa.2006.04.014
Article MathSciNet MATH Google Scholar
Mikami, T., Thieullen, M.: Optimal transportation problem by stochastic optimal control. SIAM J. Control Optim. 47(3), 1127–1139 (2008)
Article MathSciNet MATH Google Scholar
Léonard, C.: A survey of the schrodinger problem and some of its connections with optimal transport. Discrete Contin. Dyn. Syst. A 34(4), 1533–1574 (2014)
Article MathSciNet MATH Google Scholar
Léonard, C.: From the Schrödinger Problem to the Monge–Kantorovich Problem. arXiv preprint arXiv:1011.2564 (2010)
Chen, Y., Georgiou, T.T., Pavon, M.: On the relation between optimal transport and Schrödinger bridges: a stochastic control viewpoint. J. Optim. Theory Appl. 169(2), 671–691 (2016)
Article MathSciNet MATH Google Scholar
Peyré, G., Cuturi, M.: Computational Optimal Transport. arXiv preprint arXiv:1803.00567 (2018)
Cuturi, M.: Sinkhorn distances: lightspeed computation of optimal transport. In: Advances in Neural Information Processing Systems, pp. 2292–2300 (2013)
Benamou, J.D., Carlier, G., Cuturi, M., Nenna, L., Peyré, G.: Iterative Bregman projections for regularized transportation problems. SIAM J. Sci. Comput. 37(2), A1111–A1138 (2015)
Article MathSciNet MATH Google Scholar
Chen, Y., Georgiou, T.T., Pavon, M.: Optimal transport over a linear dynamical system. IEEE Trans. Autom. Control 62(5), 2137–2152 (2017)
Article MathSciNet MATH Google Scholar
Chen, Y., Georgiou, T., Pavon, M.: Entropic and displacement interpolation: a computational approach using the Hilbert metric. SIAM J. Appl. Math. 76(6), 2375–2396 (2016)
Article MathSciNet MATH Google Scholar
Fortet, R.: Résolution d’un système d’équations de M. Schrodinger. Comptes Rendus 206, 721–723 (1938)
MATH Google Scholar
Fortet, R.: Résolution d’un système d’équations de M. Schrodinger. J. Math. Pure Appl. IX, 83–105 (1940)
MathSciNet MATH Google Scholar
Beurling, A.: An automorphism of product measures. Ann. Math. 72, 189–200 (1960)
Article MathSciNet MATH Google Scholar
Jamison, B.: The Markov processes of Schrödinger. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete 32(4), 323–331 (1975)
Article MATH Google Scholar
Zambrini, J.C.: Variational processes and stochastic versions of mechanics. J. Math. Phys. 27, 2307–2330 (1986)
Article MathSciNet MATH Google Scholar
Föllmer, H.: Random fields and diffusion processes. In: Hennequin, P.-L. (ed.) École d’Été de Probabilités de Saint-Flour XV-XVII, 1985–87, pp. 101–203. Springer, Berlin (1988)
Chapter MATH Google Scholar
Deming, W.E., Stephan, F.F.: On a least squares adjustment of a sampled frequency table when the expected marginal totals are known. Ann. Math. Stat. 11(4), 427–444 (1940)
Article MathSciNet MATH Google Scholar
Sinkhorn, R.: A relationship between arbitrary positive matrices and doubly stochastic matrices. Ann. Math. Stat. 35(2), 876–879 (1964)
Article MathSciNet MATH Google Scholar
Chizat, L., Peyré, G., Schmitzer, B., Vialard, F.X.: Scaling Algorithms for Unbalanced Transport Problems. arXiv preprint arXiv:1607.05816 (2016)
Schrödinger, E.: Uber, : die umkehrung der naturgesetze. Sitzungsberichte der Preuss Akad. Wissen. Berlin. Phys. Math. Klasse 1, 144–153 (1931)
Schrödinger, E.: Sur la théorie relativiste de l’électron et l’interprétation de la mécanique quantique. Ann. Inst. H. Poincaré 2(4), 269–310 (1932)
MathSciNet MATH Google Scholar
Sanov, I.N.: On the Probability of Large Deviations of Random Variables, Technical report. Department of Statistics, North Carolina State University (1958)
Dudley, R.M.: Real Analysis and Probability. Cambridge University Press, Cambridge (2002)
Book MATH Google Scholar
Ellis, R.S.: Entropy, Large Deviations, and Statistical Mechanics. Springer, Berlin (2007)
Google Scholar
Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications. Corrected reprint of the second (1998) edition. Stochastic Modelling and Applied Probability, p. 38 (2010)
Villani, C.: Topics in Optimal Transportation, vol. 58. American Mathematical Society, Providence (2003)
MATH Google Scholar
Wakolbinger, A.: Schrödinger bridges from 1931 to 1991. In: Proceedings of the 4th Latin American Congress in Probability and Mathematical Statistics, Mexico City, pp. 61–79 (1990)
Dai Pra, P.: A stochastic control approach to reciprocal diffusion processes. Appl. Math. Optim. 23(1), 313–329 (1991)
Article MathSciNet MATH Google Scholar
Dai Pra, P., Pavon, M.: On the Markov processes of Schrödinger, the Feynman–Kac formula and stochastic control. In: Realization and Modelling in System Theory, pp. 497–504. Springer, Berlin (1990)
Pavon, M., Wakolbinger, A.: On free energy, stochastic control, and Schrödinger processes. In: Modeling, Estimation and Control of Systems with Uncertainty, pp. 334–348. Springer, Berlin (1991)
Mikami, T.: Optimal transportation problem as stochastic mechanics. Sel. Pap. Probab. Stat. 227, 75–94 (2008)
Google Scholar
Chen, Y., Georgiou, T.T., Pavon, M.: Optimal steering of a linear stochastic system to a final probability distribution, part I. IEEE Trans. Autom. Control 61(5), 1158–1169 (2016)
Article MathSciNet MATH Google Scholar
Chen, Y., Georgiou, T.T., Pavon, M.: Optimal steering of a linear stochastic system to a final probability distribution, part II. IEEE Trans. Autom. Control 61(5), 1170–1180 (2016)
Article MathSciNet MATH Google Scholar
Chen, Y., Georgiou, T.T., Pavon, M.: Fast cooling for a system of stochastic oscillators. J. Math. Phys. 56(11), 113,302 (2015)
Article MathSciNet MATH Google Scholar
Benamou, J.D., Brenier, Y.: A computational fluid mechanics solution to the Monge–Kantorovich mass transfer problem. Numer. Math. 84(3), 375–393 (2000)
Article MathSciNet MATH Google Scholar
Birkhoff, G.: Extensions of Jentzsch’s theorem. Trans. Am. Math. Soc. 85(1), 219–227 (1957)
MathSciNet MATH Google Scholar
Bushell, P.: On the projective contraction ratio for positive linear mappings. J. Lond. Math. Soc. 2(2), 256–258 (1973)
Article MathSciNet MATH Google Scholar
Bushell, P.J.: Hilbert’s metric and positive contraction mappings in a Banach space. Arch. Ration. Mech. Anal. 52(4), 330–338 (1973)
Article MathSciNet MATH Google Scholar
Birkhoff, G.: Uniformly semi-primitive multiplicative processes. Trans. Am. Math. Soc. 104(1), 37–51 (1962)
Article MathSciNet MATH Google Scholar
Lemmens, B., Nussbaum, R.: Birkhoff’s Version of Hilbert’s Metric and Its Applications in Analysis. arXiv preprint arXiv:1304.7921 (2013)
Tsitsiklis, J., Bertsekas, D., Athans, M.: Distributed asynchronous deterministic and stochastic gradient optimization algorithms. IEEE Trans. Autom. Control 31(9), 803–812 (1986)
Article MathSciNet MATH Google Scholar
Sepulchre, R., Sarlette, A., Rouchon, P.: Consensus in Non-Commutative Spaces. arXiv preprint arXiv:1003.5653 (2010)
Bonnabel, S., Astolfi, A., Sepulchre, R.: Contraction and observer design on cones. In: Decision and Control and European Control Conference (CDC-ECC), 2011 50th IEEE Conference on, pp. 7147–7151. IEEE (2011)
Reeb, D., Kastoryano, M.J., Wolf, M.M.: Hilbert’s projective metric in quantum information theory. J. Math. Phys. 52(8), 082,201 (2011)
Article MathSciNet MATH Google Scholar
Lemmens, B., Nussbaum, R.: Nonlinear Perron–Frobenius Theory, vol. 189. Cambridge University Press, Cambridge (2012)
Book MATH Google Scholar
Georgiou, T.T., Pavon, M.: Positive contraction mappings for classical and quantum Schrödinger systems. J. Math. Phys. 56(3), 033,301 (2015)
Article MATH Google Scholar
Franklin, J., Lorenz, J.: On the scaling of multidimensional matrices. Linear Algebra Appl. 114, 717–735 (1989)
Article MathSciNet MATH Google Scholar
Schmitzer, B.: Stabilized Sparse Scaling Algorithms for Entropy Regularized Transport Problems. arXiv preprint arXiv:1610.06519 (2016)
Galichon, A., Kominers, S.D., Weber, S.: The nonlinear Bernstein–Schrödinger equation in economics. In: International Conference on Networked Geometric Science of Information, pp. 51–59. Springer, Berlin (2015)

Download references

Acknowledgements

The authors thank Robert V. Kohn for useful suggestions. The second named author would also like to thank the Courant Institute of Mathematical Sciences of the New York University for the hospitality during the time this paper was written. The authors finally wish to thank two anonymous reviewers for very careful reading and providing plenty of general and specific comments/suggestions on how to improve the paper. The second named author was partly supported by the University of Padova Research Project CPDA 140897.

Author information

Authors and Affiliations

Courant Institute of Mathematical Sciences, New York University, 251 Mercer St., New York, NY, 10012, USA
Montacer Essid
Dipartimento di Matematica “Tullio Levi-Civita”, Università di Padova, via Trieste 63, 35121, Padua, Italy
Michele Pavon

Authors

Montacer Essid
View author publications
You can also search for this author in PubMed Google Scholar
Michele Pavon
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michele Pavon.

Additional information

Communicated by Gabriel Peyré

Appendix

1.1 Proof of (33) from Theorem I

Let $Z \subset {\mathfrak {I}}^2$ be the set of $\{ y \in {\mathfrak {I}}^2{:}\,g(x_0,y) = 0 \}$.

Define $Z_k = \{ y \in {\mathfrak {I}}^2{:}\,g(x_0,y) < \frac{1}{k} \}$ for $k \in {\mathbb {N}}^*$. We have $Z_{k+1} \subset Z_k$, and $Z_k \downarrow Z$ as $k \rightarrow +\infty $.

By assumption (H.vi) we know that Z has Lebesgue measure 0. From the continuity of g (H.iv), we also know that Z is closed.

Hence, $m(Z_k) \rightarrow 0$ as $k \rightarrow +\infty $.

Denote ${\mathfrak {I}}^2_k = {\mathfrak {I}}^2 \backslash Z_k$. Then we have ${\mathfrak {I}}^2_k \subset {\mathfrak {I}}^2_{k+1}$ and ${\mathfrak {I}}^2_k \uparrow {\mathfrak {I}}^2 \backslash Z$ as $k \rightarrow +\infty $.

Since

$$\begin{aligned} H_n'(x_0) = \int _{{\mathfrak {I}}^2} g(x_0,y) \frac{\omega _2(y)}{G(H_n,y)} {\mathrm{d}}y \rightarrow 0, \quad \text { as } n \rightarrow +\infty \end{aligned}$$

$\forall \epsilon >0$, we have for n large enough:

$$\begin{aligned} \int _{{\mathfrak {I}}^2} g(x_0,y) \frac{\omega _2(y)}{G(H_n,y)} {\mathrm{d}}y < \epsilon \end{aligned}$$

Fix $\epsilon >0, k \in {\mathbb {N}}^*$. We then have for n large enough:

$$\begin{aligned} 0 \le \int _{{\mathfrak {I}}^2_{k}} g(x_0,y) \frac{\omega _2(y)}{G(H_n,y)} {\mathrm{d}}y + \int _{{\mathfrak {I}}^2\backslash {\mathfrak {I}}^2_{k}} g(x_0,y) \frac{\omega _2(y)}{G(H_n,y)} {\mathrm{d}}y < \epsilon \end{aligned}$$

and in particular, by nonnegativity, the first integral yields:

$$\begin{aligned} 0 \le \int _{{\mathfrak {I}}^2_{k}} \frac{\omega _2(y)}{G(H_n,y)} {\mathrm{d}}y < k \epsilon \end{aligned}$$

This implies that the measure $\frac{\omega _2(y)}{G(H_n,y)} {\mathrm{d}}y$ converges weakly to 0 on ${\mathfrak {I}}^2_{k}$. Indeed, it is the case when evaluated on any step function with support included in ${\mathfrak {I}}^2_{k}$, and step functions are dense in the family of bounded continuous functions.

We would like the measure $\frac{\omega _2(y)}{G(H_n,y)} {\mathrm{d}}y$ to converge to 0 for any step function whose support I is included in ${\mathfrak {I}}^2$, and not merely on ${\mathfrak {I}}^2_k$.

Pick a subset $I \subset {{\mathfrak {I}}^2}$, and consider:

$$\begin{aligned} \int _{{\mathfrak {I}}^2} \mathbb 1_{I}(y) \frac{\omega _2(y)}{G(H_n,y)} {\mathrm{d}}y = \int _{{\mathfrak {I}}^2_{k} \cap I} \frac{\omega _2(y)}{G(H_n,y)} {\mathrm{d}}y + \int _{({\mathfrak {I}}^2_{k} \cap I)^C} \frac{\omega _2(y)}{G(H_n,y)} {\mathrm{d}}y \end{aligned}$$

The first integral converges to 0 as $n \rightarrow +\infty $, since the measure $\frac{\omega _2(y)}{G(H_n,y)} {\mathrm{d}}y$ converges weakly to 0 on ${\mathfrak {I}}^2_{k}$.

As for the second integral, we have that $H_n \le H_1$, so $\frac{\omega _2(y)}{G(H_n,y)} \le \frac{\omega _2(y)}{G(H_1,y)}$ which implies:

$$\begin{aligned} \int _{({\mathfrak {I}}^2_{k} \cap I)^C} \frac{\omega _2(y)}{G(H_n,y)} {\mathrm{d}}y \le \int _{({\mathfrak {I}}^2_{k} \cap I)^C} \frac{\omega _2(y)}{G(H_1,y)} {\mathrm{d}}y \le \int _{Z_{k}} \frac{\omega _2(y)}{G(H_1,y)} {\mathrm{d}}y \end{aligned}$$

where the last inequality comes from $({\mathfrak {I}}^2_{k} \cap I)^C \subset Z_{k}$.

Condition ($\star $) states that $\int _{{\mathfrak {I}}^2} \frac{\omega _2(y)}{G(H_1,y)} {\mathrm{d}}y < +\infty $; thus, we know that the measure $\frac{\omega _2(y)}{G(H_1,y)} {\mathrm{d}}y$ is absolutely continuous with respect to the Lebesgue measure m on ${\mathfrak {I}}^2$. This implies that the second integral converges to 0, as $k \rightarrow +\infty $ since $m(Z_{k}) \rightarrow 0$.

Hence, for any measurable $I \subset {\mathfrak {I}}^2$, $\int _{{\mathfrak {I}}^2} \mathbb 1_{I}(y) \frac{\omega _2(y)}{G(H_n,y)} {\mathrm{d}}y \rightarrow 0$ as $n \rightarrow +\infty $.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Essid, M., Pavon, M. Traversing the Schrödinger Bridge Strait: Robert Fortet’s Marvelous Proof Redux. J Optim Theory Appl 181, 23–60 (2019). https://doi.org/10.1007/s10957-018-1436-9

Download citation

Received: 08 May 2018
Accepted: 03 November 2018
Published: 14 November 2018
Issue Date: 15 April 2019
DOI: https://doi.org/10.1007/s10957-018-1436-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Traversing the Schrödinger Bridge Strait: Robert Fortet’s Marvelous Proof Redux

Abstract

Access this article

Similar content being viewed by others

The mean field Schrödinger problem: ergodic behavior, entropy estimates and functional inequalities

On Large Deviation Principles and the Monge–Ampère Equation (Following Berman, Hultgren)

Revisiting the Monge Problem in the Landauer Limit

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

1.1 Proof of (33) from Theorem I

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Traversing the Schrödinger Bridge Strait: Robert Fortet’s Marvelous Proof Redux

Abstract

Access this article

Similar content being viewed by others

The mean field Schrödinger problem: ergodic behavior, entropy estimates and functional inequalities

On Large Deviation Principles and the Monge–Ampère Equation (Following Berman, Hultgren)

Revisiting the Monge Problem in the Landauer Limit

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Appendix

1.1 Proof of (33) from Theorem I

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation