A Gradient Descent Perspective on Sinkhorn

Léger, Flavien

doi:10.1007/s00245-020-09697-w

A Gradient Descent Perspective on Sinkhorn

Published: 01 July 2020

Volume 84, pages 1843–1855, (2021)
Cite this article

Applied Mathematics & Optimization Submit manuscript

Flavien Léger ORCID: orcid.org/0000-0001-7756-1648¹

704 Accesses
5 Citations
Explore all metrics

Abstract

We present a new perspective on the popular Sinkhorn algorithm, showing that it can be seen as a Bregman gradient descent (mirror descent) of a relative entropy (Kullback–Leibler divergence). This viewpoint implies a new sublinear convergence rate with a robust constant.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A fast non-monotone line search for stochastic gradient descent

Article 23 September 2023

First-Order and Second-Order Variants of the Gradient Descent in a Unified Framework

Accelerated Gradient Sliding for Minimizing a Sum of Functions

Article 01 May 2020

References

Udny Yule, G.: On the methods of measuring association between two attributes. J. R. Stat. Soc. 75(6), 579–652 (1912)
Article Google Scholar
Kruithof, J.: Telefoonverkeersrekening. De Ingenieur 52, 15–25 (1937)
Google Scholar
Edwards Deming, W., Stephan, F.F.: On a least squaresadjustment of a sampled frequency table when the expected marginaltotals are known. Ann. Math. Stat. 11, 427–444 (1940). https://doi.org/10.1214/aoms/1177731829
Article MATH Google Scholar
Bacharach, M.: Estimating nonnegative matrices from marginal data. Int. Econ. Rev. 6(3), 294–310 (1965)
Article Google Scholar
Wilson, A.G.: The use of entropy maximising models, in the theory of trip distribution, mode split and route split. J. Transp. Econ. Policy 1, 108–126 (1969)
Google Scholar
Erlander, S.: Optimal Spatial Interaction and the Gravity Model. Lecture Notes in Economics and Mathematical Systems, vol. 173. Springer-Verlag, Berlin-New York (1980)
Book Google Scholar
Erlander, S., Stewart, N.F.: The Gravity Model in Transportation Analysis–Theory and Extensions. Topics in Transportation. VSP, Utrecht (1990)
MATH Google Scholar
Galichon, A., Salanié, B.: Matching with trade-offs: revealed preferences over competing characteristics
Cuturi, M.: Sinkhorn distances: lightspeed computation of optimal transport. In: Advances in Neural Information Processing Systems 2292–2300 (2013)
Sinkhorn, R.: A relationship between arbitrary positive matrices and doubly stochastic matrices. Ann. Math. Stat. 35, 876–879 (1964). https://doi.org/10.1214/aoms/1177703591
Article MathSciNet MATH Google Scholar
Rüschendorf, L.: Convergence of the iterative proportional fitting procedure. Ann. Stat. 23(4), 1160–1174 (1995). https://doi.org/10.1214/aos/1176324703
Article MathSciNet MATH Google Scholar
Franklin, J., Lorenz, J.: On the scaling of multidimensional matrices. Linear Algebra Appl. 114(115), 717–735 (1989). https://doi.org/10.1016/0024-3795(89)90490-4
Article MathSciNet MATH Google Scholar
Altschuler, J., Niles-Weed, J., Rigollet, P.e: Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration. In: Advances in Neural Information Processing Systems 1964–1974 (2017)
Chakrabarty, D., Khanna, S.: Better and simpler error analysis of the Sinkhorn–Knopp algorithm for matrix scaling, 1st Symposium on Simplicity in Algorithms (SOSA 2018) (Dagstuhl, Germany) (Raimund Seidel, ed.), OpenAccess Series in Informatics (OASIcs), vol. 61, Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2018, pp. 4:1–4:11, https://doi.org/10.4230/OASIcs.SOSA.2018.4. (2018)
Dvurechensky, P., Gasnikov, A., Kroshnin, A.: Computational optimal transport: complexity by accelerated gradient descent is better than by Sinkhorn’s algorithm. In: Proceedings of the 35th International Conference on Machine Learning (Stockholmsmässan, Stockholm Sweden) (Jennifer Dy and Andreas Krause, eds.), Proceedings of Machine Learning Research, vol. 80, PMLR, 10–15 Jul 2018, pp. 1367–1376
Mishchenko, K.: Sinkhorn algorithm as a special case of stochastic mirror descent. arXiv preprint arXiv:1909.06918 (2019)
Mensch, A., Peyré, G.: Online Sinkhorn: optimal transportation distances from sample streams. ArXiv e-prints arXiv:2003.01415 (2020)
Nemirovsky, A.S., Yudin, D.B.: Problem complexity and method efficiency in optimization, A Wiley-Interscience Publication, Wiley, New York, Translated from the Russian and with a preface by E. R. Dawson, Wiley-Interscience Series in Discrete Mathematics (1983)
Beck, A., Teboulle, M.: Mirror descent and nonlinear projected subgradient methods for convex optimization. Oper. Res. Lett. 31, 167–175 (2003). https://doi.org/10.1016/S0167-6377(02)00231-6
Article MathSciNet MATH Google Scholar
Peyré, G., Cuturi, M.: Computational optimal transport, foundations and trends®. Mach. Learn. 11(5–6), 355–607 (2019)
Google Scholar
Villani, C.: Optimal Transport, Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 338, Springer, Berlin (2009). https://doi.org/10.1007/978-3-540-71050-9.
Berman, R.J.: The Sinkhorn algorithm, parabolic optimal transport and geometric Monge–Ampère equations. arXiv preprint arXiv:1712.03082 (2017)
Conforti, G.: A second order equation for Schrödinger bridges with applications to the hot gas experiment and entropic transportation cost. Probab. Theory Relat. Fields 174, 1–47 (2019). https://doi.org/10.1007/s00440-018-0856-7
Article MATH Google Scholar
Conforti, G., Tamanini, L.: A formula for the time derivative of the entropic cost and applications, arXiv preprint arXiv:1912.10555 (2019)

Download references

Acknowledgements

The author is grateful to Gabriel Peyré for helpful discussions.

Author information

Authors and Affiliations

École Normale Supérieure, Paris, France
Flavien Léger

Authors

Flavien Léger
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Flavien Léger.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Léger, F. A Gradient Descent Perspective on Sinkhorn. Appl Math Optim 84, 1843–1855 (2021). https://doi.org/10.1007/s00245-020-09697-w

Download citation

Published: 01 July 2020
Issue Date: October 2021
DOI: https://doi.org/10.1007/s00245-020-09697-w

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Gradient Descent Perspective on Sinkhorn

Abstract

Access this article

Similar content being viewed by others

A fast non-monotone line search for stochastic gradient descent

First-Order and Second-Order Variants of the Gradient Descent in a Unified Framework

Accelerated Gradient Sliding for Minimizing a Sum of Functions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

A Gradient Descent Perspective on Sinkhorn

Abstract

Access this article

Similar content being viewed by others

A fast non-monotone line search for stochastic gradient descent

First-Order and Second-Order Variants of the Gradient Descent in a Unified Framework

Accelerated Gradient Sliding for Minimizing a Sum of Functions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation