Skip to main content
Log in

A Gradient Descent Perspective on Sinkhorn

  • Published:
Applied Mathematics & Optimization Submit manuscript

Abstract

We present a new perspective on the popular Sinkhorn algorithm, showing that it can be seen as a Bregman gradient descent (mirror descent) of a relative entropy (Kullback–Leibler divergence). This viewpoint implies a new sublinear convergence rate with a robust constant.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Udny Yule, G.: On the methods of measuring association between two attributes. J. R. Stat. Soc. 75(6), 579–652 (1912)

    Article  Google Scholar 

  2. Kruithof, J.: Telefoonverkeersrekening. De Ingenieur 52, 15–25 (1937)

    Google Scholar 

  3. Edwards Deming, W., Stephan, F.F.: On a least squaresadjustment of a sampled frequency table when the expected marginaltotals are known. Ann. Math. Stat. 11, 427–444 (1940). https://doi.org/10.1214/aoms/1177731829

    Article  MATH  Google Scholar 

  4. Bacharach, M.: Estimating nonnegative matrices from marginal data. Int. Econ. Rev. 6(3), 294–310 (1965)

    Article  Google Scholar 

  5. Wilson, A.G.: The use of entropy maximising models, in the theory of trip distribution, mode split and route split. J. Transp. Econ. Policy 1, 108–126 (1969)

    Google Scholar 

  6. Erlander, S.: Optimal Spatial Interaction and the Gravity Model. Lecture Notes in Economics and Mathematical Systems, vol. 173. Springer-Verlag, Berlin-New York (1980)

    Book  Google Scholar 

  7. Erlander, S., Stewart, N.F.: The Gravity Model in Transportation Analysis–Theory and Extensions. Topics in Transportation. VSP, Utrecht (1990)

    MATH  Google Scholar 

  8. Galichon, A., Salanié, B.: Matching with trade-offs: revealed preferences over competing characteristics

  9. Cuturi, M.: Sinkhorn distances: lightspeed computation of optimal transport. In: Advances in Neural Information Processing Systems 2292–2300 (2013)

  10. Sinkhorn, R.: A relationship between arbitrary positive matrices and doubly stochastic matrices. Ann. Math. Stat. 35, 876–879 (1964). https://doi.org/10.1214/aoms/1177703591

    Article  MathSciNet  MATH  Google Scholar 

  11. Rüschendorf, L.: Convergence of the iterative proportional fitting procedure. Ann. Stat. 23(4), 1160–1174 (1995). https://doi.org/10.1214/aos/1176324703

    Article  MathSciNet  MATH  Google Scholar 

  12. Franklin, J., Lorenz, J.: On the scaling of multidimensional matrices. Linear Algebra Appl. 114(115), 717–735 (1989). https://doi.org/10.1016/0024-3795(89)90490-4

    Article  MathSciNet  MATH  Google Scholar 

  13. Altschuler, J., Niles-Weed, J., Rigollet, P.e: Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration. In: Advances in Neural Information Processing Systems 1964–1974 (2017)

  14. Chakrabarty, D., Khanna, S.: Better and simpler error analysis of the Sinkhorn–Knopp algorithm for matrix scaling, 1st Symposium on Simplicity in Algorithms (SOSA 2018) (Dagstuhl, Germany) (Raimund Seidel, ed.), OpenAccess Series in Informatics (OASIcs), vol. 61, Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2018, pp. 4:1–4:11, https://doi.org/10.4230/OASIcs.SOSA.2018.4. (2018)

  15. Dvurechensky, P., Gasnikov, A., Kroshnin, A.: Computational optimal transport: complexity by accelerated gradient descent is better than by Sinkhorn’s algorithm. In: Proceedings of the 35th International Conference on Machine Learning (Stockholmsmässan, Stockholm Sweden) (Jennifer Dy and Andreas Krause, eds.), Proceedings of Machine Learning Research, vol. 80, PMLR, 10–15 Jul 2018, pp. 1367–1376

  16. Mishchenko, K.: Sinkhorn algorithm as a special case of stochastic mirror descent. arXiv preprint arXiv:1909.06918 (2019)

  17. Mensch, A., Peyré, G.: Online Sinkhorn: optimal transportation distances from sample streams. ArXiv e-prints arXiv:2003.01415 (2020)

  18. Nemirovsky, A.S., Yudin, D.B.: Problem complexity and method efficiency in optimization, A Wiley-Interscience Publication, Wiley, New York, Translated from the Russian and with a preface by E. R. Dawson, Wiley-Interscience Series in Discrete Mathematics (1983)

  19. Beck, A., Teboulle, M.: Mirror descent and nonlinear projected subgradient methods for convex optimization. Oper. Res. Lett. 31, 167–175 (2003). https://doi.org/10.1016/S0167-6377(02)00231-6

    Article  MathSciNet  MATH  Google Scholar 

  20. Peyré, G., Cuturi, M.: Computational optimal transport, foundations and trends®. Mach. Learn. 11(5–6), 355–607 (2019)

    Google Scholar 

  21. Villani, C.: Optimal Transport, Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 338, Springer, Berlin (2009). https://doi.org/10.1007/978-3-540-71050-9.

  22. Berman, R.J.: The Sinkhorn algorithm, parabolic optimal transport and geometric Monge–Ampère equations. arXiv preprint arXiv:1712.03082 (2017)

  23. Conforti, G.: A second order equation for Schrödinger bridges with applications to the hot gas experiment and entropic transportation cost. Probab. Theory Relat. Fields 174, 1–47 (2019). https://doi.org/10.1007/s00440-018-0856-7

    Article  MATH  Google Scholar 

  24. Conforti, G., Tamanini, L.: A formula for the time derivative of the entropic cost and applications, arXiv preprint arXiv:1912.10555 (2019)

Download references

Acknowledgements

The author is grateful to Gabriel Peyré for helpful discussions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Flavien Léger.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Léger, F. A Gradient Descent Perspective on Sinkhorn. Appl Math Optim 84, 1843–1855 (2021). https://doi.org/10.1007/s00245-020-09697-w

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00245-020-09697-w

Keywords

Mathematics Subject Classification

Navigation