Skip to main content
Log in

Gradient Flows on Graphons: Existence, Convergence, Continuity Equations

  • Published:
Journal of Theoretical Probability Aims and scope Submit manuscript

Abstract

Wasserstein gradient flows on probability measures have found a host of applications in various optimization problems. They typically arise as the continuum limit of exchangeable particle systems evolving by some mean-field interaction involving a gradient-type potential. However, in many problems, such as in multi-layer neural networks, the so-called particles are edge weights on large graphs whose nodes are exchangeable. Such large graphs are known to converge to continuum limits called graphons as their size grows to infinity. We show that the Euclidean gradient flow of a suitable function of the edge weights converges to a novel continuum limit given by a curve on the space of graphons that can be appropriately described as a gradient flow or, more technically, a curve of maximal slope. Several natural functions on graphons, such as homomorphism functions and the scalar entropy, are covered by our setup, and the examples have been worked out in detail.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Data Availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

Code Availability

The codes used during the current study are available from the corresponding author on reasonable request.

References

  1. Aldous, D.J.: On exchangeability and conditional independence. Exchangeability in probability and statistics (Rome, 1981), 165–170 (1982)

  2. Aldous, D.J.: Representations for partially exchangeable arrays of random variables. J. Multivar. Anal. 11(4), 581–598 (1981). https://doi.org/10.1016/0047-259X(81)90099-3

    Article  MathSciNet  MATH  Google Scholar 

  3. Ambrosio, L., Gigli, N., Savaré, G.: Gradient flows: In Metric spaces and in the space of probability measures. Second Edition. Lectures in mathematics. ETH Zürich. Birkhäuser Verlag AG, Basel (2008). https://doi.org/10.1007/978-3-7643-8722-8

  4. Araújo, D., Oliveira, R.I., Yukimura, D.: A mean-field limit for certain deep neural networks. arXiv preprint arXiv:1906.00193 (2019)

  5. Athreya, S., den Hollander, F., Röllin, A.: Graphon-valued stochastic processes from population genetics. Ann. Appl. Probab. 31(4), 1724–1745 (2021). https://doi.org/10.1214/20-AAP1631

    Article  MathSciNet  MATH  Google Scholar 

  6. Austin, T.: Exchangeable random arrays. In: Notes for IAS workshop (2012)

  7. Austin, T.: On exchangeable random variables and the statistics of large graphs and hypergraphs. Probab. Surv. 5, 80–145 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  8. Austin, T.: Exchangeable random measures. Annales de l’Institut Henri Poincaré, Probabilités et Statistiques 51(3), 842–861 (2015). https://doi.org/10.1214/13-AIHP584

    Article  MathSciNet  MATH  Google Scholar 

  9. Bach, F., Chizat, L.: Gradient descent on infinitely wide neural networks: global convergence and generalization. arXiv preprint arXiv:2110.08084 (2021)

  10. Ben-Eliezer, O., Fischer, E., Levi, A., Yoshida, Y.: Ordered graph limits and their applications. In: Lee, J.R. (ed.) 12th Innovations in theoretical computer science conference (ITCS 2021). Leibniz international proceedings in informatics (LIPIcs), vol. 185, pp. 42–14220. Schloss Dagstuhl–Leibniz-Zentrum für Informatik, Dagstuhl, Germany (2021). https://doi.org/10.4230/LIPIcs.ITCS.2021.42

  11. Bhattacharya, B.B., Ganguly, S.: Upper tails for edge eigenvalues of random graphs. SIAM J. Discr. Math. 34(2), 1069–1083 (2020). https://doi.org/10.1137/18M1230852

    Article  MathSciNet  MATH  Google Scholar 

  12. Bondy, J.A.: Pancyclic graphs I. J. Combinat. Theory, Series B 11(1), 80–84 (1971). https://doi.org/10.1016/0095-8956(71)90016-5

    Article  MathSciNet  MATH  Google Scholar 

  13. Borgs, C., Chayes, J.T., Lovász, L., Sós, V.T., Vesztergombi, K.: Convergent sequences of dense graphs I: Subgraph frequencies, metric properties and testing. Adv. Math. 219(6), 1801–1851 (2008). https://doi.org/10.1016/j.aim.2008.07.008

    Article  MathSciNet  MATH  Google Scholar 

  14. Borgs, C., Chayes, J.T., Lovász, L., Sós, V.T., Vesztergombi, K.: Convergent sequences of dense graphs II multiway cuts and statistical physics. Ann. Math. (2012). https://doi.org/10.4007/ANNALS.2012.176.1.2

    Article  MathSciNet  MATH  Google Scholar 

  15. Borgs, C., Chayes, J.T., Cohn, H., Holden, N.: Sparse exchangeable graphs and their limits via graphon processes. J. Mach. Learn. Res. 18(210), 1–71 (2018)

    MathSciNet  MATH  Google Scholar 

  16. Butcher, J.C.: Numerical methods for ordinary differential equations. Wiley, Hoboken (2016). https://doi.org/10.1002/9781119121534

  17. Carrillo, J.A., Craig, K., Patacchini, F.S.: A blob method for diffusion. Calc. Variat. Part. Diff. Eq. 58(2), 1–53 (2019). https://doi.org/10.1007/s00526-019-1486-3

    Article  MathSciNet  MATH  Google Scholar 

  18. Chatterjee, S.: Large deviations for random graphs: École d’Été de Probabilités de Saint-Flour XLV-2015 vol. 2197. Springer, New York (2017). https://doi.org/10.1007/978-3-319-65816-2

  19. Chatterjee, S., Diaconis, P.: Estimating and understanding exponential random graph models. Ann. Stat. 41(5), 2428–2461 (2013). https://doi.org/10.1214/13-AOS1155

    Article  MathSciNet  MATH  Google Scholar 

  20. Chatterjee, S., Varadhan, S.R.S.: The large deviation principle for the Erdős-Rényi random graph. Eur. J. Comb. 32(7), 1000–1017 (2011). https://doi.org/10.1016/j.ejc.2011.03.014

    Article  MATH  Google Scholar 

  21. Chern, B.G.: Large deviations approximation to normalizing constants in exponential models. PhD thesis, Stanford University (2016)

  22. Chizat, L., Bach, F.: On the global convergence of gradient descent for over-parameterized models using optimal transport. In: Proceedings of the 32nd international conference on neural information processing systems, pp. 3040–3050. Curran Associates Inc., Red Hook, NY, USA (2018)

  23. Cook, N., Dembo, A.: Large deviations of subgraph counts for sparse Erdős-Rényi graphs. Adv. Math. 373, 107289 (2020). https://doi.org/10.1016/j.aim.2020.107289

    Article  MathSciNet  MATH  Google Scholar 

  24. Crane, H.: Dynamic random networks and their graph limits. Ann. Appl. Probab. 26(2), 691–721 (2016). https://doi.org/10.1214/15-AAP1098

    Article  MathSciNet  MATH  Google Scholar 

  25. Demetci, P., Santorella, R., Sandstede, B., Noble, W.S., Singh, R.: Gromov-Wasserstein optimal transport to align single-cell multi-omics data. bioRxiv (2020). https://doi.org/10.1101/2020.04.28.066787

  26. Diaconis, P., Janson, S.: Graph limits and exchangeable random graphs. Rendiconti di Matematica e delle sue Applicazioni 28(1), 33–61 (2008)

    MathSciNet  MATH  Google Scholar 

  27. Diao, P., Guillot, D., Khare, A., Rajaratnam, B.: Differential calculus on graphon space. J. Combin. Theory, Series A 133, 183–227 (2015). https://doi.org/10.1016/j.jcta.2015.02.006

    Article  MathSciNet  MATH  Google Scholar 

  28. Eldan, R., Gross, R.: Exponential random graphs behave like mixtures of stochastic block models. Ann. Appl. Probab. 28(6), 3698–3735 (2018). https://doi.org/10.1214/18-AAP1402

    Article  MathSciNet  MATH  Google Scholar 

  29. Frieze, A., Kannan, R.: Quick approximation to matrices and applications. Combinatorica 19(2), 175–220 (1999). https://doi.org/10.1007/s004930050052

    Article  MathSciNet  MATH  Google Scholar 

  30. Gangbo, W., Tudorascu, A.: On differentiability in the Wasserstein space and well-posedness for Hamilton-Jacobi equations. J. Math. Pures et Appl. 125, 119–174 (2019). https://doi.org/10.1016/j.matpur.2018.09.003

    Article  MathSciNet  MATH  Google Scholar 

  31. Ghafouri, S., Khasteh, S.H.: A survey on exponential random graph models: an application perspective. PeerJ Comput. Sci. 6, 269 (2020). https://doi.org/10.7717/peerj-cs.269

    Article  Google Scholar 

  32. Harchaoui, Z., Oh, S., Pal, S., Somani, R., Tripathi, R.: Stochastic optimization on matrices and a graphon McKean-Vlasov limit. arXiv preprint arXiv:2210.00422 (2022)

  33. Hoover, D.N.: Row-column exchangeability and a generalized model for probability. Exchangeability in probability and statistics (Rome, 1981), 281–291 (1982)

  34. Huff, R.E.: The Radon-Nikodỳm property for Banach-spaces - a survey of geometric aspects. In: Bierstedt, K.-D., Fuchssteiner, B. (eds.) Functional analysis: surveys and recent results. North-Holland Mathematics Studies, vol. 27, pp. 1–13. North-Holland, Germany (1977). https://doi.org/10.1016/S0304-0208(08)70521-8

  35. Hunter, J.K.: Notes on partial differential equations. Lecture notes, https://www.math.ucdavis.edu/~hunter/pdes/pde_notes.pdf, Department of mathematics, University of California (2014)

  36. Janson, S.: Graphons and cut metric on sigma-finite measure spaces. arXiv preprint arXiv:1608.01833 (2016)

  37. Janson, S.: Graphons, cut norm and distance, couplings and rearrangements. NYJM Monographs 4 (2013)

  38. Jordan, R., Kinderlehrer, D., Otto, F.: The variational formulation of the Fokker–Planck equation. SIAM J. Math. Anal. 29(1), 1–17 (1998). https://doi.org/10.1137/S0036141096303359

    Article  MathSciNet  MATH  Google Scholar 

  39. Kallenberg, O.: On the representation theorem for exchangeable arrays. J. Multiv. Anal. 30(1), 137–154 (1989). https://doi.org/10.1016/0047-259X(89)90092-4

    Article  MathSciNet  MATH  Google Scholar 

  40. Kenyon, R., Yin, M.: On the asymptotics of constrained exponential random graphs. J. Appl. Probab. 54(1), 165–180 (2017). https://doi.org/10.1017/jpr.2016.93

    Article  MathSciNet  MATH  Google Scholar 

  41. Lindelöf, E.: Sur l’application de la méthode des approximations successives aux équations différentielles ordinaires du premier ordre. Comptes rendus hebdomadaires des séances de l’Académie des sciences 116(3), 454–457 (1894)

    MATH  Google Scholar 

  42. Lovász, L.: Large networks and graph limits. Colloquium publications, vol. 60. American Mathematical Society, Providence, RI (2012). https://doi.org/10.1090/coll/060

  43. Lovász, L., Szegedy, B.: Limits of dense graph sequences. J. Comb. Theory, Series B 96(6), 933–957 (2006). https://doi.org/10.1016/j.jctb.2006.05.002

    Article  MathSciNet  MATH  Google Scholar 

  44. Lovász, L., Szegedy, B.: Szemerédi’s lemma for the analyst. Geomet. Funct. Anal. 1(7), 252–270 (2007). https://doi.org/10.1007/s00039-007-0599-6

    Article  MATH  Google Scholar 

  45. Lovász, L.M., Zhao, Y.: On derivatives of graphon parameters. J. Combin. Theory Series A 145(C), 364–368 (2017). https://doi.org/10.1016/j.jcta.2016.08.007

    Article  MathSciNet  MATH  Google Scholar 

  46. Lubetzky, E., Zhao, Y.: On replica symmetry of large deviations in random graphs. Rand. Struct. Algor. 47(1), 109–146 (2015). https://doi.org/10.1002/rsa.20536

    Article  MathSciNet  MATH  Google Scholar 

  47. Mantel, W.: Problem 28. Wiskundige Opgaven 10(2), 60–61 (1907)

    Google Scholar 

  48. McCann, R.J.: A convexity principle for interacting gases. Adv. Math. 128(1), 153–179 (1997). https://doi.org/10.1006/aima.1997.1634

    Article  MathSciNet  MATH  Google Scholar 

  49. Mei, S., Misiakiewicz, T., Montanari, A.: Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit. In: Beygelzimer, A., Hsu, D. (eds.) Proceedings of the Thirty-Second Conference on Learning Theory. Proceedings of Machine Learning Research, vol. 99, pp. 2388–2464 (2019)

  50. Mémoli, F.: Gromov-Wasserstein distances and the metric approach to object matching. Found. Comput. Math 1(1), 417–487 (2011). https://doi.org/10.1007/s10208-011-9093-5

    Article  MathSciNet  MATH  Google Scholar 

  51. Munkres, J.R.: Topology. Prentice Hall, Upper Saddle River (2000)

    MATH  Google Scholar 

  52. Nguyen, P.-M., Pham, H.T.: A rigorous framework for the mean field limit of multilayer neural networks. arXiv preprint arXiv:2001.11443 (2020)

  53. Rotskoff, G.M., Vanden-Eijnden, E.: Parameters as interacting particles: long time convergence and asymptotic error scaling of neural networks. In: Proceedings of the 32nd international conference on neural information processing systems, pp. 7146–7155 (2018)

  54. Santambrogio, F.: Optimal transport for applied mathematicians: calculus of variations, PDEs, and modeling. Progress in nonlinear differential equations and their applications, vol. 87. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-20828-2

  55. Santambrogio, F.: \(\{\)Euclidean, metric, and Wasserstein\(\}\) gradient flows: an overview. Bull. Math. Sci. 7(1), 87–154 (2017). https://doi.org/10.1007/s13373-017-0101-1

    Article  MathSciNet  MATH  Google Scholar 

  56. Sirignano, J., Spiliopoulos, K.: Mean field analysis of neural networks: a central limit theorem. Stoch. Process. Appl. 130(3), 1820–1852 (2020). https://doi.org/10.1016/j.spa.2019.06.003

    Article  MathSciNet  MATH  Google Scholar 

  57. Sirignano, J., Spiliopoulos, K.: Mean field analysis of neural networks: a law of large numbers. SIAM J. Appl. Math. 80(2), 725–752 (2020). https://doi.org/10.1137/18M1192184

    Article  MathSciNet  MATH  Google Scholar 

  58. Song, M., Montanari, A., Nguyen, P.: A mean field view of the landscape of two-layers neural networks. Proceed. Nat. Acad. Sci. 115, 7665–7671 (2018). https://doi.org/10.1073/pnas.1806579115

    Article  MathSciNet  Google Scholar 

  59. Sturm, K.-T.: The space of spaces: curvature bounds and gradient flows on the space of metric measure spaces. Available at arXiv:1208.0434v1 (2012)

  60. Tzen, B., Raginsky, M.: A mean-field theory of lazy training in two-layer neural nets: entropic regularization and controlled McKean-Vlasov dynamics. arXiv preprint arXiv:2002.01987 (2020)

  61. Villani, C.: Topics in optimal transportation. Graduate studies in mathematics, vol. 58. American Mathematical Society, Providence, RI (2003). https://doi.org/10.1090/gsm/058

Download references

Acknowledgements

Many thanks to Persi Diaconis, Apoorva Khare and Stefan Steinerberger for helpful conversations and references and to the PIMS Kantorovich Initiative for facilitating this collaboration. The authors are listed in alphabetical order.

Funding

This research is partially supported by the following grants. Pal is supported by NSF Grant No. DMS-2052239 and a PIMS CRG (PIHOT). Pal and Oh are supported by NSF grant DMS-2134012. Oh is supported by NSF Grant No. CCF-2019844.

Author information

Authors and Affiliations

Authors

Contributions

The authors are arranged in alphabetical order.

Corresponding author

Correspondence to Soumik Pal.

Ethics declarations

Conflict of interest

The authors declare that have no conflict of interest.

Ethical Approval

Not relevant to the content of this article.

Consent to Participate

Not relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Oh, S., Pal, S., Somani, R. et al. Gradient Flows on Graphons: Existence, Convergence, Continuity Equations. J Theor Probab (2023). https://doi.org/10.1007/s10959-023-01271-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10959-023-01271-8

Keywords

Mathematics Subject Classification (2020)

Navigation