Gradient Flows on Graphons: Existence, Convergence, Continuity Equations

Oh, Sewoong; Pal, Soumik; Somani, Raghav; Tripathi, Raghavendra

doi:10.1007/s10959-023-01271-8

Gradient Flows on Graphons: Existence, Convergence, Continuity Equations

Published: 03 July 2023

(2023)
Cite this article

Journal of Theoretical Probability Aims and scope Submit manuscript

Sewoong Oh¹,
Soumik Pal²,
Raghav Somani¹ &
…
Raghavendra Tripathi²

236 Accesses
Explore all metrics

Abstract

Wasserstein gradient flows on probability measures have found a host of applications in various optimization problems. They typically arise as the continuum limit of exchangeable particle systems evolving by some mean-field interaction involving a gradient-type potential. However, in many problems, such as in multi-layer neural networks, the so-called particles are edge weights on large graphs whose nodes are exchangeable. Such large graphs are known to converge to continuum limits called graphons as their size grows to infinity. We show that the Euclidean gradient flow of a suitable function of the edge weights converges to a novel continuum limit given by a curve on the space of graphons that can be appropriately described as a gradient flow or, more technically, a curve of maximal slope. Several natural functions on graphons, such as homomorphism functions and the scalar entropy, are covered by our setup, and the examples have been worked out in detail.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Nonlocal-Interaction Equation on Graphs: Gradient Flow Structure and Continuum Limit

Article Open access 15 March 2021

Natural gradient via optimal transport

Article 19 November 2018

Gradient flows in metric random walk spaces

Article Open access 10 October 2021

Data Availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

Code Availability

The codes used during the current study are available from the corresponding author on reasonable request.

References

Aldous, D.J.: On exchangeability and conditional independence. Exchangeability in probability and statistics (Rome, 1981), 165–170 (1982)
Aldous, D.J.: Representations for partially exchangeable arrays of random variables. J. Multivar. Anal. 11(4), 581–598 (1981). https://doi.org/10.1016/0047-259X(81)90099-3
Article MathSciNet MATH Google Scholar
Ambrosio, L., Gigli, N., Savaré, G.: Gradient flows: In Metric spaces and in the space of probability measures. Second Edition. Lectures in mathematics. ETH Zürich. Birkhäuser Verlag AG, Basel (2008). https://doi.org/10.1007/978-3-7643-8722-8
Araújo, D., Oliveira, R.I., Yukimura, D.: A mean-field limit for certain deep neural networks. arXiv preprint arXiv:1906.00193 (2019)
Athreya, S., den Hollander, F., Röllin, A.: Graphon-valued stochastic processes from population genetics. Ann. Appl. Probab. 31(4), 1724–1745 (2021). https://doi.org/10.1214/20-AAP1631
Article MathSciNet MATH Google Scholar
Austin, T.: Exchangeable random arrays. In: Notes for IAS workshop (2012)
Austin, T.: On exchangeable random variables and the statistics of large graphs and hypergraphs. Probab. Surv. 5, 80–145 (2008)
Article MathSciNet MATH Google Scholar
Austin, T.: Exchangeable random measures. Annales de l’Institut Henri Poincaré, Probabilités et Statistiques 51(3), 842–861 (2015). https://doi.org/10.1214/13-AIHP584
Article MathSciNet MATH Google Scholar
Bach, F., Chizat, L.: Gradient descent on infinitely wide neural networks: global convergence and generalization. arXiv preprint arXiv:2110.08084 (2021)
Ben-Eliezer, O., Fischer, E., Levi, A., Yoshida, Y.: Ordered graph limits and their applications. In: Lee, J.R. (ed.) 12th Innovations in theoretical computer science conference (ITCS 2021). Leibniz international proceedings in informatics (LIPIcs), vol. 185, pp. 42–14220. Schloss Dagstuhl–Leibniz-Zentrum für Informatik, Dagstuhl, Germany (2021). https://doi.org/10.4230/LIPIcs.ITCS.2021.42
Bhattacharya, B.B., Ganguly, S.: Upper tails for edge eigenvalues of random graphs. SIAM J. Discr. Math. 34(2), 1069–1083 (2020). https://doi.org/10.1137/18M1230852
Article MathSciNet MATH Google Scholar
Bondy, J.A.: Pancyclic graphs I. J. Combinat. Theory, Series B 11(1), 80–84 (1971). https://doi.org/10.1016/0095-8956(71)90016-5
Article MathSciNet MATH Google Scholar
Borgs, C., Chayes, J.T., Lovász, L., Sós, V.T., Vesztergombi, K.: Convergent sequences of dense graphs I: Subgraph frequencies, metric properties and testing. Adv. Math. 219(6), 1801–1851 (2008). https://doi.org/10.1016/j.aim.2008.07.008
Article MathSciNet MATH Google Scholar
Borgs, C., Chayes, J.T., Lovász, L., Sós, V.T., Vesztergombi, K.: Convergent sequences of dense graphs II multiway cuts and statistical physics. Ann. Math. (2012). https://doi.org/10.4007/ANNALS.2012.176.1.2
Article MathSciNet MATH Google Scholar
Borgs, C., Chayes, J.T., Cohn, H., Holden, N.: Sparse exchangeable graphs and their limits via graphon processes. J. Mach. Learn. Res. 18(210), 1–71 (2018)
MathSciNet MATH Google Scholar
Butcher, J.C.: Numerical methods for ordinary differential equations. Wiley, Hoboken (2016). https://doi.org/10.1002/9781119121534
Carrillo, J.A., Craig, K., Patacchini, F.S.: A blob method for diffusion. Calc. Variat. Part. Diff. Eq. 58(2), 1–53 (2019). https://doi.org/10.1007/s00526-019-1486-3
Article MathSciNet MATH Google Scholar
Chatterjee, S.: Large deviations for random graphs: École d’Été de Probabilités de Saint-Flour XLV-2015 vol. 2197. Springer, New York (2017). https://doi.org/10.1007/978-3-319-65816-2
Chatterjee, S., Diaconis, P.: Estimating and understanding exponential random graph models. Ann. Stat. 41(5), 2428–2461 (2013). https://doi.org/10.1214/13-AOS1155
Article MathSciNet MATH Google Scholar
Chatterjee, S., Varadhan, S.R.S.: The large deviation principle for the Erdős-Rényi random graph. Eur. J. Comb. 32(7), 1000–1017 (2011). https://doi.org/10.1016/j.ejc.2011.03.014
Article MATH Google Scholar
Chern, B.G.: Large deviations approximation to normalizing constants in exponential models. PhD thesis, Stanford University (2016)
Chizat, L., Bach, F.: On the global convergence of gradient descent for over-parameterized models using optimal transport. In: Proceedings of the 32nd international conference on neural information processing systems, pp. 3040–3050. Curran Associates Inc., Red Hook, NY, USA (2018)
Cook, N., Dembo, A.: Large deviations of subgraph counts for sparse Erdős-Rényi graphs. Adv. Math. 373, 107289 (2020). https://doi.org/10.1016/j.aim.2020.107289
Article MathSciNet MATH Google Scholar
Crane, H.: Dynamic random networks and their graph limits. Ann. Appl. Probab. 26(2), 691–721 (2016). https://doi.org/10.1214/15-AAP1098
Article MathSciNet MATH Google Scholar
Demetci, P., Santorella, R., Sandstede, B., Noble, W.S., Singh, R.: Gromov-Wasserstein optimal transport to align single-cell multi-omics data. bioRxiv (2020). https://doi.org/10.1101/2020.04.28.066787
Diaconis, P., Janson, S.: Graph limits and exchangeable random graphs. Rendiconti di Matematica e delle sue Applicazioni 28(1), 33–61 (2008)
MathSciNet MATH Google Scholar
Diao, P., Guillot, D., Khare, A., Rajaratnam, B.: Differential calculus on graphon space. J. Combin. Theory, Series A 133, 183–227 (2015). https://doi.org/10.1016/j.jcta.2015.02.006
Article MathSciNet MATH Google Scholar
Eldan, R., Gross, R.: Exponential random graphs behave like mixtures of stochastic block models. Ann. Appl. Probab. 28(6), 3698–3735 (2018). https://doi.org/10.1214/18-AAP1402
Article MathSciNet MATH Google Scholar
Frieze, A., Kannan, R.: Quick approximation to matrices and applications. Combinatorica 19(2), 175–220 (1999). https://doi.org/10.1007/s004930050052
Article MathSciNet MATH Google Scholar
Gangbo, W., Tudorascu, A.: On differentiability in the Wasserstein space and well-posedness for Hamilton-Jacobi equations. J. Math. Pures et Appl. 125, 119–174 (2019). https://doi.org/10.1016/j.matpur.2018.09.003
Article MathSciNet MATH Google Scholar
Ghafouri, S., Khasteh, S.H.: A survey on exponential random graph models: an application perspective. PeerJ Comput. Sci. 6, 269 (2020). https://doi.org/10.7717/peerj-cs.269
Article Google Scholar
Harchaoui, Z., Oh, S., Pal, S., Somani, R., Tripathi, R.: Stochastic optimization on matrices and a graphon McKean-Vlasov limit. arXiv preprint arXiv:2210.00422 (2022)
Hoover, D.N.: Row-column exchangeability and a generalized model for probability. Exchangeability in probability and statistics (Rome, 1981), 281–291 (1982)
Huff, R.E.: The Radon-Nikodỳm property for Banach-spaces - a survey of geometric aspects. In: Bierstedt, K.-D., Fuchssteiner, B. (eds.) Functional analysis: surveys and recent results. North-Holland Mathematics Studies, vol. 27, pp. 1–13. North-Holland, Germany (1977). https://doi.org/10.1016/S0304-0208(08)70521-8
Hunter, J.K.: Notes on partial differential equations. Lecture notes, https://www.math.ucdavis.edu/~hunter/pdes/pde_notes.pdf, Department of mathematics, University of California (2014)
Janson, S.: Graphons and cut metric on sigma-finite measure spaces. arXiv preprint arXiv:1608.01833 (2016)
Janson, S.: Graphons, cut norm and distance, couplings and rearrangements. NYJM Monographs 4 (2013)
Jordan, R., Kinderlehrer, D., Otto, F.: The variational formulation of the Fokker–Planck equation. SIAM J. Math. Anal. 29(1), 1–17 (1998). https://doi.org/10.1137/S0036141096303359
Article MathSciNet MATH Google Scholar
Kallenberg, O.: On the representation theorem for exchangeable arrays. J. Multiv. Anal. 30(1), 137–154 (1989). https://doi.org/10.1016/0047-259X(89)90092-4
Article MathSciNet MATH Google Scholar
Kenyon, R., Yin, M.: On the asymptotics of constrained exponential random graphs. J. Appl. Probab. 54(1), 165–180 (2017). https://doi.org/10.1017/jpr.2016.93
Article MathSciNet MATH Google Scholar
Lindelöf, E.: Sur l’application de la méthode des approximations successives aux équations différentielles ordinaires du premier ordre. Comptes rendus hebdomadaires des séances de l’Académie des sciences 116(3), 454–457 (1894)
MATH Google Scholar
Lovász, L.: Large networks and graph limits. Colloquium publications, vol. 60. American Mathematical Society, Providence, RI (2012). https://doi.org/10.1090/coll/060
Lovász, L., Szegedy, B.: Limits of dense graph sequences. J. Comb. Theory, Series B 96(6), 933–957 (2006). https://doi.org/10.1016/j.jctb.2006.05.002
Article MathSciNet MATH Google Scholar
Lovász, L., Szegedy, B.: Szemerédi’s lemma for the analyst. Geomet. Funct. Anal. 1(7), 252–270 (2007). https://doi.org/10.1007/s00039-007-0599-6
Article MATH Google Scholar
Lovász, L.M., Zhao, Y.: On derivatives of graphon parameters. J. Combin. Theory Series A 145(C), 364–368 (2017). https://doi.org/10.1016/j.jcta.2016.08.007
Article MathSciNet MATH Google Scholar
Lubetzky, E., Zhao, Y.: On replica symmetry of large deviations in random graphs. Rand. Struct. Algor. 47(1), 109–146 (2015). https://doi.org/10.1002/rsa.20536
Article MathSciNet MATH Google Scholar
Mantel, W.: Problem 28. Wiskundige Opgaven 10(2), 60–61 (1907)
Google Scholar
McCann, R.J.: A convexity principle for interacting gases. Adv. Math. 128(1), 153–179 (1997). https://doi.org/10.1006/aima.1997.1634
Article MathSciNet MATH Google Scholar
Mei, S., Misiakiewicz, T., Montanari, A.: Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit. In: Beygelzimer, A., Hsu, D. (eds.) Proceedings of the Thirty-Second Conference on Learning Theory. Proceedings of Machine Learning Research, vol. 99, pp. 2388–2464 (2019)
Mémoli, F.: Gromov-Wasserstein distances and the metric approach to object matching. Found. Comput. Math 1(1), 417–487 (2011). https://doi.org/10.1007/s10208-011-9093-5
Article MathSciNet MATH Google Scholar
Munkres, J.R.: Topology. Prentice Hall, Upper Saddle River (2000)
MATH Google Scholar
Nguyen, P.-M., Pham, H.T.: A rigorous framework for the mean field limit of multilayer neural networks. arXiv preprint arXiv:2001.11443 (2020)
Rotskoff, G.M., Vanden-Eijnden, E.: Parameters as interacting particles: long time convergence and asymptotic error scaling of neural networks. In: Proceedings of the 32nd international conference on neural information processing systems, pp. 7146–7155 (2018)
Santambrogio, F.: Optimal transport for applied mathematicians: calculus of variations, PDEs, and modeling. Progress in nonlinear differential equations and their applications, vol. 87. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-20828-2
Santambrogio, F.: \(\{\)Euclidean, metric, and Wasserstein\(\}\) gradient flows: an overview. Bull. Math. Sci. 7(1), 87–154 (2017). https://doi.org/10.1007/s13373-017-0101-1
Article MathSciNet MATH Google Scholar
Sirignano, J., Spiliopoulos, K.: Mean field analysis of neural networks: a central limit theorem. Stoch. Process. Appl. 130(3), 1820–1852 (2020). https://doi.org/10.1016/j.spa.2019.06.003
Article MathSciNet MATH Google Scholar
Sirignano, J., Spiliopoulos, K.: Mean field analysis of neural networks: a law of large numbers. SIAM J. Appl. Math. 80(2), 725–752 (2020). https://doi.org/10.1137/18M1192184
Article MathSciNet MATH Google Scholar
Song, M., Montanari, A., Nguyen, P.: A mean field view of the landscape of two-layers neural networks. Proceed. Nat. Acad. Sci. 115, 7665–7671 (2018). https://doi.org/10.1073/pnas.1806579115
Article MathSciNet Google Scholar
Sturm, K.-T.: The space of spaces: curvature bounds and gradient flows on the space of metric measure spaces. Available at arXiv:1208.0434v1 (2012)
Tzen, B., Raginsky, M.: A mean-field theory of lazy training in two-layer neural nets: entropic regularization and controlled McKean-Vlasov dynamics. arXiv preprint arXiv:2002.01987 (2020)
Villani, C.: Topics in optimal transportation. Graduate studies in mathematics, vol. 58. American Mathematical Society, Providence, RI (2003). https://doi.org/10.1090/gsm/058

Download references

Acknowledgements

Many thanks to Persi Diaconis, Apoorva Khare and Stefan Steinerberger for helpful conversations and references and to the PIMS Kantorovich Initiative for facilitating this collaboration. The authors are listed in alphabetical order.

Funding

This research is partially supported by the following grants. Pal is supported by NSF Grant No. DMS-2052239 and a PIMS CRG (PIHOT). Pal and Oh are supported by NSF grant DMS-2134012. Oh is supported by NSF Grant No. CCF-2019844.

Author information

Authors and Affiliations

Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA, 98195, USA
Sewoong Oh & Raghav Somani
Department of Mathematics, University of Washington, Seattle, WA, 98195, USA
Soumik Pal & Raghavendra Tripathi

Authors

Sewoong Oh
View author publications
You can also search for this author in PubMed Google Scholar
Soumik Pal
View author publications
You can also search for this author in PubMed Google Scholar
Raghav Somani
View author publications
You can also search for this author in PubMed Google Scholar
Raghavendra Tripathi
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The authors are arranged in alphabetical order.

Corresponding author

Correspondence to Soumik Pal.

Ethics declarations

Conflict of interest

The authors declare that have no conflict of interest.

Ethical Approval

Not relevant to the content of this article.

Consent to Participate

Not relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Oh, S., Pal, S., Somani, R. et al. Gradient Flows on Graphons: Existence, Convergence, Continuity Equations. J Theor Probab (2023). https://doi.org/10.1007/s10959-023-01271-8

Download citation

Received: 16 January 2023
Revised: 15 May 2023
Accepted: 09 June 2023
Published: 03 July 2023
DOI: https://doi.org/10.1007/s10959-023-01271-8

Keywords

Mathematics Subject Classification (2020)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Gradient Flows on Graphons: Existence, Convergence, Continuity Equations

Abstract

Access this article

Similar content being viewed by others

Nonlocal-Interaction Equation on Graphs: Gradient Flow Structure and Continuum Limit

Natural gradient via optimal transport

Gradient flows in metric random walk spaces

Data Availability

Code Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical Approval

Consent to Participate

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2020)

Navigation

Gradient Flows on Graphons: Existence, Convergence, Continuity Equations

Abstract

Access this article

Similar content being viewed by others

Nonlocal-Interaction Equation on Graphs: Gradient Flow Structure and Continuum Limit

Natural gradient via optimal transport

Gradient flows in metric random walk spaces

Data Availability

Code Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical Approval

Consent to Participate

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2020)

Search

Navigation