Abstract
Curvature is a fundamental geometric characteristic of smooth spaces. In recent years different notions of curvature have been developed for combinatorial discrete objects such as graphs. However, the connections between such discrete notions of curvature and their smooth counterparts remain lurking and moot. In particular, it is not rigorously known if any notion of graph curvature converges to any traditional notion of curvature of smooth space. Here we prove that in proper settings the Ollivier–Ricci curvature of random geometric graphs in Riemannian manifolds converges to the Ricci curvature of the manifold. This is the first rigorous result linking curvature of random graphs to curvature of smooth spaces. Our results hold for different notions of graph distances, including the rescaled shortest path distance, and for different graph densities. Here the scaling of the average degree, as a function of the graph size, can range from nearly logarithmic to nearly linear.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Curvature is a fundamental concept in the study of geometric spaces. It is a local parameter whose behavior often controls global phenomena on the manifold. In particular, bounds on the Ricci curvature are known to imply an array of properties, including diameter bounds, control of the spectrum, and sub-Gaussian decay of the heat kernel. If the curvature of some space is upper-bounded by a negative value, then such space has a boundary at infinity and some other universal characteristics of (coarsely) hyperbolic spaces. Unfortunately, most notions of curvature are applicable only to smooth continuous spaces, such as Riemannian and pseudo-Riemannian manifolds. While there exist some combinatorial notions of curvature [6, 11], none has the same power as their smooth counterparts. We refer to [25] for a general overview of discrete curvatures. The focus of this paper is graph curvature.
In [27,28,29], Yann Ollivier introduced a definition of curvature for general metric spaces as a discretization of the well-known Ricci curvature. Since this definition is applicable to any metric space, it is applicable to graphs in particular. Even though relatively recent, it has already proven to be quite influential and fruitful. In analysis of networks, Ollivier–Ricci curvature has been used, for example, to identify communities [36], analyze cancer cells [33], asses the fragility of financial networks [34] and robustness of brain networks [10], and to embed networks for machine learning applications [12]. Ollivier–Ricci curvature has also been analyzed for several types of (random) graphs including Erdős–Rényi random graphs [23]. Some general bounds for this curvature have also been established based on different graph properties [4, 16, 23]. These and other applications of Ollivier–Ricci curvature have also stimulated general interest in graph curvature, leading to the introduction and studies of many other notions of graph curvature [8, 17, 24, 37].
An interesting aspect of Ollivier–Ricci curvature (or any other notion of discrete curvature) is that it creates a bridge between geometry and discrete structures. For example, discrete curvatures play an important role in the field of manifold learning where the discrete objects are data points lying on some manifold, and the task is to learn from the data the properties of the manifold [1].
A related task is that of graph embedding: given a graph, find its embedding in a smooth space such that graph distances between nodes are approximated by distances in the space. Curvature has proven to be important for finding the right space to embed the graph into [12].
In addition to these classical applications, geometry has also proven to be an important and powerful concept for designing latent-space models of random graphs whose properties—such as degree distributions, clustering, distance distributions—closely resemble those of real-world networks [5, 14, 19, 20]. These relations between geometry and network properties inevitably lead to the question whether characteristics of latent geometries of networks can be inferred from discrete properties of graphs that represent these networks. Since curvature is a fundamental characteristic of geometry, it is a natural first candidate for uncovering latent geometry in networks. Hence, a proper notion of graph curvature is needed, a notion that would be known to converge to the true curvature of the geometric space underlying the graph, if it exists.
Quantum gravity is yet another area where convergence of graph curvature is of interest. Here one wants to find a discrete geometry that converges in the continuum limit to the geometry of physical spacetime. To this end, Ollivier–Ricci curvature and its variations have been extensively investigated recently [7, 18, 40].
Despite the interest in Ollivier–Ricci and related curvatures of discrete and combinatorial spaces, the fundamental question of convergence remains largely open. That is, does there exist a discrete notion of curvature that converges in some limit to a traditional notion of curvature of smooth spaces.
There are some positive results in this direction. One is for the convergence of an angle-defect-based notion of curvature of smooth triangulations of Riemannian manifolds [6]. Another one is a manifold learning method designed for consistent estimation of Ricci curvature of a submanifold in Euclidean space based on a point cloud sprinkled uniformly onto the submanifold [1]. Perhaps the closest result to ours is the one in [2, 3] where a discrete version of the d’Alembertian operator is defined for causal sets in 2- and 4-dimensional Lorentzian manifolds. This discrete d’Alembertian is then shown to converge to the traditional d’Alembertian in the continuum limit. To the best of our knowledge, there currently exist no general convergence results for truly combinatorial objects in general and random graphs in Riemannian manifolds in particular.
In this paper we study the question of convergence of Ollivier–Ricci curvature of graphs. We consider random geometric graphs whose nodes are a Poisson process in a Riemannian manifold and whose edges are formed only between nodes that lie within a given distance threshold from each other in the manifold. We show that as the size of such graphs tends to infinity, their Ollivier–Ricci curvature recovers the Ricci curvature of the underlying manifold. To the best of our knowledge, this is the first result that relates a discrete notion of curvature of graphs to the continuum version of curvature of their underlying geometry.
The remainder of the paper is structured as follows. In the next Sect. 2 we introduce the basic notations and definitions needed to present our main results. We present these results in Sect. 3. That section ends with some general comments and outlook. We then provide a general overview of the proof strategy in the first half of Sect. 4. The second half of that section contains the proofs of the main results. The final Sect. 5 contains all the remaining details and proofs of intermediate results that are skipped in Sect. 4.
2 Notations and Definitions
2.1 Geometric Graphs
Given a metric space \(({\mathcal {X}},d)\), a countable node set \(X\subseteq {\mathcal {X}}\), and connection radius \(\varepsilon >0\), we define \(G(X,\varepsilon )\) as the graph whose nodes are all the elements in X. An edge between \(x,y\in X\) exists if and only if \(d(x,y)\le \varepsilon \). Since the nodes of G are points in the metric space, we will refer to them using x and y, instead of indices i and j, and write \(x\in G\) if x is a node of G.
We will also use \(G_{xy}\) to denote the indicator of an edge between x and y in G and define \(\mathcal {N}_x\) to be the neighborhood of node x, i.e.,
Note that \(\mathcal {N}_x=X\cap \mathcal {B}\hspace{0.27771pt}(x;\varepsilon )\), where \(\mathcal {B}\hspace{0.27771pt}(x;\varepsilon )\) denotes the closed ball around \(x\in {\mathcal {X}}\) of radius \(\varepsilon \) with respect to the distance d, but excluding x.
2.2 Random Geometric Graphs
In this paper we consider graphs that are constructed by randomly placing points in the metric space \(\mathcal {X}\), according to a Poisson process. In order to analyze a notion of curvature on these graphs we need to impose some additional structure on \(\mathcal {X}\). More precisely, we will consider Riemannian manifolds as the spaces on which graphs are constructed. We briefly recall some notions of Riemannian geometry needed for the setup and refer the reader to [15, 30] for more details on the topic.
Formally, a Riemannian manifold is a pair \((\mathcal {M},g)\) where \(\mathcal {M}\) is a smooth manifold and for each \(x\in \mathcal {M}\), \(g_x\) is a smooth inner product on the tangent space \(T_x\mathcal {M}\) at x. This inner product induces a metric \(d_\mathcal {M}\), called the Riemannian metric. Since we are mainly interested in metric spaces, we denote a Riemannian manifold by the pair \((\mathcal {M},d_\mathcal {M})\).
Throughout the remainder of this paper we work with Riemannian manifolds that are smooth, connected, and compact. This allows us to integrate over points in de manifold and ensures that for any two points \(x,y\in \mathcal {M}\) there exists a shortest path (geodesic) in \(\mathcal {M}\) connecting x and y, whose length is \(d_\mathcal {M}(x,y)\). For any \(U\subseteq \mathcal {M}\) we will write \(\textrm{vol}_\mathcal {M}(U)=\int _U\textrm{d}\textrm{vol}_\mathcal {M}\) to denote the volume of U, where \(\textrm{vol}_\mathcal {M}\) is the Riemannian measure on \(\mathcal {M}\). With this setup we can define a random geometric graph on a Riemannian manifold in an analogous way to classic random geometric graph in Euclidean space.
Definition 2.1
Let \(({\mathcal {M}},d_\mathcal {M})\) be a smooth, connected, and compact N-dimensional Riemannian manifold. Fix \(\varepsilon >0\) and consider a Poisson process \(\mathcal {P}_n\) on \(\mathcal {M}\) with intensity measure \((n/{\textrm{vol}_\mathcal {M}(\mathcal {M})})\,\textrm{d}\textrm{vol}_\mathcal {M}\). Then we define the random geometric graph \(\mathbb {G}_n(\varepsilon ):=G(\mathcal {P}_n,\varepsilon )\).
Remark 1
(conditions on the manifold) From a technical perspective, we only need the manifold to be smooth. This is because we will be working on shrinking neighborhoods of some fixed point \(x^*\in \mathcal {M}\). For a sufficiently small neighborhood U, we can always construct a volume form that is well defined on U and ensure that every two points in U are connected by a geodesic path. We could then fix a sufficiently small and compact neighborhood \({\mathcal {C}}\) of \(x^*\) and then consider a Poisson process on \({\mathcal {C}}\) with intensity measure \((n/{\textrm{vol}_\mathcal {M}(\mathcal {C})})\,\textrm{d}\textrm{vol}_\mathcal {M}\).
The only difference with the global setup is that we would need to frame everything in terms of sufficiently small neighborhoods and deal with possible boundary issues in our proofs. In the end, since curvature is a local property, these issues would vanish. Still, framing all results in this local setting would add additional technical layers to the proofs. For convenience, we therefore choose to present everything in terms of global and nice requirements on the manifold.
We shall next introduce a notion of curvature on random geometric graphs. Since curvature is inherently a local property, it makes sense to define curvature on graphs as a property of an edge. For our analysis we will take a more general approach and consider curvature between two fixed nodes in the graph that are connected by a path. We then analyze its behavior as the size of the graph tends to infinity.
For any \(x\in \mathcal {M}\), we write \(\mathbb {G}_n(x,\varepsilon ):=G(X_n,\varepsilon )\), where \(X_n=\{x\}\cup \mathcal {P}_n\). That is, \(\mathbb {G}_n(x,\varepsilon )\) is a random geometric graph with x added to the node set. Similarly, for any pair of points \((x,y)\in \mathcal {M}\) we write \(\mathbb {G}_n(x,y,\varepsilon ):=G(X_n^\prime ,\varepsilon )\), with \(X_n^\prime =\{x,y\}\cup \mathcal {P}_n\). We refer to both \(\mathbb {G}_n(x,\varepsilon )\) and \(\mathbb {G}_n(x,y,\varepsilon )\) as rooted random graphs.
2.3 Ollivier–Ricci Curvature on Graphs
The definition of Ollivier–Ricci curvature uses the Wasserstein metric (transportation distance), which we shall introduce next. Recall that a coupling between two probability measures \(\mu _1\) and \(\mu _2\) is a joint probability measure \(\mu \) whose marginals are \(\mu _1\) and \(\mu _2\).
Definition 2.2
Let \(\mu _1\) and \(\mu _2\) be probability measures on a metric space \((\mathcal {X},d)\) and let \(\varGamma (\mu _1,\mu _2)\) denote the set of all couplings \(\mu \) between \(\mu _1\) and \(\mu _2\). Then the Wasserstein metric (Kantorovich–Rubinstein distance of order one) is given by
Remark 2
The following property of the Wasserstein distance will prove useful for us. Suppose two probability measures have support on \(U\subset \mathcal {X}\) and there exists a metric \({\tilde{d}}\) on \(\mathcal {X}\) that coincides with d on U. Then the Wasserstein metric \({\tilde{W}}_1(\mu _1,\mu _2)\) associated with metric \(\tilde{d}\) equals the original Wasserstein metric associated with d.
Let G be a graph. The definition of Ollivier–Ricci curvature on graphs relies on two ingredients, a metric on G and a family of probability measures, indexed by the vertices.
Definition 2.3
An Ollivier-triple \({\mathcal {G}}\) is a triple \((G,d_G,{\varvec{m}})\), where G is a graph, \(d_G\) a metric on G and \({\varvec{m}}=\{m_x\}_{x\in G}\) a family of probability measures on G for each node \(x\in G\).
Given an Ollivier-triple \({\mathcal {G}}=(G,d_G,{\varvec{m}})\), we write \(W_1^{\mathcal {G}}\) for the Wasserstein metric with respect to the metric space \((G,d_G)\). We then define for any pair of nodes \(x,y\in G\) the associated Ollivier curvature as
Remark 3
-
1.
The concept of Ollivier–Ricci curvature is not restricted to graphs and can be defined on any metric space where we have a sequence of probability measures. A specific example of these are Riemannian manifolds \((\mathcal {M},d_\mathcal {M})\).
-
2.
Note that a sequence \(\{m_x\}_{x\in G}\) of probability measures on G gives rise to a random walk on the graph. The transition probabilities are given by \(\mathbb {P}\hspace{0.33325pt}(x_{t+1}\in A\,|\,x_t=x)=m_x(A)\). So an Ollivier-triple consists of a graph, a metric and a random walk on the graph. However, since we will only use concepts related to the measures \(m_x\) we refrain from using any random walk terminology.
-
3.
When \(d_G\) is the shortest path metric on G and \({\varvec{m}}\) corresponds to the uniform probability measures on the neighborhoods \(\mathcal {N}_x\), i.e., \(m_x(y)=G_{xy}/|\mathcal {N}_x|\), we are in the classic setting for Ollivier–Ricci curvature on graphs [16, 26, 31]. In this work, however, we shall use different combinations of metrics on graphs and probability measures to obtain our results. This is why we define Ollivier–Ricci curvature on graphs in a more general way.
-
4.
The reason why we set \(\kappa \hspace{0.30548pt}(x,y;{\mathcal {G}})=0\) if the nodes are not in the same connected component is because we work with random graphs and this way we ensure that \(\kappa \hspace{0.30548pt}(x,y;{\mathcal {G}})\) is a real-valued random variable.
2.4 Curvature in Riemannian Manifolds
Our main results relate the standard Ricci curvature of a manifold to the Ollivier–Ricci curvature of the random geometric graph constructed on this manifold. For this, we briefly recall the definition of the Ricci curvature, see [15, 30].
In general, the curvature of a geometric space is intended as a local measure for how “different" a region of the space is from that of the flat Euclidean space. Notions of curvature in Riemannian geometry are governed by the Riemannian curvature tensor R. Given an N-dimensional Riemannian Manifold \(({\mathcal {M}},d_\mathcal {M})\), a point \(x\in {\mathcal {M}}\) and two vectors \(\textbf{v},\textbf{w}\in T_x{\mathcal {M}}\) (the tangent space of x), the Riemannian curvature tensor with respect to \(\textbf{v}\), \(\textbf{w}\) is a linear map \(R(\textbf{v},\textbf{w}):T_x\mathcal {M}\rightarrow T_x\mathcal {M}\), written as \(\textbf{u}\mapsto R(\textbf{v},\textbf{w})\textbf{u}\) and defined in terms of the Levi-Civita connection on the tangent bundle. It quantifies to what extent the manifold \(\mathcal {M}\) is not isometric to flat Euclidean space.
In this paper we will use the notion of curvature called Ricci curvature. For two vectors \(\textbf{v}\) and \(\textbf{w}\), the Ricci curvature \({\text {Ric}}\hspace{0.44434pt}(\textbf{v},\textbf{w})\) is defined, in terms of the Riemannian tensor, as the trace of the linear map
Given a point \(x\in {\mathcal {M}}\) and a unit vector \(\textbf{v}\in T_x{\mathcal {M}}\), we often refer to \({\text {Ric}}\hspace{0.44434pt}(\textbf{v},\textbf{v})\) as the Ricci curvature of x with respect to v.
This Ricci curvature is related to another notion of curvature, called sectional curvature, which is defined as
where \(\langle \,{\cdot },\,{\cdot }\,\rangle \) denotes the inner product on the tangent space. One can show that \({\text {Ric}}\hspace{0.44434pt}(\textbf{v},\textbf{v})\) is obtained by averaging the sectional curvature \(K(\textbf{v},\textbf{w})\) over all unit vectors \(\textbf{w}\in T_x\mathcal {M}\).
In the remainder of this paper we will work with the Ricci curvature of a point x, with respect to some tangent vector \(\textbf{v}\). We note that it is not needed to understand the fine details behind curvature of Riemannian manifolds to understand all the details of the results or proofs.
3 Main Results
Here we state our results regarding the convergence of Ollivier–Ricci curvature of random geometric graphs on Riemannian manifolds. We note that if the manifold dimension is \(N=1\), then there is nothing to prove, so that we always assume that \(N\ge 2\).
We mainly consider two different distances on the graphs, leading to two different but related results. Although we consider different distances on graphs, we shall always consider uniform measures on balls of a certain radius. We shall clearly distinguish between the connection radius of the graph and the radius used for the uniform measures:
connection radius: | \(\varepsilon _n\) |
measure radius: | \(\delta _n\) |
The former is the connectivity distance threshold: if the distance between a pair of nodes in the manifold is below this threshold, then these nodes are connected by an edge in the graph. The latter radius is the radius of the ball (either in the graph or in the manifold) over which the uniform probability measure is distributed.
Let \(G_n=\mathbb {G}_n(\varepsilon _n)\) be a random geometric graph on \(\mathcal {M}\) and \(d_G\) a distance on \(G_n\). Then, for a node \(x\in G_n\), we define the graph ball of radius \(\lambda \) around x as
Note that \(\mathcal {B}_G(x;\lambda )\) depends on the definition of the graph distance \(d_G\). For our results we consider Ollivier-triples \({\mathcal {G}}_n=(G_n,d_G,{\varvec{m}}^G)\), where \({\varvec{m}}^G\) are the uniform measures on \(\mathcal {B}_G(x;\delta _n)\), i.e.,
We reiterate that if \(\varepsilon _n=\delta _n\) and the graph metric \(d_G\) is the shortest path distance, then we are in the classical setting of Ollivier–Ricci curvature on graphs as considered in the past literature [16, 26, 31].
3.1 Graphs with Manifold Weighted Distance
Let \(G_n=\mathbb {G}_n(x^*,\varepsilon _n)\) be a random rooted graph on \(\mathcal {M}\). Then we define the manifold weighted graph distance \(d_G^w\) as the weighted shortest-path distance on \(G_n\) where each edge (u, v) is assigned weight \(d_\mathcal {M}(u,v)\), corresponding to the distance between the nodes on the manifold. Similarly to \(\mathcal {B}_G(x;\lambda )\), we denote by \({\mathcal {B}}_G^w(x;\lambda )\) the graph ball of radius \(\lambda \) with respect to \(d_G^w\) and let \({\varvec{m}}^{G,w}=(m_x^{G,w})_{x\in G}\) denote the uniformly measures on the balls \({\mathcal {B}}_G^w(x;\delta _n)\). Finally, given a point \(x\in \mathcal {M}\) and a vector \(\textbf{v}\in T_x\mathcal {M}\), we say that another point \(y\in \mathcal {M}\) is at distance \(\delta \) in the direction of \(~ \textbf{v}\), if \(d_\mathcal {M}(x,y)=\delta \) and y lies on the geodesic starting at x in the direction of \(\textbf{v}\).
Our first result shows that for certain combinations of connection radius \(\varepsilon _n\) and measure radius \(\delta _n\), the Ollivier–Ricci curvature on \(G_n\) converges to the Ricci curvature.
Theorem 1
Let \(N\ge 2\), \((\mathcal {M},d_\mathcal {M})\) be a smooth, connected, and compact N-di- mensional Riemannian manifold, \(x^*\in {\mathcal {M}}\), and \(\textbf{v}\) a unit tangent vector at \(x^*\). Furthermore, let \(\varepsilon _n=\varTheta \hspace{0.33325pt}((\log n)^an^{-\alpha })\), \(\delta _n=\varTheta \hspace{0.33325pt}((\log n)^bn^{-\beta })\) (as \(n\rightarrow \infty \)) where the constants satisfy
and \(a\le b\) if \(\alpha =\beta \) and \(\min {\{a,a+2b\}}>2/N\) if \(\alpha +2\beta =1/N\). Let \(y_n^*\in \mathcal {M}\) be at distance \(\delta _n\) in the direction of \(\textbf{v}\) and \(G_n=\mathbb {G}_n(x^*,y_n^*,\varepsilon _n)\) be rooted random graphs on \(\mathcal {M}\). Then for the Ollivier-triple \({\mathcal {G}}_n^w=(G_n,d_{G_n}^w,{\varvec{m}}^{G,w})\), it holds
Theorem 1 relates two different quantities. The first is the Ollivier–Ricci curvature in the graph between the node \(x^*\) and another node \(y_n^*\) that is at distance \(\delta _n\) from \(x^*\) in the direction of vector \(\textbf{v}\). The second is the Ricci curvature of the manifold at \(x^*\) in the \(\textbf{v}\)-direction. The theorem says that if we properly rescale the former, it converges in expectation to the latter.
Remark 4
-
1.
Note that Theorem 1 states that \(\delta _n^{-2}2(N+2)\hspace{0.7222pt}\kappa (x^*,y_n^*;{\mathcal {G}}_n^w)\) converges in the \(L^1\) sense to \({\text {Ric}}\hspace{0.44434pt}(\textbf{v},\textbf{v})\). In particular, this implies the concentration result
$$\begin{aligned} \lim _{n\rightarrow \infty }\mathbb {P}\left( \biggl |\frac{2(N+2)\hspace{0.54993pt}\kappa (x^*,y_n^*;{\mathcal {G}}_n^w)}{\delta _n^2}-{\text {Ric}}\hspace{0.44434pt}(\textbf{v},\textbf{v})\biggr |\ge \eta \right) =0,\ \ \quad \text {for all }\eta >0. \end{aligned}$$ -
2.
Since \(\varepsilon _n,\delta _n\rightarrow 0\), both the connectivity and measure neighborhoods of \(x^*\) become smaller as n grows. Indeed, curvature is a local property, so that measuring it more accurately requires smaller regions.
-
3.
While the connectivity neighborhood of \(x^*\) is shrinking, the expected number of \(x^*\)’s neighbors lying in it is growing with n. To see this, note that for large enough n the volume of the ball \(\mathcal {B}_\mathcal {M}(x;\varepsilon _n)\) around \(x\in \mathcal {M}\) can be approximated by that of the N-dimensional Euclidean ball. Hence, for any \(x\in \mathbb {G}_n(x^*,y_n^*,\varepsilon _n)\), as \(n\rightarrow \infty \),
$$\begin{aligned} \mathbb {E}\hspace{0.33325pt}[|\mathcal {N}_x|]=n{\text {vol}}_\mathcal {M}(\mathcal {B}_\mathcal {M}(x;\varepsilon _n))=\varTheta (n\varepsilon _n^N)=\varTheta \bigl ((\log n)^{aN}n^{1-\alpha N}\bigr ). \end{aligned}$$The conditions of the theorem imply that \(\alpha \le \alpha +2\beta \le 1/N\), so that \(1-\alpha N\ge 0\). This means that the average degree diverges faster than logarithmically if \(\alpha N<1\). More generally, the conditions of Theorem 1 imply that the average degree always diverges faster than \((\log n)^2\).
If we consider the classic setting where the connection and measure radii are the same, \(\varepsilon _n=\delta _n\), then the following result is a direct consequence of Theorem 1.
Corollary 1
Let \(N\ge 2\), \((\mathcal {M},d_\mathcal {M})\) be a smooth, connected, and compact N-di- mensional Riemannian manifold, \(x^*\in {\mathcal {M}}\), and \(\textbf{v}\) a unit tangent vector at \(x^*\). Furthermore, let \(\delta _n=\varTheta \hspace{0.33325pt}((\log n)^bn^{-\beta })\), with \(\beta \le 1/(3N)\) and \(b>2/N\) whenever \(\beta =1/(3N)\). Let \(y_n^*\in \mathcal {M}\) be at distance \(\delta _n\) in the direction of \(\textbf{v}\) and \(G_n=\mathbb {G}_n(x^*,y_n^*,\delta _n)\) be rooted random graphs on \(\mathcal {M}\). Then for the Ollivier-triple \({\mathcal {G}}_n^w=(G_n,d_{G_n}^w,{\varvec{m}}^{G,w})\), it holds
While the conditions in this corollary imply that the average degree in \(\mathbb {G}_n(x^*,y_n^*,\delta _n)\) diverges faster than \(n^{2/3}\), Theorem 1 works for graphs where the average degree can be almost as small as \((\log n)^2\). The crucial component for establishing the curvature convergence in graphs with so much smaller average degree is to consider different connection and measure radii and let the connection radius decrease at a faster rate than the measure radius, i.e., \(\varepsilon _n\ll \delta _n\).
Remark 5
(extreme cases for convergence of curvature) Corollary 1 covers one set of extreme cases for the combination \(a,b,\alpha \) and \(\beta \) from Theorem 1, were we take \(\beta \) to be as big as possible. This means that we compute the curvature using uniform probability measures on a set of nodes that is as small as possible. For the true extreme case, let \(\eta >0\) be arbitrarily small and define \(\beta =(1-\eta )/(3N)\) and \(b=(2+\eta )/N\). Then, to calculate the curvature, we need to compute the Wasserstein metric between uniform probability measures on neighborhoods that contain
number of nodes. The consequence, however, is that our graphs have average degree diverging at the same rate: \((\log n)^{2+\eta }n^{(2+\eta )/3}\).
In order to get graphs whose average degree diverges as slow as possible, we need to consider an other extreme case. Again let \(\eta >0\) be arbitrary small. Now we define
For these choices we have that \(\alpha +2\beta =1/N\) and \(\min {\{a,a+2b\}}=a>2/N\) so that the result from Theorem 1 holds. In this case, the average degree scales as
which is almost logarithmic. However, we now need to compute the Wasserstein metric with respect to the uniform measure on a number of nodes that scales as
That is, in order to compute curvature on graphs with almost logarithmic average degree, we need to consider the uniform probability measure on almost the entire graph.
3.2 Graphs with Hop Count Distance
In the previous section we considered Ollivier–Ricci curvature of graphs on Riemannian manifolds, with graph edges weighted by manifold distances. These weights encode a lot of information about the manifold metric structure, so that one may feel not terribly surprised that we can recover manifold curvature from graph curvature using this information. The natural question is then if it is possible to prove convergence of Ollivier–Ricci curvature based on shortest path distances \(d_G^s\) in unweighted graphs. It turns out that this can be done under some slightly more restrictive conditions on the connection and measure radii.
For this we define, for any random geometric graph \(G_n=\mathbb {G}(\varepsilon _n)\), the rescaled shortest path distance \(d_G^*(x,y)=\varepsilon _nd_G^s(x,y)\). Similar to the previous setting we let \(\mathcal {B}_G^*(x;\delta _n)\) denote the balls of radius \(\delta _n\) around in \(x\in G_n\) with respect to the metric \(d_G^*\) and define the random walk measures
Theorem 2
Let \((\mathcal {M}, d_\mathcal {M})\) be a \({\varvec{smooth, connected, ~and~ compact}}\) 2-dimensional Riemannian manifold, \(x^*\in {\mathcal {M}}\), and \(\textbf{v}\) a unit tangent vector at \(x^*\). Furthermore, let \(\varepsilon _n=\varTheta \hspace{0.33325pt}((\log n)^an^{-\alpha })\), \(\delta _n=\varTheta \hspace{0.33325pt}((\log n)^bn^{-\beta })\) where the constants satisfy
and \(a<3b\) if \(\alpha =3\beta \) and \(2a+3b>1\) if \(\alpha =(1-3\beta )/2\). Let \(y_n^*\in \mathcal {M}\) be at distance \(\delta _n\) in the direction of \(\textbf{v}\) and \(G_n=\mathbb {G}_n(x^*,y_n^*,\varepsilon _n)\) be rooted random graphs on \(\mathcal {M}\). Then for the Ollivier-triple \({\mathcal {G}}_n^*=(G_n,d_{G_n}^*,{\varvec{m}}^{G,*})\), it holds
Remark 6
-
1.
Note that unlike Theorem 1, here we do not include any information on the distances between nodes on the manifold. This is because the distance \(\delta ^*_{G_n}\) is simply the shortest path distance on the graph \(G_n\) rescaled by the connection radius \(\varepsilon _n\).
-
2.
Observe that the theorem allows to select an \(\alpha \) that is arbitrary close to 1/2. In particular,
$$\begin{aligned} \mathbb {E}\hspace{0.33325pt}[|\mathcal {N}_x|]=\varTheta (n\varepsilon _n^2)=\varTheta \hspace{0.33325pt}((\log n)^{2a}n^{1-2\alpha })\le \varTheta \hspace{0.33325pt}((\log n)^{2a}n^{6\beta }). \end{aligned}$$Hence by selecting a small \(\beta \) we have a discrete notion of curvature that converges on graphs with almost logarithmic average degree, without using any information on the manifold.
-
3.
Theorem 2 currently only works in 2-dimensional manifolds. This is because the proof relies on results for the stretch (the fraction \(d_G/d_\mathcal {M}\)) for random geometric graphs in 2-dimensional Euclidean space [9]. Our proof techniques, however, immediately allow the results to be extended to higher dimensions, once similar types of stretch results for these spaces are obtained.
3.3 Summary, Comments, Caveats, and Outlook
In summary, we have proven that upon proper rescaling, the Ollivier–Ricci curvature of random geometric graphs on a Riemannian manifold converges to the Ricci curvature of the underlying manifold.
Our first result, Theorem 1, establishes convergence of Ollivier–Ricci curvature for a wide range of connectivity and measure radii. In particular, it contains as a corollary the classical setting where both radii are the same, Corollary 1. The theorem does, however, require knowledge of pairwise distances between connected nodes in the manifold.
Our second result, Theorem 2, relaxes this requirement and establishes the same convergence without any knowledge of distances in the manifold. This does come at the price of slightly more restrictive conditions on the possible connection and measure radii. Still, as for the first result, the convergence holds all the way up to graphs whose average degree grows very slowly (almost logarithmically).
To the best of our knowledge, these are the first rigorous results on the convergence of a discrete notion of curvature of random combinatorial objects to a traditional continuum notion of curvature of smooth space.
While the classical setting for Ollivier–Ricci graph curvature uses probability measures (random walks) on balls of the same radius as the graph connection radius, in this paper we allow the radii to be different. This is an important generalization. In particular, we find that in order for the curvature to converge on graphs with almost logarithmic average degree, we need the probability measure radius to be much larger than the connection radius. This is intuitively expected because in order to “feel” any curvature in graphs with such a low density, we really need to consider large “mesoscopic” neighborhoods in them since otherwise all we could see is local “microscopic” Euclidean flatness. It would be interesting to see how this more general approach would generalize known results for the classical setting of Ollivier–Ricci curvature of graph families that have been investigated in the past, such as trees or Erdős–Rényi random graphs [4, 16].
In our recent numeric experiments [13], we have seen that in manifold-distance-weighted random geometric graphs, the Ollivier–Ricci curvature convergence holds even for graphs with constant average degree. Unfortunately, the proof techniques presented in this paper do not allow for a direct generalization to this setting. Therefore, other techniques are needed to (dis)confirm the convergence of Ollivier–Ricci curvature of graphs with constant average degree. We note that one definitely cannot expect Ollivier–Ricci curvature to converge in all possible graph sparsity settings. For example, we definitely need the giant component to exist to talk about any curvature convergence.
For the task of learning latent geometry in networks, our results can still be improved, particularly by removing the requirement to know the connection radius. When presented just with a truly unweighted realization of a random geometric graph, this radius needs first to be learnt, estimated. It would thus be interesting to see if convergence would still hold if we replace the true value of the connection radius with its consistent estimation, e.g. based on the average degree. Here we expect the speed of curvature convergence (if any) to depend on the speed of estimator convergence in a possibly nontrivial way.
Finally, now that we have seen that Ollivier–Ricci curvature of random combinatorial discretizations of smooth spaces converges to their Ricci curvature, it would be interesting to investigate whether such convergence also holds for other popular notions of discrete curvature. Forman–Ricci curvature [37] appears to be a good next candidate for such investigation.
4 Proof Overview
Our main results in Theorems 1 and 2 follow from our more general result on the Ollivier–Ricci curvature convergence in graphs whose edges are always weighted by some weights. That is, we assume that all edges in our graphs always have some weights, assigned according to some scheme. For our general result it is not important what these weights or their assignment scheme are. What is important is that the graph distance \(d_G\) between node pairs is a good approximation of the manifold distance \(d_\mathcal {M}\) between the corresponding pair of points. Here by graph distance we mean any metric on the vertex set of the graph. To quantify how good this approximation is, we introduce the following definition.
Definition 4.1
Let \((\mathcal {M},d_\mathcal {M})\) be an N-dimensional Riemannian manifold and \(G_n=\mathbb {G}_n(x^*,\varepsilon _n)\) a rooted random graph on \(\mathcal {M}\). A graph distance \(d_G\) on \(G_n\) is said to be a \(\delta _n\)-good approximation of \(d_\mathcal {M}\) if \(d_\mathcal {M}\le d_G\) and the following holds (as \(n\rightarrow \infty \)): there exists a \(Q>3\) and \(\xi _n=o(\delta _n)\) such that with probability \(1-o(\delta _n^3)\),
holds for all \(u,v\in \mathcal {B}_\mathcal {M}(x^*;Q \delta _n) \cap G_n\).
Remark 7
(asymptotic expressions) Most of our results will deal with asymptotic relations, e.g. \(\xi _n=o(\delta _n)\). Unless stated otherwise, these asymptotic relations will always be understood as \(n\rightarrow \infty \).
There are several examples of shortest weighted path distance that are \(\delta _n\)-good approximation of the manifold distance. In this paper we consider two cases. In one, each edge (u, v) has weight equal to \(d_\mathcal {M}(u,v)\), while in the other case the weight is simply the connection radius \(\varepsilon _n\). An explicit example of the latter case is when the manifold is 2-dimensional and the connection and measure radii are given by \(\varepsilon _n=n^{-1/3}\) and \(\delta _n=n^{-1/9}\log n\), respectively. See Propositions 4 and 5 for more details.
Recall that \(\mathcal {B}_G(x;\delta )\) denotes the set of nodes in the graph that are at graph distance at most \(\delta \) from x,
and define
This \(\lambda _n\) will play the role of an additional radius, for extending the graph distance \(d_G\) to the manifold. In short, to define a distance between \(u,v\in \mathcal {M}\), we will connect u and v to all points of the graph within radius \(\lambda _n\) and then use the graph distance. The radius \(\lambda _n\) has been selected such that the expected number of nodes inside any ball \(\mathcal {B}_\mathcal {M}(x;\lambda _n)\) is of the order \(\varTheta \hspace{0.33325pt}((\log n)^2)\). Hence, the probability of observing no node of the graph inside any such ball is \(O(e^{-(\log n)^2})=o(n^{-1})\), which is sufficiently small. More details on the use of \(\lambda _n\) can be found in Sect. 5.1. Our general result is then as follows.
Theorem 3
Let \(N\ge 2\), \((\mathcal {M},d_\mathcal {M})\) be a smooth, connected, and compact N-di- mensional Riemannian manifold, \(x^*\in {\mathcal {M}}\), and \(\textbf{v}\) a unit tangent vector at \(x^*\). Furthermore, let \(\varepsilon _n\le \delta _n=o(1)\) be such that \(\lambda _n=o(\varepsilon _n)\) and \(\lambda _n=o(\delta _n^3)\). Let \(y_n^*\in \mathcal {M}\) be at distance \(\delta _n\) in the direction of \(\textbf{v}\), \(G_n=\mathbb {G}_n(x^*,y_n^*,\varepsilon _n)\) be rooted random graphs on \(\mathcal {M}\), and \(d_G\) a \(\delta _n\)-good approximation of \(d_\mathcal {M}\). Then, if we consider the Ollivier-triple \({\mathcal {G}}_n=(G_n,d_G,{\varvec{m}}^G)\),
Once we have established this general result, our main results in Theorems 1 and 2 follow if we can show that the considered graph distances are \(\delta _n\)-good approximations.
A key ingredient in the proof of Theorem 3 is the convergence result for Ollivier–Ricci curvature for uniform measures on Riemannian manifolds, proved in the seminal paper on the topic [28]. In a high-level overview, our proof approximates Ollivier–Ricci curvature of probability measures on the graph with those on the manifold. Having obtained such an approximation with a required accuracy, we then apply the convergence result from [28].
Since Ollivier–Ricci curvature is defined by the Wasserstein metric on probability measures, our analysis focuses on approximating the Wasserstein metric of discrete probability measures on the graph by the Wasserstein metric of uniform probability measures on the manifold. This is done in three steps: 1) extend the graph distance \(d_G\) to a distance \({\widetilde{d}}_\mathcal {M}\) on the manifold such that the Wasserstein metric \({\widetilde{W}}_1\) with respect to this new distance is a good approximation of the Wasserstein metric \(W_1\) on the manifold, 2) show that the Wasserstein metric between the probability measure \(m_x^G\) on the graph and the discrete probability measure \(m_x^\mathcal {M}\) on the nodes within the ball \(\mathcal {B}_\mathcal {M}(x;\delta _n)\) is sufficiently small, and 3) show that the Wasserstein metric between the uniform measure on \(\mathcal {B}_\mathcal {M}(x;\delta _n)\) and the discrete probability measure \(m_x^\mathcal {M}\) is sufficiently small.
Remark 8
In all cases, sufficiently small means that the error terms are of smaller order than \(\delta _n^3\). This is because the Wasserstein metric is first divided by \(\delta _n\) to obtain the curvature, which is then divided by \(\delta _n^2\) to make it converge to the Ricci curvature.
We proceed with explaining all ingredients and the three steps in more detail. We reiterate that unless stated otherwise, we will assume that \(\varepsilon _n\le \delta _n\) are two sequences converging to zero such that \(\lambda _n=o(\varepsilon _n)\) and \(\lambda _n=o(\delta _n^3)\).
4.1 Ollivier Curvature on Riemannian Manifolds
Let \((\mathcal {M},d_\mathcal {M})\) be a smooth, orientable, connected and compact N-dimensional Riemannian manifold. For \(x\in \mathcal {M}\) and \(\delta >0\), we write \(\mathcal {B}_\mathcal {M}(x;\delta )\subseteq \mathcal {M}\) to denote the closed ball of radius \(\delta \) around x, i.e., \(\mathcal {B}_\mathcal {M}(x;\delta )=\{y\in \mathcal {M}:d_\mathcal {M}(x,y)\le \delta \}\). Recall that
denotes the volume of the ball \(\mathcal {B}_\mathcal {M}(x;\delta )\). Now fix \(\delta > 0\) and consider the uniform measure on balls of radius \(\delta \). That is, for \(x \in \mathcal {M}\) we take the probability measure \(\mu _x^\delta \) given by
We will refer to \(\mu ^\delta _x\) as the uniform \(\delta \)-measure. The following result from [28] shows that for a uniform \(\delta \)-measure on a Riemannian manifold, the Ollivier curvature (properly rescaled) converges to the Ricci curvature as \({\delta \rightarrow 0}\).
Theorem 4
[28, Exam. 7] Let \((\mathcal {M},d_\mathcal {M})\) be a smooth complete N-dimensional Riemannian manifold \(x\in {\mathcal {M}}\) and \(\textbf{v}\) a unit tangent vector at x. Let \(\delta >0\) and \(y_\delta \) be the point at distance \(\delta \) in the direction of \(\textbf{v}\). Then if we consider the Ollivier–Ricci curvature \(\kappa \) for the uniform \(\delta \)-measures given by (6),
Remark 9
The result in Theorem 4 clearly exhibits the local nature of curvature as it holds in the limit where the distance \(d_\mathcal {M}(x,y)=\delta \) between the two points goes to zero.
Taking \(\delta = \delta _n\), \(x = x^*\), and \(y = y_n^*\) in the above theorem, we have that the rescaled Ollivier–Ricci curvature associated to the uniform \(\delta _n\)-measures converges to the Ricci curvature as \(n\rightarrow \infty \). The main strategy for proving Theorem 3 is to compare this “uniform” version of the curvature \(\kappa \) on the manifold to the discrete version on the graph. More precisely, we need to prove that
There are two complicating factors here. First, we have to deal with two Wasserstein metrics defined on two different spaces. Second, we have to compare discrete probability measures with continuous ones. We deal with the different Wasserstein metrics in the next section and with comparing the different measures in Sects. 4.3 and 4.4.
4.2 Extending the Graph Distance to the Manifold
In order to compare the two different Wasserstein metrics in (7) we extend the graph distance \(d_G\) to a distance \({\widetilde{d}}_\mathcal {M}\) defined on a sufficiently large part of \(\mathcal {M}\). In particular, we will consider the ball \(\mathcal {B}_\mathcal {M}(x^*;Q\delta _n)\), with \(Q>3\) from Definition 4.1. The extension is such that for any two nodes \(x,y\in G_n\), \(d_G(x,y)=\widetilde{d}_\mathcal {M}(x,y)\), so that \(W_1^G(m_{x^*}^G,m_{y_n^*}^G)\) can be replaced by the Wasserstein metric associated with \({\widetilde{d}}_\mathcal {M}\).
Recall the definition of \(\lambda _n\) from (5), \(\lambda _n=(\log n)^{2/N}n^{-1/N}\). Denote \(G_n=\mathbb {G}_n(x^*,y_n^*,\delta _n)\) and let \(U\subset {\mathcal {M}}\) be a countable set of points. Then we define the graph \(G_n(U)\) obtained from \(G_n\) by adding the points of U to the vertex set and connecting each \(u\in U\) to any other node \(x\in G_n \setminus U\) for which \(d_{{\mathcal {M}}}(x,u)\le \lambda _n/2\). After this, we assign to each new edge (u, x) the weight \(d_\mathcal {M}(x,u)(1+\xi _n^2)+\xi _n^3\), with \(\xi _n\) from Definition 4.1. We can then extend the graph distance to the manifold by defining \({\widetilde{d}}_{{\mathcal {M}}}(u,v)\) to be the graph distance \(d_G(u,v)\) computed in the extended graph \(G_n(\{u,v\})\) with the added weights. That is, \({\widetilde{d}}_{{\mathcal {M}}}(u,v)\) is the shortest weighted path distance in the extended graph \(G_n(\{u,v\})\), where the weights follow the same scheme as for the original graph.
Observe that if \(x,y\in G_n\) then \({\widetilde{d}}_{{\mathcal {M}}}(x,y)=d_G(x,y)\) so that the distance on nodes of \(G_n\) does not change and hence \({\widetilde{d}}_{{\mathcal {M}}}\) is a true extension of \(d_G\). In addition, by definition of the graph distance it immediately follows that \({\widetilde{d}}_{{\mathcal {M}}}(u,v)=0\) if and only if \(u=v\). Figure 1 shows an illustration of the extended distance.
It is important to note that this extended distance depends on the random graph \(G_n\). Therefore, it could happen that two added points \(u,v\in U\) are not connected in \(G_n(U)\), i.e., there does not exist a path from u to v in the extended graph. This happens if there are no nodes in \(\mathcal {B}_\mathcal {M}(u;\lambda _n/2)\) or in \(\mathcal {B}_\mathcal {M}(v;\lambda _n/2)\) or if none of the node pairs \((x,y)\in \mathcal {B}_\mathcal {M}(u;\lambda _n/2)\times \mathcal {B}_\mathcal {M}(v;\lambda _n/2)\) are connected by a path in \(G_n\). Therefore, to justify the definition of the extended manifold distance we need to make sure that, with sufficiently high probability, theses situations do not occur.
Lemma 1
Let \(G_n={\mathbb {G}}_n(x^*,y_n^*,\delta _n)\) and \(Q>3\) be the constant from Definition 4.1. Then, there exists an event \(\varOmega _n\) satisfying \(\mathbb {P}\left( \varOmega _n\right) \ge 1-o(\delta _n^3)\) such that on this event the following holds:
-
(\(\varOmega 1\)) \((\mathcal {B}_\mathcal {M}(x^*;Q\delta _n),{\widetilde{d}}_\mathcal {M})\) is a metric space and
-
(\(\varOmega 2\)) \({\widetilde{d}}_\mathcal {M}(u,v)=d_\mathcal {M}(u,v)+o(\delta _n^3)\).
The first property ensures that our extended distance is an actual distance. Moreover, by the second property, this extended distance is a good approximation of the true distance on the manifold. Finally, we also note that the first property makes sure that \(d_G(x^*,y_n^*)={\widetilde{d}}_\mathcal {M}(x^*,y_n^*)<\infty \), so that the curvature \(\kappa \) between \(x^*\) and \(y_n^*\) is well defined and not forced to be zero. The precise definition of \(\varOmega _n\) is not needed to understand the high level arguments as well as the proof of the main results. For now, let us refer to \(\varOmega _n\) as the good event. Details on this event can be found in Sect. 5.1.
Let \({\widetilde{W}}_1\) denote the Wasserstein metric with respect to \({\widetilde{d}}_\mathcal {M}\), which is only well defined on the good event \(\varOmega _n\). Since the distance is determined by the graph \(G_n=\mathbb {G}_n(x^*,y_n^*,\delta _n)\), the Wasserstein metric is also a random object. The following proposition states that, on the event \(\varOmega _n\), the difference between the Wasserstein metrics \({\widetilde{W}}_1\) and \(W_1\) is small. The proof is given in Sect. 5.1.
Proposition 1
Let \(G_n={\mathbb {G}}_n(x^*,\varepsilon _n)\) and \(\mu _1,\mu _2\) be two probability measures on \(\mathcal {M}\) with support contained in \(\mathcal {B}_\mathcal {M}(x^*;Q\delta _n)\). Then
Recall that \({\widetilde{d}}_\mathcal {M}(x,y)=d_G(x,y)\) if \(x,y\in G_n\), and therefore \(W_1^G(m_{x^*}^G,m_{y_n^*}^G)={\widetilde{W}}_1(m_{x^*}^G,m_{y_n^*}^G)\). Hence, since the uniform \(\delta _n\)-measures \(\mu _{x^*}^{\delta _n}\) and \(\mu _{y_n^*}^{\delta _n}\) are probability measures on \(\mathcal {M}\) with support contained in \(\mathcal {B}_\mathcal {M}(x^*;Q\delta _n)\), Proposition 1 implies that on the good event,
holds in expectation. This is helpful because both Wasserstein metrics in the expression on the right hand side are now defined on the same space. Therefore, since \({\widetilde{W}}_1\) is a distance, the reverse triangle inequality implies
Applying Proposition 1 again we get that
holds in expectation, conditioned on the good event. However, the right hand side no longer involves the extended distance. Hence, it now suffices to show that for any \(x\in \mathcal {B}_\mathcal {M}(x^*;\delta _n)\),
4.3 Approximating Probability Measures on Graph Balls
Recall that \(\mathcal {B}_\mathcal {M}(x;\delta _n)\) denotes the closed ball around \(x\in \mathcal {M}\) with radius \(\delta _n\) according to the manifold distance \(d_\mathcal {M}\). The first step in establishing (8) is to move from uniform measures on the graph balls \(\mathcal {B}_G(x;\delta _n)\) to uniform measures on the nodes of the graph that lie in the manifold balls \(\mathcal {B}_\mathcal {M}(x;\delta _n)\). The reason for this is that \(y\in \mathcal {B}_G(x;\delta _n)\) does not necessarily imply that \(y\in \mathcal {B}_\mathcal {M}(x;\delta _n)\), nor vice versa. This creates difficulties when comparing the measures \(m_x^G\) and \(\mu _x^{\delta _n}\).
Let \(G_n=\mathbb {G}_n(x^*,\varepsilon _n)\) be rooted random graphs on \(\mathcal {M}\). Then we define the probability measures \({\varvec{m}}^\mathcal {M}\) on the nodes of \(G_n\) as
Although the uniform measures \(m_{x^*}^G\) and \(m_{x^*}^\mathcal {M}\) are not the necessarily equal, the Wasserstein metric between them is sufficiently small.
Proposition 2
Let \(G_n=\mathbb {G}_n(x^*,\varepsilon _n)\) be rooted random graphs on \(\mathcal {M}\) with graph distance \(d_G\) that is a \(\delta _n\)-good approximation of \(d_\mathcal {M}\). Let \(x\in \mathcal {B}_\mathcal {M}(x^*;\delta _n)\) and denote by \(m_{x}^G\) the uniform measure on \(\mathcal {B}_G(x;\delta _n)\) and by \(m_{x}^\mathcal {M}\) the uniform measure on \(\mathcal {B}_\mathcal {M}(x;\delta _n)\cap G_n\). Then
The proof of this result is based on some simple computations regarding Poisson random variables and can be found in Sect. 5.2. Proposition 2 allows us to replace (8) with
Note that the only dependence on the graph is now in the amount of nodes placed inside the ball \(\mathcal {B}_\mathcal {M}(x;\delta _n)\), which is completely determined by the Poisson process. All dependencies on the actual structure of the graph have been removed. This allows us to compute the Wasserstein metric between \(m_x^\mathcal {M}\) and \(\mu _x^{\delta _n}\).
4.4 Coupling Continuous and Discrete Probability Measures on \(\mathcal {M}\)
Recall that the Wasserstein metric \(W_1(\mu _1,\mu _2)\) takes an infimum over all possible joint distributions (couplings) between the measures \(\mu _1\) and \(\mu _2\). Hence, to show that (10) holds, we need to design an optimal coupling (transport plan) between \(m_x^\mathcal {M}\) and \(\mu _x^{\delta _n}\). The main idea here is to view \(m_x^\mathcal {M}\) as a discrete version of \(\mu _x^{\delta _n}\).
For now, let us assume that we are working in the N-dimensional Euclidean cube \(\mathcal {M}=[0,1]^N\). Given a realization of the Poisson process, a transport plan between \(m_x^\mathcal {M}\) and \(\mu _x^{\delta _n}\) should assign to each measurable set \(A\subseteq \mathcal {B}_\mathcal {M}(x;\delta _n)\) how much of the associated mass \(\mu _x^{\delta _n}(A)\) is transported to each point of the Poisson process. To make it optimal, we should distribute the mass over those points that are closest to A. This problem is actually related to that of finding a minimal matching between points of a Poisson process and points of a grid on \([0,1]^N\), see [22, 35, 39]. Here, minimal means that the largest distance between a point of the Poisson process and its matched grid point is minimized. The idea for the transport plan is as follows:
-
Place a grid on \([0,1]^N\).
-
Find a minimal matching between the Poisson process and the grid.
-
Given a \(A\subseteq \mathcal {B}_\mathcal {M}(x;\delta _n)\), we take all points of the Poisson process that are matched to grid points inside A and distribute the mass \(\mu _x^{\delta _n}(A)\) equally over those points.
Using known results for minimal matchings, it can then be shown that, under suitable conditions, the Wasserstein metric between \(m_x^\mathcal {M}\) and \(\mu _x^{\delta _n}\) is \(o(\delta _n^3)\).
Finally, we need to extend these results in flat Euclidean space to the ball \(\mathcal {B}_\mathcal {M}(x;Q\delta _n)\) in general \(\mathcal {M}\). For this we use that \(\delta _n\rightarrow 0\) and that small neighborhoods of \(x\in \mathcal {M}\) can be mapped diffeomorphically to the flat N-dimensional tangent space by the exponential map \(\exp _x:T_x\mathcal {M}\rightarrow \mathcal {M}\). We then apply the matching results there and map back. Here we need to tread carefully, since the exponential map does not preserve distances. We thus fix a sufficiently small neighborhood U around the origin of the tangent space at x. Then, for some fixed \(0<\xi < 1\) and large enough n we have
where \(\mathcal {B}_N(0;\delta )\) is the Euclidean ball of radius \(\delta \). This then yields matching upper and lower bounds on the Wasserstein metric on \(\mathcal {M}\) in terms of the Wasserstein metric on the Euclidean space. All details of this approach are provided in Sect. 5.3. In the end we obtain the following result.
Proposition 3
For any point \(x\in \mathcal {M}\),
4.5 Proof of the Main Results
We now have all ingredients to prove the main results. We start with Theorem 3, where we bound the expression inside the expectation as a sum of several terms and use the above results and the fact that \(d_G\) is a \(\delta _n\)-good approximation to show that each individual term goes to zero.
Proof of Theorem 3
First, we bound the term inside the expectation as follows:
The last term is deterministic and goes to zero by Theorem 4. For the first term we note that the absolute value of each curvature term can be bounded from above by 2. Now let \(C_n\) denote the event that \(x^*\) and \(y_n^*\) are connected. Since this is implied by good event \(\varOmega _n\), see Lemma 1, it follows that \(C_n^c\subseteq \varOmega _n^c\), where the superscript c denotes the complement of the event. Moreover, on the event \(C_n^c\), \(\kappa (x^*,y_n^*,{\mathcal {G}}_n)=0\) by definition. Finally, since \(d_G\) is a \(\delta _n\)-good approximation it follows that \(\delta _n^2=d_\mathcal {M}(x^*,y_n^*)^2\le d_G(x^*,y_n^*)^2\). Therefore, we have
and
It then follows that
By construction of the good event we have \(1-\mathbb {P}\left( \varOmega _n\right) =o(\delta _n^3)\) and thus, the last term in the above bound goes to zero. For the other term we recall that
Then the expression inside the conditional expectation can be bounded as follows:
Next, since \(d_G\) is a \(\delta _n\)-good approximation, we can apply (4)
Since \(W_1^G(m_{x^*}^G,m_{y_n^*}^G)\le \delta _n\) it then follows that the second term in (11) goes to zero. For the first term we have
which implies that this term also goes to zero. We are thus left with (12), for which we have to show that
We first replace \(W_1(\mu _{x^*}^{\delta _n},\mu _{y_n^*}^{\delta _n})\) with \({\widetilde{W}}_1(\mu _{x^*}^{\delta _n},\mu _{y_n^*}^{\delta _n})\) by invoking Proposition 1:
This then implies
To show that the first term in the upper bound is also \(o(\delta _n^3)\) we apply the reverse triangle inequality twice to obtain
We proceed to show that \(\widetilde{W}_1(m_{x^*}^G,\mu _{x^*}^{\delta _n})=o(\delta _n^3)\) holds in expectation on the event \(\varOmega _n\). The proof for \({\widetilde{W}}_1(m_{y_n^*}^G,\mu _{y_n^*}^{\delta _n})\) is similar. Applying Proposition 1 again we get
Since both expectations are \(o(\delta _n^3)\) by, respectively, Propositions 2 and 3, we conclude that
which finishes the proof. \(\square \)
Now that we have the general result, Theorems 1 and 2 directly follow from Theorem 3 if we can show that the graph distances that are considered there are \(\delta _n\)-good approximations.
Throughout the remainder of this section we will assume that
for some \(a,b\in \mathbb {R}\) and \(0\le \alpha ,\beta \le 1\). We shall also assume that \(\varepsilon _n\le \delta _n\). The following results show that for appropriate choices of the constants a, b and \(\alpha , \beta \) both the weighted manifold and the rescaled hopcount distance are \(\delta _n\)-good approximations. The proofs are given in Sects. 5.4 and 5.5, respectively.
Proposition 4
Suppose the constants in \(\varepsilon _n\) and \(\delta _n\) satisfy
with \(a\le b\) if \(\alpha =\beta \) and \(a+2b>2/N\) if \(\alpha +2\beta =1/N\). Let \(y_n^*\in \mathcal {M}\) be at distance \(\delta _n\) in the direction of \(\textbf{v}\) and \(G_n=\mathbb {G}_n(x^*,y_n^*,\varepsilon _n)\) be rooted random graphs on \(\mathcal {M}\). Then the manifold-weighted graph distance \(d_G^w\) on \(G_n\) is a \(\delta _n\)-good approximation of \(d_\mathcal {M}\).
Proposition 5
Suppose the constants in \(\varepsilon _n\) and \(\delta _n\) satisfy
and \(a<3b\) if \(\alpha =3\beta \) and \(2a+3b>1\) if \(\alpha =(1-3\beta )/2\). Let \(y_n^*\in \mathcal {M}\) be at distance \(\delta _n\) in the direction of \(\textbf{v}\). Let \(G_n=\mathbb {G}_n(x^*,y_n^*,\varepsilon _n)\) be rooted random graphs on a 2-dimensional Riemannian manifold \(\mathcal {M}\) and denote by \(d_G^s\) the shortest path distance. Then the \(\varepsilon _n\)-weighted graph distance \(d_G^*:=\varepsilon _nd_G^s\) on \(G_n\) is a \(\delta _n\)-good approximation of \(d_\mathcal {M}\).
Observe that the conditions of the constants in Propositions 4 and 5 are exactly the same as in Theorems 1 and 2, respectively. Moreover, these conditions imply that \(\lambda _n=o(\varepsilon _n)\) and \(\lambda _n=o(\delta _n^3)\), with \(\lambda _n\) as defined in (5), as we will now demonstrate.
In Proposition 4 we have \(\beta >0\) and \(\alpha +2\beta \le 1/N\). It then follows that \(\alpha <1/N\) which implies \(\lambda _n=o(\varepsilon _n)\). When the inequality \(3\beta \le \alpha +2\beta \le 1/N\) is strict we have that \(\lambda _n=o(\delta _n^3)\). When \(3\beta =1/N\) it must be that \(\alpha +2\beta =1/N\) and hence the conditions of Proposition 4 imply that \(3b\ge a+2b>2/N\). From this we deduce that \(\lambda _n/\delta _n^3=\varTheta \hspace{0.33325pt}((\log n)^{2/N-a-2b})=o(1)\).
In Proposition 5, since \(N = 2\), the conditions \(\lambda _n=o(\varepsilon _n)\) and \(\lambda _n=o(\delta _n^3)\) follow if \(\alpha <1/2\) and \(3\beta <1/2\). The first inequality holds since \(\beta >0\) and \(\alpha \le (1-3\beta )/2\), while the second is due to the fact that \(3\beta \le 3/9=1/3\).
We thus conclude that under the conditions in both propositions, the radii satisfy the conditions of Theorem 3. Hence, Theorems 1 and 2 follow from it.
5 Proofs
Here we prove all the intermediate results that we used to prove our main results in the previous section. We start with the proof of Lemma 1 and Proposition 1 in the next Sect. 5.1. In Sect. 5.2 we provide the details for Proposition 2, while the proof of Proposition 3 is given in Sect. 5.3. We end with Sects. 5.4 and 5.5 where we prove Propositions 4 and 5, respectively, leading to the main results of this paper.
Recall that
and \(\varepsilon _n\le \delta _n\rightarrow 0\) are such that \(\lambda _n=o(\varepsilon _n)\) and \(\lambda _n=o(\delta _n^3)\).
5.1 Extended Graph Distance
Our first goal is to proof Lemma 1. We start by showing that there exists a radius \(r_n\rightarrow 0\) such that for any finite set of points \(u\in \mathcal {M}\), the balls \(\mathcal {B}_\mathcal {M}(u;r_n)\) will still each contain at least one node from the rooted graphs \(G_n={\mathbb {G}}_n(x^*,y_n^*,\varepsilon _n)\). The reason why we need \(r_n\) to decrease is because the connection radius \(\varepsilon _n\) also decreases and we want the ball \(\mathcal {B}_\mathcal {M}(u;r_n)\) to be contained inside the connection area of the point u.
Lemma 2
Let \(U\subset \mathcal {M}\) be a finite set of points in \(\mathcal {M}\) such that \(|U|=O(n^c)\), for some \(c>0\), and let \(r_n=\varTheta (\lambda _n)\). Then, for \(G_n={\mathbb {G}}_n(\varepsilon _n)\),
as \(n\rightarrow \infty \).
Proof
First note that for \(r_n\) small enough the ball \(\mathcal {B}_\mathcal {M}(u;r_n)\) can be mapped diffeomorphically onto the tangent space \(T_u\mathcal {M}\) at u. In particular, for small enough \(r_n\) we have that, as \(n\rightarrow \infty \), \(\textrm{vol}_\mathcal {M}(\mathcal {B}_\mathcal {M}(u;r_n))=\varTheta (r_n^N)=\varTheta (\lambda _n^N)\). Next, since the nodes in \(G_n\) are placed according to a Poisson process with intensity \(n/{\textrm{vol}_\mathcal {M}(\mathcal {M})}\) it follows that
Therefore, by applying the union bound we get
To finish the proof we note that \(e^{-\varTheta ((\log n)^2)+c\log n}=o(\lambda _n)\) which by assumption is \(o(\delta _n^3)\). \(\square \)
With this lemma we obtain the following corollary.
Corollary 2
There exists a collection \(\{B_1,\dots ,B_m\}\) of \(m=\varTheta (\lambda _n^{-N})\) balls of radius \(\lambda _n/4\) that cover \(\mathcal {M}\), such that if we denote by \(c_1,\dots ,c_m\) their centers and define the event
Then \(\mathbb {P}\left( C_n\right) =1-o(\delta _n^3)\).
Proof
The collection is constructed using the standard trick of taking a maximal set of disjoint balls of radius \(\lambda _n/8\) in \(\mathcal {M}\). Denote their centers by \(c_1,\dots ,c_m\). Simple volume comparison, and the compactness of \(\mathcal {M}\), gives \(m=O(\lambda _n^{-N})\). By construction, the balls \(B_i=\mathcal {B}_\mathcal {M}(c_i;\lambda _n/4)\) then cover \(\mathcal {M}\), and hence \(m=\varTheta (\lambda _n^{-N})=\varTheta \hspace{0.33325pt}((\log n)^{-2}n)=O(n)\). The result then follows from Lemma 2. \(\square \)
The event \(C_n\) will play a crucial part in defining the good event \(\varOmega _n\). Let \(D_n\) denote the event on which (4) holds. Then we define the good event as
On this event, with sufficiently high probability, \((\mathcal {B}_\mathcal {M}(x^*;Q\delta _n),{\widetilde{d}}_\mathcal {M})\) is a metric space for any constant \(Q>0\) and the extended distance \({\widetilde{d}}_\mathcal {M}\) is a good approximation of the original distance \(d_\mathcal {M}\). Note that we do not need to consider the whole manifold since curvature is a local property.
Lemma 3
Let \(\varOmega _n\) be the event defined in (14) and \(Q>3\) the constant from Definition 4.1. Then on the event \(\varOmega _n\),
-
each pair of points \(u,v\in \mathcal {B}_\mathcal {M}(x^*;Q\delta _n)\) is connected by a path in the extended graph \(G_n(u,v)\), and
-
\((\mathcal {B}_\mathcal {M}(x^*;Q\delta _n),{\widetilde{d}}_\mathcal {M})\) is a metric space.
Proof
We first prove the first statement. For this, take any \(u,v\in \mathcal {B}_\mathcal {M}(x^*;Q\delta _n)\) and let \(\gamma (u,v)\) denote the geodesic between u and v. This geodesic will be covered by a subsequence \(B_{t_1},\dots ,B_{t_k}\) of the cover of \(\mathcal {M}\), which we rank in order of appearance moving from u to v. Let \(c_{t_1},\dots ,c_{t_k}\) denote the corresponding centers of these balls, see Fig. 2. On the event \(C_n\) each ball contains a vertex \(x_{t_i}\in G_n\) and since
the edges \((u, x_{t_1})\) and \((v,x_{t_k})\) are present in \(G_n(u,v)\). Moreover, since \(d_\mathcal {M}(x_{t_i},x_{t_{i+1}})\) is bounded by four times the radius of the balls, it follows that for large enough n, \(d_\mathcal {M}(x_{t_i},x_{t_{i+1}})\le \lambda _n=o(\varepsilon _n)\) and thus, for n large enough, \(\{x_{t_1},\dots ,x_{t_k}\}\) is a path in \(G_n\). We thus conclude that u and v are connected in \(G_n(u,v)\). Note that because of this property, on the event \(\varOmega _n\), the extended manifold distance between \({\widetilde{d}}_\mathcal {M}\) is well defined on \(\mathcal {M}\).
We are left to show that on the event \(\varOmega _n\), the extended manifold distance is a true distance. Note that the only non-trivial part is the triangle inequality. Let \(u,v,z\in \mathcal {B}_\mathcal {M}(x^*;Q\delta _n)\) and consider the graphs \(G^{(1)}=G_n(u,v)\) and \(G^{(2)}=G_n(u,v,z)\). Now observe that the triangle inequality can only be violated if z creates a short-cut, i.e., if the shortest weighted path between u and v in \(G^{(1)}\) is longer than in \(G^{(2)}\). Suppose that this is true, and let \(\pi _1=\{u,\dots ,y_1,z,y_2,\dots ,v\}\) denote this new weighted shortest path in \(G^{(2)}\). Since \(y_1\) and \(y_2\) are connected to z in \(G^{(2)}\) it follows that \(d_\mathcal {M}(z,y_i)\le \lambda _n/2\). However, by the triangle inequality for \(d_\mathcal {M}\), this implies that \(d_\mathcal {M}(y_1,y_2)\le \lambda _n=o(\varepsilon _n)\) and hence, for sufficiently large n, the edge \((y_1, y_2)\) is present in \(G_n\) and thus also in \(G^{(1)}\) and \(G^{(2)}\).
Let \({\hat{\pi }}=\{y_1:=x_0,x_1,\dots ,x_{m-1},y_2:=x_m\}\) denote the shortest weighted path in \(G_n\) between \(y_1\) and \(y_2\), i.e., \(d_G(y_1,y_2)=\sum _{t=1}^mw_{x_{t-1}x_t}\), and take \(\pi _2=\{u,\dots ,y_1,x_1,\dots ,x_{m-1},y_2,\dots ,v\}\). Then \(\pi _2\) is a path between u and v that excludes z. See also Fig. 3. We will show that the total weight of this path is at most that of \(\pi _1\).
For simplicity lets us denote by \(\Vert \pi \Vert \) the total weight of a path \(\pi \). Since \(d_G\) is a \(\delta _n\)-good approximation,
holds on the event \(\varOmega _n\). Applying the triangle inequality for \(d_\mathcal {M}\) we get
This implies that the total weight of the path \(\pi _2\) is at most that of \(\pi _1\) from which we conclude that z cannot create a short-cut and hence \({\widetilde{d}}_\mathcal {M}\) satisfies the triangle inequality. \(\square \)
We are now ready to prove Lemma 1.
Proof of Lemma 1
Note that for any two nodes \(u,v\in G_n\) with \(u,v\in \mathcal {B}_\mathcal {M}(x^*;Q\delta _n)\), Lemma 3 implies that u and v are connected by a path in \(G_n\). Hence the only part of Lemma 1 to prove is property (\(\varOmega \)1) there.
Take any \(u,v\in \mathcal {B}_\mathcal {M}(x^*;3\delta _n)\). Then on the event \(\varOmega _n\), by definition of the extended distance \({\widetilde{d}}_\mathcal {M}\), there exists \(x_u,x_v\in G_n\) such that \(d_\mathcal {M}(u,x_u)\le \lambda _n/2\), \(d_\mathcal {M}(v,x_v)\le \lambda _n/2\), and
Moreover, since \(Q>3\) and \(\lambda _n=o(\delta _n^3)\) we can assume that \(x_u,x_v\in \mathcal {B}_\mathcal {M}(x^*;Q\delta _n)\), for sufficiently large n. Since the approximation (4) holds on the event \(\varOmega _n\), we have
Combining (15) and (16) we get
Applying the triangle inequality to the last distance,
we get
Finally, we need to prove Proposition 1. Since, on the event \(\varOmega _n\),
the proof follows immediately from the following elementary result on Wasserstein metrics.
Lemma 4
Let \((\mathcal {X},d)\) and \((\mathcal {X},{\widetilde{d}})\) be two metric spaces such that
holds for all \(x,y\in \mathcal {X}\) and some \(K>0\). Denote by \(W_1\) and \({\widetilde{W}}_1\) the Wasserstein metric associated with d and \({\widetilde{d}}\), respectively. Then for any two probability measures \(\mu _1\) and \(\mu _2\) on \(\mathcal {X}\),
Proof
For any coupling \(\mu \) between \(\mu _1\) and \(\mu _2\),
and similarly
Then it follows that
and similarly
from which the result follows. \(\square \)
5.2 Probability Measures on Graphs
In this section we give the proof of Proposition 2. Recall that \(m_x^G\) and \(m_x^\mathcal {M}\) denote the uniform probability measures on the set of nodes in \(\mathcal {B}_G(x;\delta _n)\) and \(\mathcal {B}_\mathcal {M}(x;\delta _n)\), respectively. The goal is then to show that
As we mentioned, these two sets are not necessarily contained in each other. Hence, to bound the Wasserstein metric we will work with slightly smaller and larger balls \(B^-\) and \(B^+\) such that
We can then obtain an upper bound by comparing the Wasserstein metric between \(m_x^G\), \(m_x^\mathcal {M}\), and the uniform probability measure on \(B^+\cap G_n\). This bound can be made \(o(\delta _n^3)\), by carefully selecting the radii of \(B^-\) and \(B^+\).
Before we give the details, we need the following general result concerning Poisson random variables.
Lemma 5
Let \(\alpha _n,\beta _n\rightarrow \infty \) and \(X_n,Y_n\) be two independent Poisson random variables with means \(\alpha _n\) and \(\beta _n\), respectively. Then
Proof
First, let \(C>\sqrt{2}\) be some large fixed constant. Then we have that (c.f. [32, Lemma 2.1])
In particular, if we define \(\alpha _n^\pm =\alpha _n\pm C\sqrt{\alpha _n\log \alpha _n}\), then
Similar results hold for \(Y_n\) with \(\beta _n^\pm \) defined similarly. We start by conditioning on \(X_n\):
We will bound each term separately.
First we bound the expectation inside each summation by further conditioning on \(Y_n\):
because \(C>\sqrt{2}\). We can now bound \(I_n^{(1)}\) as follows:
where we used that \(\beta _n^-\sim \beta _n\), i.e., \(\beta _n^-/\beta _n\rightarrow 1\). For \(I_n^{(2)}\) we have, using that \(\alpha _n^-\sim \alpha _n\),
and thus the result follows since we are free to select \(C>\sqrt{2}\) large enough so that \(I_n^{(1)}\) is of smaller order. \(\square \)
We are now ready to prove Proposition 2.
Proof of Proposition 2
Let \(\delta _n^\pm =(\delta _n\pm \xi _n^3)/(1\mp \xi _n^2)\) and let \(D_n\) be the event on which approximation (4) of Definition 4.1 holds and recall that \(\varOmega _n\subset D_n\). Therefore, since \(\mathbb {P}\left( \varOmega _n\right) \rightarrow 1\),
and so it suffices to look at \(\mathbb {E}\bigl [W_1(m_x^G,m_x^\mathcal {M})\mathbbm {1}_{\{D_n\}}\bigr ]\).
Note that on the event \(D_n\),
Let \(V_n\subseteq \mathcal {M}\) be any neighborhood of x such that \(\textrm{vol}_\mathcal {M}(\mathcal {B}_n)=\varTheta (\delta _n^N)\) and
where \(\mathcal {B}_n=V_n\cap G_n\). Denote by \(m_n\) the uniform probability measure on \(\mathcal {B}_n\). We will prove that
Since
applying (17) twice, once with \(\mathcal {B}_n=\mathcal {B}_G(x;\delta _n)\) and once with \(\mathcal {B}_n=\mathcal {B}_\mathcal {M}(x;\delta _n)\cap G_n\), will yield the required result.
Let us write \(\mathcal {B}_n^\pm :=\mathcal {B}_\mathcal {M}(x;\delta _n^\pm )\cap G_n\) and denote by \(m_x^\pm \) the uniform probability measure on \(\mathcal {B}_n^\pm \). To establish (17) we will show that
Note that by definition of \(\delta _n^\pm \) we have \((\delta _n^+)^N-(\delta _n^-)^N=O(\xi _n^2\delta _n^N)\). Therefore, if (18) holds,
since \(\xi _n=o(\delta _n)\). To establish (18) we condition on \(|\mathcal {B}_n^-|\):
For the first term we have
where we used that \(\mathbb {E}\hspace{0.33325pt}[|\mathcal {B}_n^-|]=n{\text {vol}}_\mathcal {M}(\mathcal {B}_n^-)=n\varTheta ((\delta _n^-)^N)\). It now suffices to show that
We will do this by constructing a specific transport plan (coupling) between the measures \(m_n\) and \(m_x^+\). Define the joint probability mass function on \({\mathcal {B}_n\times \mathcal {B}_n^+}\):
and observe that m(u, v) is a coupling between \(m_x^G\) and \(m_x^+\). Therefore
Now define \(X_n=|\mathcal {B}_n^+\setminus \mathcal {B}_n^-|\) and \(Y_n=|\mathcal {B}_n^-|\). Then \(X_n\) and \(Y_n\) are independent Poisson random variables satisfying
It then follows from Lemma 5 that
Equation (19) then follows by noting that \(\textrm{vol}_\mathcal {M}(\mathcal {B}_n^+\setminus \mathcal {B}_n)=\varTheta \hspace{0.33325pt}((\delta _n^+)^N-(\delta _n^-)^N)\). \(\square \)
5.3 Continuous and Discrete Measures on \(\mathcal {M}\)
5.3.1 Collecting Relevant Known Results
The following is a summary of results on the Wasserstein metric between empirical and uniform measures on the N-dimensional cube. The case \(N=2\) was explicitly stated in [39]. Although the results for \(N\ge 3\) are known, they are not stated in the explicit form we need. For completeness we thus include a proof here.
Proposition 6
Let \(X_1,X_2,\dots \) be independent uniformly distributed random variables on \([0,1]^N\), let \(m_n\) denote the empirical measure
and \(\mu \) the uniform measure on \([0,1]^N\). Then
Proof
The result for \(N=2\) follows from [39, (1.1)], see also the results in [22, 35]. For \(N\ge 3\) we let \(Y_1,Y_2,\dots \) be independent uniformly distributed random variables on \([0,1]^N\) and define
where the infimum is taken over all permutations \(\sigma \) of \(\{1,2,\dots ,n\}\). Then, it follows from [38, Lemma 1] that
where \(Lip _1\) now denotes the set of Lipschitz continuous functions with constant 1, with respect to the Euclidean distance \(d_N\).
Next, we recall the duality formula for the Wasserstein metric on the space \(\mathcal {X}\),
Since
we have
and hence
Finally [38, Thm. 1] implies for \(N\ge 3\),
which then yields
5.3.2 Uniform and Discrete Measures on the Unit Cube
We first extend Proposition 6 to the case where the points correspond to a Poisson process. We will actually proof a slightly more general version which allows for intensities \((1+o(1)) n\).
Lemma 6
Consider the N-dimensional unit cube \([0,1]^N\), with \(N\ge 2\), and consider a Poisson process \(\mathcal {P}\) with intensity measure \((1+f_n) n\,\textrm{d}\textrm{vol}_N\) on \([0,1]^N\), for some sequence \(f_n\rightarrow 0\). Let \(m^N_\mathcal {P}\) denote the empirical random measure with respect to \(\mathcal {P}\), i.e.,
and \(\mu ^N\) the uniform measure on the square. Then, as \(n\rightarrow \infty \),
Proof
We shall establish the result by conditioning on the size \(|\mathcal {P}|\) which has a Poisson distribution with mean \((1+f_n) n\). Conditioned on \(|\mathcal {P}|=k\), each point is uniformly distributed and therefore it follows from Proposition 6 that as \(k_n\rightarrow \infty \)
Recall the Chernoff concentration result [32, Lemma 1.2] for a Poisson random variable \(\textrm{Po}(a)\) with mean a:
Fix a \(c > 0\). Then by (21) with \(a=(1+f_n) n\) and \(x=c\sqrt{(1+f_n) n\log n}\),
Therefore, if we define
it follows that
and similarly
We shall use this and the upper bound (20) for \(\mathbb {E}\bigl [W_1^N(m^N_\mathcal {P},\mu ^N)\,|\,|P|=k_n\bigr ]\) to compute an upper bound for \(\mathbb {E}\bigl [W_1^N(m^N_\mathcal {P},\mu ^N)\bigr ]\) as follows:
Since any two points in \([0,1]^N\) are at most at distance \(\sqrt{N}\), we have for \(I_1\)
while for \(I_3\) we get, using (20),
The main contribution comes from \(I_2\) for which we use that \(k\mapsto {\mathbb {P}}(\textrm{Po}(Qn)=k)\) is concave on \([a_n^-,a_n^+]\) and attains is maximum at \(k=(1+f_n) n\) to obtain
where we used (20) with \(k_n=(1+f_n) n\) for the first line and Stirling’s approximation for n! for the second line. Since \(c>0\) was arbitrary we conclude that
\(\square \)
5.3.3 Uniform and Discrete Measures on the Ball \(\mathcal {B}_\mathcal {M}(x;\delta _n)\)
The following result follows from Lemma 6 by a simple rescaling argument.
Corollary 3
Let \(r_n\rightarrow 0\) and consider a Poisson process \(\mathcal {P}\) with intensity n on the N-dimensional square \([0,2r_n]^N\). Let \(m^N_\mathcal {P}\) denote the empirical measure on the square \([0,2r_n]^N\) with respect to \(\mathcal {P}\), i.e.,
and \(\mu ^N\) the uniform measure on the square \([0,2r_n]^N\). Then
Proof
Consider the map \(\phi :[0,2 r_n]^N\rightarrow [0,1]^N\) defined by \(\phi (x)=r_n^{-1} x/2\). Then \(\phi (\mathcal {P})\) is a Poisson Point Process on \([0,1]^N\) with intensity measure \(2^Nr_n^Nn\). Now let \({\hat{m}}^N_\mathcal {P}=m^N_\mathcal {P}\circ \phi ^{-1}\) and \({\hat{\mu ^N}}=\mu ^N\circ \phi ^{-1}\) denote, respectively, the empirical measure with respect to \(\phi (\mathcal {P})\) and the uniform measure on \([0,1]^N\). It follows from Lemma 6 that
Since for any \(x,y\in [0,2r_n]^N\) we have \(d_N(\phi (x),\phi (y))=2^{-1}r_n^{-1}d_N(x,y)\) it follows that
because \(r_n\rightarrow 0\). \(\square \)
For our analysis we first extend Corollary 3 to N-dimensional balls. For this we note that if \(m_x^N\) and \(\mu _x^N\) denote, respectively, the empirical and uniform measure on the ball \(\mathcal {B}_N(x;\delta _n)\subseteq \mathbb {R}^N\), then
where \(m^N\) and \(\mu ^N\) are, respectively, the empirical and uniform measure on a cube \([0,2\delta _n]^N\). It then follows from Corollary 3 that
We thus have the following result:
Proposition 7
Let \(f_n\rightarrow 0\), \(x\in \mathbb {R}^N\), and consider a Poisson process \(\mathcal {P}\) with intensity measure \((1+f_n) n\,\textrm{d}\textrm{vol}_N\) on the N-dimensional ball \(\mathcal {B}_N(x;\delta _n)\). Let \(m_x^N\) denote the empirical measure with respect to \(\mathcal {P}\), i.e.,
and \(\mu _x^N\) the uniform measure on \(\mathcal {B}_N(x;\delta _n)\). Then
5.3.4 From the Manifold to the Tangent Space and Back
To prove Proposition 3 we have to extend Proposition 7 to the setting of Riemannian manifolds. For this we use that for n large enough, the ball \(\mathcal {B}_\mathcal {M}(x;\delta _n)\) can be mapped diffeomorphically by the exponential map to a slightly larger ball in the tangent space of x. Since the tangent space is diffeomorphic to \(\mathbb {R}^N\) we can use Proposition 7 to obtain the result. However, we have to be careful since the exponential map does not preserve the metric.
Proof of Proposition 3
We shall denote by \(\mathcal {B}_N(x;\delta )\) the ball of radius \(\delta \) around \(x\in {\mathbb {R}}^N\), according to the Euclidean metric. Fix a \(0<\xi <1\) and pick a small enough, but fixed, neighborhood U of the origin in \(T_x\mathcal {M}\) such that: 1) the exponential map restricted to U is a diffeomorphism, 2) there exists a constant \(C>1\) such that \(U\subseteq \mathcal {B}_N(0;C\delta _n)\), and 3) for any two points \(y,z\in \exp (U)\),
In particular, this implies that for n large enough,
Next we note that the probability measures \(m_x^\mathcal {M}\) and \(\mu _x^{\delta _n}\) on \(\mathcal {B}_\mathcal {M}(x;\delta _n)\) only depend on the restriction of the Poisson process to this ball. In particular it only depends on the restriction \(\mathcal {P}_U\) of the process to the fixed neighborhood U, which is again a Poisson process with intensity \(n\,\textrm{d}\textrm{vol}_\mathcal {M}/{\textrm{vol}_\mathcal {M}(\mathcal {M})}\). Since \(U\subseteq \mathcal {B}_N(0;C\delta _n)\) it follows that on U, \({\textrm{vol}_\mathcal {M}}\circ {\exp _x}=(1+O(\delta _n^2)){\text {vol}}_N\). Therefore, it follows from the Mapping Theorem for Poisson processes [21] that \(\exp _x^{-1}(P_U)\) is a Poisson process on \(\exp _x^{-1}(U)\) with intensity function \((1+O(\delta _n^2)) n\,\textrm{d}\textrm{vol}_N/{\textrm{vol}_\mathcal {M}(\mathcal {M})}\).
Slightly abusing notation, let \(m_x^N\) and \(\mu _x^N\) denote respectively the empirical and uniform measure on \(\mathcal {B}_N(0;\delta _n/(1-\xi ))\) with respect to the Poisson Point Process \(\exp _x^{-1}(\mathcal {P}_U)\). Then, since \(\delta _n/(1-\xi )=\varTheta (\delta _n)\), Proposition 7 implies that
On the other hand we have, since \(\exp _x\) is a diffeomorphism on U, that
and hence we conclude that
which proves Proposition 3. \(\square \)
5.4 Weighted Graph Distances
Recall that \(\lambda _n=n^{-1/N}(\log n)^{2/N}\). To prove Proposition 4 we first show the following.
Lemma 7
Let \(Q>3\), \(U=\mathcal {B}_\mathcal {M}(x^*;Q\delta _n)\), and define the event
Then \(\mathbb {P}\left( A_n\right) =o(\delta _n^3)\), as \(n\rightarrow \infty \).
Proof
The proof closely follows the strategy of the proof of Lemma 3. Let \(C_n\) denote the event in Corollary 2. We will show that on this event,
for all \(u,v\in U\cap G_n\). This then implies that \({\mathbb {P}}(A_n\cap C_n)=0\) from which the results follows, since by Corollary 2
Take any two \(u,v\in U\cap G_n\) and let \(\gamma (u,v)\) denote the geodesic between u and v. We then partition this geodesic into
pieces of equal length and let \(u:=u_0,u_1,\dots ,u_{k-1},u_k:=v\) denote the \(k+1\) end- points of the intervals, see Fig. 4. On the event \(C_n\), each \(u_t\) belongs to some ball \(B_t\) of radius \(\lambda _n/4\) which contains a vertex \(x_t\in G\), where we can take \(x_0=u\) and \(x_k=v\). In particular, since \(d_\mathcal {M}(u_t,x_t)\le \lambda _n/2\), \(d_\mathcal {M}(u_{t-1},u_{t})\le \varepsilon _n/3\) and \(\lambda _n=o(\varepsilon _n)\), it follows that for large enough n,
so that \(\{u,x_1,\dots ,x_k,v\}\) is a path in \(G_n\) (see Fig. 4). Moreover, \(d_G^w(x_t,x_{t+1})\le d_\mathcal {M}(u_t,u_{t+1})+\lambda _n\) by the triangle inequality. Therefore,
To finish the proof we note that by definition \(d_G^w(u,v)\ge d_\mathcal {M}(u,v)\) and hence
\(\square \)
Proof of Proposition 4
Due to Lemma 7 it suffices to show that the conditions on \(\varepsilon _n\) and \(\delta _n\) imply \(\lambda _n/\varepsilon _n=o(\delta _n^2)\). We compute that
The latter is o(1) precisely when either \(\alpha +2\beta <1/N\) or \(\alpha +2\beta =1/N\) and \(a+2b>2/N\), which are the conditions of Proposition 4. Thus, under the conditions of Proposition 4 it holds that the manifold-weighted graph distance \(d_G^w\) is a \(\delta _n\)-good approximation with \(\xi _n=\max {\bigl \{\sqrt{\lambda _n/\varepsilon _n},\lambda _n^{1/3}\bigr \}}\). \(\square \)
5.5 Rescaled Graph Distances
Consider the 2-dimensional Euclidean space equipped with the Euclidean distance \(d_2\). Let \(\mathcal {C}=[0,1]^2\) and take \(G_n={\mathbb {G}}_n(\varepsilon )\) to be the random geometric graph on \(\mathcal {C}\) with connection radius \(\varepsilon \). The main result in [9] relates the shortest-path distance \(d_{G_n}^s\) and the Euclidean distance \(d_2\). We state a version of this result here, which includes the error bounds that follow from [9, Propositions 2.2 and 2.4].
Theorem 5
[9, Thm. 1.1] Consider the random geometric graph \(G_n\) on the unit square \([0,1]^2\) with connection radius \(\varepsilon _n=o(1)\). Then for any pair of vertices \(x,y\in G_n\) with \(d_2(x,y)>\varepsilon _n\), the following holds:
-
If \(d_2(x,y)\ge \max {\{12(\log n)^{3/2}/(n\varepsilon _n),21\varepsilon _n\log n\}}\), then
$$\begin{aligned} \mathbb {P}\left( d_G^s(x,y)\ge \biggl \lfloor \frac{d_2(x,y)}{\varepsilon _n}\biggl (1+\frac{1}{2 (n\varepsilon _nd_2(x,y))^{2/3}}\biggr )\biggr \rfloor \right) \ge 1-o(n^{-5/2}). \end{aligned}$$ -
If \(\varepsilon _n\ge 224\sqrt{(\log n)/n}\) then
$$\begin{aligned} \mathbb {P}\left( d_G^s(x,y)\le \biggl \lceil \frac{d_2(x,y)}{\varepsilon _n}(1+\gamma _n)\biggr \rceil \right) \ge 1-o(n^{-5/2}) \end{aligned}$$with
$$\begin{aligned} \gamma _n:=\max {\biggl \{1358\biggl (\frac{3\log n}{n\varepsilon _n^2+n\varepsilon _nd_2(x,y)}\biggr )^{\!2/3}\!\!,\frac{4\cdot 10^6(\log n)^2}{n^2\varepsilon _n^4},\biggl (\frac{30000}{n\varepsilon _n^2}\biggr )^{\!2/3}\biggr \}}. \end{aligned}$$
From this we obtain the following result, which gives bounds on the graph distance \(\varepsilon _nd_G^s\) in terms of the manifold distance, between two nodes of the graph \(G_n\) that are within manifold distance \(O(\delta _n)\).
Lemma 8
Let \(\varepsilon _n\ge 244\sqrt{(\log n)/n}\), \(Q>3\), \(U=\mathcal {B}_\mathcal {M}(x^*;Q\delta _n)\), and define the event
Then \({\mathbb {P}}(A_n)=o(\delta _n^3)\), as \(n\rightarrow \infty \).
Proof
Note that since the the neighborhood U is shrinking as n increases we can map it to \({\mathbb {R}}^2\) diffeomorphically for sufficiently large n. This affects the distances at most by a constant factor and hence it suffices to prove the statement for \(\mathcal {M}={\mathbb {R}}^2\). By the second statement of Theorem 5 we have that for any two \(u,v\in U\cap G_n\),
By conditioning on the number of nodes in U (\(|U\cap G_n|\)) and applying the union bound we get
Now \(\mathbb {E}[|U\cap G_n|]=\varTheta (n\delta _n^2)\) and therefore
where we used that \(n^{-3/2}=o(\delta _n)\) for all \(\delta _n=\varTheta \hspace{0.33325pt}(n^{-\beta }(\log n)^b)\) and \(\beta \le 1\). \(\square \)
We can now prove Proposition 5.
Proof of Proposition 5
First observe that \(\varepsilon _nd_G^s(u,v)\ge d_\mathcal {M}(u,v)\) for all \(u,v\in \mathcal {B}_\mathcal {M}(x^*;Q\delta _n)\). Moreover, the conditions of the proposition imply that \((\log n)^{1/2}n^{-1/2}=o(\varepsilon _n)\). Therefore, by Lemma 8 we have that with probability \(1-o(\delta _n^3)\),
for all \(u,v\in \mathcal {B}_\mathcal {M}(x^*;Q\delta _n)\cap G_n\). Moreover, since by assumption \(\alpha \ge 3\beta \) and \(a<3b\) if \(\alpha =3\beta \) it follows that \(\varepsilon _n=o(\delta _n^3)\). Thus, to prove Proposition 5 it remains to show that \(\gamma _n=o(\delta _n^2)\). Since \(\gamma _n\) is the maximum of three terms
We will show that each of them is \(o(\delta _n^2)\). For the first term it suffices to show that \(n^{-1}\varepsilon _n^{-2}\log n=o(\delta _n^3)\). This follows since
which is o(1) by the assumption that \(2\alpha +3\beta \le 1\) and \(2a+3b>1\) if \(2\alpha +3\beta =1\). We now immediately have that \((n^{-1}\varepsilon _n^{-2}\log n)^2=o(\delta _n^6)\), which proves that the second term is \(o(\delta _n^2)\). Finally, the result for the third term follows from \(n^{-1}\varepsilon _n^{-2}=o(n^{-1}\varepsilon _n^{-2}\log n)=o(\delta _n^3)\). \(\square \)
References
Ache, A.G., Warren, M.W.: Ricci curvature and the manifold learning problem. Adv. Math. 342, 14–66 (2019)
Belenchia, A., Benincasa, D.M.T., Dowker, F.: The continuum limit of a \(4\)-dimensional causal set scalar d’Alembertian. Class. Quantum Gravity 33(24), # 245018 (2016)
Benincasa, D.M.T., Dowker, F.: Scalar curvature of a causal set. Phys. Rev. Lett. 104(18), # 181301 (2010)
Bhattacharya, B.B., Mukherjee, S.: Exact and asymptotic results on coarse Ricci curvature of graphs. Discrete Math. 338(1), 23–42 (2015)
Bringmann, K., Keusch, R., Lengler, J.: Geometric inhomogeneous random graphs. Theor. Comput. Sci. 760, 35–54 (2019)
Cheeger, J., Müller, W., Schrader, R.: On the curvature of piecewise flat spaces. Commun. Math. Phys. 92(3), 405–454 (1984)
Cunningham, W.J., Surya, S.: Dimensionally restricted causal set quantum gravity: examples in two and three dimensions. Class. Quantum Gravity 37(5), # 054002 (2020)
Cushing, D., Kamtue, S.: Long-scale Ollivier Ricci curvature of graphs. Anal. Geom. Metr. Spaces 7(1), 22–44 (2019)
Díaz, J., Mitsche, D., Perarnau, G., Pérez-Giménez, X.: On the relation between graph distance and Euclidean distance in random geometric graphs. Adv. Appl. Probab. 48(3), 848–864 (2016)
Farooq, H., Chen, Y., Georgiou, T.T., Tannenbaum, A., Lenglet, Ch.: Network curvature as a hallmark of brain structural connectivity. Nat. Commun. 10, # 4937 (2019)
Forman, R.: Bochner’s method for cell complexes and combinatorial Ricci curvature. Discrete Comput. Geom. 29(3), 323–374 (2003)
Gu, A., Sala, F., Gunel, B., Ré, Ch.: Learning mixed-curvature representations in products of model spaces. In: International Conference on Learning Representations (New Orleans 2019). https://openreview.net/pdf?id=HJxeWnCcF7
van der Hoorn, P., Cunningham, W.J., Lippner, G., Trugenberger, C., Krioukov, D.: Ollivier–Ricci curvature convergence in random geometric graphs (2020). arXiv:2008.01209
Jacob, E., Mörters, P.: Spatial preferential attachment networks: power laws and clustering coefficients. Ann. Appl. Probab. 25(2), 632–662 (2015)
Jost, J.: Geometry and Physics. Springer, Berlin (2009)
Jost, J., Liu, Sh.: Ollivier’s Ricci curvature, local clustering and curvature-dimension inequalities on graphs. Discrete Comput. Geom. 51(2), 300–322 (2014)
Kempton, M., Lippner, G., Münch, F.: Large scale Ricci curvature on graphs (2019). arXiv:1906.06222
Klitgaard, N., Loll, R.: Introducing quantum Ricci curvature. Phys. Rev. D 97(4), # 046008 (2018)
Krioukov, D.: Clustering implies geometry in networks. Phys. Rev. Lett. 116(20), # 208302 (2016)
Krioukov, D., Papadopoulos, F., Kitsak, M., Vahdat, A., Boguñá, M.: Hyperbolic geometry of complex networks. Phys. Rev. E 82(3), # 036106 (2010)
Last, G., Penrose, M.: Lectures on the Poisson Process. Institute of Mathematical Statistics Textbooks, vol. 7. Cambridge University Press, Cambridge (2018)
Leighton, T., Shor, P.: Tight bounds for minimax grid matching, with applications to the average case analysis of algorithms. In: 18th Annual ACM Symposium on Theory of Computing (Berkeley 1986), pp. 91–103. ACM, New York (1986)
Lin, Y., Lu, L., Yau, Sh.-T.: Ricci curvature of graphs. Tohoku Math. J. 63(4), 605–627 (2011)
Liu, Sh., Münch, F., Peyerimhoff, N.: Bakry–Émery curvature and diameter bounds on graphs. Calc. Var. Partial Differ. Equ. 57(2), # 67 (2018)
Najman, L., Romon, P. (eds.): Modern Approaches to Discrete Curvature. Lecture Notes in Mathematics, vol. 2184. Springer, Cham (2017)
Ni, Ch.-Ch., Lin, Y.-Y., Gao, J., Gu, X.D., Saucan, E.: Ricci curvature of the Internet topology. In: 2015 IEEE Conference on Computer Communications (INFOCOM) (Hong Kong 2015), pp. 2758–2766. IEEE (2015)
Ollivier, Y.: Ricci curvature of metric spaces. C. R. Math. Acad. Sci. Paris 345(11), 643–646 (2007)
Ollivier, Y.: Ricci curvature of Markov chains on metric spaces. J. Funct. Anal. 256(3), 810–864 (2009)
Ollivier, Y.: A survey of Ricci curvature for metric spaces and Markov chains. In: Probabilistic Approach to Geometry (Kyoto 2008). Adv. Stud. Pure Math., vol. 57, pp. 343–381. Mathematical Society of Japan, Tokyo (2010)
O’Neill, B.: Semi-Riemannian Geometry. Pure and Applied Mathematics, vol. 103. Academic Press, New York (1983)
Paeng, S.-H.: Volume and diameter of a graph and Ollivier’s Ricci curvature. Eur. J. Combin. 33(8), 1808–1819 (2012)
Penrose, M.: Random Geometric Graphs. Oxford Studies in Probability, vol. 5. Oxford University Press, Oxford (2003)
Sandhu, R., Georgiou, T., Reznik, E., Zhu, L., Kolesov, I., Senbabaoglu, Y., Tannenbaum, A.: Graph curvature for differentiating cancer networks. Sci. Rep. 5, # 12323 (2015)
Sandhu, R.S., Georgiou, T.T., Tannenbaum, A.R.: Ricci curvature: an economic indicator for market fragility and systemic risk. Sci. Adv. 2(5), # e1501495 (2016)
Shor, P.W., Yukich, J.E.: Minimax grid matching and empirical measures. Ann. Probab. 19(3), 1338–1348 (1991)
Sia, J., Jonckheere, E., Bogdan, P.: Ollivier–Ricci curvature-based method to community detection in complex networks. Sci. Rep. 9,(2019)
Sreejith, R.P., Mohanraj, K., Jost, J., Saucan, E., Samal, A.: Forman curvature for complex networks. J. Stat. Mech. Theory Exp. 2016(6), # 063206 (2016)
Talagrand, M.: Matching random samples in many dimensions. Ann. Appl. Probab. 2(4), 846–856 (1992)
Talagrand, M.: Matching theorems and empirical discrepancy computations using majorizing measures. J. Am. Math. Soc. 7(2), 455–537 (1994)
Trugenberger, C.A.: Combinatorial quantum gravity: geometry from random bits. J. High Energy Phys. 2017(9), # 045 (2017)
Acknowledgements
We thank Jürgen Jost and Renate Loll for useful discussions, suggestions, and comments. This work was supported by ARO Grant Nos. W911NF-16-1-0391 and W911NF-17-1-0491, and by NSF Grant Nos. IIS-1741355 and DMS-1800738.
Author information
Authors and Affiliations
Corresponding author
Additional information
Editor in Charge: Kenneth Clarkson
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Hoorn, P.v.d., Lippner, G., Trugenberger, C. et al. Ollivier Curvature of Random Geometric Graphs Converges to Ricci Curvature of Their Riemannian Manifolds. Discrete Comput Geom 70, 671–712 (2023). https://doi.org/10.1007/s00454-023-00507-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00454-023-00507-y