Abstractly, there are different types of relations. They can vary with respect to the number of elements involved, they can be symmetric or directed, that is, distinguish between inputs and outputs, and they may also carry weights. The simplest case are binary, symmetric and unweighted relations. Such a web of relations is then modelled by an undirected and unweighted graph whose vertices stand for the elements in question and whose edges represent the presence of a relation between the two vertices they connect. For simplicity, we also assume that the graph is simple, that is, there is at most one edge between any two vertices, and that it is connected, that is, by passing from edge to edge we can reach any vertex from any other one, although these assumptions are not essential for any of the sequel. So, we start with that case.
We want to assess how a relation, that is, an edge of such a graph, sits in the web of relations, that is, how it relates to other relations. Two edges are called neighbors when they share a vertex. We can then already define the simplest concept, called Forman–Ricci curvature, because it was introduced by Forman (2003) as an analogy with the Ricci curvature of Riemannian geometry (the analogy relates to the role it plays in Bochner-type identities). We define the degree of an edge e as
$$\begin{aligned} \deg (e):= \# (\mathrm {neighbors\ of }\ e), \end{aligned}$$
(1)
and define its Forman–Ricci curvature as
$$\begin{aligned} F(e):=2-\deg (e). \end{aligned}$$
(2)
The 2 and the minus sign are somewhat unfortunate for our purposes, but they are there because of the analogy with the well-established Ricci curvature of Riemannian geometry, and they are useful from an abstract geometric perspective.
When the edge e connects the vertices v, w, we can also assess their contribution to the number of neighbors of e. We let \(\deg _v(e)\) be the number of edges that share with e the vertex v. Then, obviously,
$$\begin{aligned} F(e)=2 -(\deg _v (e) +\deg _w (e)). \end{aligned}$$
(3)
Instead of the sum of the degrees, we may also consider their difference. When the edge is not directed, there is no intrinsic structural difference between the two vertices that it connects, and so, it is natural to take the absolute value of the difference and define the degree difference (Farzam et al. 2020) as
$$\begin{aligned} \daleth (e):=|\deg _v (e) -\deg _w (e)|. \end{aligned}$$
(4)
Let us interpret the geometric significance of these quantities. \(\daleth (e)\) is large when e connects vertices of different types, a well-connected one from which many further edges emanate, and a less well-connected one from which only fewer edges originate. The statistics of this quantity therefore quantify to what extent the network is assortative, that is, typically connect similar vertices (small \(\daleth (e)\)), or disassortative, that is, typically connect dissimilar vertices (large \(\daleth (e)\)). This is important, for instance, because social networks tend to be assortative (Fisher David et al. 2017) (well connected people like to link with other well connected people, and this further improves their position in social networks). In contrast, F(e) is very negative, that is, has a particularly large absolute value when both ends of an edge are well connected. Such edges may play a very important role in the network. In fact, we have found (Samal et al. 2018) that a quantity that needs a global computation, edge-betweenness centrality (see Newman 2010), is statistically well correlated with F(e). This edge-betweenness centrality measures how many shortest connections between pairs of vertices in the network pass through that particular edge. The computation of that quantity is expensive because all shortest connections between any two vertices have to be evaluated. In contrast, the computation of F(e) is very quick and easy, because only local neighborhoods have to be evaluated.
Edges with large |F(e)| also play an important role for spreading in the network because from its vertices many other vertices in the network can be reached in a single step. There is one issue here, however. Edges from the two vertices v, w of e may end at the same vertex z, that is, v, w, z may form a triangle. In that case, they would not contribute to spreading into different directions. Or the endpoint of an edge from v and that of an edge from w may be connected themselves by an edge. That is, they form a quadrangle together with v and w. Again, that does not really constitute spreading into different directions. It is possible to address this issue by inserting two-dimensional faces into such triangles and perhaps also into quadrangles, and then to evaluate the Forman curvature of the resulting simplicial or polyhedral complex. Those faces would then increase the Forman curvature and make it less negative or even positive. See for instance (Saucan et al. 2019).
This aspect is taken care of in a different way by a more refined concept of Ricci curvature, the Ollivier–Ricci curvature introduced in Ollivier (2009). For that purpose, consider the edge \(e=(v,w)\) and let \(e_v=(v,v_1)\) and \(e_w=(w,w_1)\) be edges emanating from v and w, respectively. We then define their distance w.r.t. e as
$$\begin{aligned} d_e(e_v,e_w):= d(v_1,w_1) \end{aligned}$$
(5)
where \(d(v_1,w_1)\) denotes the distance between \(v_1\) and \(w_1\) in the network, that is, the minimal edges that have to be traversed for getting from \(v_1\) to \(w_1\). Let \(E_v\) be the set of edges that have v as a vertex, and let \(|E_v|\) be its cardinality. We then define a probability measure \(\mu _v\) on the set of all edges E by giving each edge \(e_v\in E_v\) the weight \(\frac{1}{|E_v|}\) and all edges not in \(E_v\) the weight 0. We then define the Ollivier–Ricci curvature (Ollivier 2009) of the edge \(e=(v,w)\) as
$$\begin{aligned} O(e):=1-W_1(\mu _v,\mu _w) \end{aligned}$$
(6)
where \(W_1\) is the 1-Wasserstein distance between \(\mu _v\) and \(\mu _w\),
$$\begin{aligned} W_1(\mu _v,\mu _w):=\inf _{p\in \Pi (\mu _v,\mu _w)}\sum _{(e_1,e_2)\in E\times E} d_e(e_1,e_2) p(e_1,e_2) \end{aligned}$$
(7)
and \(\Pi (\mu _v,\mu _w)\) is the set of measures on \(E\times E\) that project to \(\mu _v\) and \(\mu _w\), resp. We thus try to arrange the two collections \(E_v,E_w\) of edges sharing one of their endpoints with e in an optimal manner, that is, that the average distances of the arranged pairs become as small as possible. We note that the sets \(E_v\) and \(E_w\) both include the edge \(e=(v,w)\) that we are evaluating. This convention is only needed to let our definition agree with that originally proposed in the literature, but could otherwise be abandoned, to make the definition more natural in the present context.
In order to evaluate (6), we have to optimize the arrangement between the edges in \(E_v\) and \(E_w\), to make the transportation cost as small as possible. Since this is a quantity all edges in those two edge sets, it is not necessarily the case that an optimal transport plan arranges each edge \(e_1\) in \(E_v\) with the edge \(e_0\) in \(E_w\) closest to it. There might be some competition, as there might be other edges \(e_2,e_3,\dots\) for which \(e_0\) is closest. But even if there is no such competition, it might be overall more beneficial to arrange \(e_1\) with an edge different from \(e_0\). Also, because of the normalization, the edges in \(E_v\) and \(E_w\) have fractional weights, and if the cardinalities of the two edge sets are different, also the corresponding weights are different, necessitating an arrangement where some part of an edge in \(E_v\) is arranged with some part of an edge in \(E_w\), and other parts with other ones.
Notwithstanding these complications, let \(m_i\) be the fraction of edges in \(E_v\) that are moved a distance i in some optimal transport plan (such an optimal arrangement need not be unique, but that does not matter for our discussion). Then (Eidi and Jost 2020)
$$\begin{aligned} O(e)=m_0-m_2-2m_3. \end{aligned}$$
(8)
In particular, moving an edge a distance 1 does not contribute at all to O(e). (While \(m_1\) itself does not appear in (8), its computation is nevertheless needed as an intermediate step for computing \(m_2\) and \(m_3\).) Distance 0, that is, when e participates in a triangle, has a positive contribution. A pentagon, that is, distance 2, has a negative contribution, but not as a negative as the maximal distance, that can occur in a transportation plan, which is 3. This simple formula thus encodes the essential features of Ollivier–Ricci curvature. In fact, we could simply take (8) as the definition of O(e), instead of utilizing the more complicated formula (7).
More generally, the Ollivier–Ricci curvature is related to the clustering coefficient, that is, the relative frequency of triangles in the network (Jost and Liu 2014).
Protein–protein interaction networks
To illustrate an application of these structural measures to empirical data, we have studied the protein–protein interaction (PPI) networks in human (Luck et al. 2020), with 8275 nodes and 52,569 edges, and fission yeast S. pombe (Vo et al. 2016), with 1306 nodes and 2278 edges. The edges in these network represent binary interactions between the pair of proteins represented as nodes. These undirected and unweighted networks are disconnected with several components, however, they both include a giant component. The giant component consists of 8152 nodes and 52,036 edges in the human PPI network, and of 1306 nodes and 2278 edges in fission yeast PPI network. We have computed the Forman–Ricci curvature, Ollivier–Ricci curvature, and degree difference of edges in these networks, and their distributions are shown in Fig. 1.
In the human PPI network, while Ollivier–Ricci curvature has a unimodal distribution, the bimodal distribution of Forman–Ricci curvature in Fig. 1 signals an evident heterogeneity in the space of protein–protein interactions in the giant component; a major group of interactions are distributed around a relatively small-valued mode, and a small group of interactions between proteins that are, in average, involved in a signficantly larger number of interactions. The degree difference distribution indicates that, although the majority of interactions are between proteins with relatively similar degree, a noticeable proportion of the edges have a considerably large degree difference, which can be as large as 497. This observation is in line with the fact that this network is moderately disassortative with assortativity value \(\sim -0.119\).
Unlike the Ollivier–Ricci curvature distribution of the human PPI network, the fission yeast PPI network has a trimodal distribution of the Ollivier–Ricci curvature, reaching its global mode at curvature value 0. In fact, in the PPI network for fission yeast, all three measures have multimodal distributions, as demonstrated in Fig. 1. Interestingly, the peaks over highly negative values of Forman–Ricci curvature have larger frequencies than those over the moderately negative values. A similar phenomenon is observed in the degree difference distribution of the fission yeast PPI network. The global degree assortativity of the fission yeast PPI network is \(\sim -0.237\). This means that the fission yeast PPI network is considerably more disassortative than the human one, which is explained by the more substantial proportion of interactions in fission yeast between proteins with significantly different degrees. Thus, we see that the distribution of curvature and degree difference values can point us to biologically relevant properties of the interaction statistics in the PPI networks of different species.