Abstract
We show that prominent centrality measures in network analysis are all based on additively separable and linear treatments of statistics that capture a node’s position in the network. This enables us to provide a taxonomy of centrality measures that distills them to varying on two dimensions: (i) which information they make use of about nodes’ positions, and (ii) how that information is weighted as a function of distance from the node in question. The three sorts of information about nodes’ positions that are usually used—which we refer to as “nodal statistics”—are the paths from a given node to other nodes, the walks from a given node to other nodes, and the geodesics between other nodes that include a given node. Using such statistics on nodes’ positions, we also characterize the types of trees such that centrality measures all agree, and we also discuss the properties that identify some pathbased centrality measures.
This is a preview of subscription content, access via your institution.
Similar content being viewed by others
Notes
For more discussion and references on the distinction between various forms of influence and social capital see Jackson (2020).
For instance, there are settings in which some attributes of nodes (e.g., size) may make them more central or influential, and so anonymity is inappropriate (e.g., see Jackson and Pernoud (2019)). To keep the discussion uncluttered, we focus on the anonymous measures, but the main points that we make extend to weighted versions of centrality measures.
This is somewhat reminiscent of König et al. (2014) who show that many centrality rankings coincide in nestedsplit graphs, which have a strong hierarchical form. Trees admit more variation, and so the characterization here provides new insight, especially as it helps us understand when nodal statistics coincide.
Some centrality measures proscribe selfloops and so we can adopt the convention that \(g_{ii}=0\); but again, the main results do not require such an assumption.
We define centrality measures as cardinal functions, since that is the way they are all defined in the literature, and are typically used in practice. Of course, any cardinal measure also induces an ordinal ranking, and sometimes cardinal measures are used to identify rankings.
Jackson (2008, Chapter 2.2) provides detailed history and references.
In the case of directed networks, there are both indegree and outdegree versions, which have different interpretations as to how much node i can either receive or broadcast, depending on the direction.
Decay centrality is also defined for \(\delta \notin [0,1]\), but then the interpretation of it as capturing decay is no longer valid.
In the limit, as \(\delta \rightarrow 0\), this places weight only on shortest paths, and then becomes closer to decay centrality, at least in trees.
\(\textbf{1}\) denotes the ndimensional vector of 1s, and \(\textbf{I}\) is the identity matrix. Invertibility holds for small enough \(\delta\) (less than the inverse of the magnitude of the largest eigenvalue).
In a variation proposed by Bonacich there is a second parameter \(\beta\) that rescales: \(\textbf{c}^{KB} ({\textbf{g}}, \delta , \eta ) = (\textbf{I}  \delta {\textbf{g}})^{1} \beta {\textbf{g}} \textbf{1}.\) Since the scaling is inconsequential, we ignore it.
\(\lambda ^{\max }({\textbf{g}})\) is positive when \({\textbf{g}}\) is nonzero (recalling that it is a nonnegative matrix), the associated vector is nonnegative, and for a connected network the associated eigenvector is positive and unique up to a rescaling (by the PerronFrobenius Theorem).
This is related in spirit to basic epidemiological models (e.g, see Bailey (1975)), as well as the cascade model of Kempe et al. (2003) that allowed for thresholds of adoption (so that an agent cares about how many neighbors have adopted). A variation of the cascade model leads to a centrality measure introduced by Lim et al. (2015) called cascade centrality, which is related to the communication centrality of Banerjee et al. (2013) and the decay centrality of Jackson (2008). Diffusion centrality differs from these other measures in that it is based on walks rather than paths, which makes it easier to relate to Katz–Bonacich centrality and eigenvector centrality as discussed in Banerjee et al. (2013) and formally shown in Banerjee et al. (2019). Nonetheless, diffusion centrality is representative of a class of measures built on the premise of how much diffusion one gets from various nodes, with variations in how the process is modeled (e.g., see Bramoullé and Genicot (2018)). These are also used as inputs into other measures, such as that of Kermani et al. (2015), which combine information from a variety of centrality measures.
Note that they work directly with a weighted directed network. Thus, their \(\lambda _1=\delta \lambda ^{\max }({\textbf{g}})\).
See ErcseyRavasz et al. (2012) for some truncated measures.
This concept is first defined in Nieminen (1973) in discussing a directed centrality notion, and he refers to the neighborhood statistic as the subordinate vector.
The entries of the s’s may not sum to one, so this is not always a form of stochastic dominance, but it is defined analogously when the s’s have the same sum.
If \(L=\infty\), then when writing \(s_{i}^{\ell +1},\ldots ,s_{i}^{L},0,\ldots , 0\) below simply ignore the trailing 0’s.
It also implies monotonicity, but since we use monotonicity to establish the aggregator function on which additivity is stated, we maintain it as a separate condition in the statement of the theorem.
For smaller \(\delta\) diffusion centrality coincides with Katz–Bonacich centrality, and so exactly at the inverse of the largest eigenvalue, Katz–Bonacich and eigenvector centrality converge. This presumes that there is a unique first eigenvector, which holds if the adjacency matrix is primitive (e.g., see Jackson (2008)). Bonacich (2007) discusses some interesting properties of eigenvector centrality and how it can differ on signed and other networks, which violate these conditions.
Alternatively, we could define a closeness statistic, \(cl_i({\textbf{g}})= (cl_i^1({\textbf{g}}),\ldots , cl_i^\ell ({\textbf{g}}),\ldots ,cl_i^{n1}({\textbf{g}}))\), is the vector such that \(cl_i^\ell ({\textbf{g}}) = \frac{n_i^\ell ({\textbf{g}}) }{\ell }\) for each \(\ell =1,2,\ldots ,n1\), tracking nodes at different distances from a given node i, weighted by the inverse of those distances. and add another row. But this would build some of the weighting into the nodal statistics, which is cleaner to separate, pedagogically.
Without the restriction that \(L=n1\) one can get additional statistics that repeat entries—for instance instead of having the neighborhood statistics \((n_i^1({\textbf{g}}), n_i^2({\textbf{g}}), n_i^3({\textbf{g}}), \ldots , n_i^{n1}({\textbf{g}}))\), one can also get other statistics such as \((n_i^1({\textbf{g}}), n_i^1(g), n_i^2(g), n_i^2(g), n_i^3(g), n_i^3(g),\ldots , n_i^{n1}(g), n_i^{n1}(g))\) which duplicates entries.
Garg’s paper was never completed, and so the axiomatizations are not full characterizations and/or are without proof. Nonetheless some of the axioms in his paper are of interest.
König et al. (2014) prove that degree, closeness, betweenness and eigenvector centrality generate the same ranking on nodes for nestedsplit graphs, which are a very structured hierarchical form of network (for which all nodal statistics will provide the same orderings, and so the techniques here would provide an alternative proof technique). As noted above, Sadler (2022) investigates situations in which ordinal centrality measures coincide.
The other conditions guarantee that all leaves’ distances from the root differ by no more than one from each other. However, a line with an even number of nodes shows that there will be no welldefined root node that is more central than other nodes, and such examples are ruled out by this condition.
Note that this condition cannot be in conflict with the previous one, as it would violate the ordering of k and l. This latter condition only adds to the definition when i and j have the same immediate predecessor.
Without this condition, there are examples of trees that violate being a monotone hierarchy because of the leaf condition, but still have all nodes being comparable in terms of their neighborhood structures.
Proposition 2 shows a reversal of the partial order \(\succeq\). If the trees are irregular in having closer nodes have lower degree and farther nodes having higher degree, then one can get a reversal of \(\succ\), so that \(s_i \succ s_j\) and \(s^{\prime }_j \succ s^{\prime }_i\).
Even though the Shapley value satisfies an additivity axiom, it is an additivity across value functions and not across nodal statistics; and so does not translate here.
Note, for instance, that a convex combination of nodal statistics generates a different centrality measure from a convex combination of the measures, for instance. This opens interesting questions for future research.
In the case of a threshold model, as multiple seeds are needed to initiate any cascade in many networks, one could construct a centrality measure by assuming that k other seeds are distributed at random on all other nodes, and then examine the marginal value of a particular node.
It would generally make sense to have the \(\beta _\ell\) be a nonincreasing function of \(\ell\). The presence of the \(\alpha _{\ell }\)s ensures that there is no excessive penalty for having \(s_i^\ell =0\) for some \(\ell\).
Note that even the ordering produced by this class of measures is equivalent to ordering nodes according to \(\sum _{\ell =1}^{L} \beta _{\ell } \log (\alpha _{\ell } + s_{i}^{\ell })\). This is an additive form, with nodal statistics \(\beta _{\ell } \log (\alpha _{\ell } + s_{i}^{\ell })\). This shows that it can be challenging to escape the additive family. Nonetheless, this is a new and potentially interesting family prompted by our analysis.
In addition, diffusion centrality has \(T=5\) in all of the simulations.
See Schoch et al. (2017) for some discussion of how correlation varies with network structure.
References
Ashtiani M, SalehzadehYazdi A, RazaghiMoghadam Z, Hennig H, Wolkenhauer O, Mirzaie M, Jafari M (2018) A systematic survey of centrality measures for protein–protein interaction networks. BMC Syst Biol 12:1–17
Bailey N (1975) The mathematical theory of infectious diseases. Griffin, London
Ballester C, CalvóArmengol A, Zenou Y (2006) Who’s who in networks, wanted: the key player. Econometrica 74:1403–1417
Banerjee A, Chandrasekhar A, Duflo E, Jackson MO (2013) Diffusion of microfinance. Science. https://doi.org/10.1126/science.1236498
Banerjee A, Chandrasekhar AG, Duflo E, Jackson MO (2019) Using gossips to spread information: theory and evidence from two randomized controlled trials. Rev Econ Stud 86:2453–2490
Bavelas A (1950) Communication patterns in taskoriented groups. J Acoust Soc Am 22:725–730
Boldi P, Vigna S (2014) Axioms for centrality. Internet Math 10:222–262
Bollen J, Van de Sompel H, Hagberg A, Chute R (2009) A principal component analysis of 39 scientific impact measures. PLoS ONE 4:e6022
Bonacich P (1972) Factoring and weighting approaches to status scores and clique identification. J Math Sociol 2(1):113–120
Bonacich P (1987) Power and centrality : a family of measures. Am J Sociol 92:1170–1182
Bonacich P (2007) Some unique properties of eigenvector centrality. Soc Netw 29:555–564
Borgatti SP (2005) Centrality and network flow. Soc Netw 27:55–71
Borgatti SP, Everett MG (2006) A graphtheoretic perspective on centrality. Soc Netw 28:466–484
Bramoullé Y, Genicot G (2018) Diffusion centrality: Foundations and extensions. Université AixMarseille
Brink RV, d. and R. P. Gilles, (2003) Ranking by outdegree for directed graphs. Discret Math 271:261–270
Christakis N, Fowler J (2010) Social network sensors for early detection of contagious outbreaks. PLoS ONE 5(9):e12948. https://doi.org/10.1371/journal.pone.0012948
Ciardiello F (2018) Interorganizational information sharing with negative externalities, mechanisms and networks: A fair value. Sheffield University Management School, mimeo
Dasaratha K (2020) Distributions of centrality on networks. Games Econom Behav 122:1–27
Debreu G (1960) Topological methods in cardinal utility theory. Mathematical Methods in the Social Sciences, Stanford University Press, 16–26
Dequiedt V, Zenou Y (2017) Local and consistent centrality measures in networks. Math Soc Sci 88:28–36
ErcseyRavasz M, Lichtenwalter RN, Chawla NV, Toroczkai Z (2012) Rangelimited centrality measures in complex networks. Phys Rev E 85:066103
Freeman L (1977) A set of measures of centrality based on betweenness. Sociometry 40(1):35–41
Garg M (2009) Axiomatic foundations of centrality in networks. unpublished
Gofman M (2015) Efficiency and stability of a financial architecture with toointerconnectedtofail institutions. Forthcoming at the Journal of Financial Economics
Golub B, Jackson MO (2010) Naive learning in social networks and the wisdom of crowds. Am Econ J Microecon 2:112–149
Gomez D, GonzalezAranguena E, Manuel C, Owen G, Pozo MD, Tejada J (2003) Centrality and power in social networks: A game theoretic approach. Math Soc Sci 46(1):27–54
Hahn Y, Islam A, Patacchini E, Zenou Y (2015) Teams, organization and education outcomes: Evidence from a field experiment in Bangladesh. CEPR Discussion Paper No. 10631
Henriet D (1985) The Copeland choice function: an axiomatic characterization. Soc Choice Welfare 2:49–63
Jackson MO (2008) Social and economic networks. Princeton University Press, Princeton
Jackson MO (2019) The friendship paradox and systematic biases in perceptions and social norms. J Polit Econ 127:777–818
Jackson MO (2019) The human network: how your social position determines your power, beliefs, and behaviors. Pantheon Books, New York
Jackson MO (2020) A typology of social capital and associated network measures. Soc Choice Welfare 54(2):311–336
Jackson MO, Pernoud A (2019) Investment incentives and regulation in financial networks. SSRN working paper http://ssrn.com/abstract=3311839
Jackson MO, Wolinsky A (1996) A strategic model of social and economic networks. J Econ Theory 71:44–74
Jackson MO, Zenou Y (2014) Games on networks. In: Young HP, Zamir S (eds) Handbook of game theory. Elsevier
Katz L (1953) A new status index derived from sociometric analysis. Psychometrica 18:39–43
Kempe D, Kleinberg J, Tardos E (2003) Maximizing the Spread of Influence through a Social Network. Proc. 9th Intl. Conf. on Knowledge Discovery and Data Mining, 137–146
Kermani MAMA, Badiee A, Aliahmadi A, Ghazanfari M, Kalantari H (2015) Introducing a procedure for developing a novel centrality measure (Sociability Centrality) for social networks using TOPSIS method and genetic algorithm. in press
König M, Tessone C, Zenou Y (2014) Nestedness in networks: a theoretical model and some applications. Theor Econ 9(3):695–752
Koopmans TC (1960) Stationary ordinal utility and impatience. Econometrica 28:287–309
Laslier JF (1997) Tournament solutions and majority voting. SpringerVerlag, Berlin
Li D, Schürhoff N (2019) Dealer networks. J Financ 74:91–144
Lim Y, Ozdaglar A, Teytelboym A (2015) A simple model of cascades in networks. mimeo
Michalak TP, Aadithya KV, Szczepanski PL, Ravindran B, Jennings NR (2013) Efficient computation of the shapley value for gametheoretic network centrality. J Artif Intell Res 46:607–650
Molinero X, Riquelme F, Serna M (2013) Power indices of influence games and new centrality measures for social networks. Arxiv
Myerson RB (1977) Graphs and cooperation in games. Math Oper Res 2:225–229
Nieminen U (1973) On the centrality in a directed graph. Soc Sci Res 2:371–378
Padgett J, Ansell C (1993) Robust action and the rise of the Medici 1400–1434. Am J Sociol 98(6):1259–1319
Page L, Brin S, Motwani R, Winograd T (1998) The pagerank citation ranking: bringing order to the web. Technical Report: Stanford University
PalaciosHuerta I, Volij O (2004) The measurement of intellectual influence. Econometrica 72:963–977
Rochat Y (2009) Closeness centrality extended to unconnected graphs: The harmonic centrality index. mimeo.
Rubinstein A (1980) Ranking the participants in a tournament. J SIAM Appl Math 38:108–111
Sabidussi G (1966) The centrality index of a graph. Psychometrika 31:581–603
Sadler E (2022) Ordinal centrality. J Polit Econ 130:926–955
Schoch D, Brandes U (2016) Reconceptualizing centrality in social networks. Eur J Appl Math 27:971–985
Schoch D, Valente TW, Brandes U (2017) Correlations among centrality indices and a class of uniquely ranked graphs. Soc Netw 50:46–54
Slutzki G, Volij O (2006) Scoring of web pages and tournamentsaxiomatizations. Soc Choice Welfare 26:75–92
Valente TW, Coronges K, Lakon C, Costenbader E (2008) How correlated are network centrality measures? Connect (Tor) 28:16–26
van den Brink R, Gilles RP (2000) Measuring domination in directed networks. Soc Netw 22:141–157
Wasserman S, Faust K (1994) Social network analysis. Cambridge University Press, Cambridge
Funding
Francis Bloch benefitted from funding from the Agence Nationale de la Recherche under contract 13BSH10010. Matthew O. Jackson received funding from NSF grants SES0961481, SES1155302, SES2018554, ARO MURI Award No. W911NF1210509, and from grant FA95501210411 from the AFOSR and DARPA.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no financial or nonfinancial interests to declare.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
We benefitted from comments by Gabrielle Demange, JeanJacques Herings, Arturo Marquez Flores, Alex Teytelboym, many seminar participants, as well as an editor and referees. We gratefully acknowledge financial support from the NSF grants SES0961481, SES1155302, SES2018554, ARO MURI Award No. W911NF1210509, and from grant FA95501210411 from the AFOSR and DARPA.
Appendices
Appendix: proofs
Proof of Lemma 1
The if part is clear, and so we show the only if part.
Suppose that \(s_i({\textbf{g}}) = s_i({\textbf{g}}^{\prime })\) but \(c_i({\textbf{g}}) \ne c_i({\textbf{g}}^{\prime })\). Without loss of generality let \(c_i({\textbf{g}})>c_i({\textbf{g}}^{\prime })\). By monotonicity, since \(c_i({\textbf{g}})>c_i({\textbf{g}}^{\prime })\) it must be that \(s_i({\textbf{g}}^{\prime }) \nsucceq s_i({\textbf{g}})\). However, this contradicts the fact that \(s_i({\textbf{g}}) = s_i({\textbf{g}}^{\prime })\) (which implies that \(s_i({\textbf{g}}) \sim s_i({\textbf{g}}^{\prime })\) by reflexivity of a partial order). Thus, \(c_i({\textbf{g}}) = c_i({\textbf{g}}^{\prime })\) for any \({\textbf{g}},{\textbf{g}}^{\prime }\) for which \(s_i({\textbf{g}}) = s_i({\textbf{g}}^{\prime })\). Letting S denote the range of \(s_i({\textbf{g}})\) (which is the same for all i by anonymity), it follows that there exists \({\mathcal {C}}_i:S\rightarrow {\mathbb{R}}\) for which \(c_i({\textbf{g}})={\mathcal {C}}_i(s_i({\textbf{g}}))\) for any \({\textbf{g}}\). Moreover \({\mathcal {C}}_i\) must be a monotone function on S, given the monotonicity of \(c_i\).
Next, we show that \({\mathcal {C}}_i={\mathcal {C}}_j\) for any i, j. Consider any \(s'\in S\) and any two nodes i and j. Since \(s'\in S\), it follows that there exists \({\textbf{g}}\) for which \(s_i({\textbf{g}})=s'\). Consider a permutation \(\pi\) such that \(\pi (j)=i\) and \(\pi (i)=j\). Then by the anonymity of s, \(s' =s_j({\textbf{g}}\circ \pi )\). Thus, by anonymity of c, \(c_i({\textbf{g}}) = c_j({\textbf{g}}\circ \pi )\) and so \({\mathcal {C}}_i(s')={\mathcal {C}}_j(s')\). Given that \(s'\) was arbitrary, it follows that \({\mathcal {C}}_i={\mathcal {C}}_j={\mathcal {C}}\) for some \({\mathcal {C}}:S\rightarrow {\mathbb{R}}\) and all i, j.
We extend the function \({\mathcal {C}}\) to be monotone on all of \({\mathbb{R}} ^L\) as follows. Let \(S_1\) be the set of \(s\notin S\) such that there exists some \(s'\in S\) for which \(s'\ge s\). For any \(s\in S_1\) \({\mathcal {C}}_i(s)=\inf _{s'\in S, s'\ge s} {\mathcal {C}}(s')\). Next, let \(S_2\) be the set of \(s\notin S \cup S_1\). For any \(s\in S_2\) let \({\mathcal {C}}(s)=\sup _{s'\in S \cup S_1, s'\le s} {\mathcal {C}}(s')\) (and note that this is well defined for all \(s\in S_2\) since there is always some \(s'\in S \cup S_1, s'\le s\)). This is also monotone, by construction. \(\square\)
IF part:
It is easily checked that if \({\mathcal {C}}\) can be expressed as in equation (2), then independence holds. Similarly, if the representation is as in (3), then recursivity also holds, as does additivity if (4) is satisfied.
ONLY IF part:
Let \(\textbf{e}^{\ell }\) denote the vector in \({\mathbb {N}}^{L}\) with every \(\textbf{e}_{\ell }^{\ell }=1\) and \(\textbf{e}_{j}^{\ell }=0\) for all \(j\not =\ell\). Define \(F_{\ell }:{\mathbb {R}}\rightarrow {\mathbb {R}}_{+}\) as
Iterated applications of independence imply that
To see this, note that for \(s_{i}=\left( x,0,\ldots ,0,\ldots ,\right)\), \(s_{i}^{\prime }=\left( 0,y,0,\ldots ,0,\ldots ,\right)\), and \(s_{i}^{\prime \prime }=\left( x,y,0,\ldots ,0,\ldots ,\right)\), independence requires
Doing this again for \(s_{i}=\left( x,y,0,0,\ldots ,0,\ldots ,\right)\), \(s_{i}^{\prime }=\left( x,0,z,0,\ldots ,0,\ldots ,\right)\), and \(s_{i}^{\prime \prime }=\left( x,y,z,0,\ldots ,0,\ldots ,\right)\), independence requires
By induction, this holds for arbitrary vectors. Monotonicity implies that \(F_{\ell }\) is increasing and \(F_{\ell }\left( 0\right) =0\) for all \(\ell\).
By recursivity, for all \(\ell \le L\) and all x in \({\mathbb {R}}\):
Moreover, recursivity also implies that for any two \(x,x^{\prime }\) in \({\mathbb{R}}\) and any \(\ell\) (provided the denominators are not 0):
From (9) this is true only if \(\delta \left( x\right) \equiv \delta\) is constant.
Next, note that (10), together with the fact \(F_\ell (0)=0\) for all \(\ell\), imply that \(F_\ell (x)=\delta ^\ell f(x)\) for a common f for which \(f(0)=0\). This implies that \({\mathcal {C}}\) can be written as in equation (3).
Finally, additivity—which clearly implies independence—then implies that f is linear (a standard result in vector spaces), and given that it must be that \(f(0)=0\), the final characterization follows. \(\square\)
Proof of Theorems 4and 5. The “if” parts of both theorems are straightforward, and so we prove the “only if” claims, beginning with Theorem 4.
From Theorem 3 it follows that
The strictness of sensitivity implies that \(\delta >0\) and \(a>0\). It suffices to show that \(s_i({\textbf{g}})\) must be a weighted neighborhood statistic that is \(\delta\)monotone.
Cycle independence implies that for any \({\textbf{g}}\), \(c_i({\textbf{g}}) = c_i({\textbf{g}}')\) where \({\textbf{g}}'\) is any minimal icentered subtree of \({\textbf{g}}\). Thus, it suffices to prove the characterization for trees.
Let us construct a series of line networks, with i at one extreme and with k links, denoted \(g^k\). Iteratively, for \(k=1\) to \(n1\) define
It follows directly (noting that \(n_{i}^{\ell }\left( g^k\right) =1\) for \(\ell \le k\) and 0 otherwise) that
Any icentered tree \({\textbf{g}}\) with depth (maximum distance to i) of k, can be built from \(g^k\) by a successive addition of links. Iteratively building \({\textbf{g}}\) from \(g^k\) by adding links that connect to the tree present at each step, and applying the constant marginal values condition at each step, then implies that
Finally, comparing \(c_i(g^k)\) to \(c_i(g^{k1}+ hj)\) where h is at distance \(k2\) and j is not in the network \(g^{k1}\), implies that \(\delta ^{\ell 1} w_i^\ell > \delta ^{\ell } w_i^{\ell +1}\).
The proof of Theorem 5 comes from changing the final step above, which implies instead that \(\delta ^{\ell 1} w_i^\ell =0\) for all \(\ell >1\). \(\square\)
Proof of Proposition 1
[IF] We first prove the ‘if’ part. Consider a monotone hierarchy and two nodes i, j.
First, suppose that \(i \doteq j\). From the definition of \(\doteq\) it must be that i and j are at the same level of the hierarchy, and that any subgraph containing i starting at the same level as some subgraph containing j are identical (up to the labeling of the nodes). Hence i and j are symmetric, and \(n_i({\textbf{g}}) = n_j({\textbf{g}})\).
So, to complete the proof of the ‘if’ part, it is sufficient to show that if \(i \gtrdot j\) then \(n_i \succ n_j\).
Consider two nodes at the same distance from the root.
If they are at distance 1, then they have the same distance to each node that is not a successor of either node. Given the definition of monotone hierarchy, it must be that either \(d_i>d_j\) or that \(d_i=d_j\) and \(d_k > d_l\) for some successor k of i and any successor l of j such that \(\rho (k)= \rho (l)\). In either case the definition implies that then \(d_k \ge d_l\) for every successor k of i and successor l of j such that \(\rho (k)= \rho (l)\). It directly follows that \(n_i \succ n_j\).
Inductively, if \(\rho (i) = \rho (j)>1\):

If there exist two distinct predecessors k, l of i, j, respectively, such that \(\rho (k) = \rho (l)\) and \(k \gtrdot l\), then \(i \gtrdot j\), then the ordering holds given the ordering of those predecessors and that their neighborhoods are determined by those predecessors.

Otherwise, they follow from a common immediate predecessor and differ only in the subgraphs starting from them, and the condition follows from the reasoning above given the differences in those subgraphs, which must be ordered.
Next, suppose that \(\rho (j) = \rho (i)+1\). We show that \(n_i({\textbf{g}}) \succ n_j({\textbf{g}})\).
For this part of the proof we provide a formula to compute the number of nodes at distance less than or equal to d from node i for a monotone hierarchy, Q(i, d). Let \(\rho (i)\) denote the distance from the root and \(i_0,i_1,..,i_k,\ldots ,i_{\rho (i)}=i\) the unique path between the root and node i. Let \(p(i,\ell )\) denote the number of successors of node i at distance \(\ell\). If \(d \ge \rho (i)\), we compute the number of nodes at distance less than or equal to d as
To understand this computation, notice that all nodes which are at distance less than or equal to \(d\rho (i)\) from the root are at a distance less than d from node i. Other nodes at a distance less than d from node i are computed considering the path between \(i_0\) and i. Fix \(i_1\). There are successor nodes which are at distance \(d\rho (i)\) from node \(i_1\) (and hence at a distance \(d1\) from i) and were not counted earlier because they are at a distance of \(d\rho (i)+1\) from the root, and successor nodes which are at a distance \(d\rho (i)+1\) from node 1 (and hence at a distance d from i) and were not counted earlier because they are at a distance \(d \rho (i)+2\) from the root. Continuing along the path, for any node \(i_k\) we count successor nodes at a distance \(d\rho (i)+k1\) and \(d\rho (i)+k\) from node \(i_k\) which are at a distance \(d1\) and d from node i and were not counted earlier, and finally obtain the total number of nodes at a distance less or equal to d from node i.
Next suppose that \(d \le \rho (i)\). In that case, no node beyond \(i_0\) who does not belong to the subtree starting at \(i_1\) can be at a distance smaller than d. The expression for the number of nodes at a distance less than or equal to d simplifies to
The following claim is useful.
Claim 1
In a monotone hierarchy, for any i, j such that \(\rho (j)=\rho (i)+1\), and any \(\ell\), \(p(i,\ell ) \ge p(j,\ell )\).
Proof of the Claim
The proof is by induction on \(\ell\). For \(\ell =1\), the statement is true as \(p(i,1) \equiv d_i1 \ge d_j1 \equiv p(j,1)\). Suppose that the statement is true for all \(\ell ^{\prime }<\ell\). Let \(i_1,..,i_I\) be the direct successors of i and \(j_1,..,j_J\) the direct successors of j, with \(J \le I\). Then
where the first inequality is due to the fact that \(I \ge J\) and the second that, by the induction hypothesis, as \(\rho (i_r) = \rho (j_r)1\) for all r, \(p(i_r,\ell ^{\prime }1) \ge p(j_r, \ell ^{\prime }1)\). \(\square\)
Consider \(d \ge \rho (i)+1\) and \(i_0,i_1,.i_r,.,i_{\rho (i)}\), \(i_0,j_1,\ldots ,j_r, j_{\rho (i)+1}\) the paths linking i and j to the root. Then
Note that \(p(i_0, d\rho (i))= p(j_1, d\rho (i)1) + \sum _{k \ne j_1, \rho (k)=1} p(k, d \rho (i)1)\) and that \(p(j_1, d \rho (i)) =\sum _{l \rho (l)=2, \rho (j_1,l)=1} p(l, d \rho (i)1)\). By Claim 1, as \(\rho (l)= \rho (k+1)\), \(p(l, d \rho (i)1) \le p(k, d \rho (i)1)\) and as \(d(j_1) \le d(i_0)1\), \(\sum _{k \ne j_1, \rho (k)=1} p(k, d \rho (i)1) \ge \sum _{l \rho (l)=2, \rho (j_1,l)=1} p(l, d \rho (i)1)\). Furthermore, by Claim 1, for all r and all d, \(p(i_r,d) \ge p(j_{r+1},d)\), so that \(Q(i,d)Q(j,d) \ge 0\).
Next, consider \(d \le \rho (i) < \rho (i)+1\). Then
and by a direct application of Claim 1, \(Q(i,d)Q(j,d) \ge 0\).
We finally observe that there always exists a distance d such that \(Q(i,d)>Q(j,d)\). Let h be the total number of levels in the hierarchy. Consider a distance d such that \(h=d + \rho (i)\). Then there exist successor nodes at distance d from i but no successor nodes at distance d from j. Hence \(p(i,d)>0 = p(j,d)\). This establishes that \(Q(i,d) > Q(j,d)\) and hence \(n_i({\textbf{g}}) \succ n_j({\textbf{g}})\). By a repeated application of the same argument, for any i, j such that \(\rho (i) < \rho (j)\), for any i, j such that \(\rho (i,i_0) < \rho (j,i_0)\), \(n_i({\textbf{g}}) \succ n_j({\textbf{g}})\).
[ONLY IF]: Suppose that the tree \({\textbf{g}}\) is not a monotone hierarchy and has an even diameter. Consider a line in the tree which has the same length as the diameter of the tree. Pick as a root the unique middle node in the line and let h be the maximal distance between the root and a terminal node.
First consider the case in which there exist two nodes i and j such that \(\rho (j)= \rho (i)+1\) but \(d_j > d_i\). Then clearly \(Q(j,1) > Q(k,1)\). Notice that all nodes are at a distance less than or equal to \(d=h+\rho (i)\) from node i whereas there exist nodes which are at a distance \(h+ \rho (i)+1\) from node j, and hence \(Q(j,h+\rho (i)) < Q(i,h+ \rho (i))\) so that neither \(n_i \succeq n_j\) nor \(n_j \succeq n_i\).
Next suppose that for all nodes i, j such that \(\rho (j)=\rho (i)+1\), \(d_j \le d_i\), but that there exists two nodes i, j at the same level of the hierarchy such that \(d_i > d_j\) and two successors of i and j, k and l, at the same level of the hierarchy such that \(d_k < d_l\). Because \(d_i > d_j\), \(Q(i,1)>Q(j,1)\). Suppose that \(n_i \succ n_j\). Then \(Q(i,d) > Q(j,d)\) fr all \(d=1,2,\ldots ,h+ \rho (i)1\). Now consider the two successors k and l of i and j. As \(d_k < d_l\), \(Q(k,1) < Q(l,1)\). Now count all the nodes which are at a distance less than or equal to \(h+\rho (k)1\) from k, \(Q(k,h+\rho (k)1)\). This includes all the nodes but the nodes which are at maximal distance from k. As k is a successor of i, the set of nodes at maximal distance from k and i are equal so that \(Q(k,h+\rho (k)1) =Q(i,h+\rho (i)1)\). Similarly, the set of nodes at maximal distance from j and l are equal and \(Q(l, h+ \rho (l)1) = Q(j, h+ \rho (j)1)\). Because we assume that \(n_i \succ n_j\), \(Q(i,h+\rho (i)1) > Q(j,h+\rho (j)1)\) so that \(Q(k, h + \rho (k)1) > Q(j, h + \rho (j)1)\), showing that neither \(n_k \succ n_l\) nor \(n_l \succ n_k\), completing the proof of the Proposition. \(\square\)
Proof of Proposition 2 (IF)
[IF] Because a regular monotone hierarchy is a monotone hierarchy, we know by Proposition 1 that \(i \gtrdot j\) if and only if \(n_i \succ n_j\).
Let \(d(\ell )\) be the degree of nodes at distance \(\ell\) from the root node.
Next we show that the number of geodesic paths of any length d between two nodes is smaller for a node further away from the root. To this end, consider two nodes i and j such that j is a direct successor of i. For any d, if a geodesic path contains j but not i, then i must be an endpoint of the path. Hence, the total number of geodesic paths of length d going through j but not through i is \(2p(j,d1)\). If \(d_i \ge 3\), pick a direct successor \(k \ne j\) of i, and consider paths of length d connecting successors of k to j. All these paths must go through i and there are \(2 p(k,d1)= 2p(j,d1)\) such paths. If \(d_i=2\), then \(d_j \le 2\) so that \(2p(j,d1)=0\) or \(2p(j,d1)=2\). If \(2p(j,d1)=2\), then d is small enough so that there exists at least two paths of length d connecting a node in the network to j through i. Furthermore, if \(d = h \rho (i,i_0)+1\) where h is the number of levels of the hierarchy, there is no path of length d connecting i to a node through j whereas there exist paths of length d connecting a node to j through i, so that \(I_i \ge I_j\).
Next, we compute the number of walks emanating from two nodes i and j at different levels of the hierarchy. Let \(w_k(d)\) denote the number of walks of length d emanating from a node at level \(\ell\). We show that \(w_\ell (d) \ge w_{\ell +1}(d)\). We compute the number of walks recursively:
We also have \(w_\ell (0) = 1\) for all \(\ell\) which allows us to start the recursion.
Next we prove that \(w_\ell (d) \ge w_{\ell +1}(d)\) for \(i=1,..,I1\) by induction on d The statement is trivially true for all \(\ell\) at \(d=0\). Now suppose that the statement is true at \(d1\). We first show that the inequality holds for all nodes but the root. For \(\ell \ge 1\),
The more difficult step is to show that the statement is also true for the root. To this end, we prove by induction on d that for all \(\ell =1,..,h1\):
The statement is true at \(d=0\) because \(d(0)\ge d(\ell )\) for all \(\ell \ge 1\). Next compute
By the induction hypothesis,
and
Replacing, we obtain
concluding the inductive argument. Applying this formula for \(\ell =1\), we have \(w_0(d) = d(0) w_1(d1) \ge [d(1)1] w_2(d1) + w_0(d1) = w_1(d)\), completing the proof that \(w_\ell (d) \ge w_{\ell +1}(d)\) for all d.
[ONLY IF] Consider a leaf i of the tree. Then \(w_i=(0,0\ldots ,0)\). So all leaves have the same centrality based on the intermediary statistic. They must also have the same centrality based on the neighborhood statistic, which implies that the tree is a regular monotone hierarchy. \(\square\)
Appendix: simulations—differences in centrality measures by network type
We simulate networks on 40 nodes. We vary the type of network to have three different basic structures, corresponding to a standard random network, a simple version of a stochastic block model, and a variation of a stochastic block model that includes bridge nodes. The first is an ErdosRenyi random graph in which all links are formed independently. The second is a network that has some homophily: there are two types of nodes and we connect nodes of the same type with a different probability than nodes of different types. The third is a variant of a homophilous network in which some nodes are ‘bridge nodes’ that connect to other nodes with a uniform probability, thus putting them as connector nodes between the two homophilitic groups. We vary the overall average degrees of the networks to be either 2, 5 or 10. In the cases of the homophily and homophily bridge nodes, there are also relative within and across group link probabilities that vary. Given all of these dimensions, we end up with many different networks on which to compare centrality measures.
We then compare 5 different centrality measures on these networks: degree, decay, closeness, diffusion, and Katz–Bonacich. Decay, diffusion and Katz–Bonacich all depend on a parameter that we call the exponential parameter, and we vary that as well.^{Footnote 35}
The details on the three network types we perform the simulation on are:

ER random graphs: Each possible link is formed independently with probability \(p=\overline{d}/\left( n1\right)\).

Homophily:
There are two equallysized groups of 20 nodes. Links between pairs of nodes in the same group are formed with probability \(p_{same}\) and between pairs of nodes in different groups are formed with probability \(p_{diff}\), all independently.
Letting \(p_{same}=H\times p_{diff}\), average degree is:
$$\begin{aligned} \overline{d} =\left( \frac{n}{2}1\right) p_{same} +\frac{n}{2}p_{diff} \end{aligned}$$ 
Homophily with Bridge Nodes:
There are L bridge nodes and two equalsized groups of \(\frac{nNL}{2}\) nonbridge nodes. Each bridge node connects to any other node with probability \(p_{b}\). Nonbridge nodes connect to other same group nodes with probability \(p_{same}\) and different group nodes with probability \(p_{diff}\).
Letting \(p_{same}=H\times p_{diff}\), we set \(p_{b} =\overline{d}/\left( n1\right)\) where \(\overline{d}\) is defined as the average degree
$$\begin{aligned} \overline{d} =\left( \frac{nL}{2}1\right) p_{same} +\left( \frac{nL}{2}\right) p_{diff}+Lp_{b}. \end{aligned}$$
A first thing to note about the simulations (see the tables below) is that the correlation in rankings of the various centrality measures is very high across all of the simulations and measures, often above 0.9, and usually in the 0.8 to 1 range. This is in part reflective of what we have seen from our characterizations: all of these measures operate in a similar manner and are based on nodal statistics that often move in similar ways: nodes with higher degree tend to be closer to other nodes and have more walks to other nodes, and so forth. In terms of differences between measures, closeness and betweenness are more distinguished from the others in terms of correlation, while the other measures all correlate above 0.98 in Table 2.
These extreme correlations are higher than those found in Valente et al. (2008), who also find high correlations, but lower in magnitude, when looking at a series of real data sets.^{Footnote 36} The artificial nature of the ErdosRenyi networks serves as a benchmark from which we can jump off as it results in less differentiation between nodes than one finds in many realworld networks, but also allows us to know that differences among nodes are coming from random variations. As we add homophily in Table 4, and then bridge nodes in Table 5, we see the correlations drop significantly, especially comparing betweenness centrality to the others, and this then has an intuitive interpretation as bridge nodes naturally have high betweenness centrality, but may not stand out according to other measures.
Correlation is a very crude measure, and it does not capture whether nodes are switching ranking or by how much. Some nodes could have dramatically different rankings and yet the correlation could be relatively high overall. Thus, we also look at how many nodes switch rankings between two measures, as well as how the maximal extent to which some node changes rankings. There, we see more substantial differences across centrality measures, and with most measures being more highly distinguished from each other.
As we increase the exponential factor (e.g., from Table 2 to 3), we see greater differences between the measures, as the correlations drop and we see more changes in the rankings. With a very low exponential factor, decay, diffusion, Katz–Bonacich are all very close to degree, while for higher exponential parameters they begin to differentiate themselves. This makes sense as it allows the measures to incorporate information that depends on more of the network, and that is less tied to immediate neighborhoods.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author selfarchiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Bloch, F., Jackson, M.O. & Tebaldi, P. Centrality measures in networks. Soc Choice Welf 61, 413–453 (2023). https://doi.org/10.1007/s00355023014564
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00355023014564