Abstract
Unweighted network measures are commonly used to analyze real-world networks due to their simplicity and intuitiveness. This motivated the search for generalizations of unweighted network measures that take weights into account. We propose a new generalization methodology that captures how focused are the interactions over edges. The less focused the interaction (more uniform over edges) the closer is our generalization to the original unweighted measure. None of the previously developed generalizations capture this aspect of weighted networks. We analyze several real-world networks using our generalizations of the degree and the clustering coefficient. The analysis shows that our generalizations reveal interesting observations.
Similar content being viewed by others
Notes
A node’s degree is the number of edges incident to the node, while a node’s strength is the summation of weights incident to the node. Section 2 provides the formal definitions.
We use the generalized α-degree as a representative of the state-of-the-art generalizations (Opsahl et al. 2010). The α values of 0.5 and 1.5 were proposed by the original paper.
Other examples include heterophilicity and dyadicity. We describe these measures in further detail later.
The entropy is a measure often used in information theory to quantify the uncertainty of a set of outcomes/events (Shannon 1948). The higher the uncertainty is (which is equivalent to more uniform weights), the higher the value of the entropy.
Note that the quantity \(x\log_2{\frac{1}{x}}\rightarrow 0 \hbox{ as }x \rightarrow 0\hbox{ or }x=1.\)
Because more than one edge can have the same weight.
There are other network measures that also quantified the strength of connections within a class (community) of nodes, such as the modularity measure (Newman and Girvan 2004).
Note that, particularly for directed graphs, some researchers argued that a clustering signature would be more suitable in distinguishing networks (Ahnert and Fink 2008). In a clustering signature, seven types of directed triangles are counted separately. The effective cardinality can still be used to replace the discrete counts of these triangles. For the purpose of this paper we focus on the simpler, more widely used definition of the clustering coefficient.
Available through http://www-personal.umich.edu/mejn/netdata/.
Source code adopted from http://www.santafe.edu/ãaronc/powerlaws/.
The datasets are publicly available at http://netkit-srl.sourceforge.net/data.html. In a university network, a node represents a web page, which has a label indicating its type (personal web page, department, etc.). A link from one node to another (directed) means there is at least one URL link from the first node to the other. The weight on the link represents the number of such URLs. In an industry dataset, a node represents a company, which has a label indicating its type (transportation, technology, etc.). A link between two nodes exists if the two companies appear in the same news article. The weight represents how many articles the two companies appeared in.
The percentage of weights that equal one captures the variation in weights more accurately than the standard deviation, which is sensitive to outliers.
It is worth noting that the Newcomb Fraternity dataset (available through UCINET (Borgatti et al. 2002)) is very similar to the example depicted in Fig. 1. The dataset provided snapshots of a dynamic social network over time, but the network is not weighted (only ranking of neighbors were provided).
References
Adnan M, Nagi M, Kianmehr K, Tahboub R, Ridley M, Rokne J (2011) Promoting where, when and what? An analysis of web logs by integrating data mining and social network techniques to guide ecommerce business promotions. Soc Netw Anal Min 1
Ahnert SE, Fink TMA (2008) Clustering signatures classify directed networks. Phys Rev E 78(3):036112. doi:10.1103/PhysRevE.78.036112
Ahnert SE, Garlaschelli D, Fink TMA, Caldarelli G (2007) Ensemble approach to the analysis of weighted networks. Phys Rev E 76(1). doi:10.1103/PhysRevE.76.016101. http://dx.doi.org/10.1103/PhysRevE.76.016101
Almaas E, Kovacs B, Vicsek T, Oltvai ZN, Barabasi AL (2004) Global organization of metabolic fluxes in the bacterium, escherichia coli. Nature 427:839. http://www.citebase.org/abstract?id=oai:arXiv.org:q-bio/0403001
Barabasi AL, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512. http://view.ncbi.nlm.nih.gov/pubmed/10521342
Barrat A, Barthélemy M, Pastor-Satorras R, Vespignani A (2004) The architecture of complex weighted networks. Proc Natl Acad Sci 101:3747–3752. doi:10.1073/pnas.0400087101
Barthélemy M, Barrat A, Pastor-Satorras R, Vespignani A (2005) Characterization and modeling of weighted networks. Physica A 346:34–43. doi:10.1016/j.physa.2004.08.047
Boccaletti S, Latora V, Moreno Y, Chavez M, Hwang DU (2006) Complex networks: structure and dynamics. Phys Rep 424:175–308. doi:10.1016/j.physrep.2005.10.009
Borgatti MES, Freeman L (2002) UCINET for Windows
Chakrabarti D, Faloutsos C (2006) Graph mining: laws, generators, and algorithms. ACM Comput Surv 38(1):2. http://doi.acm.org/http://doi.acm.org/10.1145/1132952.1132954
Chapanond A, Krishnamoorthy MS, Yener B (2005) Graph theoretic and spectral analysis of enron email data. Comput Math Organ Theory 11(3):265–281. http://dx.doi.org/10.1007/s10588-005-5381-4
Clauset A, Rohilla Shalizi C, Newman MEJ (2009) Power-law distributions in empirical data. SIAM Review 51:661–703
Faloutsos M, Faloutsos P, Faloutsos C (1999) On power-law relationships of the internet topology. Comput Commun Rev 25:251–262
Gallagher B, Eliassi-Rad T (2009) Leveraging label-independent features for classification in sparsely labeled networks: an empirical study. In: Lecture notes in computer science: advances in social network mining and analysis. Springer, New York
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. SIGKDD Explor 11(1)
Kalisky T, Sreenivasan S, Braunstein LA, Buldyrev SV, Havlin S, Stanley HE (2006) Scale-free networks emerging from weighted random graphs. Phys Rev E 73(2):025103. doi:10.1103/PhysRevE.73.025103. http://link.aps.org/abstract/PRE/v73/e025103
Leskovec J, Kleinberg J, Faloutsos C (2007) Graph evolution: densification and shrinking diameters. ACM Trans Knowl Discov Data 1(1):2. http://doi.acm.org/10.1145/1217299.1217301
Li M, Wu J, Wang D, Zhou T, Di Z, Fan Y (2007) Evolving model of weighted networks inspired by scientific collaboration networks. Physica A: Stat Mech Appl 375(1):355–364
McGlohon M, Akoglu L, Faloutsos C (2008) Weighted graphs and disconnected components: patterns and a generator. In: SIGKDD. ACM, New York, pp 524–532. http://doi.acm.org/10.1145/1401890.1401955
Newman ME (2001a) Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality. Phys Rev 64(1):016132. doi:10.1103/PhysRevE.64.016132
Newman MEJ (2001b) Coauthorship networks and patterns of scientific collaboration. Proc Natl Acad Sci 98:404–409. doi:10.1073/pnas.0307545100
Newman ME (2004) Analysis of weighted networks. Phys Rev E 70(5):056131
Newman MEJ (2006) Finding community structure in networks using the eigenvectors of matrices. Phys Rev E 74:036104
Newman ME, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69(2):026113
Opsahl T, Panzarasa P (2009) Clustering in weighted networks. Soc Netw 31(2):155–163. http://dx.doi.org/10.1016/j.socnet.2009.02.002
Opsahl T, Agneessens F, Skvoretz J (2010) Node centrality in weighted networks: generalizing degree and shortest paths. Soc Netw 32(3):245–251
Park J, Barabasi AL (2007) Distribution of node characteristics in complex networks. Proc Natl Acad Sci 104:17916–17920
Raeder T, Chawla NV (2011) Market basket analysis with networks. Soc Netw Anal Min 1
Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27:379–423, 623–656
Tsourakakis CE, Drineas P, Michelakis E, Koutis I, Faloutsos C (2011) Spectral counting of triangles via element-wise sparsification and triangle-based link recommendation. Soc Netw Anal Min 1
Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’ networks. Nat 393:440–442. doi:10.1038/30918
Author information
Authors and Affiliations
Corresponding author
Additional information
An earlier version of this paper was presented in the International Workshop on Social Network Analysis (SNAKDD) 2009.
Appendix: Proofs of effective cardinality properties
Appendix: Proofs of effective cardinality properties
Theorem 1
The effective cardinality satisfies the three properties described above: the maximum cardinality, the minimum cardinality, and the consistent partial ordering.
Proof
The proof follows from the following three lemmas. □
Lemma 1
The effective cardinality satisfies the maximum cardinality property.
Proof
When all the weights are equal to a constant C we have
We then have
In other words, both the cardinality and the effective cardinality of a weighted set of edges become equivalent when the weights are uniform. The effective cardinality is also maximum in this case, because the exponent is the entropy of the weight probability distribution, which is maximum when weights are uniform over edges. □
Lemma 2
The effective cardinality satisfies the minimum cardinality property.
Proof
When the set of edges is empty, then the effective cardinality is zero by definition. When all weights are zero except only one weight that is greater than zero, then weight probability distribution is deterministic and the entropy is zero; therefore, the effective cardinality will be 1. □
Lemma 3
The effective cardinality satisfies the consistent partial order property.
Proof
Let \(E^{\prime}_1\) and \(E^{\prime}_2\) be two (edge) sets such that \(|E^{\prime}_1|=|E^{\prime}_2|\) (both have the same cardinality). Let W 1 and W 2 be the corresponding sets of weights, where \(\sum_{e1 \in E^{\prime}_1} w(e1)=\sum_{e2 \in E^{\prime}_2} w(e2) = S\) (the total weights are equal). Furthermore, let \(|W_1 \bigcap W_2|=n-2, \){\(w_{11},w_{12}\)}\( = W_1 - W_2,\){\(w_{21},w_{22}\)}\( = W_2 - W_1\), where the ‘−’ operator is the “set difference" operator (the two sets share the same weights except for two elements in each set), and \(|w_{11}-w_{12}| < |w_{21}-w_{22}|\) (the weights of W 1 are more uniform than the weights of W 2). To prove that the effective cardinality satisfies the consistent partial ordering property, we need to prove that \(c(E^{\prime}_1)>c(E^{\prime}_2)\). □
Without loss of generality, we can assume that \(w_{11} \ge w_{12}\) and \(w_{21} \ge w_{22};\) therefore, \(w_{11}-w_{12} < w_{21}-w_{22}\). We then have
or
therefore
where \(\frac{w_{12}}{S} = L-\frac{w_{11}}{S}\) and \(\frac{w_{22}}{S} = L-\frac{w_{21}}{S}\). Then from Lemma 6 we have \(h(L,\frac{w_{11}}{S}) > h(L,\frac{w_{21}}{S})\), or
Therefore \(H(E^{\prime}_1) > H(E^{\prime}_2)\), because the rest of the entropy terms (corresponding to \(W_1 \bigcap W_2\)) are equal, and consequently \(c(E^{\prime}_1)>c(E^{\prime}_2)\).
Lemma 4
The quantityh(C, x) = − x lg(x) − (C − x) lg(C − x) is symmetric around and maximized at\(x={\frac{C}{2}}\)for\(C \ge x \ge 0\).
Proof
Therefore h(C, x) is symmetric around c/2. Furthermore, h(C, x) is maximized when
or
Therefore h(C, x) is maximized at \(x=C-x = {\frac{C}{2}}\). □
Rights and permissions
About this article
Cite this article
Abdallah, S. Generalizing unweighted network measures to capture the focus in interactions. Soc. Netw. Anal. Min. 1, 255–269 (2011). https://doi.org/10.1007/s13278-011-0018-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13278-011-0018-8