Generalizing unweighted network measures to capture the focus in interactions

Abdallah, Sherief

doi:10.1007/s13278-011-0018-8

Generalizing unweighted network measures to capture the focus in interactions

Original Article
Published: 18 February 2011

Volume 1, pages 255–269, (2011)
Cite this article

Social Network Analysis and Mining Aims and scope Submit manuscript

Sherief Abdallah^1,2

290 Accesses
14 Citations
Explore all metrics

Abstract

Unweighted network measures are commonly used to analyze real-world networks due to their simplicity and intuitiveness. This motivated the search for generalizations of unweighted network measures that take weights into account. We propose a new generalization methodology that captures how focused are the interactions over edges. The less focused the interaction (more uniform over edges) the closer is our generalization to the original unweighted measure. None of the previously developed generalizations capture this aspect of weighted networks. We analyze several real-world networks using our generalizations of the degree and the clustering coefficient. The analysis shows that our generalizations reveal interesting observations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mathematics of Networks

How Correlated Are Community-Aware and Classical Centrality Measures in Complex Networks?

Optimal Scales in Weighted Networks

Notes

A node’s degree is the number of edges incident to the node, while a node’s strength is the summation of weights incident to the node. Section 2 provides the formal definitions.
We use the generalized α-degree as a representative of the state-of-the-art generalizations (Opsahl et al. 2010). The α values of 0.5 and 1.5 were proposed by the original paper.
Other examples include heterophilicity and dyadicity. We describe these measures in further detail later.
The entropy is a measure often used in information theory to quantify the uncertainty of a set of outcomes/events (Shannon 1948). The higher the uncertainty is (which is equivalent to more uniform weights), the higher the value of the entropy.
Note that the quantity $x\log_2{\frac{1}{x}}\rightarrow 0 \hbox{ as }x \rightarrow 0\hbox{ or }x=1.$
Because more than one edge can have the same weight.
There are other network measures that also quantified the strength of connections within a class (community) of nodes, such as the modularity measure (Newman and Girvan 2004).
Note that, particularly for directed graphs, some researchers argued that a clustering signature would be more suitable in distinguishing networks (Ahnert and Fink 2008). In a clustering signature, seven types of directed triangles are counted separately. The effective cardinality can still be used to replace the discrete counts of these triangles. For the purpose of this paper we focus on the simpler, more widely used definition of the clustering coefficient.
Available through http://www-personal.umich.edu/mejn/netdata/.
Source code adopted from http://www.santafe.edu/ãaronc/powerlaws/.
The datasets are publicly available at http://netkit-srl.sourceforge.net/data.html. In a university network, a node represents a web page, which has a label indicating its type (personal web page, department, etc.). A link from one node to another (directed) means there is at least one URL link from the first node to the other. The weight on the link represents the number of such URLs. In an industry dataset, a node represents a company, which has a label indicating its type (transportation, technology, etc.). A link between two nodes exists if the two companies appear in the same news article. The weight represents how many articles the two companies appeared in.
The percentage of weights that equal one captures the variation in weights more accurately than the standard deviation, which is sensitive to outliers.
It is worth noting that the Newcomb Fraternity dataset (available through UCINET (Borgatti et al. 2002)) is very similar to the example depicted in Fig. 1. The dataset provided snapshots of a dynamic social network over time, but the network is not weighted (only ranking of neighbors were provided).

References

Adnan M, Nagi M, Kianmehr K, Tahboub R, Ridley M, Rokne J (2011) Promoting where, when and what? An analysis of web logs by integrating data mining and social network techniques to guide ecommerce business promotions. Soc Netw Anal Min 1
Ahnert SE, Fink TMA (2008) Clustering signatures classify directed networks. Phys Rev E 78(3):036112. doi:10.1103/PhysRevE.78.036112
Google Scholar
Ahnert SE, Garlaschelli D, Fink TMA, Caldarelli G (2007) Ensemble approach to the analysis of weighted networks. Phys Rev E 76(1). doi:10.1103/PhysRevE.76.016101. http://dx.doi.org/10.1103/PhysRevE.76.016101
Almaas E, Kovacs B, Vicsek T, Oltvai ZN, Barabasi AL (2004) Global organization of metabolic fluxes in the bacterium, escherichia coli. Nature 427:839. http://www.citebase.org/abstract?id=oai:arXiv.org:q-bio/0403001
Google Scholar
Barabasi AL, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512. http://view.ncbi.nlm.nih.gov/pubmed/10521342
Google Scholar
Barrat A, Barthélemy M, Pastor-Satorras R, Vespignani A (2004) The architecture of complex weighted networks. Proc Natl Acad Sci 101:3747–3752. doi:10.1073/pnas.0400087101
Article Google Scholar
Barthélemy M, Barrat A, Pastor-Satorras R, Vespignani A (2005) Characterization and modeling of weighted networks. Physica A 346:34–43. doi:10.1016/j.physa.2004.08.047
Article Google Scholar
Boccaletti S, Latora V, Moreno Y, Chavez M, Hwang DU (2006) Complex networks: structure and dynamics. Phys Rep 424:175–308. doi:10.1016/j.physrep.2005.10.009
Article MathSciNet Google Scholar
Borgatti MES, Freeman L (2002) UCINET for Windows
Chakrabarti D, Faloutsos C (2006) Graph mining: laws, generators, and algorithms. ACM Comput Surv 38(1):2. http://doi.acm.org/http://doi.acm.org/10.1145/1132952.1132954
Chapanond A, Krishnamoorthy MS, Yener B (2005) Graph theoretic and spectral analysis of enron email data. Comput Math Organ Theory 11(3):265–281. http://dx.doi.org/10.1007/s10588-005-5381-4
Article MATH Google Scholar
Clauset A, Rohilla Shalizi C, Newman MEJ (2009) Power-law distributions in empirical data. SIAM Review 51:661–703
Google Scholar
Faloutsos M, Faloutsos P, Faloutsos C (1999) On power-law relationships of the internet topology. Comput Commun Rev 25:251–262
Article Google Scholar
Gallagher B, Eliassi-Rad T (2009) Leveraging label-independent features for classification in sparsely labeled networks: an empirical study. In: Lecture notes in computer science: advances in social network mining and analysis. Springer, New York
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. SIGKDD Explor 11(1)
Kalisky T, Sreenivasan S, Braunstein LA, Buldyrev SV, Havlin S, Stanley HE (2006) Scale-free networks emerging from weighted random graphs. Phys Rev E 73(2):025103. doi:10.1103/PhysRevE.73.025103. http://link.aps.org/abstract/PRE/v73/e025103
Google Scholar
Leskovec J, Kleinberg J, Faloutsos C (2007) Graph evolution: densification and shrinking diameters. ACM Trans Knowl Discov Data 1(1):2. http://doi.acm.org/10.1145/1217299.1217301
Google Scholar
Li M, Wu J, Wang D, Zhou T, Di Z, Fan Y (2007) Evolving model of weighted networks inspired by scientific collaboration networks. Physica A: Stat Mech Appl 375(1):355–364
Article Google Scholar
McGlohon M, Akoglu L, Faloutsos C (2008) Weighted graphs and disconnected components: patterns and a generator. In: SIGKDD. ACM, New York, pp 524–532. http://doi.acm.org/10.1145/1401890.1401955
Newman ME (2001a) Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality. Phys Rev 64(1):016132. doi:10.1103/PhysRevE.64.016132
Google Scholar
Newman MEJ (2001b) Coauthorship networks and patterns of scientific collaboration. Proc Natl Acad Sci 98:404–409. doi:10.1073/pnas.0307545100
Article MATH Google Scholar
Newman ME (2004) Analysis of weighted networks. Phys Rev E 70(5):056131
Google Scholar
Newman MEJ (2006) Finding community structure in networks using the eigenvectors of matrices. Phys Rev E 74:036104
Google Scholar
Newman ME, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69(2):026113
Google Scholar
Opsahl T, Panzarasa P (2009) Clustering in weighted networks. Soc Netw 31(2):155–163. http://dx.doi.org/10.1016/j.socnet.2009.02.002
Google Scholar
Opsahl T, Agneessens F, Skvoretz J (2010) Node centrality in weighted networks: generalizing degree and shortest paths. Soc Netw 32(3):245–251
Article Google Scholar
Park J, Barabasi AL (2007) Distribution of node characteristics in complex networks. Proc Natl Acad Sci 104:17916–17920
Article Google Scholar
Raeder T, Chawla NV (2011) Market basket analysis with networks. Soc Netw Anal Min 1
Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27:379–423, 623–656
Google Scholar
Tsourakakis CE, Drineas P, Michelakis E, Koutis I, Faloutsos C (2011) Spectral counting of triangles via element-wise sparsification and triangle-based link recommendation. Soc Netw Anal Min 1
Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’ networks. Nat 393:440–442. doi:10.1038/30918
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of Edinburgh, Edinburgh, UK
Sherief Abdallah
British University in Dubai, Dubai, UAE
Sherief Abdallah

Authors

Sherief Abdallah
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sherief Abdallah.

Additional information

An earlier version of this paper was presented in the International Workshop on Social Network Analysis (SNAKDD) 2009.

Appendix: Proofs of effective cardinality properties

Theorem 1

The effective cardinality satisfies the three properties described above: the maximum cardinality, the minimum cardinality, and the consistent partial ordering.

Proof

The proof follows from the following three lemmas. □

Lemma 1

The effective cardinality satisfies the maximum cardinality property.

Proof

When all the weights are equal to a constant C we have

$$ \forall e \in E^{\prime}: {\frac{w(e)}{\sum_{o\in E^{\prime}}w(o)}}={\frac{C}{C|E^{\prime}|}}={\frac{1}{|E^{\prime}|}} $$

We then have

$$ \begin{aligned} c(E^{\prime}) &=2^{\sum_{e\in E'}{\frac{1}{|E'|}}\log_2(|E^{\prime}|)}\\ &=2^{\log_2(|E^{\prime}|)} \\ &=|E^{\prime}| \end{aligned} $$

In other words, both the cardinality and the effective cardinality of a weighted set of edges become equivalent when the weights are uniform. The effective cardinality is also maximum in this case, because the exponent is the entropy of the weight probability distribution, which is maximum when weights are uniform over edges. □

Lemma 2

The effective cardinality satisfies the minimum cardinality property.

Proof

When the set of edges is empty, then the effective cardinality is zero by definition. When all weights are zero except only one weight that is greater than zero, then weight probability distribution is deterministic and the entropy is zero; therefore, the effective cardinality will be 1. □

Lemma 3

The effective cardinality satisfies the consistent partial order property.

Proof

Let $E^{\prime}_1$ and $E^{\prime}_2$ be two (edge) sets such that $|E^{\prime}_1|=|E^{\prime}_2|$ (both have the same cardinality). Let W ₁ and W ₂ be the corresponding sets of weights, where $\sum_{e1 \in E^{\prime}_1} w(e1)=\sum_{e2 \in E^{\prime}_2} w(e2) = S$ (the total weights are equal). Furthermore, let $|W_1 \bigcap W_2|=n-2, ${$w_{11},w_{12}$}$ = W_1 - W_2,${$w_{21},w_{22}$}$ = W_2 - W_1$, where the ‘−’ operator is the “set difference" operator (the two sets share the same weights except for two elements in each set), and $|w_{11}-w_{12}| < |w_{21}-w_{22}|$ (the weights of W ₁ are more uniform than the weights of W ₂). To prove that the effective cardinality satisfies the consistent partial ordering property, we need to prove that $c(E^{\prime}_1)>c(E^{\prime}_2)$. □

Without loss of generality, we can assume that $w_{11} \ge w_{12}$ and $w_{21} \ge w_{22};$ therefore, $w_{11}-w_{12} < w_{21}-w_{22}$. We then have

$$ w_{11}+w_{12}=S - \sum_{w \in W_1 \bigcap W_2}w = w_{21}+w_{22} $$

or

$$ {\frac{w_{11}+w_{12}}{S}}=1 - \sum_{w \in W_1 \bigcap W_2}{\frac{w}{S}} = {\frac{w_{21}+w_{22}}{S}} = L $$

therefore

$$ L \ge {\frac{w_{21}}{S}} > {\frac{w_{11}}{S}} \ge {\frac{L}{2}} \ge L-{\frac{w_{11}}{S}} > L-{\frac{w_{21}}{S}} $$

where $\frac{w_{12}}{S} = L-\frac{w_{11}}{S}$ and $\frac{w_{22}}{S} = L-\frac{w_{21}}{S}$. Then from Lemma 6 we have $h(L,\frac{w_{11}}{S}) > h(L,\frac{w_{21}}{S})$, or

$$ -\frac{w_{11}}{S}lg\left(\frac{w_{11}}{S}\right) - \left(L-\frac{w_{11}}{S}\right)lg\left(c-\frac{w_{11}}{S}\right) > -\frac{w_{21}}{S}lg\left(\frac{w_{21}}{S}\right) - \left(L-\frac{w_{21}} {S}\right)lg\left(c-\frac{w_{21}}{S}\right) $$

Therefore $H(E^{\prime}_1) > H(E^{\prime}_2)$, because the rest of the entropy terms (corresponding to $W_1 \bigcap W_2$) are equal, and consequently $c(E^{\prime}_1)>c(E^{\prime}_2)$.

Lemma 4

The quantityh(C, x) = − x lg(x) − (C − x) lg(C − x) is symmetric around and maximized at$x={\frac{C}{2}}$for$C \ge x \ge 0$.

Proof

$$ h\left(C,{\frac{C}{2}}+\delta\right)=-\left({\frac{C}{2}}+\delta\right)\lg\left({\frac{C} {2}}+\delta\right) - \left({\frac{C}{2}}-\delta\right) \lg \left({\frac{C}{2}}-\delta\right) = h\left(C,{\frac{C}{2}}-\delta\right) $$

Therefore h(C, x) is symmetric around c/2. Furthermore, h(C, x) is maximized when

$$ {\frac{\partial h(C,x)}{\partial x}} = 0 = -1 -\lg x + 1 + \lg(C-x) $$

or

$$ \lg x = \lg(C-x) $$

Therefore h(C, x) is maximized at $x=C-x = {\frac{C}{2}}$. □

Rights and permissions

Reprints and permissions

About this article

Cite this article

Abdallah, S. Generalizing unweighted network measures to capture the focus in interactions. Soc. Netw. Anal. Min. 1, 255–269 (2011). https://doi.org/10.1007/s13278-011-0018-8

Download citation

Received: 30 October 2010
Accepted: 25 January 2011
Published: 18 February 2011
Issue Date: November 2011
DOI: https://doi.org/10.1007/s13278-011-0018-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Generalizing unweighted network measures to capture the focus in interactions

Abstract

Access this article

Similar content being viewed by others

Mathematics of Networks

How Correlated Are Community-Aware and Classical Centrality Measures in Complex Networks?

Optimal Scales in Weighted Networks

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix: Proofs of effective cardinality properties

Theorem 1

Proof

Lemma 1

Proof

Lemma 2

Proof

Lemma 3

Proof

Lemma 4

Proof

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Generalizing unweighted network measures to capture the focus in interactions

Abstract

Access this article

Similar content being viewed by others

Mathematics of Networks

How Correlated Are Community-Aware and Classical Centrality Measures in Complex Networks?

Optimal Scales in Weighted Networks

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix: Proofs of effective cardinality properties

Appendix: Proofs of effective cardinality properties

Theorem 1

Proof

Lemma 1

Proof

Lemma 2

Proof

Lemma 3

Proof

Lemma 4

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation