Overlaying Social Networks of Different Perspectives for Inter-network Community Evolution

Sarr, Idrissa; Ndong, Joseph; Missaoui, Rokia

doi:10.1007/978-3-319-12188-8_3

Idrissa Sarr⁵,
Joseph Ndong⁵ &
Rokia Missaoui⁶

Part of the book series: Lecture Notes in Social Networks ((LNSN))

2455 Accesses
4 Citations

Abstract

In many real-life social networks, a group of individuals may be involved in multiple kinds of activities such as professional, leisure and friendship ones. Even though individuals may belong to a social network with a very precise type of links such as professional ties in LinkedIn, the interactions that may happen in other social networks such as Facebook are not reflected in the original network. We believe that overlaying networks with various types of links helps discover interesting patterns. The objective of this paper is then to overlay two or many social networks with different kinds of social activities in order to unveil homogeneous groups that could not appear in a unique social network. To that end, we propose a community detection approach based on possibility theory, which identifies time-based perspective communities for each kind of social activities that occur within a sequence of time windows. Furthermore, different perspectives are layered to detect communities that may belong to several networks in a given time period. Discovered communities in a given network for a time period can be perceived as views or perspectives in one or many networks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Backstrom L, Huttenlocher D, Kleinberg J, Lan X (2006) Group formation in large social networks: membership, growth, and evolution. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, KDD’06, pp 44–54
Google Scholar
Bekkerman R, McCallum A (2005) Disambiguating web appearances of people in a social network. In: WWW, pp 463–470
Google Scholar
Bródka P, Saganowski S, Kazienko P (2012) Ged: the method for group evolution discovery in social networks. Soc Netw Anal Min 3(1):1–14
Article Google Scholar
Crandall D, Cosley D, Huttenlocher D, Kleinberg J, Suri S (2008) Feedback effects between similarity and social influence in online communities. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, KDD’08, ACM, pp 160–168
Google Scholar
Dubois D, Prade H, Sandri S (1991) On possibility/probability transformations. In: Proceedings of the fourth international fuzzy systems association world congress (IFSA’91), Brussels, Belgium, pp 50–53
Google Scholar
Fortunato S (2010) Community detection in graphs. Phys Rep 486(3–5):75–174
Article MathSciNet Google Scholar
Girvan M, Newman MEJ (2002) Community structure in social and biological networks. Proc Natl Acad Sci USA 99(12):7821–7826
Article MATH MathSciNet Google Scholar
Goldberg MK, Magdon-Ismail M, Thompson J (2012) Identifying long lived social communities using structural properties. In: ASONAM, pp 647–653
Google Scholar
Goodman LA (1965) On simultaneous confidence intervals for multinomial proportions. Technometrics 7(2):247–254
Article MATH Google Scholar
Karnstedt M, Hennessy T, Chan J, Hayes C (2010) Churn in social networks: a discussion boards case study. In: Proceedings of the 2010 IEEE second international conference on social computing, SOCIALCOM’10. IEEE Computer Society, pp 233–240
Google Scholar
Kashoob S, Caverlee J (2012) Temporal dynamics of communities in social bookmarking systems. Soc Netw Anal Min 2(4):387–404
Article Google Scholar
Kautz HA, Selman B, Shah MA (1997) The hidden web. AI Mag 18(2):27–36
Google Scholar
Kautz HA, Selman B, Shah MA (1997) Referral web: combining social networks and collaborative filtering. Commun ACM 40(3):63–65
Article Google Scholar
Lakkaraju H, McAuley J, Leskovec J (2013) What’s in a name? Understanding the interplay between titles, content, and communities in social media. In: Seventh international AAAI conference on weblogs and social media. AAAI Publications
Google Scholar
Lancichinetti A, Fortunato S, Kertsz J (2009) Detecting the overlapping and hierarchical community structure in complex networks. New J Phys 11(3):033015
Article Google Scholar
Leskovec J, Kleinberg J, Faloutsos C (2007) Graph evolution: densification and shrinking diameters. ACM Trans Knowl Discov Data 1(1):2
Article Google Scholar
Leskove J, Huttenlocher D, Kleinberg J (2010) Predicting positive and negative links in online social networks. In: Proceedings of the 19th international conference on world wide web, WWW’10, pp 641–650
Google Scholar
Marsden PV (2005) Recent developments in network measurement. In: Carrington PJ, Scott J, Wasserman S (eds) Models and methods in social network analysis. Cambridge University Press, New York, pp 8–30
Google Scholar
Masson MH, Denoeux T (2006) Inferring a possibility distribution from empirical data. In: Proceedings of fuzzy sets and systems, pp 319–340
Google Scholar
Matsuo Y, Mori J, Hamasaki M, Ishida K, Nishimura T, Takeda H, Hasida K, Ishizuka M (2006) Polyphonet: an advanced social network extraction system from the web. In: Proceedings of the 15th international conference on world wide web, WWW’06. ACM, pp 397–406
Google Scholar
Newman MEJ (2004) Detecting community structure in networks. Eur Phys J B-Condens Matter Complex Syst 38(2):321–330
Article Google Scholar
Newman MEJ (2004) Fast algorithm for detecting community structure in networks. Phys Rev E 69(6):066133
Article Google Scholar
Palla G, Barabasi AL, Vicsek T (2007) Quantifying social group evolution. Nature 446:664–667
Article Google Scholar
Scott J (1991) Social network analysis: a handbook. Sage, London
Google Scholar
Sun Y, Han J (2012) Mining heterogeneous information networks: principles and methodologies. Synthesis lectures on data mining and knowledge discovery. Morgan and Claypool Publishers, San Rafael
Google Scholar
Tantipathananandh C, Wolf TB, Kempe D (2007) A framework for community identification in dynamic social networks. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, KDD’07. ACM, pp 717–726
Google Scholar
Toivonen R, Kovanen L, Kivel M, Onnela JP, Saramki J, Kaski K (2009) A comparative study of social network models: network evolution models and nodal attribute models. Soc Netw 31(4):240–254
Article Google Scholar
Wei F, Qian W, Wang C, Zhou A (2009) Detecting overlapping community structures in networks. World Wide Web 12:235–261
Google Scholar
Zadeh LA (1978) Fuzzy sets as a basis for a theory of possibility. In: Fuzzy sets and systems, pp 3–28
Google Scholar
Zadeh LA (1965) Fuzzy sets. Inf Control 8:338–353
Article MATH MathSciNet Google Scholar

Download references

Acknowledgments

The third author acknowledges the financial support of the Natural Sciences and Engineering Research Council of Canada (NSERC).

Author information

Authors and Affiliations

Université Cheikh Anta Diop, Avenue Cheikh Anta Diop, BP 5005, Fann Dakar, Senegal
Idrissa Sarr & Joseph Ndong
Université du Québec En Outaouais, Québec, Canada
Rokia Missaoui

Authors

Idrissa Sarr
View author publications
You can also search for this author in PubMed Google Scholar
Joseph Ndong
View author publications
You can also search for this author in PubMed Google Scholar
Rokia Missaoui
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Idrissa Sarr .

Editor information

Editors and Affiliations

Département d'Informatique et Ingénirie, Université du Québec en Outaouais, Gatineau, Québec, Canada
Rokia Missaoui
Département de Mathématiques et Informatique, Université Cheikh Anta Diop, Dakar, Senegal
Idrissa Sarr

Appendices

Appendix A: Inferring Possibility Distribution from Probability Distribution

A consistency principle between probability and possibility can be stated in a non-formal way [29]: “what is probable should be possible”. This requirement can then be translated via the inequality:

$$\begin{aligned} P(A)\le \varPi (A) \quad \forall A \subseteq \varOmega \end{aligned}$$

(7)

where $P$ and $\varPi $ are, respectively, a probability and a possibility measure on the domain $\varOmega $. In this case, $\varPi $ is considered as dominating $P$.

Transforming a probability measure into a possibilistic one then amounts to choosing a possibility measure in the set of possibility measures dominating $P$. This should be done, by adding a strong order preservation constraint, which ensures the preservation of the shape of the distribution:

$$\begin{aligned} p_i < p_j \Leftrightarrow \pi _i<\pi _j \quad \forall i,j \in \lbrace 1,\ldots ,q\rbrace , \end{aligned}$$

(8)

where $p_i=P(\lbrace E_{\omega _i} \rbrace )$ and $\pi _i=\varPi (\lbrace E_{\omega _i} \rbrace )$, $\forall i \in \lbrace 1,\ldots ,q\rbrace $. It is possible to search for the most specific possibility distribution verifying (7) and (8). The solution of this problem exists, is unique and can be described as follows. One can define a strict partial order $\mathsf P $ on $\varOmega $ represented by a set of compatible linear extensions $\varLambda (\mathsf P )=\lbrace l_u, u=1,L\rbrace $. To each possible linear order $l_u$, one can associate a permutation $\upsigma _u$ of the set $\lbrace 1,\ldots ,q \rbrace $ such that:

$$\begin{aligned} {\upsigma _{u}}(i)< \upsigma _{u}(j) \Leftrightarrow (\omega _{\upsigma _{u}(i)},\omega _{\upsigma _{u}(j)}) \in l_u, \end{aligned}$$

(9)

The most specific possibility distribution, compatible with the probability distribution $p = (p_1, p_2,\ldots , p_q)$ can then be obtained by taking the maximum over all possible permutations:

$$\begin{aligned} \pi _i=\displaystyle {\max _{u=1,L}}\displaystyle {\sum _{\lbrace j |\upsigma _{u}^{-1}(j)\le \upsigma _{u}^{-1}(i)\rbrace }}p_j \end{aligned}$$

(10)

The permutation $\upsigma $ is a bijection and the reverse transformation $\upsigma ^{-1}$ gives the rank of each $p_i$ in the list of the probabilities sorted in the ascending order. The number $L$ of permutations depends on the duplicated $p_i$ in $p$. It is equal to $1$ if there is no duplicate $p_i, \forall i$ and for this case $\mathsf P $ is a strict linear order on $\varOmega $.

Appendix B: Inferring Possibility Distribution for Classes of Activities

Let $\textit{n}_\textit{k}$ denote the number of observations (activities) of class $k$ in a sample of size $N$. Then, the random vector $n = (n_1, \ldots , \textit{n}_\textit{K})$ can be considered as a multinomial distribution with parameter $p = (p_{1}, p_{2},\ldots , p_{K})$. A confidence region for $p$ at level $1-\alpha $ can be computed using simultaneous confidence intervals as described in [19]. Such a confidence region can be considered as a set of probability distributions.

It is proposed to characterize the probabilities $p = (p_{1}, p_{2},\ldots , p_{K})$ of generating the different classes by simultaneous confidence intervals with a given confidence level $1 - \alpha $. Here, $p_{k}$ represents the probability of generating the class of events $A_{\omega _k}$. From this imprecise specification, a procedure for constructing a possibility distribution is described, insuring that the resulting possibility distribution will dominate the true probability distribution in at least $100(1 - \alpha )$ of the cases.

Since the probabilities $p$ of generating classes are unknown, we can build confidence intervals for each one of them. In interval estimation, a scalar population parameter is typically estimated as a range of possible values, namely a confidence interval, with a given confidence level $1 - \alpha $.

To build confidence intervals for multinomial proportions, it is possible to find simultaneous confidence intervals with a joint confidence level $1 - \alpha $. The method attempts to find a confidence region $\fancyscript{C}_n$ in the parameter space $p = (p_1,\ldots , \textit{p}_\textit{K}) \in [0; 1]^{K}| \displaystyle {\sum \nolimits _{i=1}^{K}} p_i=1$ as the Cartesian product of $K$ intervals $ [p_1^{-} , p_1^{+}] \ldots [\textit{p}_\textit{K}^{-}, \textit{p}_\textit{K}^{+}]$ such that we can estimate the coverage probability with:

$$\begin{aligned} \mathbb {P}(p \in \fancyscript{C}_n)\ge 1-\alpha \end{aligned}$$

(11)

We can use the Goodman [9] formulation in a series of derivations to solve the problem of building the simultaneous confidence intervals.

$$\begin{aligned} A=\chi ^{2}(1-\alpha /K,1)+N \end{aligned}$$

(12)

where $\chi ^{2}(1-\alpha /K,1)$ denotes the quantile of order $1-\alpha /K$ of the chi-square distribution with one degree of freedom, and $N= \displaystyle {\sum \nolimits _{i=1}^{K}} n_i$, denotes the size of the sample. We have also the following quantities:

$$\begin{aligned} B_i=\chi ^{2}(1-\alpha /K,1)+2n_i, \end{aligned}$$

(13)

$$\begin{aligned} C_i=\frac{n_i^{2}}{N}, \end{aligned}$$

(14)

$$\begin{aligned} \varDelta _i=B_i^{2}-4AC_i, \end{aligned}$$

(15)

Finally, for each class of activities $A_{\omega _K}$ the bounds of the confidence intervals are defined as follows:

$$\begin{aligned}{}[p_i^{-}, p_i^{+}]=\left[ \frac{B_i-\varDelta _i^{\frac{1}{2}}}{2A},\frac{B_i+\varDelta _i^{\frac{1}{2}}}{2A}\right] \end{aligned}$$

(16)

It is now possible, based on these above interval-valued probabilities, to compute the most possibility distributions of a class dominating any particular probability measure. Let $\mathsf P $ denote the partial order induced by the intervals $[p_i]=[p_i^{-},p_i^{+}]$:

$$\begin{aligned} (\omega _i,\omega _j) \in \mathsf P \Leftrightarrow p_i^{+} <p_j^{-} \end{aligned}$$

(17)

This partial order may be represented by the set of its compatible linear extensions $\varLambda (\mathsf P )=\lbrace l_u, u=1,L\rbrace $, or equivalently, by the set of the corresponding permutations $\lbrace \upsigma _u, u=1,L\rbrace $. Then, for each possible permutation $\upsigma _u$ associated with each linear order in $\varLambda (\mathsf P )$, and each class $A_{\omega _i}$, we can solve the following linear program:

$$\begin{aligned} \pi _i^{\upsigma _u}=\displaystyle {\max _{p_1,\ldots , p_K}}\displaystyle {\sum _{\lbrace j |\upsigma _{u}^{-1}(j)\le \upsigma _{u}^{-1}(i)\rbrace }}p_j \end{aligned}$$

(18)

under the constraints:

$$\begin{aligned} \left\{ \begin{array}{lll} \displaystyle {\sum _{i=1}^{K}} p_i=1 \\ p_k^{-}\le p_k \le p_k^{+} \quad \forall k \in \lbrace 1,\ldots ,K \rbrace \\ p_{\upsigma _u(1)}\le p_{\upsigma _u(2)}\le \cdots \le p_{\upsigma _u(K)} \end{array} \right. \end{aligned}$$

(19)

Finally, we can take the distribution of the class $A_{\omega _k}$ dominating all the distributions $\pi ^{\upsigma _u}$:

$$\begin{aligned} \pi _i=\displaystyle {\max _{u=1, L}}\pi _i^{\upsigma _u} \quad \forall i \in \lbrace 1,\ldots , K \rbrace \end{aligned}$$

(20)

1.1 Complexity

The complexity of our computational procedure is related to the discover of the possibility degrees of the $K$ classes. To solve this problem, the conceptually simplest approach is to generate all the linear extensions compatible with the partial order induced by the probability intervals, and then to solve the associated linear programs (i.e. Eq. (10)). However, this approach is unfortunately limited to small values of $K$ (e.g., $K < 10$) due to the complexity of the algorithms generating linear extensions of $O(L)$, where $L$ is the number of linear extensions. Even for moderate values of $K$, $L$ can be very large ($K!$ in the worst case) and generating all the linear extensions and solving the linear programs soon becomes intractable. A new formulation of the solution can be derived to reduce considerably the computations. This formulation is based on several steps. First, all the linear programs to be solved will be grouped in different subsets; then, an analytic expression for the best solution in each subset will be given; and lastly, it will be shown that it is not necessary to evaluate the solution for every subset. A simple computational algorithm will be derived (see [19] for more details). The actual complexity might actually be close to $O(|P_i|)$ where $P_i$ denotes the set of indices of the classes with a rank possibly, but not necessarily smaller than $\omega _i$.

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Sarr, I., Ndong, J., Missaoui, R. (2014). Overlaying Social Networks of Different Perspectives for Inter-network Community Evolution. In: Missaoui, R., Sarr, I. (eds) Social Network Analysis - Community Detection and Evolution. Lecture Notes in Social Networks. Springer, Cham. https://doi.org/10.1007/978-3-319-12188-8_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-12188-8_3
Published: 14 January 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12187-1
Online ISBN: 978-3-319-12188-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics