Abstract
In many real-life social networks, a group of individuals may be involved in multiple kinds of activities such as professional, leisure and friendship ones. Even though individuals may belong to a social network with a very precise type of links such as professional ties in LinkedIn, the interactions that may happen in other social networks such as Facebook are not reflected in the original network. We believe that overlaying networks with various types of links helps discover interesting patterns. The objective of this paper is then to overlay two or many social networks with different kinds of social activities in order to unveil homogeneous groups that could not appear in a unique social network. To that end, we propose a community detection approach based on possibility theory, which identifies time-based perspective communities for each kind of social activities that occur within a sequence of time windows. Furthermore, different perspectives are layered to detect communities that may belong to several networks in a given time period. Discovered communities in a given network for a time period can be perceived as views or perspectives in one or many networks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Backstrom L, Huttenlocher D, Kleinberg J, Lan X (2006) Group formation in large social networks: membership, growth, and evolution. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, KDD’06, pp 44–54
Bekkerman R, McCallum A (2005) Disambiguating web appearances of people in a social network. In: WWW, pp 463–470
Bródka P, Saganowski S, Kazienko P (2012) Ged: the method for group evolution discovery in social networks. Soc Netw Anal Min 3(1):1–14
Crandall D, Cosley D, Huttenlocher D, Kleinberg J, Suri S (2008) Feedback effects between similarity and social influence in online communities. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, KDD’08, ACM, pp 160–168
Dubois D, Prade H, Sandri S (1991) On possibility/probability transformations. In: Proceedings of the fourth international fuzzy systems association world congress (IFSA’91), Brussels, Belgium, pp 50–53
Fortunato S (2010) Community detection in graphs. Phys Rep 486(3–5):75–174
Girvan M, Newman MEJ (2002) Community structure in social and biological networks. Proc Natl Acad Sci USA 99(12):7821–7826
Goldberg MK, Magdon-Ismail M, Thompson J (2012) Identifying long lived social communities using structural properties. In: ASONAM, pp 647–653
Goodman LA (1965) On simultaneous confidence intervals for multinomial proportions. Technometrics 7(2):247–254
Karnstedt M, Hennessy T, Chan J, Hayes C (2010) Churn in social networks: a discussion boards case study. In: Proceedings of the 2010 IEEE second international conference on social computing, SOCIALCOM’10. IEEE Computer Society, pp 233–240
Kashoob S, Caverlee J (2012) Temporal dynamics of communities in social bookmarking systems. Soc Netw Anal Min 2(4):387–404
Kautz HA, Selman B, Shah MA (1997) The hidden web. AI Mag 18(2):27–36
Kautz HA, Selman B, Shah MA (1997) Referral web: combining social networks and collaborative filtering. Commun ACM 40(3):63–65
Lakkaraju H, McAuley J, Leskovec J (2013) What’s in a name? Understanding the interplay between titles, content, and communities in social media. In: Seventh international AAAI conference on weblogs and social media. AAAI Publications
Lancichinetti A, Fortunato S, Kertsz J (2009) Detecting the overlapping and hierarchical community structure in complex networks. New J Phys 11(3):033015
Leskovec J, Kleinberg J, Faloutsos C (2007) Graph evolution: densification and shrinking diameters. ACM Trans Knowl Discov Data 1(1):2
Leskove J, Huttenlocher D, Kleinberg J (2010) Predicting positive and negative links in online social networks. In: Proceedings of the 19th international conference on world wide web, WWW’10, pp 641–650
Marsden PV (2005) Recent developments in network measurement. In: Carrington PJ, Scott J, Wasserman S (eds) Models and methods in social network analysis. Cambridge University Press, New York, pp 8–30
Masson MH, Denoeux T (2006) Inferring a possibility distribution from empirical data. In: Proceedings of fuzzy sets and systems, pp 319–340
Matsuo Y, Mori J, Hamasaki M, Ishida K, Nishimura T, Takeda H, Hasida K, Ishizuka M (2006) Polyphonet: an advanced social network extraction system from the web. In: Proceedings of the 15th international conference on world wide web, WWW’06. ACM, pp 397–406
Newman MEJ (2004) Detecting community structure in networks. Eur Phys J B-Condens Matter Complex Syst 38(2):321–330
Newman MEJ (2004) Fast algorithm for detecting community structure in networks. Phys Rev E 69(6):066133
Palla G, Barabasi AL, Vicsek T (2007) Quantifying social group evolution. Nature 446:664–667
Scott J (1991) Social network analysis: a handbook. Sage, London
Sun Y, Han J (2012) Mining heterogeneous information networks: principles and methodologies. Synthesis lectures on data mining and knowledge discovery. Morgan and Claypool Publishers, San Rafael
Tantipathananandh C, Wolf TB, Kempe D (2007) A framework for community identification in dynamic social networks. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, KDD’07. ACM, pp 717–726
Toivonen R, Kovanen L, Kivel M, Onnela JP, Saramki J, Kaski K (2009) A comparative study of social network models: network evolution models and nodal attribute models. Soc Netw 31(4):240–254
Wei F, Qian W, Wang C, Zhou A (2009) Detecting overlapping community structures in networks. World Wide Web 12:235–261
Zadeh LA (1978) Fuzzy sets as a basis for a theory of possibility. In: Fuzzy sets and systems, pp 3–28
Zadeh LA (1965) Fuzzy sets. Inf Control 8:338–353
Acknowledgments
The third author acknowledges the financial support of the Natural Sciences and Engineering Research Council of Canada (NSERC).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
Appendix A: Inferring Possibility Distribution from Probability Distribution
A consistency principle between probability and possibility can be stated in a non-formal way [29]: “what is probable should be possible”. This requirement can then be translated via the inequality:
where \(P\) and \(\varPi \) are, respectively, a probability and a possibility measure on the domain \(\varOmega \). In this case, \(\varPi \) is considered as dominating \(P\).
Transforming a probability measure into a possibilistic one then amounts to choosing a possibility measure in the set of possibility measures dominating \(P\). This should be done, by adding a strong order preservation constraint, which ensures the preservation of the shape of the distribution:
where \(p_i=P(\lbrace E_{\omega _i} \rbrace )\) and \(\pi _i=\varPi (\lbrace E_{\omega _i} \rbrace )\), \(\forall i \in \lbrace 1,\ldots ,q\rbrace \). It is possible to search for the most specific possibility distribution verifying (7) and (8). The solution of this problem exists, is unique and can be described as follows. One can define a strict partial order \(\mathsf P \) on \(\varOmega \) represented by a set of compatible linear extensions \(\varLambda (\mathsf P )=\lbrace l_u, u=1,L\rbrace \). To each possible linear order \(l_u\), one can associate a permutation \(\upsigma _u\) of the set \(\lbrace 1,\ldots ,q \rbrace \) such that:
The most specific possibility distribution, compatible with the probability distribution \(p = (p_1, p_2,\ldots , p_q)\) can then be obtained by taking the maximum over all possible permutations:
The permutation \(\upsigma \) is a bijection and the reverse transformation \(\upsigma ^{-1}\) gives the rank of each \(p_i\) in the list of the probabilities sorted in the ascending order. The number \(L\) of permutations depends on the duplicated \(p_i\) in \(p\). It is equal to \(1\) if there is no duplicate \(p_i, \forall i\) and for this case \(\mathsf P \) is a strict linear order on \(\varOmega \).
Appendix B: Inferring Possibility Distribution for Classes of Activities
Let \(\textit{n}_\textit{k}\) denote the number of observations (activities) of class \(k\) in a sample of size \(N\). Then, the random vector \(n = (n_1, \ldots , \textit{n}_\textit{K})\) can be considered as a multinomial distribution with parameter \(p = (p_{1}, p_{2},\ldots , p_{K})\). A confidence region for \(p\) at level \(1-\alpha \) can be computed using simultaneous confidence intervals as described in [19]. Such a confidence region can be considered as a set of probability distributions.
It is proposed to characterize the probabilities \(p = (p_{1}, p_{2},\ldots , p_{K})\) of generating the different classes by simultaneous confidence intervals with a given confidence level \(1 - \alpha \). Here, \(p_{k}\) represents the probability of generating the class of events \(A_{\omega _k}\). From this imprecise specification, a procedure for constructing a possibility distribution is described, insuring that the resulting possibility distribution will dominate the true probability distribution in at least \(100(1 - \alpha )\) of the cases.
Since the probabilities \(p\) of generating classes are unknown, we can build confidence intervals for each one of them. In interval estimation, a scalar population parameter is typically estimated as a range of possible values, namely a confidence interval, with a given confidence level \(1 - \alpha \).
To build confidence intervals for multinomial proportions, it is possible to find simultaneous confidence intervals with a joint confidence level \(1 - \alpha \). The method attempts to find a confidence region \(\fancyscript{C}_n\) in the parameter space \(p = (p_1,\ldots , \textit{p}_\textit{K}) \in [0; 1]^{K}| \displaystyle {\sum \nolimits _{i=1}^{K}} p_i=1\) as the Cartesian product of \(K\) intervals \( [p_1^{-} , p_1^{+}] \ldots [\textit{p}_\textit{K}^{-}, \textit{p}_\textit{K}^{+}]\) such that we can estimate the coverage probability with:
We can use the Goodman [9] formulation in a series of derivations to solve the problem of building the simultaneous confidence intervals.
where \(\chi ^{2}(1-\alpha /K,1)\) denotes the quantile of order \(1-\alpha /K\) of the chi-square distribution with one degree of freedom, and \(N= \displaystyle {\sum \nolimits _{i=1}^{K}} n_i\), denotes the size of the sample. We have also the following quantities:
Finally, for each class of activities \(A_{\omega _K}\) the bounds of the confidence intervals are defined as follows:
It is now possible, based on these above interval-valued probabilities, to compute the most possibility distributions of a class dominating any particular probability measure. Let \(\mathsf P \) denote the partial order induced by the intervals \([p_i]=[p_i^{-},p_i^{+}]\):
This partial order may be represented by the set of its compatible linear extensions \(\varLambda (\mathsf P )=\lbrace l_u, u=1,L\rbrace \), or equivalently, by the set of the corresponding permutations \(\lbrace \upsigma _u, u=1,L\rbrace \). Then, for each possible permutation \(\upsigma _u\) associated with each linear order in \(\varLambda (\mathsf P )\), and each class \(A_{\omega _i}\), we can solve the following linear program:
under the constraints:
Finally, we can take the distribution of the class \(A_{\omega _k}\) dominating all the distributions \(\pi ^{\upsigma _u}\):
1.1 Complexity
The complexity of our computational procedure is related to the discover of the possibility degrees of the \(K\) classes. To solve this problem, the conceptually simplest approach is to generate all the linear extensions compatible with the partial order induced by the probability intervals, and then to solve the associated linear programs (i.e. Eq. (10)). However, this approach is unfortunately limited to small values of \(K\) (e.g., \(K < 10\)) due to the complexity of the algorithms generating linear extensions of \(O(L)\), where \(L\) is the number of linear extensions. Even for moderate values of \(K\), \(L\) can be very large (\(K!\) in the worst case) and generating all the linear extensions and solving the linear programs soon becomes intractable. A new formulation of the solution can be derived to reduce considerably the computations. This formulation is based on several steps. First, all the linear programs to be solved will be grouped in different subsets; then, an analytic expression for the best solution in each subset will be given; and lastly, it will be shown that it is not necessary to evaluate the solution for every subset. A simple computational algorithm will be derived (see [19] for more details). The actual complexity might actually be close to \(O(|P_i|)\) where \(P_i\) denotes the set of indices of the classes with a rank possibly, but not necessarily smaller than \(\omega _i\).
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Sarr, I., Ndong, J., Missaoui, R. (2014). Overlaying Social Networks of Different Perspectives for Inter-network Community Evolution. In: Missaoui, R., Sarr, I. (eds) Social Network Analysis - Community Detection and Evolution. Lecture Notes in Social Networks. Springer, Cham. https://doi.org/10.1007/978-3-319-12188-8_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-12188-8_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12187-1
Online ISBN: 978-3-319-12188-8
eBook Packages: Computer ScienceComputer Science (R0)