Encyclopedia of Social Network Analysis and Mining

Living Edition
| Editors: Reda Alhajj, Jon Rokne

Community Evolution

  • Stanisław SaganowskiEmail author
  • Piotr Bródka
  • Przemysław Kazienko
Living reference work entry
DOI: https://doi.org/10.1007/978-1-4614-7163-9_223-1

Keywords

Social Network Community Evolution Community Detection Group Evolution Inclusion Measure 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Synonyms

Glossary

SN

Social network

TSN

Temporal social network

Definition

Evolution of a particular social community can be represented as a sequence of events (changes) following each other in the successive timeframes within the temporal social network. In other words, the evolution is described by identified group transformations from time T i to T i+1 (i is the period index).

There are several approaches to definition of possible events in the social group evolution:
  • Asur et al. distinguish five possible events that may happen to groups, i.e., they may dissolve, form, continue, merge, and split (Asur et al. 2007).

  • Pala et al. identify six distinct transformations: growth, contraction, merging, splitting, birth, and death (Palla et al. 2007).

  • Bródka et al. in turn describe seven noticeable event types: continuing, shrinking, growing, splitting, merging, dissolving, and forming (Bródka et al. 2012a).

Some other different taxonomies can be found in Spiliopoulou et al. (2006), Oliveira and Gama (2010), and Takaffoli et al. (2011), but all of them are very similar and actually complete each other. The short description of the most common changes can be found below.

Continue (Asur et al. 2007) and continuing (Bródka et al. 2012) – the community continues its existence when two groups in the consecutive time windows are identical or when two groups differ by only few nodes but their sizes remain the same. Intuitively, continuation happens when two communities are so much similar that it is hard to see any significant differences.

Contraction (Palla et al. 2007) and shrinking (Bródka et al. 2012a) – the community shrinks/contracts when some members have left the group, making its size smaller than in the previous time window. A group can shrink either slightly, losing only few nodes, or greatly, losing most of its members.

Growth (Palla et al. 2007) and growing (Bródka et al. 2012a) – the community grows when some new members have joined the group, making its size bigger than in the previous time window. A group can grow slightly as well as significantly, doubling or even tripling its size.

Split (Asur et al. 2007), splitting (Palla et al. 2007), and splitting (Bródka et al. 2012a) – the community splits into two or more communities in the next time window when few groups from timeframe T i+1 consist of nodes of one group from timeframe T i . Two types of splitting can be distinguished: (1) equal, which means the contribution of the groups in the split group is more or less the same, and (2) unequal, if one of the groups outweighs the others and participates much higher in the split group. In the latter case, the splitting might look similar to shrinking for the biggest group.

Merge (Asur et al. 2007), merging (Palla et al. 2007), and merging (Bródka et al. 2012a) – the community has been created by merging several other groups, when one group from timeframe T i+1 consists of two or more groups from the previous timeframe T i . A merge, just like the split, might be (1) equal, if the contribution of the groups in the merged group is almost the same, or (2) unequal, if one of the groups contributes into the merged group much higher than other groups. For the largest group, the merging looks quite similarly to growing in the case of unequal merging.

Dissolve (Asur et al. 2007), death (Palla et al. 2007), or dissolving (Bródka et al. 2012a) happens when a community ends its life and does not occur in the next time window at all, i.e., its members have vanished or stop maintaining their relationships within the group and scattered among other groups.

Form (Asur et al. 2007), birth (Palla et al. 2007), or forming (Bródka et al. 2012a) of a new community occurs when a group which has not existed in the previous time window T i comes into existence in the next time window T i+1. In some cases, a group can be inactive even over several timeframes. Then, such sequence is treated as a dissolving of the first community and its birth again in the form of the second, new one.

The examples of all events described above are depicted in Fig. 1.
Fig. 1

The events in community evolution (Bródka et al. 2012a)

The whole evolution process for a particular social community combines all changes during its lifetime to a sequence of changes – following events. A simple example of such evolution for only one group is presented in Fig. 2. The community evolution is composed of seven consecutive changes, which have occurred between eight following time windows. At the beginning, group G1 forms itself in T2, i.e., members of G1 have no relations in T1 or their relations are rare. Next, the community grows in T3 by gaining four new nodes. In following timeframe T4, group G1 splits into G2 and G3. By losing one node, group G2 shrinks in T5, while group G3 remains unchanged. Then, a new group G4 forms in T6, while both previous communities G2 and G3 continue their existence. All groups merge into one community G5 in timeframe T7, but in the last timeframe T8, this large group violently dissolves preserving only few relations between its members.
Fig. 2

Changes over time for the single group

Introduction

The continuous interest in the social network area contributes to the fast development of this field. The new possibilities of obtaining and storing data facilitate deeper analysis of the entire social network, extracted social groups, and single individuals as well. One of the most interesting research topic is the network dynamics and dynamics of social groups in particular; it means analysis of group evolution over time. It is the natural step forward after social community extraction. Having communities extracted, appropriate knowledge and methods for dynamic analysis may be applied in order to identify changes as well as to predict the future of all or some selected groups. Furthermore, knowing the most probable change of a given group, some additional steps may be performed in order to change this predicted future according to specific needs. Such ability would be a powerful tool in the hands of human resource managers, personnel recruitment, marketing, telecommunication companies, etc.

To be able to describe evolution of social communities, we need to introduce the general concept of temporal social network.

First of all, a social network itself should be defined. Using a graph representation, a social network SN is a tuple <V, E>, where:

V is a not-empty set of nodes (vertices, actors representing social entities: humans, organizations, departments, etc., also called vertex or members).

E is a set of directed edges (relations between actors called also arcs or connections) where a single edge is represented by a tuple <x,y>, x,yϵV, and xy and for two edges <x, y> and <x’, y’> if x = x’ then yy’.

A temporal social network (TSN), in turn, is a list of the snapshots from the following timeframes T i (time windows, time steps). It means that each timeframe T i is in fact a single social network SN i (V i , E i ), where V i is a set of vertices in the ith timeframe and E i is a set of directed edges existing in the ith timeframe, as follows:
$$ TSN=<{T}_1,{T}_2,\dots, {T}_m>, m\hbox{--} \mathrm{the}\ \mathrm{total}\ \mathrm{number}\ \mathrm{of}\ \mathrm{timeframes} $$
$$ {T}_i={SN}_i\left({V}_i,{E}_i\right), i=1,2,\dots, m $$
$$ {E}_i=< x, y>: x, y\in {V}_i, i=1,2,\dots, m. $$
An example of a temporal social network TSN is presented in Fig. 3. It consists of five timeframes, and each timeframe is a separate social network created from data gathered within the particular interval of time. In the simplest case, one interval starts when the previous one ends; however, in some applications the intervals may overlap each other or even contain the full history of previous timeframes in the aggregated form.
Fig. 3

A temporal social network consisting of five timeframes (Bródka et al. 2012a)

Key Points

Several different approaches for community evolution detection can be distinguished:
  1. 1.

    Detection of static communities in a given timeframe and matching the separately detected communities from the following periods

     
  2. 2.

    Detection of temporal communities also called evolutionary communities’ mining

     
  3. 3.

    Evolutionary clustering, analogous to community mining

     

In the first approach, the data about people relationships (usually based on their activity/behavior) is split into several timeframes forming in consequence a temporal social network. Independently, for each time window, a selected community detection method is used in order to extract social communities. Some group evolution extraction algorithms can operate on the results of one predefined group extraction algorithm like in Palla et al. (2007), while the other methods are independent from the grouping algorithm like Takaffoli et al. (2011) or Bródka et al. (2012a). Next, a certain similarity measure, e.g., autocorrelation function (Palla et al. 2007), Jaccard measure (Greene et al. 2010), or inclusion measure (Bródka et al. 2012a), is utilized to match, which group from a given timeframe T i corresponds to which group in the next timeframe T i+1. Apart from matching groups between the following timeframes, it is also possible to apply clustering on a graph formed by all detected groups at different timeframes (Falkowski et al. 2006) or to calculate similarity between all the groups across all the timeframes (Tajeuna et al. 2015). The last step is to assign a proper change type to describe what happened between a given (T i ) and following snapshot (T i+1).

The second approach also starts with creating a temporal social network, but the community detection phase is different. Instead of identification of regular, static communities for each timeframe separately, some methods to find temporal communities are applied. They detect continuous/stable social communities that last over many timeframes (Sarkar and Moore 2005; Mucha et al. 2010; Kawadia and Sreenivasan 2012; Zygmunt et al. 2012; Xu et al. 2013).

Another approach is evolution of clusters, which aims to find best partition that represents the community structure at time t based on partition at time t – 1 and information about the network at time t. Finding the best partition involves optimization techniques which vary across different methods. Chakrabarti et al. (2006) introduced snapshot quality, Sun et al. (2007) presented encoding cost, and Lin et al. (2008) proposed snapshot cost to find the best partition of the network at given time. Ganti et al. (2002) proposed a change detection framework called FOCUS, where two datasets are compared by computing a deviation measure between them. Spiliopoulou et al. (2006) proposed an event-based framework called MONIC to model and track cluster transitions. They also introduced the concept of cluster matching to simplify the detection and evaluation of the cluster events that occurred. Oliveira and Gama (2010) undertook dilemma of monitoring the transitions experienced by clusters over time by identifying the temporal relationships among them.

The number of methods for tracking the community evolution grows every year, and it will become more and more important to develop a reliable solution to compare these methods. Granell et al. (2015) proposed a benchmark to compare static and dynamic techniques to describe group evolution. They showed that dynamic approaches are more accurate than static ones, but they evaluated only a few methods.

Historical Background

The need to uncover and analyze community evolution derived from two important areas, namely, community detection and social network evolution. Since the well-known paper by Girvan and Newman (2002) about community structure in social networks and their method to detect them was published in 2002, dozens of new methods have appeared each year (Fortunato 2010). At the same time, different scientists struggle to analyze, understand, and model the evolution of networks (Barabasi et al. 2002; Dorogovtsev and Mendes 2003; Kossinets and Watts 2006). Thus, when researchers found out a little about community extraction and entire network evolution, they have started to analyze the evolution of the communities themselves. Chakrabarti et al. (2006), Sun et al. (2007), and Lin et al. (2008) used partitioning to look for changes in the network over time. Kim and Han (2009) used nano-communities to find evolution of communities over time. Palla et al. (2007), Asur et al. (2007), and Bródka et al. (2012) calculate similarity between groups in following timeframes in order to discover community lifetime. Tajeuna et al. (2015) extended calculation of similarity for all the groups across all the timeframes. Xu et al. (2013) followed the contact frequency in the past between the nodes in order to track the group evolution.

Tracking Group Evolution

One area in the social network analysis is to investigate the dynamics of a community, i.e., how a particular group changes over time. To deal with this problem, several methods for tracking group evolution have been proposed. Almost all of them need as the input data the social network with communities already discovered using one of the group extraction methods. Additionally, separate methods for tracking evolution are designed to operate either on disjoint or overlapping groups and some of them are able to process both types. The further discussion provides the basic ideas behind the most recent methods for analysis of social group evolution and a more detailed description of three most popular methods. The summary of most representative methods can be found in Table 1.
Table 1

Methods for group evolution identification

Name/authors

Source

Type of communities

Type of community changes

Idea

Chakrabarti, Kumar, Tomkins

(Chakrabarti et al. 2006)

Disjoint

Snapshot quality and history cost are calculated to obtain total value of partition C t at time t

Kim, Han

(Kim and Han 2009)

Disjoint

Forming, dissolving, growing, shrinking, and drifting

Nodes are connected to their future occurrences and neighbors with links creating nano-communities; the number and density of links determine the type of community change

Mucha, Richardson, Macon, Porter, Onnela

(Mucha et al. 2010)

Disjoint

Multislice generalization of modularity obtained from the Laplacian dynamics is defined on a stacked aggregate network consisting of all snapshots

Takaffoli, Sangi, Fagnan, Zäıane

(Takaffoli et al. 2011)

Disjoint

Split, survive, dissolve, merge, and form

Metacommunity is constructed for each series of similar groups detected by the matching algorithm in different timeframes; then, significant events are identified

Kawadia, Sreenivasan

(Kawadia and Sreenivasan 2012)

Disjoint

Partition distance called estrangement is calculated to find meaningful temporal communities

FacetNet/Lin, Chi, Zhu, Sundaram, Tseng

(Lin et al. 2008)

Disjoint, overlapping

Snapshot cost and history cost are computed to obtain the appropriate partition of the data

GraphScope/Sun, Papadimitriou, Yu, Faloutsos

(Sun et al. 2007)

Disjoint

Partitioning is repeated to get the smallest encoding cost of the graph and put it in the correct segment; jumps between segments denote changes in the graph evolution

Asur, Parthasarathy, Ucar

(Asur et al. 2007)

Disjoint

Form, dissolve, continue, merge, split

Group size and overlap between groups are calculated to assign type of the community change

Palla, Barabási, Vicsek

(Palla et al. 2007)

Overlapping

Birth, death, growth, contraction, merge, split

Groups are separately extracted from the individual timeframes, their following timeframes, and the union of both to find similar communities; the type of community change is manually assigned based on the matching of groups, with their successors and unions

GED/Bródka, Saganowski, Kazienko

(Bródka et al. 2012a)

Disjoint, overlapping

Forming, dissolving, continuing, growing, shrinking, merging, splitting

Inclusion measure is calculated to match similar communities; this measure and the group size determine the type of the community change

CoCE/Xu, Hu, Wang, Ma, Xiao

(Xu et al. 2013)

Disjoint

Birth, death, merging, splitting, growth, contraction

Cumulative stable contact between the nodes is computed to extract the groups and to discover the group changes

Tajeuna, Bouguessa, Wang

(Tajeuna et al. 2015)

Disjoint

Form, dissolve, shrink, expand, split, merge, stable

Matrix of similarity between the groups discovered for all the timeframes is created; based on mutual transition, future group occurrences and types of events are defined

Chakrabarti et al. Method

Chakrabarti et al. presented in their method an original concept for identifying group changes over time (Chakrabarti et al. 2006). Instead of extracting communities for each timeframe and matching them, the authors of the method introduced the snapshot quality to measure the accuracy of the partition C t in relation to the graph formation at time t i . Then, the history cost quantifies the difference between partition C i and partition in the previous timeframe C i−1. The total value of C i is the sum of snapshot quality and history cost at each timeframe. The most valuable partition is the one with the high snapshot quality and low history cost. To obtain C i from C i−1, Chakrabarti et al. used the relative weight cp (tuned by user) to minimize difference between snapshot quality and history cost. Chakrabarti et al. did not consider, whether their method works for overlapping groups.

Kim and Han Method

Kim and Han in their method (Kim and Han 2009) used links to connect nodes at timeframe T i−1 with nodes at timeframe T i , creating nano-communities. The nodes are connected to their future occurrences and to their future neighbors. Next, the authors analyzed the number and density of the links to judge which case of relationship occurs for a given nano-community. Kim and Han defined the most common changes, which are evolving, forming, and dissolving. Evolving of a group can be distinguished into three different cases: growing, shrinking, and drifting. Community C i grows between timeframes T i and T i+1, if there is a group C i+1 in the following timeframe T i+1 containing all nodes from C i . Group C i+1 may, of course, contain additional nodes, which are not present in C i . In opposite, community C i shrinks between timeframes T i and T i+1 when there is a group C i+1 in the next timeframe T i+1, whose all nodes are within C i . Finally, group C i is drifting between timeframes T i and T i+1, if there is group C i+1 in the following timeframe T i+1, which has at least one node common with C i . Kim and Han did not specify, if the method is designed for overlapping or disjoint groups, but the drifting event suggests that the method will not work correctly for overlapping groups.

CoCE

A method by Xu et al. called CoCE (Xu et al. 2013) aims to find the evolution of stable communities. The method first calculates the number of interactions between nodes over time (cumulative stable contact, CSC measure). Nodes with the CSC greater than the threshold are joined together as the community core. Next, the remaining nodes are added to the community cores, based on the shortest distance, to form groups. Two nodes are also considered as a community.

In the following timeframes, when nodes and links are added and removed from the network, the CSC is calculated to decide if a group change occurred. Removing a link can cause splitting, contraction, or death. Adding a link can cause birth, merging, or growth.

The authors did not mention which types of the groups can be used with the method. In the experiments two datasets with disjoined groups were used.

FacetNet

Lin et al. used evolutionary clustering to create FacetNet (Lin et al. 2008), a framework allowing members to be a part of more than one community in a given timeframe. In contrast to Chakrabarti et al. method, Lin et al. used the snapshot cost and not the snapshot quality to calculate adequate of the partition to the data. Kullback-Leibler method (Kullback and Leibler 1951) has been used for counting snapshot cost and history cost. Based on the results of FacetNet, it is easier to follow what happens with particular nodes, rather than what happens with a group in general. The algorithm is not assigning any events, but the user can analyze results and assign events on his own. Unfortunately, FacetNet is unable to catch forming and dissolving events.

Tajeuna et al. Method

Tajeuna et al. with their method (Tajeuna et al. 2015) try to improve the methods which are looking for the match between communities only in the consecutive timeframes. In order to achieve that, the groups are discovered for all the timeframes, and the matrix of similarity between all of the discovered groups is created. Each community has then a vector of similarity with other communities. Groups are matched if the correlation between their representative vectors is above a threshold. This correlation is called mutual transition. The authors also proposed a convenient method to identify the optimal threshold value. It is unclear if the method handles overlapping groups. The experiments were conducted on three datasets with disjoined groups.

GraphScope

Sun et al. presented a parameter-free method called GraphScope (Sun et al. 2007). At the first step, partitioning is repeated until the smallest encoding cost for a given graph is found. Subsequent graphs are stored in the same segment S i , if the encoding cost is similar. When the examined graph G has higher encoding cost than encoding cost of segment S i , graph G is placed to segment S i+1. Jumps between segments mark change-points in graph evolution over time. The main goal of this method is to work with a streaming dataset, i.e., the method has to detect new communities in the network and to decide if the structure of the already existing communities should be changed in the database.

Asur et al. Method

The method by Asur et al. is a simple and intuitive approach for investigating community evolution over time (Asur et al. 2007). The group size and overlap are compared for every possible pair of groups in the consecutive timeframes and events involving those groups are assigned. If none of the nodes of the community from timeframe T i occur in the following timeframe T i+1, the method by Asur et al. describes such case as dissolve of the group:
$$ \mathrm{Dissolve}\left({C}_i^k\right)=1\ \mathrm{iff}\ \exists\ \mathrm{no}\ {C}_{i+1}^j\ \mathrm{such}\ \mathrm{that}\ {V}_i^k\cap {V}_{i+1}^j>1 $$
where

\( {C}_i^k \) – community number k in timeframe T i

\( {V}_i^k \) – the set of the vertex (nodes) of the community number k in timeframe T i

In opposite to dissolve, if none of the nodes of the community from timeframe T i+1 were present in the previous timeframe T i , the group is marked as newborn:
$$ \mathrm{Form}\left({C}_{i+1}^k\right)=1\ \mathrm{iff}\ \exists\ \mathrm{no}\ {C}_i^j\ \mathrm{such}\ \mathrm{that}\ {V}_{i+1}^k\cap {V}_i^j>1 $$
A community continue its existence if an identical occurrence of the group in the consecutive timeframe T i+1 is found:
$$ \mathrm{Continue}\left({C}_i^k,{C}_{i+1}^j\right)=1\ \mathrm{iff}\ {V}_i^k={V}_{i+1}^j $$
A situation, when two considered communities from the timeframe T i overlap with more than κ% nodes of another single group in the following timeframe T i+1, is called a merge:
$$ \mathrm{Merge}\left({C}_i^k,{C}_i^l,\kappa \right)=1\ \mathrm{iff}\ \exists\ {C}_{i+1}^j\ \mathrm{such}\ \mathrm{that}\frac{\left|\left({V}_i^k\cup {V}_i^l\right)\cap {V}_{i+1}^j\right|}{\mathrm{Max}\left(\left|{V}_i^k\cup {V}_i^l\right|,\left|{V}_{i+1}^j\right|\right)}>\kappa \%\ \mathrm{and}\ \left|{V}_i^k\cap {V}_{i+1}^j\right|\times >\frac{\left|{C}_i^k\right|}{2}\mathrm{and}\left|{V}_i^l\cap {V}_{i+1}^j\right|>\frac{\left|{C}_i^l\right|}{2} $$
An opposite case is marked as a split, when two groups from the following timeframe T i+1 joint together overlap in more than κ% with another single group from the previous timeframe T i :
$$ \mathrm{Split}\left({C}_i^j,\kappa \right)=1\ \mathrm{iff}\ \exists\ {C}_{i+1}^k,{C}_{i+1}^l\mathrm{such}\ \mathrm{that}\ \frac{\left|\left({V}_{i+1}^k\cup {V}_{i+1}^l\right)\cap {V}_i^j\right|}{\mathrm{Max}\left(\left|{V}_{i+1}^k\cup {V}_{i+1}^l\right|,\left|{V}_i^j\right|\right)}>\kappa \%\ \mathrm{and}\ \left|{V}_{i+1}^k\cap {V}_i^j\right| \times >\frac{\left|{C}_{i+1}^k\right|}{2}\ \mathrm{and}\ \left|{V}_{i+1}^l\cap {V}_i^j\right|>\frac{\left|{C}_{i+1}^l\right|}{2} $$
The authors of the method suggested 30% or 50% as a value for κ threshold. Examples of the events described by Asur et al. are presented in Fig. 4. Communities \( {C}_1^1 \) and \( {C}_1^2 \) continue between timeframes 1 and 2 and then they merge into one community \( {C}_3^1 \) in timeframe 3. In timeframe 4, community \( {C}_3^1 \) splits into three other groups \( {C}_4^1 \), \( {C}_4^2 \), and \( {C}_4^3 \); next, in timeframe 5, a new community \( {C}_5^4 \) forms, and finally in timeframe 6, the biggest community \( {C}_5^1 \) dissolves.
Fig. 4

Possible group evolution by Asur et al. (Figure from Asur et al. (2007))

The method proposed by Asur et al. allows also to investigate behavior of individual members within the community lifetime. A node can appear or disappear in/from the network and also join or leave a particular community.

Unfortunately, Asur et al. did not specify which method should be used for community detection nor if the method works for overlapping groups.

Palla et al. Method

Palla et al. used in their method all advantages of the clique percolation method (CPM) (Palla 2005) for tracking social group evolution (Palla et al. 2007). Social networks from two consecutive timeframes T i and T i+1 are merged into a single graph Q(T i , T i+1) and its groups are extracted using the CPM method. Next, the communities from timeframes T i and T i+1, which are the part of the same group from the joint graph Q(T i , T i+1), are considered to be matching, i.e., the community from timeframe T i+1 is treated as an evolution of the community from timeframe T i . It is quite common that more than two communities are contained in the same group from the joint graph (Fig. 5b, c). In such a case, matching is performed based on the value of their relative overlap sorted in the descending order. The overlap is calculated as follows:
$$ O\left({C}_1,{C}_2\right)=\frac{\left|{C}_1\cap {C}_2\right|}{\left|{C}_1\cup {C}_2\right|} $$
Fig. 5

Most common scenarios in the group evolution by Palla et al. The groups at timeframe T i are marked with blue, the groups at timeframe T i+1 are marked with yellow, and the groups in the joint graph are marked with green. (a) A group continue its existence, (b) the dark blue group swallows the light blue, (c) the yellow group is detached from the orange one (Figure from Palla et al. (2007))

where:

|C1C2| – the number of common nodes in the communities C1 and C2

|C1C2| – the number of nodes in the union of the communities C1 and C2

However, the authors of the method did not explain how to choose the best match for the community, which in next timeframe T i+1 has the highest overlap with two different groups.

Palla et al. proposed several event types between groups: growth, contraction, merge, split, birth, and death, but no algorithm to identify these types has been provided. The biggest disadvantage of the method by Palla et al. is that it has to be run with CPM; no other method for community evolution can be used. Despite some lacks, the method is considered the best algorithm tracking evolution for overlapping groups.

GED: Group Evolution Discovery

Yet another method to discover group evolution in the social network was called GED (group evolution discovery) (Bródka et al. 2012a). The most important component of this method is a measure called inclusion. This measure allows to evaluate the inclusion of one group in another. Therefore, inclusion I(G1, G2) of group G1 in group G2 is calculated as follows:
$$ I\left({G}_1,{G}_2\right)=\overset{\mathrm{group}\ \mathrm{quantity}}{\overbrace{\frac{\left|{G}_1\cap {G}_2\right|}{\left|{G}_1\right|}}}\cdot \underset{\mathrm{group}\ \mathrm{quality}}{\underbrace{\frac{\sum_{x\in \left({G}_1\cap {G}_2\right)}{NI}_{G_1}(x)}{\sum_{x\in \left({G}_1\right)}{NI}_{G_1}(x)}}}, $$

where \( {NI}_{G_1}(x) \) is the value reflecting importance of node x in group G1.

Any metric, which indicates the member position within the community, can be used as node importance measure \( {NI}_{G_1}(x), \)e.g., centrality degree, betweenness degree, page rank, social position, etc. The second factor in the above equation would have to be adapted accordingly to the selected measure.

The GED method, used to discover group evolution, respects both the quantity and quality of the group members. The quantity is reflected by the first part of the inclusion measure, i.e., what portion of the members from group G1 is in group G2, whereas the quality is expressed by the second part of the inclusion measure, namely, what contribution of important members from group G1 is in G2. It provides a balance between the groups that contain many of the less important members and groups with only few but key members. A complete procedure for GED can be found in Bródka et al. (2012a), whereas studies on influence of timeframe type and size are available in Saganowski et al. (2012).

The procedure for the group evolution discovery (GED) method is as follows:

GED: Group Evolution Discovery Method

Input: Temporal social network TSN, in which groups are extracted by any community detection algorithm separately for each timeframe T i and any user importance measure is calculated for each group.
  1. 1.

    For each pair of groups <G1, G2> in consecutive timeframes T i and T i+1, inclusion I(G1, G2) for G1 in G2 and I(G2, G1) for G2 in G1 is computed according to Eq. 3.

     
  2. 2.
    Based on both inclusions I(G1, G2) and I(G2, G1) and sizes of both groups, only one type of event may be identified:
    1. (a)

      Continuing:I(G1, G2) ≥ α and I(G2, G1) ≥ β and |G1| = |G2|

       
    2. (b)

      Shrinking:I(G1, G2) ≥ α and I(G2, G1) ≥ β and |G1| > |G2| OR I(G 1 ) < α and I(G2, G1) ≥ β and |G1| ≥ |G2| OR I(G1, G2) ≥ α and I(G2, G1) < β and |G1| ≥ |G2| and there is only one match between G2 and groups in the previous time window T i

       
    3. (c)

      Growing:I(G1, G2) ≥ α and I(G2, G1) ≥ β and |G1| < |G2| OR I(G1, G2) ≥ α and I(G2, G1) < β and |G1| ≤ |G2| OR I(G1, G2) < α and I(G2, G1) ≥ β and |G1| ≤ |G2| and there is only one match between G1 and groups in the next time window T i+1

       
    4. (d)

      Splitting:I(G1, G2) < α and I(G2, G1) ≥ β and |G1| ≥ |G2| OR I(G1, G2) ≥ α and I(G2, G1) < β and |G1| ≥ |G2| and there is more than one match between G 1 and groups in the next time window T i+1

       
    5. (e)

      Merging:I(G1, G2) ≥ α and I(G2, G1) < β and |G1| ≤ |G2| OR I(G1, G2) < α and I(G2, G1) ≥ β and |G1| ≤ |G2| and there is more than one match between G 2 and groups in the previous time window T i

       
    6. (f)

      Dissolving: for G1 in T i and each group G2 in T i+1I(G 1 ) < 10% and I(G 2 ) < 10%

       
    7. (g)

      Forming: for G2 in T i+1 and each group G1 in T i I(G 1 ) < 10% and I(G 2 ) < 10%

       
     
The general scheme, which facilitates understanding of the event selection (identification) for the pair of groups in the GED method, is presented in Fig. 6.
Fig. 6

The decision tree for assigning the event type to a pair of groups

Constants α and β are the GED method parameters, which can be used to adjust the method to the particular social network and community detection method.

For example, if both α and β will be set to 70% and there are two identical groups G2 and G2 in timeframes T5 and T6, respectively (see Fig. 2), the inclusion measures I(G1, G2) and I(G2, G1) will be equal 100%. Since the size of the groups is the same, continuing event between those groups is assigned. Another example is three groups G1, G2, and G3 in timeframes T3 and T4, respectively (see Fig. 2), the inclusion measures I(G1, G2) = 67%, I(G2, G1) = 100%, I(G1, G2) = 33%, and I(G2, G1) = 100%. And since G1 is bigger than G2 and G3 and there is more than one match between G1 and groups in the next time window, i.e., G2 and G3, a splitting event between G1 and G2 plus G1 and G3 is assigned.

Key Applications

Detection of social group evolution is one of the crucial components of dynamic analysis of social networks. Comparison of various social groups statements enables identification of key factors that influence group evolution. It helps, for example, to answer the following question: do small groups evolve similarly as big ones?

Additionally, having changes identified some predictive models may be created in order to forecast what is most likely to happen with a certain community in the following period (Bródka et al. 2012b; Saganowski et al. 2015). Quantification of changes facilitates comparison of communities existing in various populations, e.g., among users of different services or group dynamics in different periods (this year compared to the previous one).

The possible applications span beyond typical online social networks to the analysis of group formation and evolution in face-to-face contact networks (Atzmueller et al. 2014) or Linux operating system network (Xiao et al. 2017).

Future Directions

Further research in the field of social community evolution will probably focus on extraction of useful group evolution patterns as well as analysis not only single changes between two following timeframes but long-term series of changes.

Cross-References

Notes

Acknowledgments

This work was partially supported by Wrocław University of Science and Technology statutory funds and the Polish National Science Centre, decisions no. 2013/09/B/ST6/02317.

References

  1. Asur S, Parthasarathy S, Ucar D (2007) An event-based framework for characterizing the evolutionary behavior of interaction graphs. KDD ’07 Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 913–921 San Jose, California, USA—August 12–15, 2007 ACM New York, NY, USA 2007Google Scholar
  2. Atzmueller M, Ernst A, Krebs F, Scholz C, Stumme G (2014) Formation and temporal evolution of social groups during coffee breaks. September 15th, 2014 - Nancy, France.Google Scholar
  3. Barabasi AL, Jeong H, Neda Z, Ravasz E, Schubert A, Vicsek T (2002) Evolution of the social network of scientific collaborations. Phys A 311:590–614MathSciNetCrossRefzbMATHGoogle Scholar
  4. Bródka P, Saganowski S, Kazienko P (2012a) GED: the method for group evolution discovery in social networks. Soc Netw Anal Min. doi:10.1007/s13278-012-0058-8Google Scholar
  5. Bródka P, Kazienko P, Kołoszczyk B (2012b) Predicting group evolution in the social network. In: Social informatics, Lecturer notes computer science. Springer, Berlin/HeidelbergGoogle Scholar
  6. Chakrabarti D, Kumar R, Tomkins A (2006) Evolutionary clustering. KDD '06 Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining Pages 554–560 Philadelphia, PA, USA — August 20–23, 2006 ACM New York, NY, USA ©2006Google Scholar
  7. Dorogovtsev SN, Mendes JFF (2003) Evolution of networks: from biological nets to the internet and WWW. Oxford University Press, OxfordCrossRefzbMATHGoogle Scholar
  8. Falkowski T, Bartelheimer J, Spiliopoulou M (2006) Mining and visualizing the evolution of subgroups in social networks. In: Proceedings of the 2006 IEEE/WIC/ACM international conference on web intelligence (WI ‘06)(Hong Kong, China 18–22 December 2006), pp 52–58Google Scholar
  9. Fortunato S (2010) Community detection in graphs. Phys Rep 486(3–5):75–174MathSciNetCrossRefGoogle Scholar
  10. Ganti V, Gehrke J, Ramakrishnan R, Loh W-Y (2002) A framework for measuring differences in data characteristics. J Comput Syst Sci 64:542–578MathSciNetCrossRefzbMATHGoogle Scholar
  11. Girvan M, Newman MEJ (2002) Community structure in social and biological networks. Proc Natl Acad Sci U S A 99(12):7821–7826MathSciNetCrossRefzbMATHGoogle Scholar
  12. Granell C, Darst RK, Arenas A, Fortunato S, Gómez S (2015) Benchmark model to assess community structure in evolving networks. Phys Rev E 92(1):012805CrossRefGoogle Scholar
  13. Greene D, Doyle D, Cunningham P (2010) Tracking the evolution of communities in dynamic social networks. In: Proceedings of the international conferences on advances in social network analysis and mining (ASONOM) Odense, 9–11 August 2010), ACM, pp 176–183Google Scholar
  14. Kawadia V, Sreenivasan S (2012) Online detection of temporal communities in evolving networks by estrangement confinement, arXiv:1203.5126v1Google Scholar
  15. Kim MS, Han J (2009) A particle-and-density based evolutionary clustering method for dynamic networks. In: Proceedings of the VLDB‘09 Lyon, 24–28 Aug 2009. France endowment, ACM, pp 622–633Google Scholar
  16. Kossinets G, Watts DJ (2006) Empirical analysis of an evolving social network. Science 311:88–90MathSciNetCrossRefzbMATHGoogle Scholar
  17. Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22:49MathSciNetCrossRefzbMATHGoogle Scholar
  18. Lin YR, Chi Y, Zhu S, Sundaram H, Tseng BL (2008) Facetnet: a framework for analyzing communities and their evolutions in dynamic networks. WWW '08 Proceedings of the 17th international conference on World Wide Web Pages 685–694 Beijing, China — April 21–25, 2008 ACM New York, NY, USA ©2008Google Scholar
  19. Mucha PJ, Richardson T, Macon K, Porter MA, Onnela J-P (2010) Community structure in time-dependent, multiscale, and multiplex networks. Science 328(5980):876–878MathSciNetCrossRefzbMATHGoogle Scholar
  20. Oliveira MCM, Gama J (2010) Bipartite graphs for monitoring clusters transitions. In: Proceedings of the 9th international conference on intelligent data analysis. Springer, Berlin, pp 114–124Google Scholar
  21. Palla G, Barabási AL, Vicsek T (2007) Quantifying social group evolution. Nature 446:664–667CrossRefGoogle Scholar
  22. Palla G, Derényi I, Farkas I, Vicsek T (2005) Uncovering the overlapping community structure of complex networks in nature and society. Nature 435:814–818Google Scholar
  23. Saganowski S, Bródka P, Kazienko P (2012) Influence of the dynamic social network timeframe type and size on the group evolution discovery. In: Istanbul, Turkey 26–29 August 2012, IEEE Computer Society, pp 678–682Google Scholar
  24. Saganowski S, Gliwa B, Bródka P, Zygmunt A, Kazienko P, Koźlak J (2015) Predicting community evolution in social networks. Entropy 17(5):3053–3096CrossRefGoogle Scholar
  25. Sarkar P, Moore AW (2005) Dynamic social network analysis using latent space models. SIGKDD Explor Newsl 7:31–40CrossRefGoogle Scholar
  26. Spiliopoulou M, Ntoutsi I, Theodoridis Y, Schult R (2006) Monic: modeling and monitoring cluster transitions. KDD '06 Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining Pages 706–711 Philadelphia, PA, USA — August 20–23, 2006 ACM New York, NY, USA ©2006Google Scholar
  27. Sun J, Papadimitriou S, Yu PS, Faloutsos C (2007) GraphScope: parameter-free mining of large time-evolving graphs. In: Proceedings of the 13th ACM SIGKDD international conferences on knowledge discovery and data mining (KDD). ACM, New York, pp 687–696CrossRefGoogle Scholar
  28. Tajeuna EG, Bouguessa M, Wang S (2015) Tracking the evolution of community structures in time-evolving social networks. In: Proceedings of the 2015 I.E. international conference on data science and advanced analytics (IEEE DSAA). IEEE, Piscataway, pp 1–10CrossRefGoogle Scholar
  29. Takaffoli M, Sangi F, Fagnan J, Zäıane OR (2011) Community evolution mining in dynamic social networks. Procedia Soc Behav Sci 22:49–58CrossRefGoogle Scholar
  30. Xiao G, Zheng Z, Wang H (2017) Evolution of Linux operating system network. Phys A Stat Mech Appl 466:249–258CrossRefGoogle Scholar
  31. Xu H, Hu Y, Wang Z, Ma J, Xiao W (2013) Core-based dynamic community detection in mobile social networks. Entropy 15:5419–5438CrossRefzbMATHGoogle Scholar
  32. Zygmunt A, Bródka P, Kazie nko P, Koźlak J (2012) Key person analysis in social communities within the blogosphere. J Univ Comput Sci 18(4):577–597Google Scholar

Copyright information

© Springer Science+Business Media LLC 2017

Authors and Affiliations

  • Stanisław Saganowski
    • 1
    Email author
  • Piotr Bródka
    • 1
  • Przemysław Kazienko
    • 1
  1. 1.Department of Computational IntelligenceWrocław University of Science and TechnologyWrocławPoland

Section editors and affiliations

  • Huan Liu
    • 1
  • Lei Tang
    • 2
  1. 1.Arizona State UniversityTempeUSA
  2. 2.Chief Data Scientist, Clari Inc.SunnyvaleUSA