An Enhanced Particle Swarm Optimization Based on Physarum Model for Community Detection
- 1 Citations
- 1.8k Downloads
Abstract
Community detection, an effective tool to analyze and understand network data, has been paid more and more attention in recent years. One of the most popular methods of detecting community structure is to find the division with the maximal modularity. However, the modularity maximization is an NP-complete problem. In the field of swarm intelligence algorithm, particle swarm optimization (PSO) has been widely used to solve such NP-complete problem. Nevertheless, premature convergence and lower accuracy limit its performance in community detection. In order to overcome these shortcomings, this paper proposes a novel PSO called P-PSO for community detection through combining the computational ability of Physarum, a kind of slime. The proposed algorithm improves the efficiency of PSO by recognizing inter-community edges based on Physarum-inspired network model (PNM). Experiments in eight networks show that the proposed algorithm is effective and promising for community detection, compared with other algorithms.
Keywords
Community detection PSO Physarum network model1 Introduction
Complex networks have numerous characteristics, among which the community structure is an important one. Community detection, a powerful tool to discover community structures, has a wide application prospect, like predicting protein functions [1] and analyzing the information dissemination [2].
In the past few decades, a large number of algorithms have been proposed for community detection. They can be classified into optimization algorithm and heuristic algorithm. Meanwhile, a modularity measure Q [3] is proposed to evaluate the quality of community divisions, which has been widely used. It has been proved that swarm intelligence optimization algorithms including particle swarm optimization algorithm (PSO) [4] show their superiority in local learning and global search. Recently, Cai et al. have successfully used greedy discrete particle swarm optimization algorithm (GDPOS) [5] to detect the community structures in a network. However, failing to make full use of prior knowledge of network and generate high-quality initial population, this algorithm does not lead to the good enough performance of global search and relatively high accuracy.
According to the latest reports, a large number of biological experiments have demonstrated that a slime named Physarum has an intelligence of solving mazes and constructing efficient and robust networks [6, 9]. Meanwhile, the Physarum-inspired Mathematical Model (PM) has been proposed by Tero et al. [7], which has been used for optimizing the heuristic algorithms [8]. Thus, a Physarum-inspired network model (PNM) is proposed for initializing the PSO based on the PM model, which is utilized to distinguish inter-community edges from intra-community edges. Furthermore, we attempt to optimize the phase of PSO’s initialization for higher quality in community detection.
The remaining of this paper is organized as follows: Sect. 2 illustrates the related background and introduces the particle swarm optimization algorithm for community detection. Section 3 proposes the Physarum-inspired particle swarm optimization algorithm. Section 4 reports the experiments in eight real-world networks and the comparisons with state of the art algorithms. Section 5 concludes this paper.
2 Related Work
2.1 Community Detection
A network can be composed of nodes and edges, in which nodes usually stand for members and edges represent relationships between members. Let \(G= (V, E)\) denote a network, where V and E are the aggregations of nodes and edges, respectively. Aiming at dividing the nodes in a network into different communities, community detection results in that nodes across communities are sparsely connected, while nodes within a community are relatively densely connected. Under the premise that a community is a subset of V and \(n_{c}\) is defined as the number of communities, a community division is a set of communities, \(C_{i}\subset G, C=\{C_{1}, C_{2}, \dots , C_{n_{c}}\}\), where \(C_{i}\ne \varnothing , \bigcap \limits _{i=1}^{n_{c}}C_{i}=\varnothing , \bigcup \limits _{i=1}^{n_{c}}C_{i}=G\).
2.2 PSO for Community Detection
Derived from the social behavior seen in some animal populations, like fish school and birds flock, PSO is a type of swarm intelligence algorithm proposed by Eberhart and Kennedy in 1995 [4]. The concise framework, simple principle and fast convergence make PSO a popular algorithm for solving continuous optimization problems. Each particle has a position and velocity vector. The position vector usually stimulates a candidate solution to the optimized problem, and the velocity vector denotes the tendency of position updating. A particle updates its status iteratively according to its own and the other particles’ experiences to search for the optimal solution. Here, we take a typical PSO for network clustering, termed GDPSO, as an example to introduce the basic parts of PSO for community detection.
Particle representation: Considering that the community detection is a discrete optimized problem, we have to redefine the particle positions. One position vector represents a network division and the position vector of the particle i is defined as \(X_{i}=\{x_{i}^{1}, x_{i}^{2}, \dots , x_{i}^{n}\}\), where \(x_{i}^{j}\in [1, n]\) is an integer.
The coding scheme of the particle in GDPSO. Each particle is coded as a string of integers, which represents the label identifier of the corresponding node.
Mutation: GDPSO implements the mutation operation so as to preserve diversity and avoid falling into local optima. The procedure can be depicted as follows: generating a random number between 0 and 1; for each node in a network, if the random number is smaller than the mutation probability pm, assigning its label identifier to all of its neighbors.
3 Physarum-inspired PSO for Community Detection
3.1 The Physarum-based network mathematical model
In this paper, PM model is modified into Physarum-based network model (PNM) which could be used to recognize the intra-community edges in a network. The key mechanism of PM model is the feedback system between the fluxes and conductivities of tubes based on the Posieuille flow.
3.2 Physarum-Inspired Network Model for Community Detection
Taking advantage of PNM, we roughly distinguish the inter-community edges from intra-community through conductivities. Then, we adopt PNM optimize initialization generating a high-quality initial solution and accelerating convergence.
We can obtain a matrix D through PNM, and suppose that node i has a neighbor set \(L (i)=\{l_{1}, l_{2}, \dots , l_{k}\}\) and let label(i) be the community label which node i belongs to. First, for each node i, we initialize label(i) as i. In addition, we assume that \(\varOmega _{i} = \{label (j)|j\in L (i)\; and\; D_{i,j}< (1-R\%)*D_{max}\}\) includes the community labels of neighbors of node i. Namely, the top \(R\%\) conductivities \(D_{i,j}\) denote that the edges between node i and j are inter-community edges. Then, each node randomly selects an element from \(\varOmega _{i}\) as its new label.
4 Experiments and Results
All experiments are executed in the same environment to enable fair comparisons between our algorithm and other algorithms including GDPSO [5], IACO-Net [12] and PNGACD [13]. All results are averaged over 30 repeated runnings in order to eliminate fluctuation. There are two popular metrics for evaluating the performance of community detection: the modularity Q and normalized mutual information (NMI) [10].
4.1 Results on Benchmark Networks
Some experiments are carried out in the GN benchmark network proposed by Lancichinetti et al. [11]. \(\alpha \) denotes the mixing parameter which controlls the proportion of links within and out of a community. We test all algorithms in eleven computer-generated networks with the value of \(\alpha \) ranging from 0 to 0.5.
The experimental results from the GN benchmark networks.
4.2 Results on Real-World Networks
Networks used in this paper. Clusters stands for the number of communities in standard divisions, in which “\(\textendash \)” means that the standard division is non-existent.
| Network | Nodes | Edges | Clusters | Network | Nodes | Edges | Clusters |
|---|---|---|---|---|---|---|---|
| Karate | 34 | 78 | 4 | Dolphins | 62 | 159 | 2 |
| Polbooks | 105 | 441 | 3 | Football | 115 | 613 | 12 |
| Lesmis | 77 | 254 | \(\textendash \) | Adjnoun | 112 | 425 | \(\textendash \) |
| SFI | 118 | 200 | \(\textendash \) | Celegans | 297 | 1540 | \(\textendash \) |
The average Q of the final iteration in four real-world networks. The upper and lower ends of whiskers represent the maximum and minimum of Q, and the vertical height of the box ranges from the first and the third quartiles. Besides, the small square and band inside the box denote the average and median of Q, respectively. These box charts demonstrate that P-PSO is inclined to a better robustness in community detection.
The test results for the Football, SFI and Celegans in terms of \(Q_{max}\) and \(Q_{avg}\)
| Network | Football | SFI | Celegans | |||
|---|---|---|---|---|---|---|
| \(Q_{max}\) | \(Q_{avg}\) | \(Q_{max}\) | \(Q_{avg}\) | \(Q_{max}\) | \(Q_{avg}\) | |
| P-PSO | 0.6046 | 0.6046 | 0.7470 | 0.7389 | 0.4732 | 0.4717 |
| GDPSO | 0.6046 | 0.6046 | 0.7470 | 0.7370 | 0.4707 | 0.4685 |
| IACO-Net | 0.6032 | 0.5817 | 0.1940 | 0.1969 | 0.3733 | 0.3622 |
| PNGACD | 0.5973 | 0.5856 | 0.7457 | 0.7400 | 0.2914 | 0.2903 |
The dynamic Q with the increment of iteration. The results show that the proposed algorithm can accelerate the convergence, compared with GDPSO and IACO-Net.
The visualizations of community divisions in two networks
Figure 5 shows the community divisions in Polbooks and Football. In Fig. 5(a), the geometric figures denote the real communities and the colors denote communities detected by P-PSO. Due to the context of books, some books are connected more closely and form smaller communities, which disorganizes the original divisions in the real world. In terms of the Football network, the positions are denoted as the real division and the colors mean five communities in the division of P-PSO. Each node represents a football team in the real world, and an edge stands for a game they have together. The marked circle emphasizes the main difference between the detected communities by P-PSO and the real communities.
5 Conclusion
The research about community detection is helpful for us to analyze the basic characteristics of networks. Taking advantage of the Physarum network model (PNM) and greedy discrete particle swarm optimization algorithm (GDPSO), we propose a particle swarm optimization algorithm (P-PSO). The experimental results in eight real-world networks demonstrate that P-PSO shows a better ability in optimizing the initial solution and can obtain effective and promising results than other state of the art algorithms.
Notes
Acknowledgments
Zhengpeng Chen and Fanzhen Liu contributed equally to this work and should be considered as co-first authors. This work is supported by the National Natural Science Foundation of China (Nos. 61402379, 61403315), Fundamental Research Funds for the Central Universities (No. XDJK2016A008, XDJK2016B029, XDJK2016E074), CQ CSTC (cstc2015gjhz40002).
References
- 1.Fortunato, S.: Community detection in graphs. Phys. Rep. 486, 75–174 (2010)MathSciNetCrossRefGoogle Scholar
- 2.Weng, L., Menczer, F., Ahn, Y.Y.: Virality prediction and community structure in social networks. Sci. Rep. 3, 2522 (2013)CrossRefGoogle Scholar
- 3.Newman, M.E.: Modularity and community structure in networks. Proc. Natl. Acad. Sci. 103, 8577–8582 (2006)CrossRefGoogle Scholar
- 4.Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of 1995 IEEE International Conference on Neural Networks, pp. 1942–1948. IEEE Press, New York (1995)Google Scholar
- 5.Cai, Q., Gong, M., Ma, L.: Greedy discrete particle swarm optimization for large-scale social network clustering. Inf. Sci. 316, 503–516 (2015)CrossRefGoogle Scholar
- 6.Tero, A., Takagi, S., Saigusa, T., Ito, K., Bebber, D.P., Fricker, M.D., Yumiki, K., Kobayashi, R., Nakagaki, T.: Rules for biologically inspired adaptive network design. Science 327, 439–442 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
- 7.Tero, A., Kobayashi, R., Nakagaki, T.: A mathematical model for adaptive transport network in path finding by true slime mold. J. Theor. Biol. 224, 553–564 (2007)MathSciNetCrossRefGoogle Scholar
- 8.Liu, Y., Gao, C., Zhang, Z., Lu, Y., Chen, S., Liang, M., Tao, L.: Solving np-hard problems with physarum-based ant colony system. IEEE/ACM Trans. Comput. Biol. Bioinf. 14, 108–120 (2017)CrossRefGoogle Scholar
- 9.Nakagaki, T., Yamada, H., Tóth, Á.: Intelligence: maze-solving by an amoeboid organism. Nature 407, 470–470 (2000)CrossRefGoogle Scholar
- 10.Danon, L., Daz-Guilera, A., Duch, J., Arenas, A.: Comparing community structure identification. J. Stat. Mech. Theory Exp. 2005, P09008 (2005)CrossRefGoogle Scholar
- 11.Lancichinetti, A., Fortunato, S., Radicchi, F.: Benchmark graphs for testing community detection algorithms. Phys. Rev. E 78, 046110 (2008)CrossRefGoogle Scholar
- 12.Mu, C., Zhang, J., Jiao, L.: An intelligent ant colony optimization for community detection in complex networks. In: 2014 IEEE Congress on Evolutionary Computation(CEC), pp. 700–706. IEEE Press, New York (2014)Google Scholar
- 13.Gao, C., Liang, M., Li, X., Zhang, Z., Wang, Z.: Network community detection based on the Physarum-inspired computational framework. IEEE/ACM Trans. Comput. Biol. Bioinf. (2016). doi: 10.1109/TCBB.2016.2638824 Google Scholar






