## Abstract

Social network analysis has become an important topic for researchers in sociology and computer science. Similarities among individuals form communities as the basic constitutions of social networks. Regarding the importance of communities, community detection is a fundamental step in the study of social networks typically modeled as large-scale graphs. Detecting communities in such large-scale graphs which generally suffers from the curse of dimensionality is the main objective followed in this study. An efficient modularity-based community detection algorithm called MDPCluster is introduced in order to detect communities in large-scale graphs in a timely manner. To address the high dimensionality problem, first, a Louvain-based algorithm is utilized by MDPCluster to distinguish initial communities as super-nodes and then a Modified Discrete Particle Swarm Optimization algorithm, called MDPSO is leveraged to detect communities through maximizing modularity measure. MDPSO discretizes Particle Swarm Optimization using the idea of transmission tendency and also escapes from premature convergence thereby a mutation operator inspired by Genetic Algorithm. To evaluate the proposed method, six standard datasets, i.e., American College Football, Books about US Politics, Amazon Product Co-purchasing, DBLP, GR-QC and HEP-TH have been employed. The first two are known as synthetic datasets whereas the rest are real-world datasets. In comparison to eight state-of-the-art algorithms, i.e., Stationary Genetic Algorithm, Generational Genetic Algorithm, Simulated Annealing-Stationary Genetic Algorithm, Simulated Annealing-Generational Genetic Algorithm, Grivan–Newman, Danon and Label Propagation Algorithm, the results indicate the superiorities of MDCluster in terms of modularity, Normalized Mutual Information and execution time as well.

This is a preview of subscription content, access via your institution.

## Notes

- 1.
- 2.
Modularity and an Improved Genetic Algorithm.

## References

- 1.
González-Bailón S (2013) Social science in the era of big data. Policy Int 5(2):147–160

- 2.
Javadi, SHS, Gharani P, Khadivi S (2018) Detecting community structure in dynamic social networks using the concept of leadership. In: Sustainable interdependent networks. Springer, pp 97–118

- 3.
Jiang JQ, McQuay LJ (2012) Modularity functions maximization with nonnegative relaxation facilitates community detection in networks. Physica A 391(3):854–865

- 4.
Yang J, Leskovec J (2015) Defining and evaluating network communities based on ground-truth. Knowl Inf Syst 42(1):181–213

- 5.
Kadirkamanathan V, Selvarajah K, Fleming PJ (2006) Stability analysis of the particle dynamics in particle swarm optimizer. IEEE Trans Evol Comput 10(3):245–255

- 6.
Du K-L, Swamy M (2016) Particle swarm optimization. In: Search and optimization by metaheuristics. Springer, pp 153–173

- 7.
Poli R, Kennedy J, Blackwell T (2007) Particle swarm optimization. Swarm Intell 1(1):33–57

- 8.
Yuan X, Wang L, Yuan Y (2008) Application of enhanced PSO approach to optimal scheduling of hydro system. Energy Convers Manag 49(11):2966–2972

- 9.
Liu B, Wang L, Jin Y-H (2007) An effective PSO-based memetic algorithm for flow shop scheduling. IEEE Trans Syst Man Cybern Part B Cybern 37(1):18–27

- 10.
Liu B, Wang L, Jin Y, Huang D (2005) Advances in particle swarmoptimization algorithm. Control Instrum Chem Ind 32(3):1–6

- 11.
Aung TT, Nyunt TTS (2019) Discrete artificial bee colony algorithm for community detection in social networks. In: Seventeenth international conference on computer applications (ICCA 2019)

- 12.
Ghoshal AK, Das N, Bhattacharjee S, Chakraborty G (2019) A fast parallel genetic algorithm based approach for community detection in large networks. In: 2019 11th international conference on communication systems and networks (COMSNETS). IEEE, pp 95–101

- 13.
Guerrero M, Montoya FG, Baños R, Alcayde A, Gil C (2017) Adaptive community detection in complex networks using genetic algorithms. Neurocomputing 266:101–113

- 14.
Raghavan UN, Albert R, Kumara S (2007) Near linear time algorithm to detect community structures in large-scale networks. Phys Rev E 76(3):036106

- 15.
Newman ME, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69(2):026113

- 16.
Traag VA, Waltman L, van Eck NJ (2019) From Louvain to Leiden: guaranteeing well-connected communities. Scientific Rep 9:e5233

- 17.
Guimera R, Danon L, Diaz-Guilera A, Giralt F, Arenas A (2003) Self-similar community structure in a network of human interactions. Phys Rev E 68(6):065103

- 18.
Que X, Checconi F, Petrini F, Gunnels JA (2015) Scalable community detection with the louvain algorithm. In: 2015 IEEE international on parallel and distributed processing symposium (IPDPS). IEEE, pp 28–37

- 19.
Newman ME (2012) Communities, modules and large-scale structure in networks. Nat Phys 8(1):25

- 20.
Schaeffer SE (2007) Graph clustering. Comput Sci Rev 1(1):27–64

- 21.
Rossi F, Villa-Vialaneix N (2010) Optimizing an organized modularity measure for topographic graph clustering: a deterministic annealing approach. Neurocomputing 73(7–9):1142–1163

- 22.
Wang R-S, Zhang S, Wang Y, Zhang X-S, Chen L (2008) Clustering complex networks and biological networks by nonnegative matrix factorization with various similarity measures. Neurocomputing 72(1–3):134–141

- 23.
Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 2008(10):P10008

- 24.
Newman ME (2001) Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality. Phys Rev E 64(1):016132

- 25.
Shang R, Bai J, Jiao L, Jin C (2013) Community detection based on modularity and an improved genetic algorithm. Physica A 392(5):1215–1231

- 26.
Moradi M, Parsa S (2019) An evolutionary method for community detection using a novel local search strategy. Physica A 523:457–475

- 27.
Ying W et al (2019) Parallel conical area community detection using evolutionary multi-objective optimization. Processes 7(2):111

- 28.
Ji J, Song X, Liu C, Zhang X (2013) Ant colony clustering with fitness perception and pheromone diffusion for community detection in complex networks. Physica A 392(15):3260–3272

- 29.
Cai Q, Gong M, Shen B, Ma L, Jiao L (2014) Discrete particle swarm optimization for identifying community structures in signed social networks. Neural Netw 58:4–13

- 30.
Hassan EA, Hafez AI, Hassanien AE, Fahmy AA (2014) Community detection algorithm based on artificial fish swarm optimization. In: Intelligent Systems’ 2014. Springer, pp 509–521

- 31.
Wang Y, Xu W, Kang Q (2018) AFSMA: an enhanced artificial fish swarm algorithm based on mutuality for community detection. In: Proceedings of the 2nd international conference on big data research. ACM, pp 107–113

- 32.
Li X, Wu X, Xu S, Qing S, Chang P-C (2019) A novel complex network community detection approach using discrete particle swarm optimization with particle diversity and mutation. Appl Soft Comput 81:105476

- 33.
Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: Proceedings of the sixth international symposium on micro machine and human science, 1995. MHS’95. IEEE, pp 39–43

- 34.
Kennedy R, Eberhart J (1995) Particle swarm optimization. In: Proceedings of IEEE international conference on neural networks IV, vol 1000, p 1995

- 35.
Yang X-S (2014) Swarm intelligence based algorithms: a critical analysis. Evol Intell 7(1):17–28

- 36.
Yang X-S, He X (2016) Nature-inspired optimization algorithms in engineering: overview and applications. In: Nature-inspired computation in engineering. Springer, pp 1–20

- 37.
Yu YF, Li G, Xu C (2013) An improved particle swarm optimization algorithm. In: Applied mechanics and materials. Trans Tech Publ, vol 401, pp 1328–1335

- 38.
Selvi V, Umarani DR (2010) Comparative analysis of ant colony and particle swarm optimization techniques. Int J Comput Appl 5(4):1–6

- 39.
Shi Y, Eberhart R (1998) A modified particle swarm optimizer. In: The 1998 IEEE international conference on evolutionary computation proceedings, 1998. IEEE World Congress on Computational Intelligence. IEEE, pp 69–73

- 40.
Clerc M (2004) Discrete particle swarm optimization, illustrated by the traveling salesman problem. In: New optimization techniques in engineering. Springer, pp 219–239

- 41.
Pan Q-K, Tasgetiren MF, Liang Y-C (2008) A discrete particle swarm optimization algorithm for the no-wait flowshop scheduling problem. Comput Oper Res 35(9):2807–2839

- 42.
Strasser S, Goodman R, Sheppard J, Butcher S (2016) A new discrete particle swarm optimization algorithm. In: Proceedings of the genetic and evolutionary computation conference 2016. ACM, pp 53–60

- 43.
Wu H, Nie C, Kuo F-C, Leung H, Colbourn CJ (2015) A discrete particle swarm optimization for covering array generation. IEEE Trans Evol Comput 19(4):575–591

- 44.
Chatterjee S, Sarkar S, Hore S, Dey N, Ashour AS, Balas VE (2017) Particle swarm optimization trained neural network for structural failure prediction of multistoried RC buildings. Neural Comput Appl 28(8):2005–2016

- 45.
Momeni E, Armaghani DJ, Hajihassani M, Amin MFM (2015) Prediction of uniaxial compressive strength of rock samples using hybrid particle swarm optimization-based artificial neural networks. Measurement 60:50–63

- 46.
Lynn N, Suganthan PN (2015) Distance based locally informed particle swarm optimizer with dynamic population size. In: Proceedings of the 18th Asia Pacific symposium on intelligent and evolutionary systems-volume 2. Springer, pp 577–587

- 47.
Meunier D, Lambiotte R, Fornito A, Ersche K, Bullmore ET (2009) Hierarchical modularity in human brain functional networks. Front Neuroinform 3:37

- 48.
Pujol JM, Erramilli V, Rodriguez P (2009) Divide and conquer: partitioning online social networks. arXiv preprint arXiv:0905.4918

- 49.
Raeder T, Chawla NV (2011) Market basket analysis with networks. Social Netw Anal Min 1(2):97–113

- 50.
Newman ME (2006) Modularity and community structure in networks. Proc Natl Acad Sci 103(23):8577–8582

- 51.
Leskovec J, Kleinberg J, Faloutsos C (2007) Graph evolution: Densification and shrinking diameters. ACM Trans Knowl Discovery Data (TKDD) 1(1):2

- 52.
Boseniuk T, Ebeling W (1990) Boltzmann-, Darwin-and Haeckel-strategies in optimization problems. In International conference on parallel problem solving from nature. Springer, pp 429–444

- 53.
Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220(4598):671–680

- 54.
Cai X, Shi Y, Zhu Y, Qiao Y, Hu F (2017) An algorithm Q-PSO for community detection in complex networks. In 2017 16th international symposium on distributed computing and applications to business, engineering and science (DCABES). IEEE, pp 76–79

- 55.
Bilal S, Abdelouahab M (2017) Evolutionary algorithm and modularity for detecting communities in networks. Physica A 473:89–96

- 56.
Gong M, Fu B, Jiao L, Du H (2011) Memetic algorithm for community detection in networks. Phys Rev E 84(5):056101

## Author information

### Affiliations

### Corresponding author

## Additional information

### Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Appendix

### Appendix

Figure 13 shows the pseudocode of PSO algorithm. In this algorithm, the main loop iterates as many times as total iterations or as it is known as T. In this loop, there is an inner loop which runs as many times as the number of individuals in the population, N_{p}, in which velocities and positions are updated. The fitness function is calculated for all the individuals of the population as many times as the number of problem dimensions or |V|. The time complexity of calculating fitness function is \( O\left( {\left| V \right|^{2} } \right) \) in this paper. Therefore, the time complexity of PSO is \( O\left( {\left( {\left| V \right|*N_{p} + \left| V \right|^{2} *N_{p} } \right)*T} \right) \), and with some simplification, the time complexity of the algorithm is \( O\left( {\left( {\left| V \right|^{2} *N_{p} } \right)*T} \right) \).

In GA, the main loop iterates as many times as all the iterations T. in this loop, there is an inner loop which runs as many times as the number of individuals in population N_{p} in which the fitness function for all individual is calculated and mutation and cross-over are applied on population by mutation and cross-over rates. The time complexity of fitness function in this problem (modularity) is \( O\left( {\left| V \right|^{2} } \right) \). Thus, time complexity of GA would be \( O\left( {\left( {\left| V \right|^{2} *N_{p} } \right)*T*(p_{c} *O\left( {cross over} \right) + p_{m} *O\left( {mutation} \right))} \right) \). With some simplifications we have \( O\left( {\left( {\left| V \right|^{2} *N_{p} } \right)*T} \right) \) for the time complexity of the algorithm.

See Fig. 17.

## Rights and permissions

## About this article

### Cite this article

Fozuni Shirjini, M., Farzi, S. & Nikanjam, A. MDPCluster: a swarm-based community detection algorithm in large-scale graphs.
*Computing* **102, **893–922 (2020). https://doi.org/10.1007/s00607-019-00787-4

Received:

Accepted:

Published:

Issue Date:

### Keywords

- Particle swarm optimization
- Community detection
- Swarm intelligence
- Large scale graphs
- Louvain algorithm

### Mathematics Subject Classification

- 05C85