Clique-Based Method for Social Network Clustering

  • Guang Ouyang
  • Dipak K. Dey
  • Panpan ZhangEmail author


In this article, we develop a clique-based method for social network clustering. We introduce a new index to evaluate the quality of clustering results, and propose an efficient algorithm based on recursive bipartition to maximize an objective function of the proposed index. The optimization problem is NP-hard, so we approximate the semi-optimal solution via an implicitly restarted Lanczos method. One of the advantages of our algorithm is that the proposed index of each community in the clustering result is guaranteed to be higher than some predetermined threshold, p, which is completely controlled by users. We also account for the situation that p is unknown. A statistical procedure of controlling both under-clustering and over-clustering errors simultaneously is carried out to select localized threshold for each subnetwork, such that the community detection accuracy is optimized. Accordingly, we propose a localized clustering algorithm based on binary tree structure. Finally, we exploit the stochastic blockmodels to conduct simulation studies and demonstrate the accuracy and efficiency of our algorithms, both numerically and graphically.


Clique-score index Localized clustering algorithm Modularity Social network Spectral analysis Stochastic blockmodel 



The authors would like to thank the Associate Editor as well as two anonymous reviewers for their insightful comments and suggestions to this manuscript.


  1. Airoldi, E.M., Blei, D.M., Fienberg, S.E., Xing, E.P. (2008). Mixed membership stochastic blockmodels. Journal of Machine Learning Research, 9, 1981–2014.zbMATHGoogle Scholar
  2. Barabási, A.-L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286, 509–512.MathSciNetCrossRefGoogle Scholar
  3. Bickel, P.J., & Chen, A. (2009). A nonparametric view of network models and Newman-Girvan and other modularities. Proceedings of the National Academy of Sciences of the United States of America, 106, 21068–21073.CrossRefGoogle Scholar
  4. Calvetti, D., Reichel, L., Sorensen, D. (1994). An implicitly restarted Lanczos method for large symmetric eigenvalue problems. Electronic Transactions on Numerical Analysis, 2, 1–21.MathSciNetzbMATHGoogle Scholar
  5. Chung, F.R.K. (1997). Spectral graph theory. Providence: American Mathematical Society.zbMATHGoogle Scholar
  6. Clauset, A., Newman, M.E.J., Moore, C. (2004). Finding community structure in very large networks. Physical Review E, 70, 066111.CrossRefGoogle Scholar
  7. Erdös, P., & Rényi, A. (1959). On random graphs I. Publicationes Mathematicae, 6, 290–297.MathSciNetzbMATHGoogle Scholar
  8. Fortunato, S., & Barthélemy, M. (2007). Resolution limit in community detection. Proceedings of the National Academy of Sciences of the United States of America, 104, 36–41.CrossRefGoogle Scholar
  9. Fred, A., & Jain, A. (2003). Robust data clustering. Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2, 128–133.Google Scholar
  10. Gilbert, E.N. (1959). Random graphs. Annals of Mathematical Statistics, 30, 1141–1144.MathSciNetCrossRefGoogle Scholar
  11. Goldenberg, A., Zheng, A.X., Fienberg, S.E., Airoldi, E.M. (2010). A survey of statistical network models. Foundations and Trends in Machine Learning, 2, 129–233.CrossRefGoogle Scholar
  12. Handcock, M.S., Raftery, A.E., Tantrum, J.M. (2007). Model-based clustering for social networks. Journal of the Royal Statistical Society, Series A, 170, 301–354.MathSciNetCrossRefGoogle Scholar
  13. Hoff, P.D., Raftery, A.E., Handcock, M.S. (2002). Latent space approaches to social network analysis. Journal of the American Statistical Association, 97, 1090–1098.MathSciNetCrossRefGoogle Scholar
  14. Holland, P.W., & Leinhardt, S. (1981). An exponential family of probability distributions for directed graphs. Journal of the American Statistical Association, 76, 33–50.MathSciNetCrossRefGoogle Scholar
  15. Holland, P.W., Laskey, K.B., Leinhardt, S. (1983). Stochastic blockmodels: first steps. Social Networks, 5, 109–137.MathSciNetCrossRefGoogle Scholar
  16. Horn, R.A., & Johnson, C.R. (1985). Matrix analysis. New York: Cambridge University Press.CrossRefGoogle Scholar
  17. Hubert, L., & Abrabie, P. (1985). Comparing partitions. Journal of Classification, 2, 193–218.CrossRefGoogle Scholar
  18. Lancichinetti, A., & Fortunato, S. (2011). Limits of modularity maximization in community detection. Physical Review E, 84, 066122.CrossRefGoogle Scholar
  19. Newman, M.E.J. (2001). The structure of scientific collaboration networks. Proceedings of the National Academy of Sciences of the United States of America, 98, 404–409.MathSciNetCrossRefGoogle Scholar
  20. Newman, M.E.J. (2006). Modularity and community structure in networks. Proceedings of the National Academy of Sciences of the United States of America, 103, 8577–8582.CrossRefGoogle Scholar
  21. Newman, M.E.J., Strogatz, S.H., Watts, D.J. (2001). Random graphs with arbitrary degree distributions and their applications. Physical Review E, 64, 026118.CrossRefGoogle Scholar
  22. Ng, A.Y., Jordan, M.I., Weiss, Y. (2001). On spectral clustering: analysis and an algorithm. Advances in Neural Information Processing Systems, 14, 849–856.Google Scholar
  23. Snijders, T.A.B., & Nowicki, K. (1997). Estimation and prediction for stochastic blockmodels for graphs with latent block structure. Journal of Classification, 14, 75–100.MathSciNetCrossRefGoogle Scholar
  24. Ouyang, G. (2015). Social network community detection. Ph.D.dissertation, University of Connecticut.Google Scholar
  25. Pao, L.-F. (2014). Discovering the dynamics of smart business networks. Computational Management Science, 1, 445–458.MathSciNetCrossRefGoogle Scholar
  26. Pei, X., Zhan, X. -X., Jin, Z. (2017). Application of pair approximation method to modeling and analysis of a marriage network. Applied Mathematics and Computation, 294, 280–293.MathSciNetCrossRefGoogle Scholar
  27. Reichardt, J., & Bornholdt, S. (2006). Statistical mechanics of community detection. Physical Review E, 74, 016110.MathSciNetCrossRefGoogle Scholar
  28. Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transaction on Pattern Analysis and Machine Intelligence, 22, 888–905.CrossRefGoogle Scholar
  29. Watts, D.J., & Strogatz, S.H. (1998). Collective dynamics of “small-world” networks. Nature, 440–442.Google Scholar
  30. Wohlgemuth, J., & Matache, M.T. (2014). Small-wold properties of Facebook group networks. Complex Systems, 23, 197–225.MathSciNetCrossRefGoogle Scholar

Copyright information

© The Classification Society 2019

Authors and Affiliations

  1. 1.Google Inc.Mountain ViewUSA
  2. 2.Department of StatisticsUniversity of ConnecitcutStorrsUSA

Personalised recommendations