Cluster Computing

, Volume 21, Issue 1, pp 945–953 | Cite as

Clustering based on words distances

  • Hongtao Liu
  • Hongwei GuanEmail author
  • Jie Jian
  • Xueyan Liu
  • Ying Pei


In order to find the relevance of the key words in the hot topics effectively, we proposed a clustering method based on words-distances. We calculated the distances between the words firstly, then calculated the sectional density of each words. We regarded the words which have higher sectional density and far away from sectional density point as the center point in the clustering. After find the center point, we start to clustering. This method through decision diagram on estimating the number of clusters. At last, we can find the results on the evaluating indicator of accuracy rate and recall rate.


Hot topic Clustering Accuracy rate Recall rate 



This research is supported by the following fundings or programs: the National Natural Science Foundation of China (61402309), the Fundamental Research Funds for the Central Universities (No. XDJK2014B012), the National Social Science Foundation of China (13CGL146), the National Social Science Foundation of China (15BGL2729), the Study on the Key Common Characteristics of Network Transaction Fraud (14SKF01).


  1. 1.
    Qiao, Y.N., Yong, Q., Hui, H.: The research on term field based term co-occurrence model. In:Third International Conference on Semantics, Knowledge and Grid. IEEE, pp. 471–474 (2007)Google Scholar
  2. 2.
    Zhang, Y., Shi, K., Qingpeng, X.U.: Spam filter based on term co-occurrence model. J. Chin. Inf. Proc. 6, 010 (2009)Google Scholar
  3. 3.
    Gao, J., Zhou, M., et al.: Resolving query translation ambiguity using a decaying co-occurrence model and syntactic dependence relations. In: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval. ACM (2002)Google Scholar
  4. 4.
    Hartigan, J.A., et al.: Clustering Algorithms. Wiley, New York (1975)zbMATHGoogle Scholar
  5. 5.
    Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69(2), 026113 (2004)CrossRefGoogle Scholar
  6. 6.
    Lee, J., Gross, S.P., Lee, J.: Modularity optimization by conformational space annealing. Phys. Rev. E 85(5), 056702 (2012)CrossRefGoogle Scholar
  7. 7.
    Shang, R., Bai, J., Jiao, L., et al.: Community detection based on modularity and an improved genetic algorithm. Phys. A 392(5), 1215–1231 (2013)CrossRefGoogle Scholar
  8. 8.
    Shen, H.W., Cheng, X.Q.: Spectral methods for the detection of network community structure: a comparative analysis. J. Stat. Mech. 2010(10), P10020 (2010)CrossRefGoogle Scholar
  9. 9.
    Jiang, J.Q., Dress, A.W.M., Yang, G.: A spectral clustering-based framework for detecting community structures in complex networks. Appl. Math. Lett. 22(9), 1479–1482 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. Proc. Nat. Acad. Sci. 99(12), 7821–7826 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Blondel, V.D., Guillaume, J.L., Lambiotte, R., et al.: Fast unfolding of communities in large networks. J. Stat. Mech. 2008(10), P10008 (2008)CrossRefGoogle Scholar
  12. 12.
    Raghavan, U.N., Albert, R., Kumara, S.: Near linear time algorithm to detect community structures in large-scale networks. Phys. Rev. E 76(3), 036106 (2007)CrossRefGoogle Scholar
  13. 13.
    Šubelj, L., Bajec, M.: Unfolding communities in large complex networks: combining defensive and offensive label propagation for core extraction. Phys. Rev. E 83(3), 036103 (2011)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Rosvall, M., Bergstrom, C.T.: Maps of random walks on complex networks reveal community structure. Proc. Natl. Acad. Sci. 105(4), 1118–1123 (2008)CrossRefGoogle Scholar
  15. 15.
    Shen, H., Cheng, X.Q., Chen, H.Q., et al.: Information bottleneck based community detection in network. Chin. J. Comput. Chin. Ed. 31(4), 677 (2008)CrossRefGoogle Scholar
  16. 16.
    Deng, X.L., Wang, B., Wu, B., et al.: Modularity modeling and evaluation in community detecting of complex network based on information entropy. J. Comput. Res. Dev. 49(4), 725–734 (2012)Google Scholar
  17. 17.
    Palla, G., Derényi, I., Farkas, I., et al.: Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043), 814–818 (2005)CrossRefGoogle Scholar
  18. 18.
    Lancichinetti, A., Fortunato, S., Kertész, J.: Detecting the overlapping and hierarchical community structure in complex networks. N. J. Phys. 11(3), 033015 (2009)CrossRefGoogle Scholar
  19. 19.
    Coscia, M., Rossetti, G., Giannotti, F., et al.: Demon: a local-first discovery method for overlapping communities. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 615–623 (2012)Google Scholar
  20. 20.
    Lancichinetti, A., Radicchi, F., Ramasco, J.J., et al.: Finding statistically significant communities in networks. PloS ONE 6(4), e18961 (2011)CrossRefGoogle Scholar
  21. 21.
    Gregory, S.: Finding overlapping communities in networks by label propagation. N. J. Phys. 12(10), 103018 (2010)CrossRefGoogle Scholar
  22. 22.
    Wu, Z.H., Lin, Y.F., Gregory, S., et al.: Balanced multi-label propagation for overlapping community detection in social networks. J. Comput. Sci. Technol. 27(3), 468–479 (2012)MathSciNetCrossRefGoogle Scholar
  23. 23.
    Xie, J., Szymanski, B.K.: Towards Linear Time Overlapping Community Detection in Social Networks. Advances in Knowledge Discovery and Data Mining. Springer, Berlin Heidelberg (2012)CrossRefGoogle Scholar
  24. 24.
    Ahn, Y.Y., Bagrow, J.P., Lehmann, S.: Link communities reveal multiscale complexity in networks. Nature 466(7307), 761–764 (2010)CrossRefGoogle Scholar
  25. 25.
    Ball, B., Karrer, B., Newman, M.E.J.: Efficient and principled method for detecting communities in networks. Phys. Rev. E 84(3), 036103 (2011)CrossRefGoogle Scholar
  26. 26.
    Kim, Y., Jeong, H.: Map equation for link communities. Phys. Rev. E 84(2), 026110 (2011)MathSciNetCrossRefGoogle Scholar
  27. 27.
    Ester, M., Kriegel, H.P., Sander, J., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd 96(34), 226–231 (1996)Google Scholar
  28. 28.
    Rodriguez, A., Laio, A.: Clustering by fast search and find of density peaks. Science 344(6191), 1492–1496 (2014)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  • Hongtao Liu
    • 1
  • Hongwei Guan
    • 1
    Email author
  • Jie Jian
    • 2
  • Xueyan Liu
    • 2
  • Ying Pei
    • 2
  1. 1.College of Computer Science and TechnologyChongqing University of Posts and TelecommunicationsChongqingChina
  2. 2.School of Economics and ManagementChongqing University of Posts and TelecommunicationsChongqingChina

Personalised recommendations