Cluster Computing

, Volume 22, Supplement 3, pp 5997–6007 | Cite as

Research on clustering analysis of Internet public opinion

  • XueGang ChenEmail author
  • Sheng Duan
  • Lu-da Wang


In order to enhance the real-time performance of Internet public opinion recognizing and early warning, and improve the accuracy of the analysis of Internet public opinion for hot spots, similarity analysis methods of Internet public opinion are put forward. Firstly, web crawler technology is introduced for obtaining accurate and comprehensive public opinion. Secondly, propose similarity algorithms from the aspects of known and unknown of the subject. At the same time, considering the uncertainty and fuzzy of Internet public opinion, the concept of information entropy is introduced, and present a similarity analysis approach of Internet public opinion based on information entropy, and can cluster and identify hot spots and crisis events of Internet public opinion. Experimental results show that the proposed methods can quickly obtain the Internet public opinion, and has high accuracy rate of clustering, which provide an important technical support for Internet public opinion monitoring and recognizing.


Document analysis Uncertainty Information recognition Information entropy Clustering analysis 



The authors would like to thank for financial support by youth fund project of the humanities and social sciences of Education Ministry (No. 15YJC870004), science and technology innovation team of XiangNan University (Recognition and analysis based on big data public opinion), Social Sciences fund project of Hunan Province (No. 13YBA302) and education Science in Hunan province in 12th Five-Year planning project (XJK014CGD081, XJK011BXJ004).


  1. 1.
    Zheng, Y.: On the target positioning and realization guarantee of network public. Inf. Sci. 33(6), 81–85 (2015)Google Scholar
  2. 2.
    Wang, G.: Research on Hotspot Discovery in Internet Public Opinions Based on Improved K-Means. Computational Intelligence and Neuroscience. ID 230946 (2013)Google Scholar
  3. 3.
    Xue, Y., Xu, L., Qiu, B.: Relationship discovery in public opinion and actual behavior for social media stock data space. EURASIP J. Wirel. Commun. Netw. 2016, 216 (2016)Google Scholar
  4. 4.
    Wang, S., Peng, Y., Wang, J.: Research of the text clustering based on LDA using in network public opinion analysis. J. Shandong Univ. 49(9), 129–133 (2014)Google Scholar
  5. 5.
    Yang, P., Gui, X., Tian, F.: Efficient keywords clustering method for topic detection. J. Xi’an Jiaotong Univ. 46(10), 873–876 (2012)Google Scholar
  6. 6.
    Tang, H., Wang, H.: Application of improved K-means algorithm to analysis of online public opinions. Syst. Appl. Comput. 20(3), 165–168 (2011)Google Scholar
  7. 7.
    Zhang, Y., Wang, Z.: Forum user social network mining based on content similarity. J. Intell. 29(8), 165–168 (2010)Google Scholar
  8. 8.
    Yang, Z., Duan, L., Lai, Y.: Online public opinion hot spot detection and analysis based on short text clustering using string distance. J. Beijing Univ. Technol. 36(5), 669–673 (2010)Google Scholar
  9. 9.
    Wang, H., Cao, C., Gao, S.: Research on text clustering of micro-blog public opinion: word sense cluster and collocation-based method. J. Nanjing Normal Univ. 38(1), 57–64 (2015)Google Scholar
  10. 10.
    Liu, Y., Lv, K., Liu, J.: Design of public sentiment monitoring system based on co-ICIB co-clustering. J. Henan Polytech. Univ. 32(5), 592–595 (2013)Google Scholar
  11. 11.
    Tong, L.: Research and Implementation of Network Public Opinion Analysis System Based on Hadoop Platform. Jilin University, Changchun (2015)Google Scholar
  12. 12.
    Wang, Z., He, M., Du, Y.: Text similarity computing based on topic model LDA. Comput. Sci. 40(12), 229–232 (2013)Google Scholar
  13. 13.
    Gu, C., Xu, H., Zhou, H., Zhang, J.: Text similarity computing based on lexical semantic information. Appl. Res. Comput. 35(2) (2018)Google Scholar
  14. 14.
    Li, L., Zhu, A., Su, T.: Research and implementation of an improved vsm-based text similarity algorithm. Comput. Appl. Softw. 29(2), 282–284 (2012)Google Scholar
  15. 15.
    Li, S., Ling, W., Gong, J., Zhou, C.: Text-similarity method based on entropy. Appl. Res. Comput. 33(3), 665–668 (2016)Google Scholar
  16. 16.
    Huang, C., Yin, J., Hou, F.: A text similarity measurement combining word semantic information with TF-IDF method. Chin. J. Comput. 34(5), 856–864 (2011)Google Scholar
  17. 17.
    Hua, X., Zhu, Q., Li, P.: Chinese text similarity method research by combining semantic analysis with statistics. Appl. Res. Comput. 29(3), 833–836 (2012)Google Scholar
  18. 18.
    Ju, X., Chen, J., Shao, H.: Hierarchical web page classification method based on vector space model. J. Nantong Univ. 9(1), 24–29 (2010)Google Scholar
  19. 19.
    Li, D., Liao, X., Fan, F.: A focused network crawler with topic knowledge automatically growing. Comput. Appl. Softw. 31(5), 29–33 (2014)Google Scholar
  20. 20.
    Jin, J.: Review of clustering method. Comput. Sci. 41(11A), 288–293 (2014)Google Scholar
  21. 21.
    Tang, L.: Text feature selection method based on information entropy and dynamic clustering. Comput. Eng. Appl. 51(19), 152–157 (2015)Google Scholar
  22. 22.
    Niewiarowski, A., Stanuszek, M.: Mechanism of analysis of similarity short texts based on the Levenshtein distance. Stud. Inform. 34(1), 107–114 (2013)Google Scholar
  23. 23.
    Yang, K., Zhang, Y., Li, Y.: Feature selection method based on document frequency. Comput. Eng. 36(17), 33–35 (2010)Google Scholar
  24. 24.
    Chen, X.: Research on fast autonomous clustering method of microblog public opinion based on big data technology. J. Intell. 36(5), 113117 (2017)Google Scholar
  25. 25.
    Li, H., Shi, Z., Yi, J.: Secondary clustering recommendation algorithm based on information entropy. Comput. Eng. 42(5), 213–217 (2016)Google Scholar
  26. 26.
    Chen, X., Zhang, J., Cheng, J.: Analysis of similarity of DNA sequences based on information quantity. Appl. Res. Comput. 30(5), 1381–1384 (2013)Google Scholar
  27. 27.
    Tao, Z., Liu, X., Chen, H.: Entropy measures for linguistic information and its application to decision making. J. Intell. Fuzzy Syst. 29(2), 747–759 (2015)MathSciNetzbMATHGoogle Scholar
  28. 28.
    Park, I.-K., Choi, G.-S.: A variable-precision information-entropy rough set approach for job searching. Inf. Syst. 48, 279–288 (2015)Google Scholar
  29. 29.
    Li, X., Li, G., Xiao, M.: Novel classification method for remote sensing images based on information entropy discretization algorithm and vector space model. Comput. Geosci. 89, 252–259 (2016)Google Scholar
  30. 30.
    Zhou, R., Yang, Z., Yu, M., Dan, A.R.: A portfolio optimization model based on information entropy and fuzzy time series. Fuzzy Optim. Decis. Mak. 14, 381–397 (2015)MathSciNetzbMATHGoogle Scholar
  31. 31.
    Zhang, X., Mei, C., Chen, D., Li, J.: Feature selection in mixed data: A method using a novel fuzzy rough set-based information entropy. Pattern Recognit. 56, 1–15 (2016)zbMATHGoogle Scholar
  32. 32.
    Wang, H., Yao, X.: Objective reduction based on nonlinear correlation information entropy. Soft Comput. 20, 2393–2407 (2016)Google Scholar
  33. 33.
    Navarrete, J., Viejo, D., Cazorla, M.: Color smoothing for RGB-D data using entropy information. Appl. Soft Comput. 46, 361–380 (2016)Google Scholar
  34. 34.
    Chen, S., Chen, Q., Wu, Z.: A hierarchical clustering algorithm based on kernel function. J. Jinan Univ. 32(1), 31–34 (2011)Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.College of Software and Communication EngineeringXiangNan UniversityChenzhouPeople’s Republic of China

Personalised recommendations