Internet Public Opinion Hotspot Detection Research Based on K-means Algorithm
Internet is becoming a spreading platform for the public opinion. It is important to grasp the internet public opinion (IPO) in time and understand the trends of their opinion correctly. Text mining plays a fundamental role in a number of information management and retrieval tasks. This paper studies internet public opinion hotspot detection using text mining approaches. First, we create an algorithm to obtain vector space model for all of text document. Second, this algorithm is combined with K-means clustering algorithm to develop unsupervised text mining approach. We use the proposed text mining approach to group the internet public opinion into various clusters, with the center of each representing a hotspot public opinion within the current time span. Through the result of the experiment, it shows that the efficiency and effectiveness of the algorithm using.
KeywordsInternet public opinion k-means clustering algorithm vector space model text classification
Unable to display preview. Download preview PDF.
- 1.Shen, D., Chen, Z., Yang, Q., Zeng, H.J., Zhang, B., Lu, Y., Ma, W.Y.: Web-page classification through summarization. In: Proceedings of the 27th Annual International Conference on Research and Development in Information Retrieval (SIGIR 2004), Sheffield, United Kingdom, July 25-29, pp. 242–249 (2004)Google Scholar
- 2.Khan, J.I., Shaikh, S.: Relationship algebra for computing in social networks and social network based applications. In: 2006 IEEE/WIC/ACM International Conference on Web Intelligence, pp. 113–116 (2006)Google Scholar
- 3.Zhang, G.X.: Analysis on the inclination of group polarization from subject of public opinion in the cyber space. Journal of China Qingdao University of Science and Technology 21(4), 104–107 (2005)Google Scholar
- 4.DouShen, P., Qiang Yang, P., Zheng Chen, P.: Noise reduction through summarization for Web-page classification. Proceedings of Information Processing and Management: an International Journal 43(6) (2007)Google Scholar
- 5.Zeng, J., Zhang, S., Wu, C., Xie, J.: Predictive Model for Internet Public Opinion. IEEE 3, 7–11 (2007)Google Scholar
- 6.Yao, G.-X., Guan, Q.-L., Lin, L.-C., et al.: Research and implementation of next generation network intrusion detection system based on protocol analysis. In: Proceedings ISECS, CCCM 2008, vol. 2, pp. 353–357 (2008)Google Scholar
- 9.Guan, Q., Ye, S., Yao, G., Zhang, H., Wei, L., Song, G., He, K.: Research and Design of Internet Public Opinion Analysis System. IITA International Conference on Services Science, Management and Engineering, 173–177 (2009)Google Scholar
- 10.Li, Y., Chung, S.M., Holt, J.D.: Text document clustering based on frequent word meaning sequences. ACM, 381–404 (2008)Google Scholar
- 11.Turney, P.D., Littman, M.L.: Unsupervised learning of semantic orientation from a hundred-billion-word corpus. Technical Report ERB-1094, National Research Council Canada, Institute for Information Technology (2002) Google Scholar
- 12.Jorgensen, P.: Incorporating context in text analysis by interactive activation with competition artificial neural networks. ACM, 1081–1099 (2005)Google Scholar