Skip to main content
Log in

Extracting news blog hot topics based on the W2T Methodology

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Although topic detection and tracking techniques have made great progress, most of the researchers seldom pay more attention to the following two aspects. First, the construction of a topic model does not take the characteristics of different topics into consideration. Second, the factors that determine the formation and development of hot topics are not further analyzed. In order to correctly extract news blog hot topics, the paper views the above problems in a new perspective based on the W2T (Wisdom Web of Things) methodology, in which the characteristics of blog users, context of topic propagation and information granularity are investigated in a unified way. The motivations and features of blog users are first analyzed to understand the characteristics of news blog topics. Then the context of topic propagation is decomposed into the blog community, topic network and opinion network, respectively. Some important factors such as the user behavior pattern, opinion leader and network opinion are identified to track the development trends of news blog topics. Moreover, a blog hot topic detection algorithm is proposed, in which news blog hot topics are identified by measuring the duration, topic novelty, attention degree of users and topic growth. Experimental results show that the proposed method is feasible and effective. These results are also useful for further studying the formation mechanism of opinion leaders in blogspace.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Agarwal, N., Liu, H., Tang, L.: Identifying the influential bloggers in a community. In: Proceedings of the International Conference on Web Search and Web Data Mining, pp. 207–217 (2008)

  2. Akritidis, L., Katsaros, D., Bozanis, P.: Identifying the productive and influential bloggers in a community. IEEE Trans. Syst. Man Cybern. 41(5), 759–764 (2011)

    Article  Google Scholar 

  3. Allan, J., Papka, R., Lavrenko, V.: On-line new event detection and tracking. In: Proceedings of the Twenty-First Annual International ACM SIGIR Conference, pp. 37–45 (1998)

  4. Anderson, J.R., Schooler, L.J.: Reflections of the environment in memory. Psychol. Sci. 2(6), 396–408 (1991)

    Article  Google Scholar 

  5. Balakrishnan, H., Deo, N.: Discovering communities in complex networks. In: Proceedings of the Forty-Fourth Annual Southeast Regional Conference, pp. 280–285 (2006)

  6. Bansal, N., Chiang, F., Koudas, N., Wm, F.: Seeking stable clusters in the blogosphere. In: Proceedings of the Thirty-Third International Conference on Very Large Data Bases, pp. 806–817 (2007)

  7. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  8. Bodendorf, F., Kaiser, C.: Detecting opinion leaders and trends in online social networks. In: Proceedings of the Fourth International Conference on Digital Society, pp. 124–129 (2010)

  9. Brants, T., Chen, F., Ioannis, T.: Topic-based document segmentation with probabilistic latent semantic analysis. In: Proceedings of the Eleventh International Conference on Information and Knowledge Management, pp. 211–218 (2002)

  10. Cao, Y.Z., Shao, P.J., Li, L.Q.. Topic propagation model based on diffusion threshold in blog networks. In: Proceedings of 2011 International Conference on Business Computing and Global Information, pp. 539–542 (2011)

  11. Chen, C.C., Chen, Y.T., Chen, M.C.: An aging theory for event life-cycle modeling. IEEE Trans. Syst. Man Cybern. 37(2), 237–248 (2007)

    Google Scholar 

  12. Chen, K.Y., Luesukprasert, L., Chou, S.C.T.: Hot topic extraction based on timeline analysis and multidimensional sentence modeling. IEEE Trans. Knowl. Data Eng. 19(8), 1016–1025 (2007)

    Article  Google Scholar 

  13. Constantiou, L., Hoebel, N., Zicari, R.V.: How do framing strategies influence the user’s choice of content on the web. Concurrency Comput. Pract. Exper. 24(17), 2207–2220 (2012)

    Article  Google Scholar 

  14. Dai, X.Y., Chen, Q.C., Wang, X.L., Xu, J.: Online topic detection and tracking of financial news based on hierarchical clustering. In: Proceedings of the Ninth International Conference on Machine Learning and Cybernetics, vol. 6, pp. 3341–3346 (2010)

  15. Ding, F.: Research on information interaction and diffusion in internet communities. Beijing Jiaotong University, Beijing (2010)

    Google Scholar 

  16. Gong, H.J.: Research on automatic network hot topics detection. Central China Normal University, Wuhan (2008)

    Google Scholar 

  17. He, T.T., Qu, G.Z., Li, S.W., Tu, X.H., Zhong, Y., Ren, H.: Semi-automatic hot event detection. In: Proceedings of the Second International Conference on Advanced Data Mining and Applications, pp. 1008–1016 (2006)

  18. Hong, Y., Zhang, Y., Fan, J.L., Liu, T., Li, S.: New event detection based on division comparison of subtopic. Chinese Journal of Computers 31(4), 687–695 (2008)

    Article  Google Scholar 

  19. Huang, H.H., Kuo, Y.H.: Cross-lingual document representation and semantic similarity measure a fuzzy set and rough set based approach. IEEE Trans. Fuzzy Syst. 18(6), 1098–1111 (2010)

    Article  Google Scholar 

  20. ICTCLAS. Home page: http://ictclas.org. Accessed 10 Mar 2011

  21. Kilner, P.G., Hoadley, C.M.: Anonymity options and professional participation in an online community of practice. In: Proceedings of the 2005 Conference on Computer Support for Collaborative Learning, pp. 272–280 (2005)

  22. Ku, L.W., Liang, Y.T., Chen, H.H.: Opinion extraction, summarization and tracking in news and blog corpora. In: Proceedings of AAAI-2006 Spring Symposium on Computational Approaches to Analyzing Weblogs, pp. 100–107 (2006)

  23. Kumar, R., Novak, J., Raghavan, P.: On the bursty evolution of blogspace. World Wide Web 8(2), 159–178 (2005)

    Article  Google Scholar 

  24. Li, J.J., Zhang, X.C., Weng, Y., Hu, C.J.: Blog hotness evaluation model based on text opinion analysis. In: Proceedings of the Eighth IEEE International Conference on Dependable, Autonomic and Secure Computing, pp. 235–240 (2009)

  25. Li, Y.M., Lai, C.Y., Chen, C.W.: Discovering influencers for marketing in the blogosphere. Inf. Sci. 181(23), 5143–5157 (2011)

    Article  Google Scholar 

  26. Lim, S.H., Kim, S.W., Park, S.J., Lee, J.H.: Determining content power users in a blog network: an approach and its applications. IEEE Trans. Syst. Man Cybern. 41(5), 853–862 (2011)

    Google Scholar 

  27. Liu, Y., Yu, X.H., An, A.J., Huang, X.J.: Riding the tide of sentiment change: sentiment analysis with evolving online reviews. World Wide Web. doi:10.1007/s11280-012-0177-1

  28. Luo, H.: A study on the evolution of internet public opinion of social focused events. Huazhong University of Science and Technoloy, Wuhan (2011)

    Google Scholar 

  29. Ma, X.H., Li, L.: Why do people blog? exploration of motivations for blogging. In: Proceedings of the Second IEEE Symposium on Web Society, pp. 119–122 (2010)

  30. Musial, K., Budka, M., Juszczyszyn, K.: Creation and growth of online social network how do social networks evolve? World Wide Web. doi:10.1007/s11280-012-0179-z

  31. Musial, K., Kazienko, P.: Social networks on the internet. World Wide Web 16(1), 31–72 (2013)

    Article  Google Scholar 

  32. Pan, X.: Opinion spreading models on complex network. Dalian University of Technology, Dalian (2010)

    Google Scholar 

  33. Qi, H.F.: Research on hot topic detection and event tracking in network public opinion. Harbin Engineering University, Harbin (2008)

    Google Scholar 

  34. Qiu, H.M.: The social network analysis of blogosphere. Harbin Institute of Technology, Harbin (2007)

    Google Scholar 

  35. Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Information Processing & Management 24(5), 513–523 (1988)

    Article  Google Scholar 

  36. Shi, J., Hu, M., Dai, G.Z.: Topic analysis of Chinese text based on small world model. Journal of Chinese Information Processing 21(3), 69–75 (2007)

    Google Scholar 

  37. Sina Blog Website. Home page: http://blog.sina.com.cn. Accessed 1 Feb 2012

  38. Sogou Laboratory. Home page: http://www.sogou.com/labs/dl/c.html. Accessed 28 Oct 2009

  39. Song, X.D., Chi, Y., Hino, K., Tseng, B.: Identifying opinion leaders in the blogosphere. In: Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, pp. 971–974 (2007)

  40. Sun, W.J., Qiu, H.M.: A social network analysis on blogospheres. In: Proceedings of 2008 International Conference on Management Science and Engineering, pp. 1769–1773 (2008)

  41. Wang, C.H., Zhang, M., Ma, S.P., Ru, L.Y.: Automatic online news issue construction in web environment. In: Proceedings of the Seventeenth International Conference on World Wide Web, pp. 457–466 (2008)

  42. Wang, J.H.: Web-based verification on the representativeness of terms extracted from single short documents. In: Proceedings of 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology, vol. 3, pp. 114–117 (2011)

  43. Wang, Y., Xi, Y.H., Wang, L.: Mining the hottest topics on Chinese webpage based on the improved k-means partitioning. In: Proceedings of the Eighth International Conference on Machine Learning and Cybernetics, pp. 255–260 (2009)

  44. Xie, G.H.: The research on the system of the affect of internet opinion leaders. Central China Normal University, Wuhan (2011)

    Google Scholar 

  45. Yang, C.C., Shi, X.D., Wei, C.H.: Discovering event evolution graphs from news corpora. IEEE Trans. Syst. Man Cybern. 39(4), 850–863 (2009)

    Google Scholar 

  46. Yao, J.J., Cui, B., Huang, Y.X.: Bursty event detection from collaborative tags. World Wide Web 15(2), 171–195 (2012)

    Article  Google Scholar 

  47. Yao, J.T., Yao, Y.Y.: Information granulation for web based information retrieval support systems. In: Proceedings of the Society of Photo-Optical Instrumentation Engineers, vol. 5098, pp. 138–146 (2003)

  48. Yao, Y.Y., Petty, S.: Multiple representations of web content for effective knowledge utilization. In: Proceedings of 2012 International Conference on Brain Informatics, pp. 338–347 (2012)

  49. Yu, H.: Research on the opinion leaders of political BBS: an case study on Sino-Japan BBS of strong nation forum. Huazhong University of Science and Technology, Wuhan (2007)

    Google Scholar 

  50. Zhang, Y.: A study on the phenomenon of public-opinion-spreading through bulletin board system. Jilin University, Changchun (2011)

    Google Scholar 

  51. Zhang, Y.C., Liu, Y., Ding, F., Si, X.M.: The research on stability of diffusion and competition between online topics. Int. J. Mod. Phys. C 21(12), 1517–1529 (2010)

    Article  MATH  Google Scholar 

  52. Zhao, J.: Web usage mining based on granularity computing. South China University of Technology, Guangzhou (2010)

    Google Scholar 

  53. Zhao, K., Kumar, A.: Who blogs what: understanding the publishing behavior of bloggers. World Wide Web. 10.1007/s11280-012-0167-3

  54. Zhao, P., Cai, Q.S., Wang, Q.Y., Gen, H.T.: An automatic keyword extraction of Chinese document algorithm based on complex network features. Pattern Recognition and Artificial Intelligence 20(6), 827–831 (2007)

    Google Scholar 

  55. Zhong, N., Bradshaw, J.M., Liu, J.M., Taylor, J.G.: Brain informatics. IEEE Intell. Syst. 26(5), 16–21 (2011)

    Article  Google Scholar 

  56. Zhong, N., Ma, J.H., Huang, R.H., Liu, J.M., Yao, Y.Y., Zhang, Y.X., Chen, J.H.: Research challenges and perspectives on Wisdom Web of Things (W2T). J. Supercomput. 1–21 (2010). doi:10.1007/s11227-010-0518-8

    Google Scholar 

  57. Zhou, Y.D., Sun, Q.D., Guan, X.H., Li, W., Tao, J.: Internet popular topics extraction of traffic content words correlation. Journal of Xian Jiaotong University 41(10), 1142–1145 (2007)

    Google Scholar 

  58. Zhu, M.X., Cai, Z., Cai, Q.S.: Automatic keywords extraction of Chinese document using small world structure. In: Proceedings of Natural Language Processing and Knowledge Engineering, pp. 438–443 (2003)

  59. Zhu, T.: Research on node role and group evolution in social network. Beijing University of Posts and Telecommunications, Beijing (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ning Zhong.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhou, E., Zhong, N. & Li, Y. Extracting news blog hot topics based on the W2T Methodology. World Wide Web 17, 377–404 (2014). https://doi.org/10.1007/s11280-013-0207-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-013-0207-7

Keywords

Navigation