Linking Topics of News and Blogs with Wikipedia for Complementary Navigation

  • Yuki Sato
  • Daisuke Yokomoto
  • Hiroyuki Nakasaki
  • Mariko Kawaba
  • Takehito Utsuro
  • Tomohiro Fukuhara
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6045)

Abstract

We study complementary navigation of news and blog, where Wikipedia entries are utilized as fundamental knowledge source for linking news articles and blog feeds/posts. In the proposed framework, given a topic as the title of a Wikipedia entry, its Wikipedia entry body text is analyzed as fundamental knowledge source for the given topic, and terms strongly related to the given topic are extracted. Those terms are then used for ranking news articles and blog posts. In the scenario of complementary navigation from a news article to closely related blog posts, Japanese Wikipedia entries are ranked according to the number of strongly related terms shared by the given news article and each Wikipedia entry. Then, top ranked 10 entries are regarded as indices for further retrieving closely related blog posts. The retrieved blog posts are finally ranked all together. The retrieved blog posts are then shown to users as blogs of personal opinions and experiences that are closely related to the given news article. In our preliminary evaluation, through an interface for manually selecting relevant Wikipedia entries, the rate of successfully retrieving relevant blog posts improved.

Keywords

IR Wikipedia news blog topic analysis 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Gabrilovich, E., Markovitch, S.: Overcoming the brittleness bottleneck using Wikipedia: Enhancing text categorization with encyclopedic knowledge. In: Proc. 21st AAAI, pp. 1301–1306 (2006)Google Scholar
  2. 2.
    Wang, P., Domeniconi, C.: Building semantic kernels for text classification using Wikipedia. In: Proc. 14th SIGKDD, pp. 713–721 (2008)Google Scholar
  3. 3.
    Hu, J., Fang, L., Cao, Y., Zeng, H.J., Li, H., Yang, Q., Chen, Z.: Enhancing text clustering by leveraging Wikipedia semantics. In: Proc. 31st SIGIR, pp. 179–186 (2008)Google Scholar
  4. 4.
    Huang, A., Frank, E., Witten, I.H.: Clustering document using a Wikipedia-based concept representation. In: Proc. 13th PAKDD, pp. 628–636 (2009)Google Scholar
  5. 5.
    Hu, X., Zhang, X., Lu, C., Park, E.K., Zhou, X.: Exploiting Wikipedia as external knowledge for document clustering. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 389–396 (2009)Google Scholar
  6. 6.
    Cucerzan, S.: Large-scale named entity disambiguation based on Wikipedia data. In: Proc. EMNLP-CoNLL, pp. 708–716 (2007)Google Scholar
  7. 7.
    Kazama, J., Torisawa, K.: Exploiting Wikipedia as external knowledge for named entity recognition. In: Proc. EMNLP-CoNLL, pp. 698–707 (2007)Google Scholar
  8. 8.
    Oh, J.H., Kawahara, D., Uchimoto, K., Kazama, J., Torisawa, K.: Enriching multilingual language resources by discovering missing cross-language links in Wikipedia. In: Proc. 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, pp. 322–328 (2008)Google Scholar
  9. 9.
    Mihalcea, R., Csomai, A.: Wikify! linking documents to encyclopedic knowledge. In: Proceedings of the 16th ACM Conference on Information and Knowledge Management, pp. 233–242 (2007)Google Scholar
  10. 10.
    Sumida, A., Torisawa, K.: Hacking Wikipedia for hyponymy relation acquisition. In: Proc. 3rd IJCNLP, pp. 883–888 (2008)Google Scholar
  11. 11.
    McKeown, K.R., Barzilay, R., Evans, D., Hatzivassiloglou, V., Klavans, J.L., Nenkova, A., Sable, C., Schiffman, B., Sigelman, S.: Tracking and summarizing news on a daily basis with Columbia’s Newsblaster. In: Pro. 2nd HLT, pp. 280–285 (2002)Google Scholar
  12. 12.
    Radev, D., Otterbacher, J., Winkel, A., Blair-Goldensohn, S.: NewsInEssence: Summarizing online news topics. Communications of the ACM 48, 95–98 (2005)CrossRefGoogle Scholar
  13. 13.
    Glance, N., Hurst, M., Tomokiyo, T.: Blogpulse: Automated trend discovery for Weblogs. In: WWW 2004 Workshop on the Weblogging Ecosystem: Aggregation, Analysis and Dynamics (2004)Google Scholar
  14. 14.
    Nanno, T., Fujiki, T., Suzuki, Y., Okumura, M.: Automatically collecting, monitoring, and mining Japanese weblogs. In: WWW Alt. 2004: Proc. 13th WWW Conf. Alternate Track Papers & Posters, pp. 320–321 (2004)Google Scholar
  15. 15.
    Kawaba, M., Nakasaki, H., Utsuro, T., Fukuhara, T.: Cross-lingual blog analysis based on multilingual blog distillation from multilingual Wikipedia entries. In: Proceedings of International Conference on Weblogs and Social Media, pp. 200–201 (2008)Google Scholar
  16. 16.
    Nakasaki, H., Kawaba, M., Yamazaki, S., Utsuro, T., Fukuhara, T.: Visualizing cross-lingual/cross-cultural differences in concerns in multilingual blogs. In: Proceedings of International Conference on Weblogs and Social Media, pp. 270–273 (2009)Google Scholar
  17. 17.
    Kawaba, M., Yokomoto, D., Nakasaki, H., Utsuro, T., Fukuhara, T.: Linking Wikipedia entries to blog feeds by machine learning. In: Proc. 3rd IUCS (2009)Google Scholar
  18. 18.
    Gamon, M., Basu, S., Belenko, D., Fisher, D., Hurst, M., Konig, A.C.: Blews: Using blogs to provide context for news articles. In: Proc. ICWSM, pp. 60–67 (2008)Google Scholar
  19. 19.
    Ikeda, D., Fujiki, T., Okumura, M.: Automatically linking news articles to blog entries. In: Proc. 2006 AAAI Spring Symp. Computational Approaches to Analyzing Weblogs, pp. 78–82 (2006)Google Scholar
  20. 20.
    Yoshioka, M.: IR Interface for Contrasting Multiple News Sites. In: Prof. 4th AIRS, pp. 516–521 (2008)Google Scholar
  21. 21.
    Bautin, M., Vijayarenu, L., Skiena, S.: International Sentiment Analysis for News and Blogs. In: Proc. ICWSM, pp. 19–26 (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Yuki Sato
    • 1
  • Daisuke Yokomoto
    • 1
  • Hiroyuki Nakasaki
    • 2
  • Mariko Kawaba
    • 3
  • Takehito Utsuro
    • 1
  • Tomohiro Fukuhara
    • 4
  1. 1.Graduate School of Systems and Information EngineeringUniversity of TsukubaTsukubaJapan
  2. 2.NTT DATA CORPORATIONTokyoJapan
  3. 3.NTT Cyber Space LaboratoriesNTT CorporationYokosuka, KanagawaJapan
  4. 4.Center for Service ResearchNational Institute of Advanced Industrial Science and TechnologyTokyoJapan

Personalised recommendations