Chinese and Korean Cross-Lingual Issue News Detection based on Translation Knowledge of Wikipedia

  • Shengnan Zhao
  • Bayar Tsolmon
  • Kyung-Soon Lee
  • Young-Seok LeeEmail author
Conference paper
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 285)


Cross-lingual issue news and analyzing the news content is an important and challenging task. The core of the cross-lingual research is the process of translation. In this paper, we focus on extracting cross-lingual issue news from the Twitter data of Chinese and Korean. We propose translation knowledge method for Wikipedia concepts as well as the Chinese and Korean cross-lingual inter-Wikipedia link relations. The relevance relations are extracted from the category and the page title of Wikipedia. The evaluation achieved a performance of 83 % in average precision in the top 10 extracted issue news. The result indicates that our method is an effective for cross-lingual issue news detection.


Issue news detection Cross-lingual link discovery Wikipedia knowledge 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.



This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2012R1A1A2044811).


  1. 1.
    L.X.Tang, S.Geva, A.Trotman, Y.Xu, and K.Y.Itakura.: Overview of the NTCIR-9 Crosslingual Link Discovery. Proceedings of NTCIR-9, 2011Google Scholar
  2. 2.
    G.J.Jones, F.Fantino, E.Newman, and Y.Zhang.:Domain-specific query translation for Multilingual information access using machine translation augmented with dictionaries mined from Wikipedia. Proceedings of CLIA’08, 2008.Google Scholar
  3. 3.
    Leacock, C.&M.Chodorow (1998). Combining local context and WordNet similarity for word sense identification. In C. Fellbaum (Ed.), WordNet. An Electronic Lexical Database, Chp. 11, pp. 265–283. Cambridge, Mass.: MIT Press.Google Scholar
  4. 4.
    Dempster, A., Laird, N., and Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. J. Royal Statist. Soc. B39 (1977)Google Scholar
  5. 5.
    Thomas, H.: Probabilistic Latent Semantic Indexing, Proceedings of the Twenty-Second Annual International SIGIRGoogle Scholar
  6. 6.
    M.Strube, S.P.Ponzetto.: WikiRelate! Computing Semantic Relatedness Using Wikipedia. Proceedings of AAAI, 2006Google Scholar
  7. 7.
    H.Kwak, C.Lee, H.Park, and S.Moon.: What is Twitter, a Social Network or a News Media?, Proceedings of WWW, 2010Google Scholar
  8. 8.
    D. Zhang, Q. Mei and C.X., Zhai.: Cross-Lingual Latent Topic Extraction, Proceedings of ACL, pp.1128-1137, 2010Google Scholar

Copyright information

© Springer Science+Business Media Singapore 2014

Authors and Affiliations

  • Shengnan Zhao
    • 1
  • Bayar Tsolmon
    • 1
  • Kyung-Soon Lee
    • 1
  • Young-Seok Lee
    • 1
    Email author
  1. 1.Division of Computer Science and Engineering, CAIITChonbuk National UniversityDeokjin-gu, Jeonju-siRepublic of Korea

Personalised recommendations