Skip to main content

Entity Linking for Vietnamese Tweets

  • Conference paper
Knowledge and Systems Engineering

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 326))

Abstract

We study the task of entity linking for Vietnamese tweets, which aims at detecting entity mentions and linking them to corresponding entries in a given knowledge base. Unlike authored news or textual web content, tweets are noisy, irregular, and short, which causes entity linking in tweets much more challenging.We propose an approach to build an end-to-end entity linking system for Vietnamese tweets. The system consists of two stages. The first stage is to detect mentions and the second one performs entity disambiguation. We create a dataset including 524 Vietnamese tweets with 1,061 mentions and evaluate the system on this dataset. Our system achieves 69.2% F1-score. In order to show that our system is language-independent,we evaluate the system on a public dataset including 562 English tweets. The experiment results show that our system achieves 54.5% F1-score and outperforms the state-of-the-art end-to-end entity linking methods for tweets. To the best of our knowledge, this is the first attempt to build an end-to-end entity linking system for Vietnamese tweets and the system achieves very encouraging performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Milne, D., Witten, H.I.: Learning to Link with Wikipedia. In: Proc. of the ACM Conference on Information and Knowledge Management, pp. 509–518 (2008)

    Google Scholar 

  2. Meij, E., Weerkamp, W., Rijke, D.M.: Adding Semantics to Microblog Posts. In: Proc. of the Fifth ACM International Conference on Web Search and Data Mining (WSDM) (2012)

    Google Scholar 

  3. Liu, X., Li, Y., Wu, H., Zhou, M., Wei, F., Lu, Y.: Entity Linking for Tweets. In: Proc. of the 51st Annual Meeting of the Association for Computational Linguistics (ACL 2013), pp. 1304–1311 (2013)

    Google Scholar 

  4. Cassidy, T., Ji, H., Ratinov, L., Zubiaga, A., Huang, H.: Analysis and Enhancement of Wikification for Microblogs with Context Expansion. In: Proc. of the 23th International Conference on Computational Linguistics (COLING 2012), pp. 441–456 (2012)

    Google Scholar 

  5. Ratinov, L., Roth, D., Downey, D., Anderson, M.: Local and Global Algorithms for Disambiguation to Wikipedia. In: Proc. of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 1375–1384 (2011)

    Google Scholar 

  6. Huynh, H.M., Nguyen, T.T., Cao, T.H.: Using Coreference and Surrounding Context for Entity Linking. In: Proc. of the 10th IEEE RIVF International Conference on Computing and Communication Technologies (RIVF 2013) (2013)

    Google Scholar 

  7. Sofean, M., Stewart, A., Denecke, K., Smith, M.: Medical Case-Driven Classification of Microblogs: Characteristics and Annotation. In: Proc. of IHI 2012 (2012)

    Google Scholar 

  8. Truong, L.M., Cao, T.H., Dinh, D.: Towards vietnamese entity disambiguation. In: Van Huynh, N., Denoeux, T., Tran, D.H., Le, A.C., Pham, B.S. (eds.) KSE 2013, Part II. Advances in Intelligent Systems and Computing, vol. 245, pp. 299–310. Springer, Heidelberg (2014)

    Google Scholar 

  9. Milne, D., Witten, H.I.: An open-source toolkit for mining Wikipedia. Artificial Intelligence 194, 222–239 (2012)

    Article  MathSciNet  Google Scholar 

  10. Han, X., Sun, L., Zhao, J.: Collective Entity Linking in Web Text: A Graph-Based Method. In: Proc. of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 765–774 (2011)

    Google Scholar 

  11. Han, X., Sun, L.: A generative entity-mention model for linking entities with knowledge base. In: Proc. of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 945–954. Association for Computational Linguistics

    Google Scholar 

  12. Hachey, B., Radford, W., Curran, J.R.: Graph-based named entity linking with wikipedia. In: Bouguettaya, A., Hauswirth, M., Liu, L. (eds.) WISE 2011. LNCS, vol. 6997, pp. 213–226. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  13. Ji, H., Grishman, R., Dang, H.T.: Overview of the TAC 2011 Knowledge Base Population Track. In: Proc. of Text Analysis Conference (2011)

    Google Scholar 

  14. Ji, H., Grishman, R., Dang, H.T., Griffitt, K., Ellis, J.: Overview of the TAC 2010 Knowledge Base Population Track. In: Proc. Text Analysis Conference (2010)

    Google Scholar 

  15. McNamee, P., Dang, H.T.: Overview of the tac 2009 knowledge base population track. In: Proc. Text Analysis Conference (2009)

    Google Scholar 

  16. Shen, W., Wang, J., Luo, P., Wang, M.: Linking named entities in tweets with knowledge base via user interest modeling. In: Proc. of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 68–76 (2013)

    Google Scholar 

  17. Guo, S., Chang, M.W., Kiciman, E.: To Link or Not to Link? A Study on End-to-End Tweet Entity Linking. In: Proc. of NAACL 2013 (2013)

    Google Scholar 

  18. Murnane, E.L., Haslhofer, B., Lagoze, C.: RESLVE: leveraging user interest to improve entity disambiguation on short text. In: Proc. of the 22nd International Conference on World Wide Web, pp. 1275–1284 (2013)

    Google Scholar 

  19. Derczynski, L., Maynard, D., Aswani, N., Bontcheva, K.: Microblog-Genre Noise and Impact on Semantic Annotation Accuracy. In: Proc. of 24th ACM Conference on Hypertext and Social Media (2013)

    Google Scholar 

  20. Bontcheva, K., Rout, D.: Making sense of social media streams through semantics: a survey. Semantic Web Journal (2012)

    Google Scholar 

  21. Jin, Y., Kiciman, E., Wang, K., Loynd, R.: Entity Linking at the Tail: Sparse Signals, Unknown Entities and Phrase Models. In: Proc. of The Seventh ACM International Conference on Web Search and Data Mining (WSDM 2014) (2014)

    Google Scholar 

  22. Li, Y., Wang, C., Han, F., Han, J., Roth, D., Yan, X.: Mining evidences for named entity disambiguation. In: Proc. of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2013) (2013)

    Google Scholar 

  23. He, Z., Liu, S., Song, Y., Li, M., Zhou, M., Wang, H.: Efficient collective entity linking with stacking. In: Proc. of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP 2013) (2013)

    Google Scholar 

  24. He, Z., Liu, S., Li, M., Zhou, M., Zhang, L., Wang, H.: Learning entity representation for entity disambiguation. In: Proc. of the 51st Annual Meeting of the Association for Computational Linguistics (ACL 2013), pp. 30–34 (2013)

    Google Scholar 

  25. Berners-Lee, T., Hendler, J., Lassila, O.: The Semantic Web. Scientific American, 34–43 (2001)

    Google Scholar 

  26. Spina, D., Gonzalo, J., Amigó, E.: Discovering filter keywords for company name dis-ambiguation in twitter. Expert Systems with Applications 40(12), 4986–5003 (2013)

    Article  Google Scholar 

  27. Nguyen, H.T., Cao, T.H.: Named Entity Disambiguation: A Hybrid Approach. International Journal of Computational Intelligence Systems 5(6), 1052–1067 (2012)

    Article  Google Scholar 

  28. Huang, H., Cao, Y., Huang, X., Ji, H., Lin, C.-Y.: Collective Tweet Wikification based on Semi-supervised Graph Regularization. In: Proc. of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014) (2014)

    Google Scholar 

  29. Garcia, N.F., Fisteus, J.A., Fernández, L.S.: Comparative Evaluation of Link-Based Approaches for Candidate Ranking in Link-to-Wikipedia Systems. Journal of Artificial Intelligence Research 49, 733–773 (2014)

    MATH  Google Scholar 

  30. Ferragina, P., Scaiella, U.: Tagme: on-the-fly annotation of short text fragments (by Wikipedia entities). In: Proc. of the 19th ACM International Conference on Information and Knowledge Management (CIKM 2010) (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Duy K. Van .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Van, D.K., Huynh, H.M., Nguyen, H.T., Vo, V.T. (2015). Entity Linking for Vietnamese Tweets. In: Nguyen, VH., Le, AC., Huynh, VN. (eds) Knowledge and Systems Engineering. Advances in Intelligent Systems and Computing, vol 326. Springer, Cham. https://doi.org/10.1007/978-3-319-11680-8_48

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11680-8_48

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11679-2

  • Online ISBN: 978-3-319-11680-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics