Skip to main content

Chinese Named Entity Recognition and Disambiguation Based on Wikipedia

  • Conference paper

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 333))

Abstract

This paper presents a method for named entity recognition and disambiguation based on Wikipedia. First, we establish Wikipedia database using open source tools named JWPL. Second, we extract the definition term from the first sentence of Wikipedia page and use it as external knowledge in named entity recognition. Finally, we achieve named entity disambiguation using Wikipedia disambiguation pages and contextual information. The experiments show that the use of Wikipedia features can improve the accuracy of named entity recognition.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Shun, Z., Wang, H.: Named Entity Recognition Research. Modern Library and Information Technology (6), 42–47 (2010)

    Google Scholar 

  2. Zhou, K.: Rule-based named entity recognition. Hefei University of Technology, Anhui (2010)

    Google Scholar 

  3. Li, J., Wang, D., Wang, X.: Chinese organization name recognition based on template matching. Information Technology (6), 97–99 (2008)

    Google Scholar 

  4. Huang, D., Yue, G., Yang, Y.: Chinese local name recognition based on statistics. Journal of Chinese Information 17(2), 36–41 (2003)

    Article  Google Scholar 

  5. Huang, D., Yang, Y., et al.: Identification of Chinese Name Based on Statistics. Journal of Chinese Information Processing (2001)

    Google Scholar 

  6. Wan, R.: Chinese organization name recognition. Dalian University of Technology, Liaoning (2008)

    Google Scholar 

  7. Qiao, Y.: Chinese named entity recognition with the combination of rules and statistics. Shandong University, Shandong (2007)

    Google Scholar 

  8. Kazamaand, J., Torisawa, K.: Exploiting Wikipedia as external knowledge for named entity recognition. In: Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 698–707 (2007)

    Google Scholar 

  9. Nothman, J., Curran, J.R., Murphy, T.: Transforming Wikipedia into named entity training data. In: Proceedings of the Australasian Language Technology Association Workshop, pp. 124–132 (2011)

    Google Scholar 

  10. Cucerzan, S.: Large-scale named entity disambiguation based on Wikipedia data. In: Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 708–716 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Miao, Y., Yajuan, L., Qun, L., Jinsong, S., Hao, X. (2012). Chinese Named Entity Recognition and Disambiguation Based on Wikipedia. In: Zhou, M., Zhou, G., Zhao, D., Liu, Q., Zou, L. (eds) Natural Language Processing and Chinese Computing. NLPCC 2012. Communications in Computer and Information Science, vol 333. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34456-5_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-34456-5_25

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-34455-8

  • Online ISBN: 978-3-642-34456-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics