Chinese Named Entity Recognition Using Improved Bi-gram Model Based on Dynamic Programming

Le, Juan; Niu, ZhenDong

doi:10.1007/978-3-642-37832-4_40

Juan Le⁵ &
ZhenDong Niu⁵

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 214))

2731 Accesses

Abstract

This paper proposes a bi-gram model based on dynamic programming to Chinese person named entity recognition. By studying the previous work, we concluded that we can improve the precision of NER by improving the recall rate and narrowing the gap between the recall rate and the precision rate. The algorithm defines five recognition rules which ensure the names can be recognized and returned firstly to improve the recall rate. This paper’s innovation is a filtering stage introduced to filter out the invalid names by combining the inverse-maximum-matching with bi-gram model. The bi-gram model takes four pairs of transition probability into consideration when segments the sentence which can effectively narrow the gap between precision rate and recall rate. We take the open test in different corpus and materials extracted from the Internet straightly, its precision rate achieves 83.53 %, recall rate achieves 91.43 % and its F-value achieves 87.3 %.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Kashif R (2010) Rule-based named entity recognition in Urdu. In: Proceedings of the 2010 named entities workshop, Curran Associates, Inc., Uppsala, Sweden, pp 126–135
Google Scholar
Laura C, Rajasekar K, Yunyao Li Frederick R, Shivakumar V (2010) Domain adaptation of rule-based annotators for named-entity recognition tasks. In: Conference on empirical methods in natural language processing, Massachusetts, pp 1002–1012
Google Scholar
Dilek K, Adnan Y (2009) Named entity recognition experiments on Turkish texts. In: 8th international conference flexible query answering systems, Springer, Denmark, pp 524–535
Google Scholar
Zhang HuaPing, Liu Qun (2003) Chinese named entity recognition using role model. J Comput Linguist Chin Lang Process 8:29–60
Google Scholar
GuoHong Fu, Jian Su (2002) Named entity recognition using an HMM-based tagger. In: 40th annual meeting of the Association for Computational Linguistics (ACL), Philadelphia, pp 473–480
Google Scholar
GuoHong, Kang-Kwong Luke (2005) Chinese named entity recognition using lexicalized HMMs. J ACM SIGKDD Explor Newsl 7:19–25 (New York)
Google Scholar
Hua Y, Tan Y, Hao W (2009) A method of Chinese named entity recognition based on maximum entropy model. In: ICMA international conference, IEEE Press, China, pp 2472–2477
Google Scholar
Lufeng Z, Pascale F, Richard S, Marine C, Dekai W (2004) Using N-best lists for named entity recognition from Chinese speech. In: Proceedings of HLT-NAACL, short papers, IEEE Press, Boston, Massachusetts
Google Scholar
FuChun Peng, FangFang Feng, McCallum A (2004) Chinese segmentation and new word detection using conditional random fields. In: COLING, Geneva, pp 562–568
Google Scholar
HongPing Hu, HuiPing Zhou (2008) Chinese named entity recognition with CRFs. In: 2008 international conference on computational intelligence and security, NW Washington, pp 1–6
Google Scholar
XiaoFeng Yu (2007) Chinese named entity recognition with cascaded hybrid model. In: Proceedings of the NAACL-Short 07’ human language technologies 2007: the conference of the North American chapter of the association for computational linguistics, companion volume, short papers, New York, pp 197–200
Google Scholar
Xiaoyan Zhang, TingWang, Tang J, Zhou H, HuoWang Chen (2005) Chinese named entity recognition with a hybrid-statistical model. In: Web technologies research and development—APWeb 2005, Lecture notes in computer science, Vol 3399/2005, pp 900–912
Google Scholar
ZhuoYe Ding, DeGen Huang, Huiwei Zhou (2008) A hybrid model based on CRFs for Chinese named entity recognition. In: 2008 International conference on advanced language processing and web information technology, IEEE Press, China, pp 127–132
Google Scholar

Download references

Author information

Authors and Affiliations

College of Computer Science, Beijing Institute of Technology, Beijing, China
Juan Le & ZhenDong Niu

Authors

Juan Le
View author publications
You can also search for this author in PubMed Google Scholar
ZhenDong Niu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Juan Le .

Editor information

Editors and Affiliations

Department of Computer Science and Technology, Tsinghua University, Beijing, China, People's Republic
Fuchun Sun
School of Information Science and Technology, Southwest Jiaotong University, Chengdu, China, People's Republic
Tianrui Li
Department of Computer Science and Techn, Tsinghua University, Beijing, China, People's Republic
Hongbo Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Le, J., Niu, Z. (2014). Chinese Named Entity Recognition Using Improved Bi-gram Model Based on Dynamic Programming. In: Sun, F., Li, T., Li, H. (eds) Knowledge Engineering and Management. Advances in Intelligent Systems and Computing, vol 214. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37832-4_40

Download citation

DOI: https://doi.org/10.1007/978-3-642-37832-4_40
Published: 24 July 2013
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37831-7
Online ISBN: 978-3-642-37832-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics