Chinese Named Entity Recognition Based on Hierarchical Hybrid Model
Chinese named entity recognition is a challenging, difficult, yet important task in natural language processing. This paper presents a novel approach based on a hierarchical hybrid model to recognize Chinese named entities. Three mutually dependent stages-boosting, Markov Logic Networks (MLNs) based recognition, and abbreviation detection – are integrated in the model. AdaBoost algorithm is utilized for fast recognition of simple named entities first. More complex named entities are then piped into MLNs for accurate recognition. In particular, the left boundary recognition of named entities is considered. Lastly, special care is taken for classifying the abbreviated named entities by using the global context information in the same document. Experiments were conducted on People’s Daily corpus. The results show that our approach can improve the performance significantly with precision of 94.38%, recall of 93.89%, and F β =1 value of 93.97%.
Unable to display preview. Download preview PDF.
- 1.Sun, J., Gao, J., Zhang, L., Zhou, M., Huang, C.: Chinese named entity identification using class-based language model. In: ICL, pp. 1–7 (2002)Google Scholar
- 2.Yu, X., Carpuat, M., Wu, D.: Boosting for chinese named entity recognition. In: The 5th SIGHAN Workshop on Chinese Language Processing, pp. 150–153 (2006)Google Scholar
- 3.Yu, X.: Chinese named entity recognition with cascaded hybrid model. In: NAACL HLT 2007, pp. 197–200 (2007)Google Scholar
- 5.Kok, S., Singla, P., Richardson, M., Domingos, P., Summer, M., Poon, H.: The alchemy system for statistical relational ai (2006)Google Scholar
- 6.Singla, P., Domingos, P.: Discriminative training of markov logic networks. In: AAAI 2005, pp. 120–128 (2005)Google Scholar
- 7.Wu, Y., Zhao, J., Xu, B.: Chinese named entity recognition combining a statistical model with human knowledge. In: The ACL Workshop on Multilingual and Mixed-Language Named Entity Recognition, pp. 65–72 (2003)Google Scholar