Recognizing Chinese Proper Nouns with Transformation-Based Learning and Ontology

  • Peifeng Li
  • Qiaoming Zhu
  • Lei Wang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4733)


This paper proposes an approach based on the Ontology and transformation-based error-driven learning (TBL) to recognize Chinese proper nouns. Firstly, our approach redefines the label set and tags Chinese words according to the usage of proper nouns and their context, and then it extracts Characteristic Information (CI) of the proper noun from the text and merges them based on the Ontology. Secondly, it tags the training corpus following the new definition of Multi-dimension Attribute Points (MAP), and then extracts rules using the TBL approach. Finally, it recognizes proper nouns by utilizing the rule set and Ontology. The experimental results in our open test show that the precision is 92.5% and the recall is 86.3%.


Chinese proper nouns recognition TBL MAP rule 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Yu, H., Zhang, H., Liu, Q.: Recognition of Chinese Organization Name Based on Role Tagging. In: Proc. of 20th Int. Conf. on Computer Processing of Oriental Languages, pp. 79–87 (2003)Google Scholar
  2. 2.
    Yu, H., Zhang, H., Liu, Q., Lv, X., Shi, S.: Chinese NE Identification Using Cascaded Hidden Markov Model. Journal of Communications 27(2), 87–94 (2006)Google Scholar
  3. 3.
    Li, L., Huang, D., Mao, T., Xu, X.: Auto Recognition of Person Names from Chinese Texts Based on SVM. Computer Engineering 32(19), 188–210 (2006)Google Scholar
  4. 4.
    Qian, J., Zhang, Y., Zhang, T.: Research on Chinese Person Name and Location Name Recognition Based on ME Model. Mini-Micro Systems 27(9), 1761–1765 (2006)Google Scholar
  5. 5.
    Tan, H., Zheng, J., Liu, K.: Design and Realization of Chinese Place Name Automatic Recognition System. Computer Engineering 28(8), 128–129 (2002)Google Scholar
  6. 6.
    Li, Z., Liu, Y.: Chinese Name Recognition Based on Boundary Templates and Local Frequency. Journal of Chinese Information Processing 20(5), 44–50 (2006)Google Scholar
  7. 7.
    Lv, Y., Zhao, T., Yang, M., Yu, H., Li, S.: Unknown Chinese Words Resolution by Dynamic Programming. Journal of Chinese Information Processing 15(1), 123–128 (2001)Google Scholar
  8. 8.
    Manning, C., Schutze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)zbMATHGoogle Scholar
  9. 9.
    Li, P., Zhu, Q., Qian, P.: The Construction of a Multilingual Language Ontology Framework. Journal of Computer Application 27(3), 646–649 (2007)Google Scholar
  10. 10.
    Brill, E.: Transformation-based Error-drive Learning and Natural Language Processing: a Case Study in Part of Speech Tagging. Computational Linguistic 21(4), 543–565 (1995)Google Scholar
  11. 11.
    Zhou, M., Wu, J., Wang, C.: A Fast Learning Algorithm for Part of Speech Tagging: An Improvement on Brill’s Transformation-based Algorithm. Chinese Journal of Computer 21(4), 357–366 (1998)Google Scholar
  12. 12.
    Zhu, Q., Wen, T., Li, P., Qian, P.: Self-Adaptive Chinese Ambiguous Word Segmentation Method Based on Multi-Gram Library. Mini-micro Systems 27(8), 1597–1600 (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Peifeng Li
    • 1
  • Qiaoming Zhu
    • 1
  • Lei Wang
    • 1
  1. 1.School of Computer Science & Technology, Soochow University, Suzhou, 215006China

Personalised recommendations