Abstract
The Chinese base phrase identification plays an important role in the field of natural language processing. It needs to be improved in the recognition scope and methods currently. This paper presents a method based on word frequency statistics model for Chinese base noun phrase identification: Building the noun phrase dictionary by training corpus, calculating the co-occurrence frequency and threshold of the noun phrase, and constructing word table according to the different roles of the words in the noun phrase. Unknown word processing and rule templates are added. Improve the results with error correction processing at last. Experiments on the test corpus show that the average precision and average recall rate of the base noun phrases identification in different areas are 91.28% and 93.22%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Xu, F., Zong, C.Q., Wang, X.: Chinese baseNP chunking by Error-driven Combination Classifiers. Journal of Chinese Information Processing 21(1), 115–119 (2007)
Xu, Y.H.: Corpus-based studies of base noun phrase. Language Application (1), 120–125 (2008)
Hu, N.Q., Zhu, Q.M., Zhou, G.D.: Hybrid Method to Chinese Base Noun Phrase Recognition. Computer Engineering 35(20), 199–201 (2009)
Church, K.W.: A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text, pp. 136–143 (1988)
Eric, B.: Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part-of-Speech Tagging. Computational Linguistics 21(4) (1995)
Zhao, J., Huang, C.N.: A Chinese base noun phrase identification model based on the transformation. Journal of Chinese Information Processing 13(2), 1–7 (1998)
Zhou, Y.Q., Guo, Y., Huang, X.J.: Chinese and English BaseNP Recognition Based on a Maximum Entropy Model. Journal of Computer Research and Development 40(3), 440–446 (2003)
Tan, W., Kong, F., Ni, J.: A mixed statistical model-based method for identifying Chinese base noun phrase. Computer Applications and Software 28(8), 254–256 (2011)
Zhang, Y.Q., Zhou, Q.: Automatic identification of Chinese Base Phrases. Journal of Chinese Information Processing 16(6), 1–8 (2002)
Liu, S., Li, Y., Zhang, L.: Chinese Text Chunking Using Co-training Method. Journal of Chinese Information Processing 19(3), 73–79 (2005)
Huang, C.N., Jin, G.J.: To observe three Chinese grammar problems from Chinese TreeBank. The language of science 12(2), 178–192 (2013)
Yin, B.Y., Fang, S.Z.: New concepts and methods of word frequency statistics. In: Proceedings of Language Application (1995)
Hu, W.T., Yang, Y., Yin, H.F.: Organization name recognition based on word frequency statics. Application Research of Computers 30(7), 2014–2016 (2013)
Wu, X.Q., Lv, N.: An analysis method based on keyword co-occurrence frequency. The Intelligence Theory and Practice 35(8), 115–119 (2012)
Mao, T., Yang, J.D., Wang, W.G.: State transition method of natural language based on finite automata machine. Journal of Liaoning Technical University (Natural Science) 31(6), 885–888 (2012)
Fu, G.H., Wang, P., Wang, X.L.: Research on the approach of integrating Chinese word segmentation with Part-of-speech Tagging. Application Research of Computers (7), 24–26 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Kong, L., Ren, F., Sun, X., Quan, C. (2014). Word Frequency Statistics Model for Chinese Base Noun Phrase Identification. In: Huang, DS., Jo, KH., Wang, L. (eds) Intelligent Computing Methodologies. ICIC 2014. Lecture Notes in Computer Science(), vol 8589. Springer, Cham. https://doi.org/10.1007/978-3-319-09339-0_64
Download citation
DOI: https://doi.org/10.1007/978-3-319-09339-0_64
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09338-3
Online ISBN: 978-3-319-09339-0
eBook Packages: Computer ScienceComputer Science (R0)