Abstract
Head-driven statistical models for natural language parsing are the most representative lexicalized syntactic parsing models, but they only utilize semantic dependency between words, and do not incorporate other semantic information such as semantic collocation and semantic category. Some improvements on this distinctive parser are presented. Firstly, “valency” is an essential semantic feature of words. Once the valency of word is determined, the collocation of the word is clear, and the sentence structure can be directly derived. Thus, a syntactic parsing model combining valence structure with semantic dependency is purposed on the base of head-driven statistical syntactic parsing models. Secondly, semantic role labeling (SRL) is very necessary for deep natural language processing. An integrated parsing approach is proposed to integrate semantic parsing into the syntactic parsing process. Experiments are conducted for the refined statistical parser. The results show that 87.12% precision and 85.04% recall are obtained, and F measure is improved by 5.68% compared with the head-driven parsing model introduced by Collins.
Similar content being viewed by others
References
MANNING C D, SCHUTZE H. Foundations of statistical natural language processing [M]. London: The MIT Press, 1999: 184–197.
JURAFSKY D, MARTIN J H. Speech and language processing [M]. New Jersey: Prentice Hall, 2009: 210–265.
DAI Yin-tang, WU Cheng-rong, MA Sheng-xiang, ZHONG Yi-ping. Hierarchically classified probabilistic grammar parsing [J]. Journal of Software, 2011, 22(2): 245–257. (in Chinese)
AVIRAN S, SIEGEL P H, WOLF J K. Optimal parsing trees for run-length coding of biased data [J]. IEEE Transaction on Information Theory, 2008, 54(2): 841–849.
SUN Ang, JIANG Ming-hu, HE Yi-fan, CHEN Lin, YUAN Bao-zong. Chinese question answering based on syntax analysis and answer classification [J]. Acta Electronica Sinica, 2008, 36(5): 833–839. (in Chinese)
CHEN Yi-heng, QIN Bing, SONG Fan, LIU Ting. Search result clustering based on centroid optimization by ontology extraction [J]. Acta Electronica Sinica, 2008, 36(12A): 166–171. (in Chinese)
ZHOU Fa-guo, ZHANG Fan, YANG Bing-ru. Problems and review of statistical parsing language model [C]// Proceedings of 2010 International Conference on Asian Language Processing. Harbin, China, 2010: 77–80.
CHARNIA K E. Immediate-head parsing for language models [C]// Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics. Toulouse, France, 2001: 116–123.
SIMA A K. Tree-gram parsing: Lexical dependencies and structual relations [C]// Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics. Hong Kong, 2000: 53–56.
COLLINS M. Head-driven statistical models for natural language parsing [J]. Computational Linguistics, 2003, 29(4): 589–637.
LIU Shui, LI Sheng, ZHAO Tie-Jun, LIU Peng-yuan. Directly smooth interpolation algorithm in head-driven parsing [J]. Journal of Software, 2009, 20(11): 2915–2924. (in Chinese)
ZHOU M. A block-based dependency parser for unrestricted Chinese text [C]// Proceedings of the 2nd Chinese Language Processing Workshop. Hong Kong, 2000: 78–84.
GAO J F, SUZUKI H. Unsupervised learning of dependency structure for language modeling [C]// Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics. Sapporo, Japan, 2003: 521–528.
LAI T B Y, HUANG C N, ZHOU M, MIAO J B, SIU K C. Span-based statistical dependency parsing of Chinese [C]// Proceedings of the 6th Natural Language Processing Pacific Rim Symposium (NLPRS2001). Tokyo, Japan, 2001: 677–684.
CHELBA C, JELINK F. Exploiting syntactic structure for language modeling [C]// Proceedings of the36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics. Quebec, Canada, 1998: 225–231.
ZHOU De-yu, HE Yu-lan. Discriminative training of the hidden vectors state model for semantic parsing [J]. IEEE Transaction on Knowledge and Data Engineering, 2009, 21(1): 66–77.
VILARES J, ALONSO M A, VILARES M. Extraction of complex index terms in non-English IR: A shallow parsing based approach [J]. Information Processing and Management, 2008, 44(4): 1517–1537.
LI Jun-hui. Research on joint syntactic and semantic parsing for Chinese [D]. Suzhou, China: Soochow University, 2010: 6–40. (in Chinese)
YU Jiang-de, FAN Xiao-zhong, PANG Wen-bo, YU Zheng-tao. Semantic role labeling based on conditional random fields [J]. Journal of Southeast University: English Edition, 2007, 23(3): 361–364.
LI Jun-hui, ZHOU Guo-dong, ZHU Qiao-ming, QIAN Pei-de. Semantic role labeling in Chinese language for nominal predicates [J]. Journal of Software, 2011, 22(8): 1725–737. (in Chinese)
PAUN G. A new generative device: Valence grammars [J]. Rev Roumaine Math Pures Appl, 1980, XXV(6): 911–924.
Zelko Agic, Kresimir Sojat, Marko Tadic. An xperiment in verb valency frame extraction from croatian dependency treebank [C]// Proceedings of the 32th International Conference on Information Technology Interfaces. Cavtat, Croatia, 2010: 55–60.
TESNIÈRE L. Elements of syntaxe structural [M]. Paris, France: Klincksieck, 1959: 35–76. (in Franch)
YUAN Li-chi. Dependency language paring model based on Word Clustering [J]. Journal of Central South University: Natural Science, 2011, 42(7): 2023–2027. (in Chinese)
YUAN Li-chi. Statistical parsing with linguistic features [J]. Journal of Central South University: Natural Science, 2012, 43(3): 986–991. (in Chinese)
Author information
Authors and Affiliations
Corresponding author
Additional information
Foundation item: Project(61262035) supported by the National Natural Science Foundation of China; Projects(GJJ12271, GJJ12742) supported by the Science and Technology Foundation of Education Department of Jiangxi Province, China; Project(20122BAB201033) supported by the Natural Science Foundation of Jiangxi Province, China
Rights and permissions
About this article
Cite this article
Yuan, Lc. Improved head-driven statistical models for natural language parsing. J. Cent. South Univ. 20, 2747–2752 (2013). https://doi.org/10.1007/s11771-013-1793-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11771-013-1793-3