Lexicalized Token Subcategory and Complex Context Based Shallow Parsing

Liu, Shui; Zhang, Zheng; Liu, Pengyuan

doi:10.1007/978-3-319-27194-1_46

Shui Liu¹⁵,
Zheng Zhang¹⁵ &
Pengyuan Liu¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9332))

Included in the following conference series:

Workshop on Chinese Lexical Semantics

Abstract

Based on second order hidden markov model (HMM), this paper proposed a Viterbi-decoding chunking algorithm and a novel chunking post-processing algorithm. The method for estimating the parameter in HMM makes use of token subcategory and lexicalization information, which balances the disambiguation ability and data sparseness problem in maximum likelihood estimate (MLE) caused by the token subcategory and lexicalization. To compensate for the absence of complex context during HMM based chunking, this paper proposed a post-processing algorithm which makes a stable improvement to chunking algorithm and avoids the illegal token path in chunking. The experiment indicates that the performance of this chunking system achieves 93% f-measure on the CoNLL 2000 standard testing corpus.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abney, S.: Parsring by chunks. In: Principle-Based Parsing: Computation and Psycholinguistics, pp. 257–278. Kluwer Academic Publishers (1991)
Google Scholar
Honglin, S., Shiwen, Y.: Survey of shallow paring. Contemporary Linguistics 2, 74–83 (2000)
Google Scholar
Qiang, G., Maosun, S., Changning, H.: Chunk system of Chines sentence. JOS 11, 1158–1165 (1999)
Google Scholar
Ramshaw, L.A., Marcus, M.P.: Text chunking using transformaiton-based learning. In: Proceedings of the Third ACL Workshop on Very Large Corpora, Cambridge, MA, USA, pp. 82–94 (1995)
Google Scholar
Sang, E.F., Veenstra, J.: Representing text chunks. In: Proceedings of EACL 1999, Bergen, Norway, pp. 173–179 (1999)
Google Scholar
Sujian, L., Qun, L., Zhifeng, Y.: Chunk system in Chinese sentence. JOS 26, 1722–1727 (2003)
Google Scholar
Kudoh, T., Matsumoto, Y.: Use of support vector learning for chunking identification. In: Proceedings of the 2nd Workshop on Learning Language in Logic and the 4th Conference on Computational Natural Language Learning, pp. 142–144 (2000)
Google Scholar
Collins, M.: Head Driven Statistical Models for Natural Language Parsing. Ph.D. thesis. The University of Pennsylvania (1999)
Google Scholar
Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, pp. 423–430 (2003)
Google Scholar
Gale, W.A., Church, K.: What’s wrong with adding one? In: Oostdijk, N., de Haan, P. (eds.) Corpus-Based Research into Language, Rodolpi, Amsterdam
Google Scholar
Chen, S.F., Goodman, J.: An empirical study of smoothing techniques for language modeling. In: Proceedings of the 34th Annual Meeting on Association for Computational Linguistics, pp. 310–318 (1996)
Google Scholar
Johnson, M.: PCFG models of linguistic tree representations. Computational Linguistics 24, 613–632 (1998)
Google Scholar
Molina, A., Pla, F.: Shallow Parsing using Specialized HMMs. Journal of Machine learning Research 2, 595–613 (2002)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

School of Foreign Languags and Literature, Beijing Normal University, Beijing, China
Shui Liu & Zheng Zhang
Institute of Information Science and Technology, Beijing Language and Culture University, Beijing, China
Pengyuan Liu

Authors

Shui Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zheng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Pengyuan Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shui Liu .

Editor information

Editors and Affiliations

The Hong Kong Polytechnic University, Hong Kong, Hong Kong
Qin Lu
Nanyang Technological University, Singapore, Singapore
Helena Hong Gao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, S., Zhang, Z., Liu, P. (2015). Lexicalized Token Subcategory and Complex Context Based Shallow Parsing. In: Lu, Q., Gao, H. (eds) Chinese Lexical Semantics. CLSW 2015. Lecture Notes in Computer Science(), vol 9332. Springer, Cham. https://doi.org/10.1007/978-3-319-27194-1_46

Download citation

DOI: https://doi.org/10.1007/978-3-319-27194-1_46
Published: 12 January 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27193-4
Online ISBN: 978-3-319-27194-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics