Skip to main content

Lexicalized Token Subcategory and Complex Context Based Shallow Parsing

  • Conference paper
  • First Online:
Chinese Lexical Semantics (CLSW 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9332))

Included in the following conference series:

Abstract

Based on second order hidden markov model (HMM), this paper proposed a Viterbi-decoding chunking algorithm and a novel chunking post-processing algorithm. The method for estimating the parameter in HMM makes use of token subcategory and lexicalization information, which balances the disambiguation ability and data sparseness problem in maximum likelihood estimate (MLE) caused by the token subcategory and lexicalization. To compensate for the absence of complex context during HMM based chunking, this paper proposed a post-processing algorithm which makes a stable improvement to chunking algorithm and avoids the illegal token path in chunking. The experiment indicates that the performance of this chunking system achieves 93% f-measure on the CoNLL 2000 standard testing corpus.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abney, S.: Parsring by chunks. In: Principle-Based Parsing: Computation and Psycholinguistics, pp. 257–278. Kluwer Academic Publishers (1991)

    Google Scholar 

  2. Honglin, S., Shiwen, Y.: Survey of shallow paring. Contemporary Linguistics 2, 74–83 (2000)

    Google Scholar 

  3. Qiang, G., Maosun, S., Changning, H.: Chunk system of Chines sentence. JOS 11, 1158–1165 (1999)

    Google Scholar 

  4. Ramshaw, L.A., Marcus, M.P.: Text chunking using transformaiton-based learning. In: Proceedings of the Third ACL Workshop on Very Large Corpora, Cambridge, MA, USA, pp. 82–94 (1995)

    Google Scholar 

  5. Sang, E.F., Veenstra, J.: Representing text chunks. In: Proceedings of EACL 1999, Bergen, Norway, pp. 173–179 (1999)

    Google Scholar 

  6. Sujian, L., Qun, L., Zhifeng, Y.: Chunk system in Chinese sentence. JOS 26, 1722–1727 (2003)

    Google Scholar 

  7. Kudoh, T., Matsumoto, Y.: Use of support vector learning for chunking identification. In: Proceedings of the 2nd Workshop on Learning Language in Logic and the 4th Conference on Computational Natural Language Learning, pp. 142–144 (2000)

    Google Scholar 

  8. Collins, M.: Head Driven Statistical Models for Natural Language Parsing. Ph.D. thesis. The University of Pennsylvania (1999)

    Google Scholar 

  9. Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, pp. 423–430 (2003)

    Google Scholar 

  10. Gale, W.A., Church, K.: What’s wrong with adding one? In: Oostdijk, N., de Haan, P. (eds.) Corpus-Based Research into Language, Rodolpi, Amsterdam

    Google Scholar 

  11. Chen, S.F., Goodman, J.: An empirical study of smoothing techniques for language modeling. In: Proceedings of the 34th Annual Meeting on Association for Computational Linguistics, pp. 310–318 (1996)

    Google Scholar 

  12. Johnson, M.: PCFG models of linguistic tree representations. Computational Linguistics 24, 613–632 (1998)

    Google Scholar 

  13. Molina, A., Pla, F.: Shallow Parsing using Specialized HMMs. Journal of Machine learning Research 2, 595–613 (2002)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shui Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Liu, S., Zhang, Z., Liu, P. (2015). Lexicalized Token Subcategory and Complex Context Based Shallow Parsing. In: Lu, Q., Gao, H. (eds) Chinese Lexical Semantics. CLSW 2015. Lecture Notes in Computer Science(), vol 9332. Springer, Cham. https://doi.org/10.1007/978-3-319-27194-1_46

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27194-1_46

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27193-4

  • Online ISBN: 978-3-319-27194-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics