Skip to main content

Preliminary Study on the Construction of Bilingual Phrase Structure Treebank

  • Conference paper
  • First Online:
Chinese Lexical Semantics (CLSW 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8922))

Included in the following conference series:

Abstract

Treebank is an important resource for Natural Language Processing. Most existing treebanks are monolingual, but bilingual treebanks are the important basis of syntactical model in machine translation. In this paper, a bilingual phrase structure Treebank aimed for the application of machine translation was preliminarily constructed, which chose POS tagset and syntactic tagset of U-Penn English Treebank and Chinese Treebank as its tagging system. Chinese- English sentence pairs which were drawn from machine translation evaluation data in the treebank were pre-processed, with POS tagged, phrase structure annotated, and all processed data were proofread. Through the analysis of phrase structures which were modified in the proofreading process, it was found that Chinese functional words usages play an important role in Chinese phrase structure grammar.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Du, J.H., Zhang, M., Zong, C.Q., Sun L.: Opportunities and Challenges for Machine Translation in China-Summary and Prospects for the Eighth Workshop on Machine Translation. Journal of Chinese Information Processing 27(4) 1–8 (2013). (in Chinese)

    Google Scholar 

  2. Wang, Y.L., Ji, D.H.: A Review of Chinese Treebanks. Contemporary Linguistics 11(1), 47–55 (2009). (in Chinese)

    Google Scholar 

  3. Xue,N., Xia, F., Chiou, F. D., et al.: The Penn Chinese Treebank: Phrase Structure Annotation of a Large Corpus. Natural Language Engineering 10(4), 1–30 (2004)

    Google Scholar 

  4. Chen, F.Y., Jiang, B.F., Chen K.J., et al.: The Construction of Sinica Treebank. Computational Linguistics and Chinese Language Processing 4(2), 87–104 (1999). (in Chinese)

    Google Scholar 

  5. Zhou, Q., Zhang, W., Yu, S.W.: Building a Chinese Treebank. Journal of Chinese Information Processing 11(4) 42–51 (1997). (in Chinese)

    Google Scholar 

  6. Zhou, Q.: Annotation Scheme for Chinese Treebank. Journal of Chinese Information Processing 18(4), 1–8 (2004). (in Chinese)

    Google Scholar 

  7. Jin, G.J., Xiao, H., Fu, L., et al.: Deep Processing and Construction of Modern Chinese Corpus Applied Linguistics 54(2), 111–120 (2005). (in Chinese)

    Google Scholar 

  8. Liu, T., Ma, J.S., Li, S.: Building a dependency Treebank for improving Chinese Parser. Journal of Chinese Language and Computing. 16(4), 207–224 (2006)

    Google Scholar 

  9. Xia, F.: The Part-of-speechTagging Guidelines forthe Penn Chinese Treebank(3.0). http://www.cis.upenn.edu/~chinese/

  10. Ann, B., Mark, F., Karen K., et al.: Bracketing Guidelines for Treebank II Style Penn Treebank Project. http://www.cis.upenn.edu/~english/etb.html

  11. Xue, N.W., Xia, F.: The Bracketing Guidelines for the Penn Chinese Treebank (3.0). http://www.cis.upenn.edu/~chinese/ctb.html

  12. Xia, F., Mattha, P., Xue, N.W., et al.: Developing Guidelines and Ensuring Consistency for Chinese Text Annotation. In: Proceedings of the 2nd International Conference on Language Resources and Evaluation, Athens, Greece (2000)

    Google Scholar 

  13. Zan, H.Y., Zhang, K.L., Zhu, X.F., Yu, S.W.: Research on the Chinese Function Word Usage Knowledge Base. International Journal on Asian Language Processing 21(4), 185–198 (2011)

    Google Scholar 

  14. Zan, H.Y., Zhou, L.J., Zhang, K.L.: Modern Chinese Conjunction Phrase Recognition Based on Usage. Journal of Chinese Information Processing 26(6), 72–78 (2012). (in Chinese)

    Google Scholar 

  15. Zan, H.Y., Zhang, J.J., Lou, X.P.: Studies on the Application of Chinese Functional Words’ Usages IN Dependency Parsing. Journal of Chinese Information Processing 27(5), 35–42 (2013). (in Chinese)

    Google Scholar 

  16. Pang, Y.Y.: Studies on the Usage of Preposition and Conjunction in Phrase Structure Syntactic Parsing. Master Thesis. Zhengzhou University, Zhengzhou (2013). (in Chinese)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kunli Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Zhang, K., Zan, H., Han, Y., Mu, L. (2014). Preliminary Study on the Construction of Bilingual Phrase Structure Treebank. In: Su, X., He, T. (eds) Chinese Lexical Semantics. CLSW 2014. Lecture Notes in Computer Science(), vol 8922. Springer, Cham. https://doi.org/10.1007/978-3-319-14331-6_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-14331-6_40

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-14330-9

  • Online ISBN: 978-3-319-14331-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics