Skip to main content

Abstract

This paper describes the development of Traditional Mongolian dependency treebank (TMDT) which aims to facilitate the dependency analysis on Traditional Mongolian. The annotation scheme of the dependency treebank is established according to Traditional Mongolian grammar and its usability in syntactic analysis. In the treebank, morphological and analytical information are annotated. At morphological level, a semi-automation strategy is adopted. Part-Of-Speech (POS) and stem of each word in the sentence are tagged and extracted respectively with automation tools, and then manually corrected. At analytical level, the dependencies in the sentence are only annotated manually according to constituent structure and the annotation scheme. This treebank formulates the foundation of dependency parsing on Traditional Mongolian and can be extended to a multi-dependency Treebank.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Marcus, M., Santorini, B., Marcinkiewicz, M.: Building a Large Annotated Corpus of English: The Penn Treebank. Computational Linguistics 19, 313–330 (1994)

    Google Scholar 

  2. Bhatt, R., Narasimhan, B., Palmer, M., Rambow, O., Sharma, D.M., Xia, F.: A Multi-representational and Multi-layered Treebank for Hindi/Urdu. In: Proceedings of the Third Linguistic Annotation Workshop, pp. 186–189. Association for Computational Linguistics, Suntec (2009)

    Chapter  Google Scholar 

  3. Böhmová, A., Hajič, J., Hajičová, E., Hladká, B.: The Prague Dependency Treebank: A Three-Level Annotation Scenario. In: Abeillé, A. (ed.) Treebanks: Building and Using Syntactically Annotated Corpora, pp. 103–127. Kluwer Academic Publishers (2001)

    Google Scholar 

  4. Huang, C.-R., Chen, F.-Y., Chen, K.-J., Gao, Z.-M., Chen, K.-Y.: Sinica Treebank: Design Criteria, Annotation Guidelines, and On-line Interface. In: Second Chinese Language Processing Workshop, pp. 29–37. Association for Computational Linguistics, Hong Kong (2000)

    Google Scholar 

  5. Pajas, P., Štěpánek, J.: Recent Advances in a Feature-Rich Framework for Treebank Annotation. In: Proceedings of the 22nd International Conference on Computational Linguistics, vol. 1, pp. 673–680. Association for Computational Linguistics, Manchester (2008)

    Google Scholar 

  6. de Marneffe, M.-C., Manning, C.D.: The Stanford Typed Dependencies Representation. In: Coling 2008: Proceedings of the Workshop on Cross-Framework and Cross-Domain Parser Evaluation, pp. 1–8. Association for Computational Linguistics, Manchester (2008)

    Google Scholar 

  7. Mel’čuk, I.A.: Dependency Syntax: Theory and Practice. State University of New York Press, New York (1988)

    Google Scholar 

  8. Hudson, R.: An Introduction to Word Grammar. Cambridge University Press, Cambridge (2010)

    Book  Google Scholar 

  9. Nivre, J.: Dependency Grammar and Dependency Parsing. Technical Report, School of Mathematics and Systems Engineering, Växjö University (2005)

    Google Scholar 

  10. Brants, T., Skut, W.: Automation of Treebank Annotation. In: Proceedings of the Joint Conferences on New Methods in Language Processing and Computational Natural Language Learning, pp. 49–57. Association for Computational Linguistics, Sydney (1998)

    Google Scholar 

  11. van der Beek, L., Bouma, G., Malouf, R., van Noord, G.: The Alpino Dependency Treebank. Computational Linguistics in the Netherlands, CLIN (2002)

    Google Scholar 

  12. Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 282–289. Morgan Kaufmann Publishers Inc. (2001)

    Google Scholar 

  13. Ma, M.-Y.: Researching of Mongolian Word Segmentation System Based on Dictionary, Rules and Language Model. Computer Science, Inner Mongolian University, master (2011) (in Chinese)

    Google Scholar 

  14. Jiang, W.-B., Wu, J.-X., Wuriliga, Nashunwuritu, Liu, Q.: Discriminative Stem-Affix Segmentation for Directed-Graph-Based Mongolian Lexical Analyzer. Journal of Chinese Information Processing 25, 30–34 (2011)

    Google Scholar 

  15. Qinggeertai: Traditional Mongolian Grammar. Inner Mongolian Press, Huhhot (1992) (in Chinese)

    Google Scholar 

  16. König, E., Lezius, W.: The TIGER Language: A Description Language for Syntax Graphs, Formal Definition (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Su, X., Gao, G., Yan, X. (2013). Development of Traditional Mongolian Dependency Treebank. In: Sun, M., Zhang, M., Lin, D., Wang, H. (eds) Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. NLP-NABD CCL 2013 2013. Lecture Notes in Computer Science(), vol 8202. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41491-6_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-41491-6_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-41490-9

  • Online ISBN: 978-3-642-41491-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics