Abstract
This paper describes the development of Traditional Mongolian dependency treebank (TMDT) which aims to facilitate the dependency analysis on Traditional Mongolian. The annotation scheme of the dependency treebank is established according to Traditional Mongolian grammar and its usability in syntactic analysis. In the treebank, morphological and analytical information are annotated. At morphological level, a semi-automation strategy is adopted. Part-Of-Speech (POS) and stem of each word in the sentence are tagged and extracted respectively with automation tools, and then manually corrected. At analytical level, the dependencies in the sentence are only annotated manually according to constituent structure and the annotation scheme. This treebank formulates the foundation of dependency parsing on Traditional Mongolian and can be extended to a multi-dependency Treebank.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Marcus, M., Santorini, B., Marcinkiewicz, M.: Building a Large Annotated Corpus of English: The Penn Treebank. Computational Linguistics 19, 313–330 (1994)
Bhatt, R., Narasimhan, B., Palmer, M., Rambow, O., Sharma, D.M., Xia, F.: A Multi-representational and Multi-layered Treebank for Hindi/Urdu. In: Proceedings of the Third Linguistic Annotation Workshop, pp. 186–189. Association for Computational Linguistics, Suntec (2009)
Böhmová, A., Hajič, J., Hajičová, E., Hladká, B.: The Prague Dependency Treebank: A Three-Level Annotation Scenario. In: Abeillé, A. (ed.) Treebanks: Building and Using Syntactically Annotated Corpora, pp. 103–127. Kluwer Academic Publishers (2001)
Huang, C.-R., Chen, F.-Y., Chen, K.-J., Gao, Z.-M., Chen, K.-Y.: Sinica Treebank: Design Criteria, Annotation Guidelines, and On-line Interface. In: Second Chinese Language Processing Workshop, pp. 29–37. Association for Computational Linguistics, Hong Kong (2000)
Pajas, P., Štěpánek, J.: Recent Advances in a Feature-Rich Framework for Treebank Annotation. In: Proceedings of the 22nd International Conference on Computational Linguistics, vol. 1, pp. 673–680. Association for Computational Linguistics, Manchester (2008)
de Marneffe, M.-C., Manning, C.D.: The Stanford Typed Dependencies Representation. In: Coling 2008: Proceedings of the Workshop on Cross-Framework and Cross-Domain Parser Evaluation, pp. 1–8. Association for Computational Linguistics, Manchester (2008)
Mel’čuk, I.A.: Dependency Syntax: Theory and Practice. State University of New York Press, New York (1988)
Hudson, R.: An Introduction to Word Grammar. Cambridge University Press, Cambridge (2010)
Nivre, J.: Dependency Grammar and Dependency Parsing. Technical Report, School of Mathematics and Systems Engineering, Växjö University (2005)
Brants, T., Skut, W.: Automation of Treebank Annotation. In: Proceedings of the Joint Conferences on New Methods in Language Processing and Computational Natural Language Learning, pp. 49–57. Association for Computational Linguistics, Sydney (1998)
van der Beek, L., Bouma, G., Malouf, R., van Noord, G.: The Alpino Dependency Treebank. Computational Linguistics in the Netherlands, CLIN (2002)
Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 282–289. Morgan Kaufmann Publishers Inc. (2001)
Ma, M.-Y.: Researching of Mongolian Word Segmentation System Based on Dictionary, Rules and Language Model. Computer Science, Inner Mongolian University, master (2011) (in Chinese)
Jiang, W.-B., Wu, J.-X., Wuriliga, Nashunwuritu, Liu, Q.: Discriminative Stem-Affix Segmentation for Directed-Graph-Based Mongolian Lexical Analyzer. Journal of Chinese Information Processing 25, 30–34 (2011)
Qinggeertai: Traditional Mongolian Grammar. Inner Mongolian Press, Huhhot (1992) (in Chinese)
König, E., Lezius, W.: The TIGER Language: A Description Language for Syntax Graphs, Formal Definition (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Su, X., Gao, G., Yan, X. (2013). Development of Traditional Mongolian Dependency Treebank. In: Sun, M., Zhang, M., Lin, D., Wang, H. (eds) Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. NLP-NABD CCL 2013 2013. Lecture Notes in Computer Science(), vol 8202. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41491-6_23
Download citation
DOI: https://doi.org/10.1007/978-3-642-41491-6_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41490-9
Online ISBN: 978-3-642-41491-6
eBook Packages: Computer ScienceComputer Science (R0)