Skip to main content

Issues in Analyzing Telugu Sentences towards Building a Telugu Treebank

  • Conference paper
Computational Linguistics and Intelligent Text Processing (CICLing 2010)

Abstract

This paper describes an effort towards building a Telugu Dependency Treebank. We discuss the basic framework and issues we encountered while annotating. 1487 sentences have been annotated in Paninian framework. We also discuss how some of the annotation decisions would effect the development of a parser for Telugu.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bamman, D., Crane, G.: The design and use of a Latin dependency treebank. In: Proc. of TLT 2006, pp. 67–78. FAL MFF UK, Prague (2006)

    Google Scholar 

  2. Begum, R., Husain, S., Dhwaj, A., Sharma, D., Bai, L., Sangal, R.: Dependency annotation scheme for Indian languages. In: Proc. of IJCNLP 2008 (2008)

    Google Scholar 

  3. Begum, R., Husain, S., Sharma, D.M., Bai, L.: Developing Verb Frames in Hindi. In: Proc. of LREC 2008, Marrakech, Morocco (2008)

    Google Scholar 

  4. Bharati, A., Husain, S., Sharma, D.M., Sangal, R.: A Two-Stage Constraint Based Dependency Parser for Free Word Order Languages. In: Proc. of the COLIPS IALP 2008, Chiang Mai, Thailand (2008)

    Google Scholar 

  5. Bharati, A., Husain, S., Sharma, D.M., Sangal, R.: In: Proc. of IWPT 2009, Paris (2009)

    Google Scholar 

  6. Bharati, A., Chaitanya, V., Sangal, R.: Natural Language Processing: A Paninian Perspective, pp. 65–106. Prentice-Hall of India, New Delhi (1995)

    Google Scholar 

  7. Bhatt, R., Narasimhan, B., Palmer, M., Rambow, O., Sharma, D.M., Xia, F.: A Multi-Representational and Multi-Layered Treebank for Hindi/Urdu. In: Proc. of TLT 2009 (2009)

    Google Scholar 

  8. Brants, S., Dipper, S., Hansen, S., Lezius, W., Smith, G.: The TIGER Treebank. In: Proc. of TLT 2002 (2002)

    Google Scholar 

  9. Bosco, C., Lombardo, V.: Dependency and relational structure in treebank annotation. In: Proc. of Workshop on Recent Advances in Dependency Grammar at COLING 2004 (2004)

    Google Scholar 

  10. Hajicova, E.: Prague Dependency Treebank: From Analytic to Tectogrammatical Annotation. In: Proc. TSD 1998 (1998)

    Google Scholar 

  11. Hudson, R.: Word Grammar. Basil Blackwell, 108, Cowley Rd, Oxford, OX4 1JF, England (1984)

    Google Scholar 

  12. Krishnamurti, B., Gwynn, J.P.L.: A grammar of modern Telugu. Oxford University Press, Delhi, New York (1985)

    Google Scholar 

  13. Mel’cuk, I.A.: Dependency Syntax: Theory and Practice. State University Press of New York (1988)

    Google Scholar 

  14. Marcus, M., Santorini, B., Marcinkiewicz, M.A.: Building a large annotated corpus of English: The Penn Treebank. In: Computational Linguistics (1993)

    Google Scholar 

  15. Rambow, O., Creswell, C., Szekely, R., Taber, H., Walker, M.: A dependency treebank for English. In: Proc. of LREC 2002 (2002)

    Google Scholar 

  16. Shieber, S.M.: Evidence against the contextfreeness of natural language. Linguistics and Philosophy, 8, 334–343 (1985)

    Google Scholar 

  17. van der Beek, L., Bouma, G., Malouf, R., van Noord, G.: The Alpino dependency treebank. In: Computational Linguistics in the Netherlands (2002)

    Google Scholar 

  18. Vaidya, A., Husain, S., Mannem, P., Sharma, D.M.: A karaka-based dependency annotation scheme for English. In: Gelbukh, A. (ed.) CICLing 2009. LNCS, vol. 5449, pp. 41–52. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Vempaty, C. et al. (2010). Issues in Analyzing Telugu Sentences towards Building a Telugu Treebank. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2010. Lecture Notes in Computer Science, vol 6008. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12116-6_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12116-6_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12115-9

  • Online ISBN: 978-3-642-12116-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics