Verb Subcategorisation Acquisition for Estonian Based on Morphological Information

  • Siim Orasmaa
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8082)


A method for automatic acquisition of verb subcategorisation information for Estonian is presented. The method focuses on detection of subcategorisation relations between verbs and nominal phrases. Simple comparison of verb-specific argument candidate’s frequency ranking against a global frequency ranking of the candidate is used to decide whether the argument candidate is likely governed by the verb. The method also requires only limited linguistic resources from the input corpora: morphological annotations and clause boundary annotations. The results obtained are evaluated against a manually built valency lexicon.


verb subcategorisation acquisition morphological information frequency ranking comparisons Estonian 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Manning, C.: Automatic Acquisition of a Large Subcategorization Dictionary from Corpora. In: Proceedings of 31st Meeting of the Association of Computational Linguistics, Columbus, Ohio, pp. 235–242 (1993)Google Scholar
  2. 2.
    Briscoe, T., Carroll, J.: Automatic extraction of subcategorization from corpora. In: Proceedings of the 5th ACL Conference on Applied Natural Language Processing, Washington, DC, pp. 356–363 (1997)Google Scholar
  3. 3.
    Aldezabal, I., Aranzabe, M., Gojenola, K., Sarasola, K., Atutxa, A.: Learning Argument/Adjunct Distinction for Basque. In: Proceedings of the ACL 2002 Workshop on Unsupervised Lexical Acquisition, ULA 2002, Philadelphia, Pennsylvania, vol. 9, pp. 42–50 (2002)Google Scholar
  4. 4.
    Lippincott, T., ÓSéaghdha, D., Korhonen, A.: Learning Syntactic Verb Frames Using Graphical Models. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL 2012), Jeju, Korea (2012)Google Scholar
  5. 5.
    Kermanidis, K., Fakotakis, N., Kokkinakis, G.: Automatic acquisition of verb subcategorization information by exploiting minimal linguistic resources. Corpus Linguistics 9(1), 1–28 (2004)CrossRefGoogle Scholar
  6. 6.
    Kaalep, H.-J., Muischnek, K.: Robust clause boundary identification for corpus annotation. In: Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC 2012), Istanbul, Turkey (2012)Google Scholar
  7. 7.
    EKSS: Eesti kirjakeele seletussõnaraamat. ETA KKI, Tallinn (1988–2000)Google Scholar
  8. 8.
    EVS: Eesti-venesõnaraamat I. Eesti Keele Instituut, Tallinn (1997)Google Scholar
  9. 9.
    Kaalep, H.-J., Muischnek, K., Uiboaed, K., Veskis, K.: The Estonian Reference Corpus: Its Composition and Morphology-aware User Interface. In: Proceedings of the 2010 Conference on Human Language Technologies – The Baltic Perspective: Proceedings of the Fourth International Conference Baltic HLT, pp. 143–146 (2010)Google Scholar
  10. 10.
    Müürisep, K.: Parsing Estonian with Constraint Grammar. In: Online proceedings of NODALIDA 2001, Uppsala (2001),
  11. 11.
    Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Siim Orasmaa
    • 1
  1. 1.Institute of Computer ScienceUniversity of TartuTartuEstonia

Personalised recommendations