Problems with Pruning in Automatic Creation of Semantic Valence Dictionary for Polish

  • Elżbieta Hajnicz
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5729)


In this paper we present the first step towards the automatic creation of semantic valence dictionary of Polish verbs. First, resources used in the process are listed. Second, the way of gathering corpus-based observations into a semantic valence dictionary and pruning them is discussed. Finally, an experiment in the application of the method is presented and evaluated.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Brent, M.R.: From grammar to lexicon: unsupervised learning of lexical syntax. Computational Linguistics 19(2), 243–262 (1993)Google Scholar
  2. 2.
    Briscoe, T., Carrol, J.: Automatic extraction of subcategorization from corpora. In: Proceedings of the 5th ACL Conference on Applied Natural Language Processing, Washington, DC, pp. 356–363 (1997)Google Scholar
  3. 3.
    Hajnicz, E.: Dobór czasowników do badań przy tworzeniu słownika semantycznego czasowników polskich. Technical Report 1003, Institute of Computer Science, Polish Academy of Sciences, Warsaw (2007)Google Scholar
  4. 4.
    Świdziński, M.: Syntactic Dictionary of Polish Verbs. Uniwersytet Warszawski / Universiteit van Amsterdam (1994)Google Scholar
  5. 5.
    Dȩbowski, Ł., Woliński, M.: Argument co-occurrence matrix as a description of verb valence. In: Vetulani, Z. (ed.) Proceedings of the 3rd Language & Technology Conference, Poznań, Poland, pp. 260–264 (2007)Google Scholar
  6. 6.
    Hajnicz, E.: Semantic annotation of verb arguments in shallow parsed Polish sentences by means of EM selection algorithm. In: Marciniak, M., Mykowiecka, A. (eds.) Aspects of Natural Language Processing. LNCS, vol. 5070. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  7. 7.
    Derwojedowa, M., Piasecki, M., Szpakowicz, S., Zawisławska, M.: Polish WordNet on a shoestring. In: Data Structures for Linguistic Resources and Applications: Proceedings of the GLDV 2007 Biannual Conference of the Society for Computational Linguistics and Language Technology, Universita̋t Tűbingen, Tűbingen, Germany, pp. 169–178 (2007)Google Scholar
  8. 8.
    Derwojedowa, M., Piasecki, M., Szpakowicz, S., Zawisławska, M., Broda, B.: Words, concepts and relations in the construction of Polish WordNet. In: Tanacs, A., Csendes, D., Vincze, V., Fellbaum, C., Vossen, P. (eds.) Proceedings of the Global WordNet Conference, Seged, Hungary, pp. 162–177 (2008)Google Scholar
  9. 9.
    Derwojedowa, M., Szpakowicz, S., Zawisławska, M., Piasecki, M.: Lexical units as the centrepiece of a wordnet. In: Kłopotek, M.A., Przepiórkowski, A., Wierzchoń, S.T. (eds.) Proceedings of the Intelligent Information Systems XVI (IIS 2008). Challenging Problems in Science: Computer Science. Academic Publishing House Exit, Zakopane (2008)Google Scholar
  10. 10.
    Przepiórkowski, A.: The IPI PAN corpus. Preliminary version. Institute of Computer Science, Polish Academy of Sciences, Warsaw (2004)Google Scholar
  11. 11.
    Woliński, M.: Komputerowa weryfikacja gramatyki Świdzińskiego. PhD thesis, Institute of Computer Science, Polish Academy of Sciences, Warsaw (2004)Google Scholar
  12. 12.
    Świdziński, M.: Gramatyka formalna jȩzyka polskiego. Rozprawy Uniwersytetu Warszawskiego. Wydawnictwa Uniwersytetu Warszawskiego, Warsaw (1992)Google Scholar
  13. 13.
    Dȩbowski, Ł.: Valence extraction using the EM selection and co-occurrence matrices. arXiv (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Elżbieta Hajnicz
    • 1
  1. 1.Institute of Computer SciencePolish Academy of SciencesPoland

Personalised recommendations