Abstract
In this paper we present the first step towards the automatic creation of semantic valence dictionary of Polish verbs. First, resources used in the process are listed. Second, the way of gathering corpus-based observations into a semantic valence dictionary and pruning them is discussed. Finally, an experiment in the application of the method is presented and evaluated.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Brent, M.R.: From grammar to lexicon: unsupervised learning of lexical syntax. Computational Linguistics 19(2), 243–262 (1993)
Briscoe, T., Carrol, J.: Automatic extraction of subcategorization from corpora. In: Proceedings of the 5th ACL Conference on Applied Natural Language Processing, Washington, DC, pp. 356–363 (1997)
Hajnicz, E.: Dobór czasowników do badań przy tworzeniu słownika semantycznego czasowników polskich. Technical Report 1003, Institute of Computer Science, Polish Academy of Sciences, Warsaw (2007)
Świdziński, M.: Syntactic Dictionary of Polish Verbs. Uniwersytet Warszawski / Universiteit van Amsterdam (1994)
Dȩbowski, Ł., Woliński, M.: Argument co-occurrence matrix as a description of verb valence. In: Vetulani, Z. (ed.) Proceedings of the 3rd Language & Technology Conference, Poznań, Poland, pp. 260–264 (2007)
Hajnicz, E.: Semantic annotation of verb arguments in shallow parsed Polish sentences by means of EM selection algorithm. In: Marciniak, M., Mykowiecka, A. (eds.) Aspects of Natural Language Processing. LNCS, vol. 5070. Springer, Heidelberg (2009)
Derwojedowa, M., Piasecki, M., Szpakowicz, S., Zawisławska, M.: Polish WordNet on a shoestring. In: Data Structures for Linguistic Resources and Applications: Proceedings of the GLDV 2007 Biannual Conference of the Society for Computational Linguistics and Language Technology, Universita̋t Tűbingen, Tűbingen, Germany, pp. 169–178 (2007)
Derwojedowa, M., Piasecki, M., Szpakowicz, S., Zawisławska, M., Broda, B.: Words, concepts and relations in the construction of Polish WordNet. In: Tanacs, A., Csendes, D., Vincze, V., Fellbaum, C., Vossen, P. (eds.) Proceedings of the Global WordNet Conference, Seged, Hungary, pp. 162–177 (2008)
Derwojedowa, M., Szpakowicz, S., Zawisławska, M., Piasecki, M.: Lexical units as the centrepiece of a wordnet. In: Kłopotek, M.A., Przepiórkowski, A., Wierzchoń, S.T. (eds.) Proceedings of the Intelligent Information Systems XVI (IIS 2008). Challenging Problems in Science: Computer Science. Academic Publishing House Exit, Zakopane (2008)
Przepiórkowski, A.: The IPI PAN corpus. Preliminary version. Institute of Computer Science, Polish Academy of Sciences, Warsaw (2004)
Woliński, M.: Komputerowa weryfikacja gramatyki Świdzińskiego. PhD thesis, Institute of Computer Science, Polish Academy of Sciences, Warsaw (2004)
Świdziński, M.: Gramatyka formalna jȩzyka polskiego. Rozprawy Uniwersytetu Warszawskiego. Wydawnictwa Uniwersytetu Warszawskiego, Warsaw (1992)
Dȩbowski, Ł.: Valence extraction using the EM selection and co-occurrence matrices. arXiv (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hajnicz, E. (2009). Problems with Pruning in Automatic Creation of Semantic Valence Dictionary for Polish. In: Matoušek, V., Mautner, P. (eds) Text, Speech and Dialogue. TSD 2009. Lecture Notes in Computer Science(), vol 5729. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04208-9_21
Download citation
DOI: https://doi.org/10.1007/978-3-642-04208-9_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04207-2
Online ISBN: 978-3-642-04208-9
eBook Packages: Computer ScienceComputer Science (R0)