Abstract
This article presents the design of a syntactico-semantic dictionary for Polish, i.e., a valence dictionary enriched with certain semantic informations. Valence dictionaries, specifying the number and morphosyntactic form of arguments of verbs, are useful in many Natural Language Processing applications, including deep parsing, e.g., for the purpose of machine translation, shallow parsing, e.g., for the purpose of information extraction, and rule-based morphosyntactic disambiguation, e.g., for the purpose of corpus annotation. An approach based on recent results in formal and computational linguistics is proposed, which takes into consideration the morphosyntactic and syntactic structure of Polish and which avoids various known problems of previous valence dictionaries, some of them stemming from their impoverished theoretical framework, unable to take proper care of the syntax-semantics interface, case variations and raising predicates. An implementation of a grammar of Polish deploying the ideas presented here is currently under development.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Miroslaw BaĆŗko, editor. Inny slownik jƧzyka polskiego. Wydawnictwo Naukowe PWN, Warsaw, 2000.
Leonard Bolc, Krzysztof Czuba, Anna KupA6, Malgorzata Marciniak, Agnieszka Mykowiecka, and Adam PrzepiĆ³rkowski. A survey of systems for implementing HPSG grammars. IPI PAN Research Report 814, Institute of Computer Science, Polish Academy of Sciences, 1996.
Gosse Bouma, Robert Malouf, and Ivan A. Sag. Satisfying constraints on extraction and adjunction. Natural Language and Linguistic Theory, 19 (1): 165, 2001.
Ann Copestake. Implementing Typed Feature Structure Grammars. CSLI Publications, Stanford, CA, 2002.
Lukasz Dgbowski. A reconfigurable stochastic tagger for languages with complex tag structure. In Proceedings of Morphological Processing of Slavic Languages, EACL 2003, 2003.
Lukasz Dgbowski. Trigram morphosyntactic tagger for Polish. In Proceedings of IIS:IIPWM 2004, 2003.
Norbert Morciniec, Leslaw Cirko, and Ryszard Ziobro. Slownik walencyjny czasownikĆ³w niemieckich i polskich/Wƶrterbuch zur Valenz Deutscher und Polnischer Verben. Wydawnictwo Uniwersytetu Wroclawskiego, Wroclaw, 1995.
Gerald Penn, Detmar Meurers, Kordula De Kuthy, Mohammad HajiAbdolhosseini, Venessa Metcalf, Stevan MĆ¼ller, and Holger Wunsch. Traie Milca Environment v. 2.5.0. Userās Manual (Draft), May 2003.
Jakub Piskorski, Peter Homola, Malorzata Marciniak, Agnieszka Mykowiecka, Adam PrzepiĆ³rkowski, and Marcin Wolinski. Information extraction for Polish using the SProUT platform. In Proceedings of IIS:IIPWM 2004, 2003.
Kazimierz Polanski, editor. Slownik syntaktyczno-generatywny czasownikĆ³w polskich. Zaklad Narodowy im. Ossolinskich/Instytut Jezyka Polskiego PAN, Wroclaw/Krak45w, 1980ā1992.
Carl Pollard and Ivan A. Sag. Head-driven Phrase Structure Grammar. Chicago University Press/CSLI Publications, Chicago, IL, 1994.
Adam PrzepiĆ³rkowski. Case Assignment and the Complement-Adjunct Dichotomy: A Non-Configurational Constraint-Based Approach. Ph. D. dissertation, UniversitƤt TĆ¼bingen, Germany, 1999.
Adam PrzepiĆ³rkowski. Long distance genitive of negation in Polish. Journal of Slavic Linguistics, 8: 151ā189, 2000.
Adam PrzepiĆ³rkowski. On the computational usability of valence dictionaries for Polish. IPI PAN Research Report 971, Institute of Computer Science, Polish Academy of Sciences, 2003. To appear in Proceedings of Slovko 2003
Adam PrzepiĆ³rkowski, Anna Kupsc, Malgorzata Marciniak, and Agnieszka Mykowiecka. Formalny opis jczyka polskiego: Teoria i implementacja. Akademicka Oficyna Wydawnicza EXIT, Warsaw, 2002.
Adam PrzepiĆ³rkowski and Marcin Wolinski. A flexemic tagset for Polish. In Proceedings of Morphological Processing of Slavic Languages, EACL 2003, 2003.
Adam PrzepiĆ³rkowski and Marcin Wolinski. The unbearable lightness of tagging: A case study in morphosyntactic tagging of Polish. In Proceedings of the 4th International Workshop on Linguistically Interpreted Corpora (LINC-03), EACL 2003,2003.
Frank Richter. A Mathematical Formalism for Linguistic Theories with an Application in Head-Driven Phrase Structure Grammar. Ph. D. dissertation, UniversitƤt TĆ¼bingen, 2000.
Zygmunt Saloni. Czasownik polski. Odmiana, slownik. Wiedza Powszechna, Warsaw, 2001.
Marek SwidzirĆski. Dalsze klopoty z bezokolicznikiem. In Jadwiga Sambor, Jadwiga Linde-Usiekniewicz, and Romuald Huszcza, editors, Azykoznawstwo synchroniczne i diachroniczne, pages 303ā314. Wydawnictwa Uniwersytetu Warszawskiego, Warsaw, 1993.
Marek SwidziĆ¼ski. Syntactic dictionary of Polish verbs. Ms., University of Warsaw and Universiteit van Amsterdam, 1994.
Jan Tokarski. Schematyczny indeks a tergo polskich form wyrazowych. Wydawnictwo Naukowe PWN, Warsaw, 1993. Elaborated and edited by Zygmunt Saloni.
Atro Voutilainen. Morphological disambiguation. In F. Karlsson, A. Voutilainen, J. HeikkilƤ, and A. Anttila, editors, Constraint Grammar: A Language-Independent Systsem for Parsing Unrestricted Text, pages 165ā284. Mouton de Gruyter, Berlin, 1995.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Ā© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
PrzepiĆ³rkowski, A. (2004). Towards the Design of a Syntactico-Semantic Lexicon for Polish. In: KÅopotek, M.A., WierzchoÅ, S.T., Trojanowski, K. (eds) Intelligent Information Processing and Web Mining. Advances in Soft Computing, vol 25. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39985-8_25
Download citation
DOI: https://doi.org/10.1007/978-3-540-39985-8_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21331-4
Online ISBN: 978-3-540-39985-8
eBook Packages: Springer Book Archive