International Conference on Text, Speech, and Dialogue

Text, Speech, and Dialogue pp 378-386

Heuristic Algorithm for Zero Subject Detection in Polish

Conference paper

DOI: 10.1007/978-3-319-24033-6_43

Part of the Lecture Notes in Computer Science book series (LNCS, volume 9302)
Cite this paper as:
Kaczmarek A., Marcińczuk M. (2015) Heuristic Algorithm for Zero Subject Detection in Polish. In: Král P., Matoušek V. (eds) Text, Speech, and Dialogue. Lecture Notes in Computer Science, vol 9302. Springer, Cham

Abstract

This article describes a heuristic approach to zero subject detection in Polish. It focuses on the zero subject detection as a crucial step in end-to-end coreference resolution. The zero subject verbs are recognized using a set of manually created rules utilizing information from different sources, including: a dependency parser, a shallow relational parser and a valence dictionary. The rules were developed and evaluated on the Polish Coreference Corpus. The experimental results show that the presented method significantly outperforms the only machine learning-based alternative for Polish, i.e., MentionDetector. We also discuss and evaluate the importance of zero subject detection for existing coreference resolution tools for Polish.

Keywords

Zero subject Anaphora detection Coreference resolution Polish 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Computational Intelligence Reserach Group, Institute of Computer ScienceUniversity of WrocławWrocławPoland
  2. 2.G4.19 Research Group: Computational Linguistics and Language Technology, Department of Computational IntelligenceWrocław University of TechnologyWrocławPoland

Personalised recommendations