Named-Entity Recognition for Polish with SProUT

  • Jakub Piskorski
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3490)

Abstract

Although considerable work on named-entity recognition for few major languages exists, research on this topic in the context of Slavonic languages has been almost neglected. This paper presents a rule-based named-entity recognition system for Polish built on top of SProUT, a novel multi-lingual NLP platform. We pinpoint the encountered difficulties and present some promising evaluation results.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Appelt, D., Israel, D.: An introduction to information extraction technology. In: A Tutorial prepared for IJCAI-1999 Conference (1999)Google Scholar
  2. 2.
    Becker, M., Drożdżyński, W., Krieger, H.-U., Piskorski, J., Schäfer, U., Xu, F.: SProUT – Shallow Processing with Typed Feature Structures and Unification. In: Proceedings of ICON 2002, Mumbai, India (2002)Google Scholar
  3. 3.
    Busemann, S., Krieger, H.-U.: Resources and Techniques for Multilingual Information Extraction. In: Proceedings of International Conference on Language Resources an Evaluation–LREC 2004, Lissabon, Portugal (2004)Google Scholar
  4. 4.
    Chinchor, N., Robinson, P.: MUC-7 Named Entity Task Definition (version 3.5). In: Proceedings of the MUC-7, Fairfax, Virginia, USA (1998)Google Scholar
  5. 5.
    Cunningham, H., Paskaleva, E., Bontcheva, K., Angelova, G.: Proceedings of the Workshop IESL – Information Extraction for Slavonic Languages, Borovets, Bulgaria (2003)Google Scholar
  6. 6.
    Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: GATE: A Framework and Graphical Development Environment for Robust NLP Tools and Applications. In: Proceedings of the ACL 2002, Philadelphia, USA (2002)Google Scholar
  7. 7.
    Dȩbowski, Ł.: Trigram morphosyntactic tagger for Polish. In: Proceedings of IIS 2004, Zakopane, Poland (2004)Google Scholar
  8. 8.
    Drożdżyński, W., Krieger, H.-U., Piskorski, J., Schäfer, U., Xu, F.: Shallow Processing with Unification and Typed Feature Structures – Foundations and Applications. German AI Journal KI-Zeitschrift, vol. 01/04, Gesellschaft für Informatik e.V. (2004)Google Scholar
  9. 9.
    Erjavec, T., Džeroski, S.: Lemmatising Unknown Words in Highly Inflective Languages. In: Proceedings of the IESL 2003, Borovets, Bulgaria (2003)Google Scholar
  10. 10.
    Grzenia, J.: Słownik nazw własnych – ortografia, wymowa, słowotwórstwo i odmiana. PWN, Seria: Słowniki Jȩzyka Polskiego (1998) ISBN: 83-01-12500-4Google Scholar
  11. 11.
    Krieger, H.-U., Drożdżyński, W., Piskorski, J., Scha̧fer, U., Xu, F.: A Bag of Usefull Techniques for Unification-Based Finite-State Transducers. In: Proceedings of KONVENS 2004, Vienna, Austria (2004)Google Scholar
  12. 12.
    Przepiórkowski, A.: Towards the design of a Syntactico-Semantic Lexicon for Polish. In: Proceedings of IIS 2004, Zakopane, Poland (2004)Google Scholar
  13. 13.
    Przepiórkowski, A., Woliński, M.: A flexemic tagset for Polish. In: Proceedings of Morph logical Processing of Slavic Languages, EACL-2003, Budapest, Hungary (2003)Google Scholar
  14. 14.
    Świdziński, M., Saloni, Z.: Składnia współczesnego jȩzyka polskiego. PWN (1998) ISBN: 83-01-12712-0Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Jakub Piskorski
    • 1
  1. 1.DFKI GmbHSaarbrückenGermany

Personalised recommendations