Neural Networks for Featureless Named Entity Recognition in Czech

Conference paper

DOI: 10.1007/978-3-319-45510-5_20

Part of the Lecture Notes in Computer Science book series (LNCS, volume 9924)
Cite this paper as:
Straková J., Straka M., Hajič J. (2016) Neural Networks for Featureless Named Entity Recognition in Czech. In: Sojka P., Horák A., Kopeček I., Pala K. (eds) Text, Speech, and Dialogue. TSD 2016. Lecture Notes in Computer Science, vol 9924. Springer, Cham

Abstract

We present a completely featureless, language agnostic named entity recognition system. Following recent advances in artificial neural network research, the recognizer employs parametric rectified linear units (PReLU), word embeddings and character-level embeddings based on gated linear units (GRU). Without any feature engineering, only with surface forms, lemmas and tags as input, the network achieves excellent results in Czech NER and surpasses the current state of the art of previously published Czech NER systems, which use manually designed rule-based orthographic classification features. Furthermore, the neural network achieves robust results even when only surface forms are available as input. In addition, the proposed neural network can use the manually designed rule-based orthographic classification features and in such combination, it exceeds the current state of the art by a wide margin.

Keywords

Neural networks Named entity recognition Czech Word embeddings Character-level embeddings Parametric rectified linear unit (PReLU) Gated linear unit (GRU) 

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Faculty of Mathematics and Physics, Institute of Formal and Applied LinguisticsCharles University in PraguePragueCzech Republic

Personalised recommendations