Advertisement

A Named Entity Recognition Approach for Albanian Using Deep Learning

  • Evis TrandafiliEmail author
  • Elinda Kajo Meçe
  • Enea Duka
Chapter
Part of the Studies in Computational Intelligence book series (SCI, volume 880)

Abstract

Named Entity Recognition (NER) is an information extraction task that deals with the identification and tagging of generic named entities and/or domain-specific named entities. NER is a crucial task in semantic processing of text data, making it a key component in different Natural Language Processing applications such as Question Answering, Machine Translation, etc. In this paper we propose an approach for Named Entity Recognition based on Deep Learning models using an Albanian corpus. We focused on the generic named entities such as person’s name, geographical location, name of organization/institution and other categories. Given that there is no publicly available Albanian annotated corpus, we have manually created one. Furthermore, we have built a deep neural network using LSTM cells as the hidden layers and a Conditional Random Field as the output, using both word and character tagging. Taking into consideration the complexity of the Albanian language and the little research done in NLP for Albanian, the results achieved are promising. The results obtained from the experiments demonstrate that the NER performance can be further improved by using a larger annotated corpus to train the model.

Keywords

Named entity recognition Albanian language Deep learning Neural network LSTM CRF Information extraction Natural language processing 

References

  1. 1.
    Reinsel, D., Gantz, J., Rydning, J.: The digitization of the world: from edge to core, data age 2020, an IDC whitepaper (2018)Google Scholar
  2. 2.
    International Data Group, Inc., via the Headwaters Group. http://www.theheadwatersgroup.com/your-unstructured-data-is-sexy/. Accessed 15 Jan 2019
  3. 3.
    Grishman, R., Sundheim, B.: Message understanding conference-6: a brief history. In: Proceedings of the COLING, vol. 1 (1996)Google Scholar
  4. 4.
    Palshikar, G.: Techniques for named entity recognition: a survey. Collaboration and the Semantic Web: Social Networks, Knowledge Networks and Knowledge Resources, 1st edn. IGI Global, pp. 191–217 (2012)Google Scholar
  5. 5.
    Fleischman, M.: Automated subcategorization of named entities. In: Proceedings, Conference of the European Chapter of Association for Computational Linguistic (2001)Google Scholar
  6. 6.
    Jiang, J., Zhai, C.: Exploiting domain structure for named entity recognition. In: Proceedings, Conference of Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics (2006)Google Scholar
  7. 7.
    Li, J., Sun, A., Han, R., Li, C.: A survey on deep learning for named entity recognition (2018). arXiv:1812.09449
  8. 8.
    Lipton, Z.: A critical review of recurrent neural networks for sequence learning (2015). arXiv:1506.00019v1 [cs.LG]
  9. 9.
    Yadav, V., Bethard, S.: A survey on recent advances in named entity recognition from deep learning models. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 2145–2158 (2018)Google Scholar
  10. 10.
    Skenduli, M.P., Biba, M.: A named entity recognition approach for Albanian. In: International Conference on Advances in Computing, Communications and Informatics (2013)Google Scholar
  11. 11.
    Kono, G., Hoxha, K.: Named entity recognition in Albanian based on CRFs approach. RTA-CSIT (2016)Google Scholar
  12. 12.
    Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 1(12), 2493–2537 (2011)zbMATHGoogle Scholar
  13. 13.
    Young, T., Hazarika, D., Poria, S., Cambria, E.: Recent trends in deep learning based natural language processing (2018). arXiv:1708.02709v8 [cs.CL]
  14. 14.
    Bonadiman, D., Severyn, A., Alessandro, M.: Deep neural networks for named entity recognition in Italian (2015)CrossRefGoogle Scholar
  15. 15.
    Wang, C., Chen, W., Xu, B.: Named entity recognition with gated convolutional neural networks, pp. 110–121 (2017).  https://doi.org/10.1007/978-3-319-69005-6_10Google Scholar
  16. 16.
    Kuru, O., Yuret, D.: CharNER: character-level named entity recognition. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp. 911–921 (2016)Google Scholar
  17. 17.
    Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition (2016). arXiv:1603.01360v3 [cs.CL]
  18. 18.
    Gregoric, A., Bachrach, Y., Coope, S.: Named entity recognition with parallel recurrent neural networks. In: Proceedings of the ACL, vol. 2, pp. 69–74 (2018)Google Scholar
  19. 19.
    Ju, M., Miwa, M., Ananiadou, S.: A neural layered model for nested named entity recognition. In: Proceedings of the NAACL-HLT, vol. 1, pp. 1446–1459 (2018)Google Scholar
  20. 20.
    Hamp, E.P.: Albanian language, encyclopedia Britannica (2016). Accessed 10 Jan 2019Google Scholar
  21. 21.
    Trandafili, E., Meçe, E.K., Kica, K., Paci, H.: A novel question answering system for Albanian language. In: EIDWT, pp. 514–524. Springer International Publishing AG (2018)Google Scholar
  22. 22.
    Kanai, S., Fujiwara, Y., Iwamura, S.: Preventing gradient explosions in gated recurrent units. In: Conference on Neural Information Processing Systems, NIPS (2017)Google Scholar
  23. 23.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 1735–1780 (1997)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Evis Trandafili
    • 1
    Email author
  • Elinda Kajo Meçe
    • 1
  • Enea Duka
    • 1
  1. 1.Department of Computer Engineering, Faculty of Information TechnologyPolytechnic University of TiranaTiranaAlbania

Personalised recommendations