Skip to main content
Log in

A deep neural framework for named entity recognition with boosted word embeddings

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The upcoming deep neural architectures overpower other previously proposed techniques for named entity recognition by involving fewer efforts but improving accuracy. In the present study, an important variant of recurrent neural network namely, a Bidirectional Long Short-Term Memory which is specially used for sequence classification problems has been used for the NER task. Word contextual features play an important role in the correct prediction of named entities. The present work contributes to developing novel word embeddings i.e. boosted word embeddings which can learn contextual features efficiently. Boosted word embeddings are the combination of character-based convolutional embeddings, part of speech embeddings, and word length embeddings. The results obtained using boosted word embeddings include f-score values of 73.99%, 66.94% and 77.95% respectively for Hindi, Punjabi and bilingual Hindi and Punjabi named entity recognition system. Boosted word embeddings are found effective in raising the accuracy of the named entity recognition model as compared to other models already present in the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1:
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Alfred R, Leong LC, On CK, Anthony P (2014) Malay named entity recognition based on rule-based approach. Int J Mach Learn Comput 4(3):300–306

    Article  Google Scholar 

  2. Bam SB, Shahi TB (2014) Named entity recognition for Nepali text using support vector machines. Intell Inf Manag 6(2):21–29

    Google Scholar 

  3. Bharati A, Sangal R, Sharma DM (2007) Ssf: Shakti standard format guide. Language Technologies Research Centre. International Institute of Information Technology, Hyderabad, India, pp 1–25

  4. Chen Y, Lasko TA, Mei Q, Denny JC, Xu H (2015) A study of active learning methods for named entity recognition in clinical text. J Biomed Inform 58:11–18

    Article  Google Scholar 

  5. Das A, Ganguly D, Garain U (2017) Named entity recognition with word embeddings and Wikipedia categories for a low-resource language. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 16(3):1–19

  6. Ekbal A, Haque R, Das A, Poka V, Bandyopadhyay S (2008) Language independent named entity recognition in Indian languages. In: Proceedings of the IJCNLP-08 workshop on named entity recognition for south and south east Asian languages, pp 33–40

  7. Gangadharan V, Gupta D (2020) Recognizing named entities in agriculture documents using LDA based topic modelling techniques. Proc Comput Sci 171:1337–1345

    Article  Google Scholar 

  8. Goyal A, Gupta V, Kumar M (2019) Analysis of different supervised techniques for named entity recognition. In: International conference on advanced informatics for computing research. Springer, Singapore, pp 184–195

  9. Goyal A, Gupta V, Kumar M (2021) A deep learning-based bilingual Hindi and Punjabi named entity recognition system using enhanced word embeddings. Knowl-Based Syst 234:107601

    Article  Google Scholar 

  10. Goyal A, Gupta V, Kumar M (2021) Recurrent neural network-based model for named entity recognition with improved word embeddings. IETE J Res:1–7. https://doi.org/10.1080/03772063.2021.2006805

  11. Gridach M, Haddad H (2017) Arabic named entity recognition: a bidirectional GRU-CRF approach. In: International conference on computational linguistics and intelligent text processing. Springer, Cham, pp 264–275

  12. Gupta PK, Arora S (2009) An approach for named entity recognition system for Hindi: an experimental study. Proceedings of ASCNT–2009, CDAC, Noida, India, pp 103–108

  13. Gupta V, Lehal GS (2011) Named entity recognition for Punjabi language text summarization. Int J Comput Appl 33(3):28–32

    Google Scholar 

  14. Hanh TTH, Doucet A, Sidere N, Moreno JG, Pollak S (2021) Named entity recognition architecture combining contextual and global features. In: Towards open and trustworthy digital societies: 23rd international conference on Asia-Pacific digital libraries, ICADL 2021, Virtual event, December 1–3, 2021, proceedings, Cham, Springer International Publishing, pp 264–276

  15. Hindi dataset is available online at: http://ltrc.iiit.ac.in/ner-ssea-08/index.cgi?topic=5. Accessed 9 Sept 2007

  16. Hindi POS tagger available online at: https://bitbucket.org/sivareddyg/hindi-part-of-speech-tagger. Accessed 30 June 2014

  17. Li J, Sun A, Han J, Li C (2020) A survey on deep learning for named entity recognition. IEEE Trans Knowl Data Eng 34(1):50–70

    Article  Google Scholar 

  18. Mikolov T, Sutskever I, Chen K, Corrado GS Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119

  19. Nasar Z, Jaffry SW, Malik MK (2021) Named entity recognition and relation extraction: state-of-the-art. ACM Comput Surv (CSUR) 54(1):1–39

    Article  Google Scholar 

  20. Punjabi POS Tagger is available online at: http://punjabipos.learnpunjabi.org

  21. Saha SK, Mitra P, Sarkar S (2012) A comparative study on feature reduction approaches in Hindi and Bengali named entity recognition. Knowl-Based Syst 27:322–332

    Article  Google Scholar 

  22. Sang EF, De Meulder F (2003) Introduction to the CoNLL-2003 shared task: language-independent named entity recognition. In: Proceedings of the seventh conference on natural language learning at HLT-NAACL, 4, pp 142–147. Association for Computational Linguistics

  23. Schweter S, Akbik A (2020) Flert: document-level features for named entity recognition. arXiv preprint arXiv:2011.06993 [cs.CL]

  24. Shaalan K, Raza H (2009) NERA: named entity recognition for Arabic. J Am Soc Inf Sci Technol 60(8):1652–1663

    Article  Google Scholar 

  25. Shah B, Kopparapu SK (2019) A deep learning approach for Hindi named entity recognition. arXiv preprint arXiv:1911.01421 [cs.CL]

  26. Sharma R, Goyal V (2011) Name entity recognition systems for Hindi using CRF approach. In International conference on information systems for Indian languages. Springer, Berlin, Heidelberg, pp 31–35

  27. Sharma R, Morwal S, Agarwal B (2022) Named entity recognition using neural language model and CRF for Hindi language. Comput Speech Lang 74:101356

  28. Shelke R, Thakore D (2020) A novel approach for named entity recognition in Hindi language using residual Bilstm network. Int J Nat Lang Comput (IJNLC) 9(2):1–8

    Article  Google Scholar 

  29. Tan Z, Shen Y, Zhang S, Lu W, Zhuang Y (2021) A sequence-to-set network for nested named entity recognition. In: Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence at Virtual Event / Montreal, Canada, pp 3936–3942

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Archana Goyal.

Ethics declarations

Competing interest/Conflicts of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. The authors also declare that they have no conflicts of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Goyal, A., Gupta, V. & Kumar, M. A deep neural framework for named entity recognition with boosted word embeddings. Multimed Tools Appl 83, 15533–15546 (2024). https://doi.org/10.1007/s11042-023-16176-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-16176-1

Keywords

Navigation