Skip to main content
Log in

Deep learning for named entity recognition: a survey

  • Review
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Named entity recognition (NER) aims to identify the required entities and their types from unstructured text, which can be utilized for the construction of knowledge graphs. Traditional methods heavily rely on manual feature engineering and face challenges in adapting to large datasets within complex linguistic contexts. In recent years, with the development of deep learning, a plethora of NER methods based on deep learning have emerged. This paper begins by providing a succinct introduction to the definition of the problem and the limitations of traditional methods. It enumerates commonly used NER datasets suitable for deep learning methods and categorizes them into three classes based on the complexity of named entities. Then, some typical deep learning-based NER methods are summarized in detail according to the development history of deep learning models. Subsequently, an in-depth analysis and comparison of methods achieving outstanding performance on representative and widely used datasets is conducted. Furthermore, the paper reproduces and analyzes the recognition results of some typical models on three different types of typical datasets. Finally, the paper concludes by offering insights into the future trends of NER development.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25

Similar content being viewed by others

Data availability statements

Authors can confirm that all relevant data are included in the article.

References

  1. Fang Z, Cao Y, Li T, Jia R, Fang F, Shang Y, Lu Y (2021) Tebner: domain specific named entity recognition with type expanded boundary-aware network. In: Proceedings of the conference on empirical methods in natural language processing, pp 198–207

  2. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444

    Article  Google Scholar 

  3. Shaalan K (2014) A survey of arabic named entity recognition and classification. Comput Linguist 40(2):469–510

    Article  Google Scholar 

  4. Wang Y, Tong H, Zhu Z, Li Y (2022) Nested named entity recognition: a survey. ACM Trans Knowl Discov Data 16(6):1–29

    Google Scholar 

  5. Bose P, Srinivasan S, Sleeman W, Palta J, Kapoor R, Ghosh P (2021) A survey on recent named entity recognition and relationship extraction techniques on clinical texts. Appl Sci 11(18):8319

    Article  Google Scholar 

  6. Yadav V, Bethard S (2018) A survey on recent advances in named entity recognition from deep learning models. In: Proceedings of the 27th international conference on computational linguistics, pp 2145–2158

  7. Li J, Sun A, Han J, Li C (2022) A survey on deep learning for named entity recognition. IEEE Trans Knowl Data Eng 34(1):50–70

    Article  Google Scholar 

  8. Rau LF (1991) Extracting company names from text. In: Proceedings the seventh IEEE conference on artificial intelligence application, pp 29–30

  9. Goodfellow I, Pouget Abadie J, Mirza M, Xu B, Warde Farley D, Ozair S, Courville A, Bengio Y (2020) Generative adversarial networks. Commun ACM 63(11):139–144

    Article  MathSciNet  Google Scholar 

  10. Collins M, Singer Y (1999) Unsupervised models for named entity classification. In: 1999 Joint SIGDAT conference on empirical methods in natural language processing and very large corpora

  11. Chieu HL, Ng HT (2003) Named entity recognition with a maximum entropy approach. In: Proceedings of the 7th conference on natural language learning at HLT-NAACL 2003, pp 160–163

  12. Isozaki H, Kazawa H (2002) Efficient support vector classifiers for named entity recognition. In: COLING 2002: the 19th international conference on computational linguistics

  13. Zhou G, Su J (2002) Named entity recognition using an hmm-based chunk tagger. In: Proceedings of the 40th annual meeting of the association for computational linguistics, pp 473–480

  14. Lafferty J, McCallum A, Pereira FC (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th international conference on machine learning, pp 282–289

  15. Peng N, Dredze M (2015) Named entity recognition for chinese social media with jointly trained embeddings. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 548–554

  16. Atkinson J, Bull V (2012) A multi-strategy approach to biological named entity recognition. Expert Syst Appl 39(17):12968–12974

    Article  Google Scholar 

  17. Liu X, Zhang S, Wei F, Zhou M (2011) Recognizing named entities in tweets. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, pp 359–367

  18. Ringland N, Dai X, Hachey B, Karimi S, Paris C, Curran J (2019) Nne: a dataset for nested named entity recognition in english newswire. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 5176–5181

  19. Wang Z, Shang J, Liu L, Lu L, Liu J, Han J (2019) Crossweigh: training named entity tagger from imperfect annotations. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, pp 5154–5163

  20. Malmasi S, Fang A, Fetahu B, Kar S, Rokhlenko O (2022) Multiconer: a large-scale multilingual dataset for complex named entity recognition. In: Proceedings of the 29th international conference on computational linguistics, pp 3798–3809

  21. Tjong Kim Sang EF (2002) Introduction to the CoNLL-2002 shared task: language-independent named entity recognition. In: COLING-02: the 6th conference on natural language learning 2002 (CoNLL-2002)

  22. Tjong Kim Sang EF, De Meulder F (2003) Introduction to the CoNLL-2003 shared task: language-independent named entity recognition. In: Proceedings of the 7th conference on natural language learning at HLT-NAACL 2003, pp 128–147

  23. Roth D, Yih W (2004) A linear programming formulation for global inference in natural language tasks. In: Proceedings of the eighth conference on computational natural language learning (CoNLL-2004) at HLT-NAACL 2004, pp 1–8

  24. Weischedel R, Pradhan S, Ramshaw L, Palmer M, Xue N, Marcus M, Taylor A, Greenberg C, Hovy E, Belvin R et al (2011) Ontonotes release 4.0. LDC2011T03, Philadelphia, Penn.: Linguistic Data Consortium

  25. Pradhan S, Moschitti A, Xue N, Ng HT, Björkelund A, Uryupina O, Zhang Y, Zhong Z (2013) Towards robust linguistic analysis using ontonotes. In: Proceedings of the 17th conference on computational natural language learning, pp 143–152

  26. Levow GA (2006) The third international chinese language processing bakeoff: word segmentation and named entity recognition. In: Proceedings of the 5th SIGHAN workshop on chinese language processing, pp 108–117

  27. Doddington G, Mitchell A, Przybocki M, Ramshaw L, Strassel S, Weischedel R (2004) The automatic content extraction (ACE) program—tasks, data, and evaluation. In: Proceedings of the 4th international conference on language resources and evaluation

  28. Walker C, Strassel S, Medero J, Maeda K (2006) Ace 2005 multilingual training corpus. Linguistic Data Consortium

  29. Ohta T, Tateisi Y, Kim JD, Mima H, Tsujii J (2002) The genia corpus: an annotated research abstract corpus in molecular biology domain. In: Proceedings of the human language technology conference, pp 73–77

  30. Karimi S, Metke Jimenez A, Kemp M, Wang C (2015) Cadec: a corpus of adverse drug event annotations. J Biomed Inform 55:73–81

    Article  Google Scholar 

  31. Pradhan S, Elhadad N, South BR, Martinez D, Christensen LM, Vogel A, Suominen H, Chapman WW, Savova GK (2013) Task 1: Share/clef ehealth evaluation lab 2013. In: Proceedings of CLEF (Working Notes)

  32. Mowery DL, Velupillai S, South BR, Christensen L, Martinez D, Kelly L, Goeuriot L, Elhadad N, Pradhan S, Savova G et al. (2014) Task 2: Share/clef ehealth evaluation lab 2014. In: Proceedings of CLEF 2014

  33. Bengio Y, Ducharme R, Vincent P (2000) A neural probabilistic language model. In: Proceedings of conference on neural information processing systems, pp 932–938

  34. Xu A, Wang C (2021) Ner based on feed-forward depth neural network. In: Proceedings of the international conference on computer information science and artificial intelligence, pp 510–516

  35. Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12:2493–2537

    Google Scholar 

  36. Xu M, Jiang H, Watcharawittayakul S (2017) A local detection approach for named entity recognition and mention detection. In: Proceedings of the 55th annual meeting of the association for computational linguistics, pp 1237–1247

  37. Caruana R (1993) Multitask learning: a knowledge-based source of inductive bias. In: Proceedings of the international conference on machine learning

  38. Caruana R (1997) Multitask learning. Mach Learn 28(1):41–75

    Article  MathSciNet  Google Scholar 

  39. Collobert R, Weston J (2008) A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th international conference on machine learning, pp 160–167

  40. Liu X, Gao J, He X, Deng L, Duh K, Wang Y (2015) Representation learning using multi-task deep neural networks for semantic classification and information retrieval. In: Proceedings of the conference of the north American chapter of the association for computational linguistics: human language technologies, pp 912–921

  41. Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. In: Proceedings of the 1st international conference on learning representations

  42. Sienčnik SK (2015) Adapting word2vec to named entity recognition. In: Proceedings of the 20th nordic conference of computational linguistics, pp 239–243

  43. Kumarjeet P, Pramit M, Gatty V (2020) Named entity recognition using word2vec. Int Res J Eng Technol 7(9):1818–1820

    Google Scholar 

  44. Yuan J, Xiong Y (2016) Chinese named entity extraction system based on word2vec under spark platform. In: Proceedings of the 4th international conference on advanced materials and information technology processing, pp 387–394

  45. Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Proceedings of the conference on empirical methods in natural language processing, pp 1532–1543

  46. Chiu JP, Nichols E (2016) Named entity recognition with bidirectional lstm-cnns. Trans Assoc Comput Linguist 4:357–370

    Article  Google Scholar 

  47. Gridach M (2017) Character-level neural network for biomedical named entity recognition. J Biomed Inform 70:85–91

    Article  Google Scholar 

  48. Xie J, Yang Z, Neubig G, Smith NA, Carbonell JG (2018) Neural cross-lingual named entity recognition with minimal resources. In: Proceedings of the conference on empirical methods in natural language processing, pp 369–379

  49. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  50. Colah: Understanding LSTM Networks. [EB/OL]. http://timmurphy.org/2009/07/22/line-spacing-in-latex-documents/ Accessed April 4, 2023

  51. Graves A, Jaitly N, Mohamed A (2013) Hybrid speech recognition with deep bidirectional lstm. In: Proceedings of the IEEE workshop on automatic speech recognition and understanding, pp 273–278

  52. Huang Z, Xu W, Yu K (2015) Bidirectional lstm-crf models for sequence tagging. arXiv:1508.01991

  53. Hammerton J (2003) Named entity recognition with long short-term memory. In: Proceedings of the 7th conference on natural language learning at HLT-NAACL, pp 172–175

  54. Dyer C, Ballesteros M, Ling W, Matthews A, Smith NA (2015) Transition-based dependency parsing with stack long short-term memory. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing, pp 334–343

  55. Kalchbrenner N, Grefenstette E, Blunsom P (2014) A convolutional neural network for modelling sentences. In: Proceedings of the 52nd annual meeting of the association for computational linguistics, pp 655–665

  56. Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the conference on empirical methods in natural language processing, pp 1746–1751

  57. Nguyen VQ, Anh TN, Yang H-J (2019) Real-time event detection using recurrent neural network in social sensors. Int J Distrib Sens Netw 15(6):1550147719856492

    Article  Google Scholar 

  58. Emma S, Pat V, David B, Andrew M (2017) Fast and accurate entity recognition with iterated dilated convolutions. In: Proceedings of the conference on empirical methods in natural language processing

  59. Ma X, Hovy E (2016) End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In: Proceedings of the 54th annual meeting of the association for computational linguistics, pp 1064–1074

  60. Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng A, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the conference on empirical methods in natural language processing, pp 1631–1642

  61. Hsuan LP, Ping DR, Siang WY, Chieh CJ, Yun MW (2017) Leveraging linguistic structures for named entity recognition with bidirectional recursive neural networks. In: Proceedings of the conference on empirical methods in natural language processing, pp 2664–2669

  62. Li J, Luong T, Jurafsky D, Hovy E (2015) When are tree structures necessary for deep learning of representations? In: Proceedings of the conference on empirical methods in natural language processing, pp 2304–2314

  63. Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: Proceedings of 3rd international conference on learning representations

  64. Luong T, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. In: Proceedings of the conference on empirical methods in natural language processing, pp 1412–1421

  65. Vinyals O, Kaiser Ł, Koo T, Petrov S, Sutskever I, Hinton G (2015) Grammar as a foreign language. In: Proceedings of conference on neural information processing systems

  66. Hermann KM, Kocisky T, Grefenstette E, Espeholt L, Kay W, Suleyman M, Blunsom P (2015) Teaching machines to read and comprehend. In: Proceedings of conference on neural information processing systems

  67. Vinyals O, Blundell C, Lillicrap T, Wierstra D, et al. (2016) Matching networks for one shot learning. In: Proceedings of conference on neural information processing systems

  68. Yang Z, Chen H, Zhang J, Ma J, Chang Y (2020) Attention-based multi-level feature fusion for named entity recognition. In: Proceedings of the 29th international joint conference on artificial intelligence, pp 3594–3600

  69. Luo L, Yang Z, Yang P, Zhang Y, Wang L, Lin H, Wang J (2018) An attention-based bilstm-crf approach to document-level chemical named entity recognition. Bioinformatics 34(8):1381–1388

    Article  Google Scholar 

  70. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Lu, Polosukhin I (2017) Attention is all you need. In: Proceedings of conference on neural information processing systems, pp 5998–6008

  71. Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X, Chen X (2016) Improved techniques for training gans. In: Proceedings of conference on neural information processing systems, pp 2234–2242

  72. Zhang Y, Gan Z, Carin L (2016) Generating text via adversarial training. In: Proceedings of conference on neural information processing systems workshop on adversarial training, pp 21–32

  73. Yu L, Zhang W, Wang J, Yu Y (2017) Seqgan: sequence generative adversarial nets with policy gradient. In: Proceedings of the 31st AAAI conference on artificial intelligence

  74. Croce D, Castellucci G, Basili R (2019) Kernel-based generative adversarial networks for weakly supervised learning. In: Proceedings of the international conference of the italian association for artificial intelligence, pp 336–347

  75. Feng Y, You H, Zhang Z, Ji R, Gao Y (2019) Hypergraph neural networks. In: Proceedings of the AAAI conference on artificial intelligence, pp 3558–3565

  76. Lu W, Roth D (2015) Joint mention extraction and classification with mention hypergraphs. In: Proceedings of the conference on empirical methods in natural language processing, pp 857–867

  77. Muis AO, Lu W (2017) Labeling gaps between words: Recognizing overlapping mentions with mention separators. In: Proceedings of the conference on empirical methods in natural language processing, pp 2608–2618

  78. Katiyar A, Cardie C (2018) Nested named entity recognition revisited. In: Proceedings of the conference of the North American chapter of the association for computational linguistics: human language technologies, pp 861–871

  79. Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Proceedings of the 27th international conference on neural information processing systems, pp 3104–3112

  80. Wu Y, Schuster M, Chen Z, Le QV, Norouzi M, Macherey W, Krikun M, Cao Y, Gao Q, Macherey K, et al. (2016) Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv:1609.08144

  81. Vinyals O, Toshev A, Bengio S, Erhan D (2015) Show and tell: a neural image caption generator. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3156–3164

  82. Lebret R, Grangier D, Auli M (2016) Generating text from structured data with application to the biography domain. arXiv: 1603.07771

  83. Loyola P, Marrese Taylor E, Matsuo Y (2017) A neural architecture for generating natural language descriptions from source code changes. In: Proceedings of the 55th annual meeting of the association for computational linguistics, pp 287–292

  84. Gillick D, Brunk C, Vinyals O, Subramanya A (2016) Multilingual language processing from bytes. In: Proceedings of the conference of the North American chapter of the association for computational linguistics: human language technologies, pp 1296–1306

  85. Yan H, Gui T, Dai J, Guo Q, Zhang Z, Qiu X (2021) A unified generative framework for various NER subtasks. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, pp 5808–5822

  86. Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 2227–2237

  87. Devlin J, Chang MW, Lee K, Toutanova K (2019) Bert: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the Conference of the North American chapter of the association for computational linguistics: human language technologies, pp 4171–4186

  88. Straková J, Straka M, Hajic J (2019) Neural architectures for nested NER through linearization. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 5326–5331

  89. Wang Y, Shindo H, Matsumoto Y, Watanabe T (2021) Nested named entity recognition via explicitly excluding the influence of the best path. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, pp 3547–3557

  90. Shen Y, Wang X, Tan Z, Xu G, Xie P, Huang F, Lu W, Zhuang Y (2022) Parallel instance query network for named entity recognition. In: Proceedings of the 60th annual meeting of the association for computational linguistics, pp 947–961

  91. Liu P, Yuan W, Fu J, Jiang Z, Hayashi H, Neubig G (2022) Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Comput Surv

  92. Cui L, Wu Y, Liu J, Yang S, Zhang Y (2021) Template-based named entity recognition using BART. In: Proceedings of the findings of the association for computational linguistics: ACL-IJCNLP 2021, pp 1835–1845

  93. Lewis M, Liu Y, Goyal N, Ghazvininejad M, Mohamed A, Levy O, Stoyanov V, Zettlemoyer L (2020) Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 7871–7880

  94. Ma R, Zhou X, Gui T, Tan Y, Li L, Zhang Q, Huang X (2022) Template-free prompt tuning for few-shot NER. In: Proceedings of the Conference of the North American chapter of the association for computational linguistics: human language technologies, pp 5721–5732

  95. OpenAI: Introducing chatgpt. [EB/OL]

  96. Xie T, Li Q, Zhang J, Zhang Y, Liu Z, Wang H (2023) Empirical study of zero-shot ner with chatgpt. In: Proceedings of the 2023 conference on empirical methods in natural language processing, pp 7935–7956

  97. Chanthran M, Soon L, Ong H, Selvaretnam B (2023) How well chatgpt understand malaysian english? an evaluation on named entity recognition and relation extraction. In: Proceedings of the generation, evaluation and metrics (GEM) workshop at EMNLP 2023

  98. Li B, Fang G, Yang Y, Wang Q, Ye W, Zhao W, Zhang S (2023) Evaluating chatgpt’s information extraction capabilities: an assessment of performance, explainability, calibration, and faithfulness. CoRR

  99. Dai X, Karimi S, Hachey B, Paris C (2020) An effective transition-based model for discontinuous NER. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 5860–5870

  100. Li F, Lin Z, Zhang M, Ji D (2021) A span-based model for joint overlapped and discontinuous named entity recognition. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, pp 4814–4828

  101. Wang Y, Yu B, Zhu H, Liu T, Yu N, Sun L (2021) Discontinuous named entity recognition as maximal clique discovery. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, pp 764–774

  102. Lample G, Ballesteros M, Subramanian S, Kawakami K, Dyer C (2016) Neural architectures for named entity recognition. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 260–270

  103. Strubell E, Verga P, Belanger D, McCallum A (2017) Fast and accurate entity recognition with lterated dilated convolutions. In: Proceedings of the conference on empirical methods in natural language processing, pp 2670–2680

  104. Zhang M, Zhang Y, Fu G (2017) End-to-end neural relation extraction with global optimization. In: Proceedings of the conference on empirical methods in natural language processing, pp 1730–1740

  105. Akbik A, Blythe D, Vollgraf R (2018) Contextual string embeddings for sequence labeling. In: Proceedings of the 27th international conference on computational linguistics, pp 1638–1649

  106. Zhang Y, Yang J (2018) Chinese NER using lattice LSTM. In: Proceedings of the 56th annual meeting of the association for computational linguistics, pp 1554–1564

  107. Li X, Yin F, Sun Z, Li X, Yuan A, Chai D, Zhou M, Li J (2019) Entity-relation extraction as multi-turn question answering. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 1340–1350

  108. Li PH, Fu TJ, Ma WY (2020) Why attention? analyze bilstm deficiency and its remedies in the case of ner. In: Proceedings of the AAAI conference on artificial intelligence, pp 8236–8244

  109. Wang J, Lu W (2020) Two are better than one: Joint entity and relation extraction with table-sequence encoders. In: Proceedings of the conference on empirical methods in natural language processing, pp 1706–1721

  110. Li X, Yan H, Qiu X, Huang X (2020) FLAT: chinese NER using flat-lattice transformer. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 6836–6842

  111. Wang X, Jiang Y, Bach N, Wang T, Huang Z, Huang F, Tu K (2021) Automated concatenation of embeddings for structured prediction. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, pp 2643–2660

  112. Li J, Fei H, Liu J, Wu S, Zhang M, Teng C, Ji D, Li F (2022) Unified named entity recognition as word-word relation classification. In: Proceedings of the AAAI conference on artificial intelligence, pp 10965–10973

  113. Miwa M, Bansal M (2016) End-to-end relation extraction using LSTMs on sequences and tree structures. In: Proceedings of the 54th annual meeting of the association for computational linguistics, pp 1105–1116

  114. Wang B, Lu W, Wang Y, Jin H (2018) A neural transition-based model for nested mention recognition. In: Proceedings of the conference on empirical methods in natural language processing, pp 1011–1017

  115. Lin H, Lu Y, Han X, Sun L (2019) Sequence-to-nuggets: nested entity mention detection via anchor-region networks. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 5182–5192

  116. Zhu E, Li J (2022) Boundary smoothing for named entity recognition. In: Proceedings of the 60th annual meeting of the association for computational linguistics, pp 7096–7108

  117. Zhang Q, Qian J, Guo Y, Zhou Y, Huang X (2016) Generating abbreviations for chinese named entities using recurrent neural network with dynamic dictionary. In: Proceedings of the conference on empirical methods in natural language processing, pp 721–730

  118. Tong H, Xie C, Liang J, He Q, Yue Z, Liu J, Xiao Y, Wang W (2022) A context-enhanced generate-then-evaluate framework for chinese abbreviation prediction. In: Proceedings of the 31st ACM international conference on information and knowledge management, pp 1945–1954

  119. Zhang L, Li L, Wang H, Sun X (2014) Predicting chinese abbreviations with minimum semantic unit and global constraints. In: Proceedings of the conference on empirical methods in natural language processing, pp 1405–1414

  120. Pan SJ, Toh Z, Su J (2013) Transfer joint embedding for cross-domain named entity recognition. ACM Trans Inf Syst 31(2):1–27

    Article  Google Scholar 

  121. Wang X, Lyu J, Dong L, Xu K (2019) Multitask learning for biomedical named entity recognition with cross-sharing structure. BMC Bioinform 20(1):1–13

    Article  Google Scholar 

  122. Jia C, Liang X, Zhang Y (2019) Cross-domain ner using cross-domain language modeling. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 2464–2474

  123. Tao Y, Genc S, Chung J, Sun T, Mallya S (2021) Repaint: knowledge transfer in deep reinforcement learning. In: International conference on machine learning, pp 10141–10152

  124. Qiu L, Hu C, Zhao K (2008) A method for automatic POS guessing of Chinese unknown words. In: Proceedings of the 22nd international conference on computational linguistics, pp 705–712

  125. Dogan C, Dutra A, Gara A, Gemma A, Shi L, Sigamani M, Walters E (2019) Fine-grained named entity recognition using elmo and wikidata. arXiv:1904.10503

  126. Zhou X, Zhang X, Tao C, Chen J, Xu B, Wang W, Xiao J (2021) Multi-grained knowledge distillation for named entity recognition. In: Proceedings of the 2021 conference of the north american chapter of the association for computational linguistics: human language technologies, pp 5704–5716

  127. Li K (2019) Quantization loss re-learning method. In: 33rd conference on neural information processing systems

  128. Gordon MA, Duh K, Andrews N (2020) Compressing bert: studying the effects of weight pruning on transfer learning. In: Proceedings of the 5th workshop on representation learning for NLP, pp 143–155

Download references

Acknowledgements

The authors would like to thank the anonymous reviewers for their critical and constructive comments and suggestions. This work was supported in part by the National Natural Science Foundation of China under Grant 61976080, in part by the Academic Degrees and Graduate Education Reform Project of Henan Province under Grant 2021SJGLX195Y and in part by the Innovation and Quality Improvement Project for Graduate Education of Henan University under Grants SYL20010101, SYLYC2022191, and SYLYC2022192.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Hou.

Ethics declarations

Conflict of interest

The authors declare that there are no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hu, Z., Hou, W. & Liu, X. Deep learning for named entity recognition: a survey. Neural Comput & Applic (2024). https://doi.org/10.1007/s00521-024-09646-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00521-024-09646-6

Keywords

Navigation