Abstract
The essence of named entity recognition is to mine entities with specific meanings in the text, which is the basis for some downstream tasks in the field of natural language processing. Currently, deep learning-based methods have further improved the accuracy of named entity recognition, and most methods are based on word-level and character-level embeddings. However, these methods ignore the effectiveness of global context for entity recognition, so this paper proposes to use an attention mechanism to obtain comprehensive information of the same word from different contextual information. Meanwhile, character-level representations affect not only the accuracy of recognizing unseen words, but also the extraction of contextual representations. Considering this issue, we propose to extract character-to-word representations using label attention mechanism. The proposed model uses CNN-LSTM-CRF as the baseline, which is effectively integrated into the above two representation extraction methods, named CNN-CWR-LSTM-GCR-CRF. On the basis of this model, we further integrate the language model BERT. Experiments show that our model achieves the results competitive with the state-of-the-art records on CONLL-2002 Spanish dataset, CONLL-2003 and Ontonotes5.0 English datasets, respectively.
Similar content being viewed by others
References
Zheng S, Wang F, Bao H, et al. (2017) Joint extraction of entities and relations based on a novel tagging scheme. arXiv:1706.05075
Chen D, Manning CD (2014) A fast and accurate dependency parser using neural networks. EMNLP, pp 740–750
Ma X, Zhao H (2012) Fourth-order dependency parsing. COLING, pp 785–796
Ma X, Zhao H (2015) Probabilistic models for high-order projective dependency parsing. arXiv:1502.04174
Ng V (2010) Supervised noun phrase coreference research: the first fifteen years. ACL, pp 1396–1411
Guo J, Xu G, Cheng X, et al (2009) Named entity recognition in query. SIGIR, pp 267–274
Petkova D, Croft WB (2007) Proximity-based document representation for named entity retrieval. CIKM, pp 731–740
Zheng S, Wang F, Bao H, et al (2017) Joint extraction of entities and relations based on a novel tagging scheme. ACL, pp 1227–1236
Li P, Dong R, Wang Y, et al (2017) Leveraging linguistic structures for named entity recognition with bidirectional recursive neural networks. EMNLP, pp 2664–2669
Wang C, Cho K, Kiela D (2018) Code-switched named entity recognition with embedding attention. CALCS, pp 154–158
Ma X, Hovy E (2016) End-to-end sequence labeling via bidirectional lstm-cnns-crf. ACL, pp 1064–1074
Lample G, Ballesteros M, Subramanian S, et al (2016) Neural architectures for named entity recognition. NAACL
Yang J, Zhang Y (2018) NCRF++: an opensource neural sequence labeling toolkit. ACL
Peters ME, Neumann M, Iyyer M, et al (2018) Deep contextualized word representations. NAACL-HLT, pp 2227–2237
Luo Y, Xiao F, Hai Z (2020) Hierarchical contextualized representation for named entity recognition. arXiv:1911.02257
Wang G, Li C, Wang W, et al (2018) Joint embedding of words and labels for text classification. ACL
Qian T, Zhang M, Lou Y et al (2021) A Joint model for named entity recognition with sentence-level entity type attentions. IEEE/ACM Trans Audio Speech Lang Process 29:1438–1448
Liu Y, Meng F, Zhang J, et al (2019) Gcdt: a global context enhanced deep transition architecture for sequence labeling. ACL
Hanh T, Doucet A, Sidere N et al (2021) Named entity recognition architecture combining contextual and global features. International Conference on Asian Digital Libraries. Springer, Cham, pp 264–276
Pennington J, Socher R, Manning CD (2014) GloVe: global vectors for word representation. EMNLP
Devlin J, Chang MW, Lee K, et al (2019) Bert: pre-training of deep bidirectional transformers for language understanding. NAACL
Xin Y, Hart E, Mahajan V, et al (2018) Learning better internal structure of words for sequence labeling. EMNLP
Yang Z, Salakhutdinov R, Cohen WW (2016) Multi-task cross-lingual sequence tagging from scratch. CoRR
Akbik A, Blythe DA, Vollgraf R (2018) Contextual string embeddings for sequence labeling. COLING, pp 1638–1649
Kim Y, Jernite Y, Sontag D, et al (2016) Character-aware neural language models. AAAI, pp 2741–2749
Gridach M (2017) Character-level neural network for biomedical named entity recognition. Biomed Inf 70:85–91
Chiu JP, Nichols E (2016) Named entity recognition with bidirectional lstm-cnns. Trans Assoc Comput Linguist 4:357–370
Qian Y, Santus E, Jin Z, et al (2019) GraphIE: a graph-based framework for information extraction. NAACL
Akbik A, Bergmann T, Vollgraf R (2019) Pooled contextualized embeddings for named entity recognition. NAACL-HLT
Kulkarni M, Mahata D, Arora R, et al (2021) Learning rich representation of keyphrases from text. arXiv:2112.08547
Li J, Fei H, Liu J et al (2022) Unified named entity recognition as word-word relation classification. Proc AAAI Conf Artif Intell 36(10):10965–10973
Yamada I, Asai A, Shindo H, et al (2020) Luke: deep contextualized entity representations with entity-aware self-attention. arXiv:2010.01057
Li X, Feng J, Meng Y, et al (2020) A unified MRC framework for named entity recognition. Proc Assoc Comput Linguist, pp 5849–5859
Li X, Sun X, Meng Y, et al (2020) Dice loss for data imbalanced NLP tasks. Proc Assoc Comput Linguist
Wu S, Dredze M (2019) Beto, bentz, becas: the surprising cross-lingual effectiveness of BERT. EMNLP, pp 833–844
Shen Y, Ma X, Tan Z, et al (2021) Locate and label: a two-stage identifier for nested named entity recognition. ACL-IJCNLP, pp 2782–2794
Yan H, Gui T, Dai J, et al (2021) A Unified Generative Framework for Various NER Subtasks. ACL-IJCNLP, pp 5808–5822
Lewis M, Liu Y, Goyal N, et al (2020) BART: denoising sequence-to-sequence pretraining for natural language generation, translation, and comprehension. ACL
Lafferty JD, Mccallum A, Pereira FCN (2001) Conditional random fields: Probabilistic models for segmenting and labeling sequence data. ICML
Islam MA, Jia S, Bruce NDB (2020) How much position information do convolutional neural networks encode? arXiv:2001.08248
Sang ET, Meulder FD (2003) Introduction to the conll-2003 shared task: language-independent named entity recognition. CoNLL
Pradhan S, Moschitti A, Xue N, et al (2013) Towards robust linguistic analysis using ontonotes. CoNLL
Tjong Kim Sang EF (2002) Introduction to the conll-2002 shared task: language independent named entity recognition. CoNLL
Pradhan S, Moschitti A, Xue N, et al (2012) Modeling multilingual unrestricted coreference in OntoNotes. EMNLP, pp 1–40
Bojanowski P, Grave E, Joulin A et al (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146
Gillick D, Brunk C, Vinyals O, et al (2015) Multilingual language processing from bytes. Computer Science
Yang Z, Salakhutdinov R, WCohen W (2017b) Transfer learning for sequence tagging with hierarchical recurrent networks. ICLR
Straková J, Straka M, Hajič J (2019) Neural architectures for nested NER through linearization. arXiv:1908.06926
Zhang Y, Liu Q, Song L (2018) Sentence state lstm for text representation. ACL
Zhang S, Shen Y, Tan Z et al (2022) De-bias for generative extraction in unified NER task. ACL 1:808–818
Shen Y, Yun H, Lipton ZC, et al (2017) Deep active learning for named entity recognition. ACL
Ghaddar A, Langlais P (2018) Robust lexical features for improved neural network named entity recognition. COLING
Chen H, Lin Z, Ding G, et al (2019) GRN: gated relation network to enhance convolutional neural network for named entity recognition. AAAI
Cui L, Li Y, Zhang Y (2022) Label attention network for structured prediction. IEEE/ACM Trans Audio Speech Lang Process 30:1235–1248
Acknowledgements
Research Project Supported by Shanxi Scholarship Council of China (Grant No. HGKY2019024).
Funding
Shanxi Scholarship Council of China, HGKY2019024, Xiaohong Han.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chang, J., Han, X. Character-to-Word Representation and Global Contextual Representation for Named Entity Recognition. Neural Process Lett 55, 8551–8567 (2023). https://doi.org/10.1007/s11063-023-11168-6
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-023-11168-6