Character-to-Word Representation and Global Contextual Representation for Named Entity Recognition

Chang, Jun; Han, Xiaohong

doi:10.1007/s11063-023-11168-6

Character-to-Word Representation and Global Contextual Representation for Named Entity Recognition

Published: 15 February 2023

Volume 55, pages 8551–8567, (2023)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Jun Chang¹ &
Xiaohong Han²

281 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

The essence of named entity recognition is to mine entities with specific meanings in the text, which is the basis for some downstream tasks in the field of natural language processing. Currently, deep learning-based methods have further improved the accuracy of named entity recognition, and most methods are based on word-level and character-level embeddings. However, these methods ignore the effectiveness of global context for entity recognition, so this paper proposes to use an attention mechanism to obtain comprehensive information of the same word from different contextual information. Meanwhile, character-level representations affect not only the accuracy of recognizing unseen words, but also the extraction of contextual representations. Considering this issue, we propose to extract character-to-word representations using label attention mechanism. The proposed model uses CNN-LSTM-CRF as the baseline, which is effectively integrated into the above two representation extraction methods, named CNN-CWR-LSTM-GCR-CRF. On the basis of this model, we further integrate the language model BERT. Experiments show that our model achieves the results competitive with the state-of-the-art records on CONLL-2002 Spanish dataset, CONLL-2003 and Ontonotes5.0 English datasets, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

"Challenges and future in deep learning for sentiment analysis: a comprehensive review and a proposed novel hybrid approach"

Article Open access 05 March 2024

Impact of word embedding models on text analytics in deep learning environment: a review

Article 22 February 2023

TextConvoNet: a convolutional neural network based architecture for text classification

Article 22 October 2022

References

Zheng S, Wang F, Bao H, et al. (2017) Joint extraction of entities and relations based on a novel tagging scheme. arXiv:1706.05075
Chen D, Manning CD (2014) A fast and accurate dependency parser using neural networks. EMNLP, pp 740–750
Ma X, Zhao H (2012) Fourth-order dependency parsing. COLING, pp 785–796
Ma X, Zhao H (2015) Probabilistic models for high-order projective dependency parsing. arXiv:1502.04174
Ng V (2010) Supervised noun phrase coreference research: the first fifteen years. ACL, pp 1396–1411
Guo J, Xu G, Cheng X, et al (2009) Named entity recognition in query. SIGIR, pp 267–274
Petkova D, Croft WB (2007) Proximity-based document representation for named entity retrieval. CIKM, pp 731–740
Zheng S, Wang F, Bao H, et al (2017) Joint extraction of entities and relations based on a novel tagging scheme. ACL, pp 1227–1236
Li P, Dong R, Wang Y, et al (2017) Leveraging linguistic structures for named entity recognition with bidirectional recursive neural networks. EMNLP, pp 2664–2669
Wang C, Cho K, Kiela D (2018) Code-switched named entity recognition with embedding attention. CALCS, pp 154–158
Ma X, Hovy E (2016) End-to-end sequence labeling via bidirectional lstm-cnns-crf. ACL, pp 1064–1074
Lample G, Ballesteros M, Subramanian S, et al (2016) Neural architectures for named entity recognition. NAACL
Yang J, Zhang Y (2018) NCRF++: an opensource neural sequence labeling toolkit. ACL
Peters ME, Neumann M, Iyyer M, et al (2018) Deep contextualized word representations. NAACL-HLT, pp 2227–2237
Luo Y, Xiao F, Hai Z (2020) Hierarchical contextualized representation for named entity recognition. arXiv:1911.02257
Wang G, Li C, Wang W, et al (2018) Joint embedding of words and labels for text classification. ACL
Qian T, Zhang M, Lou Y et al (2021) A Joint model for named entity recognition with sentence-level entity type attentions. IEEE/ACM Trans Audio Speech Lang Process 29:1438–1448
Article Google Scholar
Liu Y, Meng F, Zhang J, et al (2019) Gcdt: a global context enhanced deep transition architecture for sequence labeling. ACL
Hanh T, Doucet A, Sidere N et al (2021) Named entity recognition architecture combining contextual and global features. International Conference on Asian Digital Libraries. Springer, Cham, pp 264–276
Google Scholar
Pennington J, Socher R, Manning CD (2014) GloVe: global vectors for word representation. EMNLP
Devlin J, Chang MW, Lee K, et al (2019) Bert: pre-training of deep bidirectional transformers for language understanding. NAACL
Xin Y, Hart E, Mahajan V, et al (2018) Learning better internal structure of words for sequence labeling. EMNLP
Yang Z, Salakhutdinov R, Cohen WW (2016) Multi-task cross-lingual sequence tagging from scratch. CoRR
Akbik A, Blythe DA, Vollgraf R (2018) Contextual string embeddings for sequence labeling. COLING, pp 1638–1649
Kim Y, Jernite Y, Sontag D, et al (2016) Character-aware neural language models. AAAI, pp 2741–2749
Gridach M (2017) Character-level neural network for biomedical named entity recognition. Biomed Inf 70:85–91
Article Google Scholar
Chiu JP, Nichols E (2016) Named entity recognition with bidirectional lstm-cnns. Trans Assoc Comput Linguist 4:357–370
Article Google Scholar
Qian Y, Santus E, Jin Z, et al (2019) GraphIE: a graph-based framework for information extraction. NAACL
Akbik A, Bergmann T, Vollgraf R (2019) Pooled contextualized embeddings for named entity recognition. NAACL-HLT
Kulkarni M, Mahata D, Arora R, et al (2021) Learning rich representation of keyphrases from text. arXiv:2112.08547
Li J, Fei H, Liu J et al (2022) Unified named entity recognition as word-word relation classification. Proc AAAI Conf Artif Intell 36(10):10965–10973
Google Scholar
Yamada I, Asai A, Shindo H, et al (2020) Luke: deep contextualized entity representations with entity-aware self-attention. arXiv:2010.01057
Li X, Feng J, Meng Y, et al (2020) A unified MRC framework for named entity recognition. Proc Assoc Comput Linguist, pp 5849–5859
Li X, Sun X, Meng Y, et al (2020) Dice loss for data imbalanced NLP tasks. Proc Assoc Comput Linguist
Wu S, Dredze M (2019) Beto, bentz, becas: the surprising cross-lingual effectiveness of BERT. EMNLP, pp 833–844
Shen Y, Ma X, Tan Z, et al (2021) Locate and label: a two-stage identifier for nested named entity recognition. ACL-IJCNLP, pp 2782–2794
Yan H, Gui T, Dai J, et al (2021) A Unified Generative Framework for Various NER Subtasks. ACL-IJCNLP, pp 5808–5822
Lewis M, Liu Y, Goyal N, et al (2020) BART: denoising sequence-to-sequence pretraining for natural language generation, translation, and comprehension. ACL
Lafferty JD, Mccallum A, Pereira FCN (2001) Conditional random fields: Probabilistic models for segmenting and labeling sequence data. ICML
Islam MA, Jia S, Bruce NDB (2020) How much position information do convolutional neural networks encode? arXiv:2001.08248
Sang ET, Meulder FD (2003) Introduction to the conll-2003 shared task: language-independent named entity recognition. CoNLL
Pradhan S, Moschitti A, Xue N, et al (2013) Towards robust linguistic analysis using ontonotes. CoNLL
Tjong Kim Sang EF (2002) Introduction to the conll-2002 shared task: language independent named entity recognition. CoNLL
Pradhan S, Moschitti A, Xue N, et al (2012) Modeling multilingual unrestricted coreference in OntoNotes. EMNLP, pp 1–40
Bojanowski P, Grave E, Joulin A et al (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146
Article Google Scholar
Gillick D, Brunk C, Vinyals O, et al (2015) Multilingual language processing from bytes. Computer Science
Yang Z, Salakhutdinov R, WCohen W (2017b) Transfer learning for sequence tagging with hierarchical recurrent networks. ICLR
Straková J, Straka M, Hajič J (2019) Neural architectures for nested NER through linearization. arXiv:1908.06926
Zhang Y, Liu Q, Song L (2018) Sentence state lstm for text representation. ACL
Zhang S, Shen Y, Tan Z et al (2022) De-bias for generative extraction in unified NER task. ACL 1:808–818
Google Scholar
Shen Y, Yun H, Lipton ZC, et al (2017) Deep active learning for named entity recognition. ACL
Ghaddar A, Langlais P (2018) Robust lexical features for improved neural network named entity recognition. COLING
Chen H, Lin Z, Ding G, et al (2019) GRN: gated relation network to enhance convolutional neural network for named entity recognition. AAAI
Cui L, Li Y, Zhang Y (2022) Label attention network for structured prediction. IEEE/ACM Trans Audio Speech Lang Process 30:1235–1248
Article Google Scholar

Download references

Acknowledgements

Research Project Supported by Shanxi Scholarship Council of China (Grant No. HGKY2019024).

Funding

Shanxi Scholarship Council of China, HGKY2019024, Xiaohong Han.

Author information

Authors and Affiliations

Fenyang College of Shanxi Medical University, Fenyang, 032200, Shanxi, People’s Republic of China
Jun Chang
Taiyuan University of Technology, No. 79 West Street Yingze, Taiyuan, 030024, Shanxi, People’s Republic of China
Xiaohong Han

Authors

Jun Chang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaohong Han
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaohong Han.

Ethics declarations

Conflict of interest

No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Chang, J., Han, X. Character-to-Word Representation and Global Contextual Representation for Named Entity Recognition. Neural Process Lett 55, 8551–8567 (2023). https://doi.org/10.1007/s11063-023-11168-6

Download citation

Accepted: 01 February 2023
Published: 15 February 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s11063-023-11168-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Character-to-Word Representation and Global Contextual Representation for Named Entity Recognition

Abstract

Access this article

Similar content being viewed by others

"Challenges and future in deep learning for sentiment analysis: a comprehensive review and a proposed novel hybrid approach"

Impact of word embedding models on text analytics in deep learning environment: a review

TextConvoNet: a convolutional neural network based architecture for text classification

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Character-to-Word Representation and Global Contextual Representation for Named Entity Recognition

Abstract

Access this article

Similar content being viewed by others

"Challenges and future in deep learning for sentiment analysis: a comprehensive review and a proposed novel hybrid approach"

Impact of word embedding models on text analytics in deep learning environment: a review

TextConvoNet: a convolutional neural network based architecture for text classification

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation