A parallel computing-based Deep Attention model for named entity recognition

Liu, Xiaojun; Yang, Ning; Jiang, Yu; Gu, Lichuan; Shi, Xianzhang

doi:10.1007/s11227-019-02985-5

A parallel computing-based Deep Attention model for named entity recognition

Published: 07 September 2019

Volume 76, pages 814–830, (2020)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Xiaojun Liu¹,
Ning Yang¹,
Yu Jiang¹,
Lichuan Gu¹ &
…
Xianzhang Shi¹

555 Accesses
11 Citations
Explore all metrics

Abstract

Named entity recognition (NER) is an important task in natural language processing and has been widely studied. In recent years, end-to-end NER with bidirectional long short-term memory (BiLSTM) has received more and more attention. However, it remains a major challenge for BiLSTM to parallel computing, long-range dependencies and single feature space mapping. We propose a deep neural network model which is based on parallel computing self-attention mechanism to address these problems. We only use a small number of BiLSTMs to capture the time series of texts and then make use of self-attention mechanism that allows parallel computing to capture long-range dependencies. Experiments on two NER datasets show that our model is superior in quality and takes less training time. Our model achieves an F1 score of 92.63% on the SIGHAN bakeoff 2006 MSRA portion for Chinese NER, improving over the existing best results by over 1.4%. On the CoNLL2003 shared task portion for English NER, our model achieves an F1 score of 92.17%, which outperforms the previous state-of-the-art results by 0.91%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Chinese Named Entity Recognition Using the Improved Transformer Encoder and the Lexicon Adapter

A multi-head adjacent attention-based pyramid layered model for nested named entity recognition

Article Open access 01 September 2022

Fast Neural Chinese Named Entity Recognition with Multi-head Self-attention

Notes

The toolkit is developed by Tsinghua University natural language processing laboratory. Refer to http://thulac.thunlp.org/ for more details.
https://code.google.com/p/word2vec/.
http://www.sogou.com/labs/resource/ca.php.
https://dumps.wikimedia.org/zhwiki/latest.

References

Cheng J, Dong L, Lapata M (2016) Long short-term memory-networks for machine reading. arXiv preprint arXiv:1601.06733
Dernoncourt F, Lee JY, Szolovits P (2017) Neuroner: an easy-to-use program for named-entity recognition based on neural networks. arXiv preprint arXiv:1705.05487
Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805
Dong C, Zhang J, Zong C, Hattori M, Di H (2016) Character-based LSTM-CRF with radical-level features for Chinese named entity recognition. In: Natural Language Understanding and Intelligent Applications. Springer, pp 239–250
Gu L, Han Y, Wang C, Chen W, Jun J, Yuan X (2018) Module overlapping structure detection in PPI using an improved link similarity-based Markov clustering algorithm. Neural Comput Appl 1:1–10
Google Scholar
Habibi M, Weber L, Neves M, Wiegandt DL, Leser U (2017) Deep learning with word embeddings improves biomedical named entity recognition. Bioinformatics 33(14):i37
Article Google Scholar
Kim Y, Jernite Y, Sontag D, Rush AM (2016) Character-aware neural language models. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. AAAI Press, pp 2741–2749
Kuru O, Can OA, Yuret D (2016) Charner: character-level named entity recognition. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. pp 911–921
Lample G, Ballesteros M, Subramanian S, Kawakami K, Dyer C (2016) Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360
Lei Ba J, Kiros JR, Hinton GE (2016) Layer normalization. arXiv preprint arXiv:1607.06450
Lin Z, Feng M, Santos CNd, Yu M, Xiang B, Zhou B, Bengio Y (2017) A structured self-attentive sentence embedding. arXiv preprint arXiv:1703.03130
Lu Y, Zhang Y, Ji D (2016) Multi-prototype Chinese character embedding. In: Proceedings of the tenth international conference on language resources and evaluation, pp 855–859
Luo G, Huang X, Lin CY, Nie Z (2016) Joint entity recognition and disambiguation. In: Conference on Empirical Methods in Natural Language Processing. pp 879–888
Ma X, Hovy E (2016) End-to-end sequence labeling via bi-directional lstm-cnns-crf. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). vol 1, pp 1064–1074
Mnih V, Heess N, Graves A (2014) Recurrent models of visual attention. In: Advances in Neural Information Processing Systems, pp 2204–2212
Rao H, Shi X, Rodrigue AK, Feng J, Xia Y, Elhoseny M, Yuan X, Gu L (2019) Feature selection based on artificial bee colony and gradient boosting decision tree. Appl Soft Comput 74(1):634–642
Article Google Scholar
Ratinov L, Roth D (2009) Conll ’09 design challenges and misconceptions in named entity recognition. In: CoNLL ’09: Proceedings of the Thirteenth Conference on Computational Natural Language Learning. pp 147–155
Santos CD, Zadrozny B (2014) Learning character-level representations for part-of-speech tagging. In: Proceedings of the 31st International Conference on Machine Learning (ICML-14). pp 1818–1826
Shen T, Zhou T, Long G, Jiang J, Pan S, Zhang C (2017) Disan: directional self-attention network for RNN/CNN-free language understanding. arXiv preprint arXiv:1709.04696
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
MathSciNet MATH Google Scholar
Stephen H, Du BK, Johnson BR (2012) The homemade alternative: Teaching human neurophysiology with instrumentation made (almost) from scratch. J Undergrad Neurosci Educ 11(1):A161–A168
Google Scholar
Stevenson S, Carreras X (2009) Proceedings of the thirteenth conference on computational natural language learning: shared task. In: Thirteenth Conference on Computational Natural Language Learning: Shared Task
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in Neural Information Processing Systems, pp 5998–6008
Wang C, Chen W, Xu B (2017) Named entity recognition with gated convolutional neural networks. In: Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. Springer, pp 110–121
Yadav V, Sharp R, Bethard S (2018) Deep affix features improve neural named entity recognizers. In: Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics. pp 167–172
Yang YS, Zhang M, Chen W, Zhang W, Wang H, Zhang M (2018) Adversarial learning for Chinese NER from crowd annotations. arXiv preprint arXiv:1801.05147
Yang Z, Salakhutdinov R, Cohen WW (2017) Transfer learning for sequence tagging with hierarchical recurrent networks. arXiv preprint arXiv:1703.06345
Yuan X, Xie L, Abouelenien M (2002) A regularized ensemble framework of deep learning for cancer detection from multiclass, imbalanced training data. Pattern Recognit 77:160–172
Article Google Scholar
Yuan X, Buckles BP, Yuan Z, Zhang J (2018) Mining negative association rules. In: Proceedings ISCC 2002 Seventh International Symposium on Computers and Communications. pp 623–628
Zhou J, Qu W, Zhang F (2013) Chinese named entity recognition via joint identification and categorization. Chin J Electron 22(2):225–230
Google Scholar
Zhou J, Xu W (2015) End-to-end learning of semantic role labeling using recurrent neural networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). vol 1, pp 1127–1137

Download references

Acknowledgements

We thank the reviewers for their thoughtful comments and suggestions. This work was supported by the National Natural Science Foundation of China (Grant Nos. 31771679, 31371533, 31671589), the Special Fund for Key Program of Science and Technology of Anhui Province of China (Grant Nos. 16030701092, kJ2016A836, 18030901034), the Key Laboratory of Agricultural Electronic Commerce (Grant Nos. AEC2018003, AEC2018006) and Hefei Major Research Project of Key Technology (Grant No. J2018G14).

Author information

Authors and Affiliations

School of Computer and Information, Anhui Agricultural University, No. 130 ChangJiang Road, Hefei, 230036, Anhui, China
Xiaojun Liu, Ning Yang, Yu Jiang, Lichuan Gu & Xianzhang Shi

Authors

Xiaojun Liu
View author publications
You can also search for this author in PubMed Google Scholar
Ning Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yu Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Lichuan Gu
View author publications
You can also search for this author in PubMed Google Scholar
Xianzhang Shi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lichuan Gu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, X., Yang, N., Jiang, Y. et al. A parallel computing-based Deep Attention model for named entity recognition. J Supercomput 76, 814–830 (2020). https://doi.org/10.1007/s11227-019-02985-5

Download citation

Published: 07 September 2019
Issue Date: February 2020
DOI: https://doi.org/10.1007/s11227-019-02985-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A parallel computing-based Deep Attention model for named entity recognition

Abstract

Access this article

Similar content being viewed by others

Chinese Named Entity Recognition Using the Improved Transformer Encoder and the Lexicon Adapter

A multi-head adjacent attention-based pyramid layered model for nested named entity recognition

Fast Neural Chinese Named Entity Recognition with Multi-head Self-attention

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A parallel computing-based Deep Attention model for named entity recognition

Abstract

Access this article

Similar content being viewed by others

Chinese Named Entity Recognition Using the Improved Transformer Encoder and the Lexicon Adapter

A multi-head adjacent attention-based pyramid layered model for nested named entity recognition

Fast Neural Chinese Named Entity Recognition with Multi-head Self-attention

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation