Chinese Named Entity Recognition Based on Multi-feature Fusion

Sun, Zhenxiang; Sun, Runyuan; Liang, Zhifeng; Su, Zhuang; Yu, Yongxin; Wu, Shuainan

doi:10.1007/978-981-99-4752-2_55

Zhenxiang Sun^13,14,
Runyuan Sun^13,14,
Zhifeng Liang¹⁴,
Zhuang Su^13,14,
Yongxin Yu^13,14 &
…
Shuainan Wu¹³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14089))

Included in the following conference series:

International Conference on Intelligent Computing

1075 Accesses

Abstract

Pre-trained language models usher in a new era of named entity recognition, but more additional relevant knowledge is needed to improve its performance on specific problems. In particular, in Chinese government named entity recognition, most entities are lengthy and have vague boundaries, and this entity length and boundary uncertainty makes the entity recognition task difficult or incorrectly identified. To address this problem, this paper proposes a Chinese named entity recognition model based on multi-feature fusion, in which lexical features, word boundary features and pinyin features are fused together through a multi-headed attention mechanism to enhance the model’s semantic representation of government texts. Meanwhile, this paper also studied the contribution of different features to entity recognition, and finds that pinyin features have unique advantages in recognising government entities. This study provides new ideas and methods for the research and application of Chinese governmental entity recognition, and also provides some insights into the problem of named entity recognition in other language domains. The experimental results show that the model proposed in this paper has better performance compared to the baseline model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Kenton, J.D.M.W.C., Toutanova, L.K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of naacL-HLT, vol. 1, p. 2 (2019)
Google Scholar
Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R.R., Le, Q.V.: Xlnet: generalized autoregressive pretraining for language understanding. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning, pp. 160–167 (2008)
Google Scholar
Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional LSTM-CNNs. Trans. Assoc. Comput. Linguist. 4, 357–370 (2016)
Article Google Scholar
Lewis, P., et al.: Retrieval-augmented generation for knowledge-intensive NLP tasks. In: Advances in Neural Information Processing Systems, vol. 33, pp. 9459–9474 (2020)
Google Scholar
Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360 (2016)
Ma, X., Hovy, E.: End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. arXiv preprint arXiv:1603.01354 (2016)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Cui, Y., et al.: A span-extraction dataset for Chinese machine reading comprehension. arXiv preprint arXiv:1810.07366 (2018)
Sun, Y., et al.: Ernie: enhanced representation through knowledge integration. arXiv preprint arXiv:1904.09223 (2019)
Sun, Z., et al.: ChineseBERT: Chinese pretraining enhanced by glyph and pinyin information. arXiv preprint arXiv:2106.16038 (2021)
Yang, J., Wang, H., Tang, Y., Yang, F.: Incorporating lexicon and character glyph and morphological features into BiLSTM-CRF for Chinese medical NER. In: 2021 IEEE International Conference on Consumer Electronics and Computer Engineering (ICCECE), pp. 12–17. IEEE (2021)
Google Scholar
Li, J., Meng, K.: MFE-NER: multi-feature fusion embedding for Chinese named entity recognition. arXiv preprint arXiv:2109.07877 (2021)
Chen, C., Kong, F.: Enhancing entity boundary detection for better Chinese named entity recognition. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pp. 20–25 (2021)
Google Scholar
Liu, W., Fu, X., Zhang, Y., Iao, W.: Lexicon enhanced Chinese sequence labeling using BERT adapter. arXiv preprint arXiv:2105.07148 (2021)
Geng, Z., Yan, H., Yin, Z., An, C., Qiu, X.: Turner: the uncertainty-based retrieval framework for Chinese NER. arXiv preprint arXiv:2202.09022 (2022)
Zheng, L., Ren, L.: Named entity recognition in the domain of nutrition and health using fusion rules and BERT-flflat model. Trans. Chin. Soc. Agric. Eng. 37(20) (2021)
Google Scholar
Guo, X., Tang, Z., Diao, L., Zhou, H., Li, L.: Named entity recognition of pests and diseases based on radical insertion and attention mechanism. J. Agric. Mach. 51(S2), 335–343 (2020)
Google Scholar
Wu, S., Song, X., Feng, Z., Wu, X.: Nflflat: non-flat-lattice transformer for Chinese named entity recognition. arXiv preprint arXiv:2205.05832 (2022)
Zhang, Y., Yang, J.: Chinese NER using lattice LSTM. arXiv preprint arXiv:1805.02023 (2018)
Levow, G.A.: The third international Chinese language processing bakeoffff: word segmentation and named entity recognition. In: Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing, pp. 108–117 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Science and Engineering, University of Jinan, Jinan, 250022, China
Zhenxiang Sun, Runyuan Sun, Zhuang Su, Yongxin Yu & Shuainan Wu
Shandong Provincial Key Laboratory of Network Based Intelligent Computing, University of Jinan, Jinan, 250022, China
Zhenxiang Sun, Runyuan Sun, Zhifeng Liang, Zhuang Su & Yongxin Yu

Authors

Zhenxiang Sun
View author publications
You can also search for this author in PubMed Google Scholar
Runyuan Sun
View author publications
You can also search for this author in PubMed Google Scholar
Zhifeng Liang
View author publications
You can also search for this author in PubMed Google Scholar
Zhuang Su
View author publications
You can also search for this author in PubMed Google Scholar
Yongxin Yu
View author publications
You can also search for this author in PubMed Google Scholar
Shuainan Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Runyuan Sun .

Editor information

Editors and Affiliations

Department of Computer Science, Eastern Institute of Technology, Zhejiang, China
De-Shuang Huang
University of Wollongong, North Wollongong, NSW, Australia
Prashan Premaratne
Zhengzhou University of Light Industry, Zhengzhou, China
Baohua Jin
Zhong Yuan University of Technology, Zhengzhou, China
Boyang Qu
University of Ulsan, Ulsan, Korea (Republic of)
Kang-Hyun Jo
Department of Computer Science, Liverpool John Moores University, Liverpool, UK
Abir Hussain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sun, Z., Sun, R., Liang, Z., Su, Z., Yu, Y., Wu, S. (2023). Chinese Named Entity Recognition Based on Multi-feature Fusion. In: Huang, DS., Premaratne, P., Jin, B., Qu, B., Jo, KH., Hussain, A. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2023. Lecture Notes in Computer Science(), vol 14089. Springer, Singapore. https://doi.org/10.1007/978-981-99-4752-2_55

Download citation

DOI: https://doi.org/10.1007/978-981-99-4752-2_55
Published: 31 July 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-4751-5
Online ISBN: 978-981-99-4752-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics