WCP-RNN: a novel RNN-based approach for Bio-NER in Chinese EMRs

Li, Jianqiang; Zhao, Shenhe; Yang, Jijiang; Huang, Zhisheng; Liu, Bo; Chen, Shi; Pan, Hui; Wang, Qing

doi:10.1007/s11227-017-2229-x

WCP-RNN: a novel RNN-based approach for Bio-NER in Chinese EMRs

Paper ID: FC_17_25

Published: 16 January 2018

Volume 76, pages 1450–1467, (2020)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Jianqiang Li^1,2,
Shenhe Zhao³,
Jijiang Yang²,
Zhisheng Huang⁴,
Bo Liu³,
Shi Chen⁵,
Hui Pan⁵ &
…
Qing Wang²

1195 Accesses
36 Citations
Explore all metrics

Abstract

Deep learning has achieved remarkable success in a wide range of domains. However, it has not been comprehensively evaluated as a solution for the task of Chinese biomedical named entity recognition (Bio-NER). The traditional deep-learning approach for the Bio-NER task is usually based on the structure of recurrent neural networks (RNN) and only takes word embeddings into consideration, ignoring the value of character-level embeddings to encode the morphological and shape information. We propose an RNN-based approach, WCP-RNN, for the Chinese Bio-NER problem. Our method combines word embeddings and character embeddings to capture orthographic and lexicosemantic features. In addition, POS tags are involved as a priori word information to improve the final performance. The experimental results show our proposed approach outperforms the baseline method; the highest F-scores for subject and lesion detection tasks reach 90.36 and 90.48% with an increase of 3.10 and 2.60% compared with the baseline methods, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Breakthrough of Large Language Models Release for Medical Applications: 1-Year Timeline and Perspectives

Article Open access 17 February 2024

Information extraction from electronic medical documents: state of the art and future research directions

Article 08 November 2022

Graph neural networks with selective attention and path reasoning for document-level relation extraction

Article 20 April 2024

References

Yang J-J, Li J, Mulder J, Wang Y, Chen S, Wu H, Wang Q, Pan H (2015) Emerging information technologies for enhanced healthcare. Comput Ind 69:3–11
Article Google Scholar
Zhang S, Elhadad N (2013) Unsupervised biomedical named entity recognition: experiments with clinical and biological texts. J Biomed Inform 46(6):1088–1098
Article Google Scholar
Mao R, Xu H, Wu W, Li J, Li Y, Lu M (2015) Overcoming the challenge of variety: big data abstraction, the next evolution of data management for AAL communication systems. IEEE Commun Mag 53(1):42–47
Article Google Scholar
Lafferty J, McCallum A, Pereira FCN (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Enabling recognition of diseases in biomedical text with machine learning: corpus and benchmark
Tsochantaridis I, Joachims T, Hofmann T, Altun Y (2005) Large margin methods for structured and interdependent output variables. J Mach Learn Res 6(Sep):1453–1484
MathSciNet MATH Google Scholar
Mao R, Zhang P, Li X, Liu X, Lu M (2016) Pivot selection for metric-space indexing. Int J Mach Learn Cybern 7(2):311–323
Article Google Scholar
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Article Google Scholar
Tompson JJ, Jain A, LeCun Y, Bregler C (2014) Joint training of a convolutional network and a graphical model for human pose estimation. In: Advances in neural information processing systems. pp 1799–1807
Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12(Aug):2493–2537
MATH Google Scholar
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems. pp 3111–3119
Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng A, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. pp 1631–1642
Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078
Seok M, Song H-J, Park C-Y, Kim J-D, Kim Y-S (2016) Named entity recognition using word embedding as a feature. Int J Softw Eng Appl 10(2):93–104
Google Scholar
Sahu SK, Anand A (2016) Recurrent neural network models for disease name recognition using domain invariant features. arXiv preprint arXiv:1606.09371
Tang B, Cao H, Wang X, Chen Q, Xu H (2014) Evaluating word representation features in biomedical named entity recognition tasks. BioMed Res Int 2014:240403
Li C, Song R, Liakata M, Vlachos A, Seneff S, Zhang X (2015) Using word embedding for bio-event extraction. In: Proceedings of the 2015 Workshop on Biomedical Natural Language Processing (BioNLP 2015). Association for Computational Linguistics, Stroudsburg, pp 121–126
Nie Y, Rong W, Zhang Y, Ouyang Y, Xiong Z (2015) Embedding assisted prediction architecture for event trigger identification. J Bioinform Comput Biol 13(03):1541001
Article Google Scholar
Jagannatha AN, Yu H (2016) Bidirectional rnn for medical event detection in electronic health records. In: Proceedings of the Conference. Association for Computational Linguistics. North American Chapter. Meeting, 2016, 473. NIH Public Access
Jagannatha AN, Yu H (2016) Structured prediction models for rnn based sequence labeling in clinical text. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2016, 856. NIH Public Access
Lei J, Tang B, Lu X, Gao K, Jiang M, Xu H (2013) A comprehensive study of named entity recognition in chinese clinical text. J Am Med Inform Assoc 21(5):808–814
Article Google Scholar
Yan Y, Wen D, Wang Y, Wang K (2014) Named entity recognition in chinese medical records based on cascaded conditional random field. J Jilin Univers Eng Technol Edn 6:048
Google Scholar
Dong X, Qian L, Guan Y, Huang L, Yu Q, Yang J (2016) A multiclass classification method based on deep learning for named entity recognition in electronic medical records. In: Scientific data summit (NYSDS). IEEE, New York, pp 1–10
Wu Y, Jiang M, Lei J, Xu H (2015) Named entity recognition in chinese clinical text using deep neural network. Stud Health Technol Inform 216:624
Google Scholar
Botha J, Blunsom P (2014) Compositional morphology for word representations and language modelling. In: International Conference on Machine Learning. pp 1899–1907
Chen X, Xu L, Liu Z, Sun M, Luan H-B (2015) Joint learning of character and word embeddings. In: IJCAI. pp 1236–1242
Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2):157–166
Article Google Scholar
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw 18(5):602–610
Article Google Scholar
Dyer C, Ballesteros M, Ling W, Matthews A, Smith NA (2015) Transition-based dependency parsing with stack long short-term memory. arXiv preprint arXiv:1505.08075
Cho K, Van Merriënboer B, Bahdanau D, Bengio Y (2014) On the properties of neural machine translation: encoder-decoder approaches. arXiv preprint arXiv:1409.1259
Kinga D, Adam JB (2015) A method for stochastic optimization. In: International Conference on Learning Representations (ICLR)

Download references

Acknowledgements

This work is supported by Beijing Natural Science Foundation (4152007) and China National Key Technology Research and Development Program Project with No. 2015BAH13F01.

Author information

Authors and Affiliations

School of Software Engineering, Beijing University of Technology, Beijing Engineering Research Center for IoT Software and Systems, Beijing, 100124, China
Jianqiang Li
Tsinghua National Laboratory for Information Science and Technology, Tsinghua University, Beijing, 100084, China
Jianqiang Li, Jijiang Yang & Qing Wang
School of Software Engineering, Beijing University of Technology, Beijing, 100124, China
Shenhe Zhao & Bo Liu
Computer Science Department, VU University Amsterdam, Amsterdam, The Netherlands
Zhisheng Huang
Department of Endocrinology, Peking Union Medical College Hospital, Chinese Academe of Medical Sciences and Peking Union Medical College, Beijing, 100730, China
Shi Chen & Hui Pan

Authors

Jianqiang Li
View author publications
You can also search for this author in PubMed Google Scholar
Shenhe Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Jijiang Yang
View author publications
You can also search for this author in PubMed Google Scholar
Zhisheng Huang
View author publications
You can also search for this author in PubMed Google Scholar
Bo Liu
View author publications
You can also search for this author in PubMed Google Scholar
Shi Chen
View author publications
You can also search for this author in PubMed Google Scholar
Hui Pan
View author publications
You can also search for this author in PubMed Google Scholar
Qing Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jijiang Yang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, J., Zhao, S., Yang, J. et al. WCP-RNN: a novel RNN-based approach for Bio-NER in Chinese EMRs. J Supercomput 76, 1450–1467 (2020). https://doi.org/10.1007/s11227-017-2229-x

Download citation

Published: 16 January 2018
Issue Date: March 2020
DOI: https://doi.org/10.1007/s11227-017-2229-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

WCP-RNN: a novel RNN-based approach for Bio-NER in Chinese EMRs

Abstract

Access this article

Similar content being viewed by others

The Breakthrough of Large Language Models Release for Medical Applications: 1-Year Timeline and Perspectives

Information extraction from electronic medical documents: state of the art and future research directions

Graph neural networks with selective attention and path reasoning for document-level relation extraction

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

WCP-RNN: a novel RNN-based approach for Bio-NER in Chinese EMRs

Abstract

Access this article

Similar content being viewed by others

The Breakthrough of Large Language Models Release for Medical Applications: 1-Year Timeline and Perspectives

Information extraction from electronic medical documents: state of the art and future research directions

Graph neural networks with selective attention and path reasoning for document-level relation extraction

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation