Research of Clinical Named Entity Recognition Based on Bi-LSTM-CRF

Qin, Ying; Zeng, Yingfei

doi:10.1007/s12204-018-1954-5

Research of Clinical Named Entity Recognition Based on Bi-LSTM-CRF

Published: 07 June 2018

Volume 23, pages 392–397, (2018)
Cite this article

Journal of Shanghai Jiaotong University (Science) Aims and scope Submit manuscript

Ying Qin (秦颖)¹ &
Yingfei Zeng (曾颖菲)¹

328 Accesses
21 Citations
Explore all metrics

Abstract

Electronic Medical Records (EMR) with unstructured sentences and various conceptual expressions provide rich information for medical information extraction. However, common Named Entity Recognition (NER) in Natural Language Processing (NLP) are not well suitable for clinical NER in EMR. This study aims at applying neural networks to clinical concept extractions. We integrate Bidirectional Long Short-Term Memory Networks (Bi-LSTM) with a Conditional Random Fields (CRF) layer to detect three types of clinical named entities. Word representations fed into the neural networks are concatenated by character-based word embeddings and Continuous Bag of Words (CBOW) embeddings trained both on domain and non-domain corpus. We test our NER system on i2b2/VA open datasets and compare the performance with six related works, achieving the best result of NER with F1 value 0.853 7. We also point out a few specific problems in clinical concept extractions which will give some hints to deeper studies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

SAGER N, FRIEDMAN C, LYMAN M S. Review of medical language processing: computer management of narrative data [J]. Computational Linguistics, 1989, 15(3): 195–198.
Google Scholar
UZUNER O, SOUTH B R, SHEN S, et al. 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text [J]. Journal of the American Medical Informatics Association. 2011, 18(5): 552–556.
Article Google Scholar
CURRAN J R, CLARK S. Language independent NER using a maximum entropy tagger [C]//Proceedings of the 7th Conference on Natural Language Learning at HLT-NAACL. Edmonton, Canada: ACL, 2003: 164–167.
Chapter Google Scholar
TJONG KIM SANG E F, DE MEULDER F. Introduction to the CoNLL-2003 shared task: Language-Independent named entity recognition [C]//Proceedings of the 7th Conference on Natural Language Learning at HLT-NAACL. Edmonton, Canada: ACL, 2003: 142–147.
Chapter Google Scholar
COLLOBERT R, WESTON J, BOTTOU L, et al. Natural language processing (almost) from scratch [J]. Journal of Machine Learning Research, 2011, 12(8): 2493–2537.
MATH Google Scholar
HUANG Z, XU W, YU K. Bidirectional LSTM-CRF models for sequence tagging [EB/OL]. (2015-08-19). [2017-06-21]. https://arxiv.org/pdf/1508.01991v1.pdf.
LAMPLE G, BALLESTEROS M, SUBRAMANIAN S, et al. Neural architectures for named entity recognition [C]//Proceedings of NAACL-2016, San Diego, US: ACL, 2016: 260–270.
Google Scholar
HOCHREITER S, SCHMIDHUBER J. Long shortterm memory [J]. Neural Computation, 1997, 9(8): 1735–1780.
Article Google Scholar
LAFFERTY J, MCCALLUM A, PEREIRA F C N. Conditional random fields: Probabilistic models for segmenting and labeling sequence data [C]//Proceedings of the 18th International Conference on Machine Learning. Williamstown, US: IMLS, 2001: 282–289.
Google Scholar
BOAG W, WACOME K, NAUMANN T, et al. CliNER: A lightweight tool for clinical named entity recognition [C]//AMIA Joint Summits on Clinical Research Informatics. San Francisco, CA: AMIA, 2015.
DE BRUIJN B, CHERRY C, KIRITCHENKO S, et al. Machine-Learned solutions for three stages of clinical information extraction: the state of the art at i2b2 2010 [J]. Journal of the American Medical Informatics Association, 2011, 18(5): 557–562.
Article Google Scholar
WU Y H, XU J, JIANG M, et al. A study of neural word embeddings for named entity recognition in clinical text [C]//AMIA Annual Symposium Proceedings. 2015: 1326–1333.
Google Scholar
JONNALAGADDA S, COHEN T, WU S, et al. Enhancing clinical concept extraction with distributional semantics [J]. Journal of Biomedical Informatics, 2012,45(1): 129–140.
Article Google Scholar
CHALAPATHY R, BORZESHI, E Z, PICCARDI M. Bidirectional LSTM-CRF for clinical concept extraction [EB/OL]. (2016-10-19). [2017-06-21]. https://arxiv.org/pdf/1610.05858.pdf.
MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space [EB/OL]. (2013-09-07). [2017-06-21]. https://arxiv.org/pdf/1301.3781v3.pdf.
BENGIO Y, SIMARD P, FRASCONI P. Learning long-term dependencies with gradient descent is difficult [J]. IEEE Transactions on Neural Networks, 1994, 5(2): 157–166.
Article Google Scholar
GRAVES A, SCHMIDHUBER J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures [J]. Neural Networks, 2005, 18(5/6): 602–610.
Article Google Scholar
MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality [EB/OL]. (2013-10-16). [2017-06-21]. https://arxiv.org/pdf/1310.4546.pdf.
FU X, ANANIADOU S. Improving the extraction of clinical concepts from clinical records [C]//Proceedings of the 4th Workshop on Building and Evaluating Resources for Health and Biomedical Text Processing. Reykjavik, Iceland: ELRA, 2014.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Beijing Foreign Studies University, Beijing, 100089, China
Ying Qin (秦颖) & Yingfei Zeng (曾颖菲)

Authors

Ying Qin (秦颖)
View author publications
You can also search for this author in PubMed Google Scholar
Yingfei Zeng (曾颖菲)
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ying Qin (秦颖).

Additional information

Foundation item: the National Social Science Foundation of China (No. 17BYY047)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Qin, Y., Zeng, Y. Research of Clinical Named Entity Recognition Based on Bi-LSTM-CRF. J. Shanghai Jiaotong Univ. (Sci.) 23, 392–397 (2018). https://doi.org/10.1007/s12204-018-1954-5

Download citation

Received: 21 June 2017
Published: 07 June 2018
Issue Date: June 2018
DOI: https://doi.org/10.1007/s12204-018-1954-5

Key words

CLC number

TP 391.4

Document code

A

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Research of Clinical Named Entity Recognition Based on Bi-LSTM-CRF

Abstract

Access this article

Similar content being viewed by others

Entity recognition from clinical texts via recurrent neural network

Clinical Named Entity Recognition Methods: An Overview

Comparing Different Methods for Named Entity Recognition in Portuguese Neurology Text

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Key words

CLC number

Document code

Navigation

Research of Clinical Named Entity Recognition Based on Bi-LSTM-CRF

Abstract

Access this article

Similar content being viewed by others

Entity recognition from clinical texts via recurrent neural network

Clinical Named Entity Recognition Methods: An Overview

Comparing Different Methods for Named Entity Recognition in Portuguese Neurology Text

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

CLC number

Document code

Search

Navigation