Investigation of Biomedical Named Entity Recognition Methods

Çelikten, Azer; Onan, Aytuğ; Bulut, Hasan

doi:10.1007/978-3-031-31956-3_18

Part of the book series: Engineering Cyber-Physical Systems and Critical Infrastructures ((ECPSCI,volume 7))

Included in the following conference series:

The International Conference on Artificial Intelligence and Applied Mathematics in Engineering

302 Accesses

Abstract

Biomedical named-entity recognition is the process of identifying entity names such as disease, symptom, drug, protein, and chemical in biomedical texts. It plays an important role in natural language processing, such as relationship extraction, question-answer systems, keyword extraction, machine translation, and text summarization. Biomedical domain information extraction can be used for early diagnosis of diseases, detection of missing relationships between biomedical entities such as diseases and chemicals, and determination of drug interactions and side effects. Since biomedical texts contain domain-specific words, complicated phrases, and abbreviations, named entity recognition in this domain is still a challenging task. In this study, we first investigated methods for named entity recognition in the biomedical domain. These methods are classified into four categories: dictionary-based, rule-based, machine learning, and deep learning methods. Recent advances such as deep learning and transformer-based biomedical language models have helped to achieve successful results in the named entity recognition task. Second, we conduct an experimental study on an annotated dataset called MedMention which is available to researchers. Finally, we present our experimental results and discuss the challenges and opportunities of the existing methods. The experimental study shows that the most successful method for extracting diseases and symptoms from biomedical texts is BioBERT, with an F1 score of 0.72.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 299.00; Price excludes VAT (USA)

Softcover Book: USD 379.99; Price excludes VAT (USA)

Hardcover Book: USD 379.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Deep Learning-Based Named Entity Recognition in Biomedical Domain

A Literature Survey on Biomedical Named Entity Recognition

Exploring Recurrent Neural Networks to Detect Named Entities from Biomedical Text

References

Li, J., Sun, A., Han, J., Li, C.: A survey on deep learning for named entity recognition. IEEE Trans. Knowl. Data Eng. 34(1), 50–70 (2020)
Article Google Scholar
Lee, J., et al.: BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36(4), 1234–1240 (2020)
Article Google Scholar
Zhang, Y., Chen, Q., Yang, Z., Lin, H., Lu, Z.: BioWordVec, improving biomedical word embeddings with subword information and MeSH. Sci. Data 6(1), 1–9 (2019)
Article Google Scholar
Kaddari, Z., Mellah, Y., Berrich, J., Bouchentouf, T., Belkasmi, M.G.: Biomedical question answering: a survey of methods and datasets. In: 2020 Fourth International Conference On Intelligent Computing in Data Sciences (ICDS), pp. 1–8. IEEE (2020)
Google Scholar
Aramaki, E., Miura, Y., Tonoike, M., Ohkuma, T., Masuichi, H., Ohe, K.: Text2table: Medical text summarization system based on named entity recognition and modality identification. In: Proceedings of the BioNLP 2009 Workshop, pp. 185–192 (2009)
Google Scholar
Çelikten, A., Uğur, A., Bulut, H.: Keyword extraction from biomedical documents using deep contextualized embeddings. In: 2021 International Conference on INnovations in Intelligent SysTems and Applications (INISTA), pp. 1–5 (2021). https://doi.org/10.1109/INISTA52262.2021.9548470
Yang, Z., Lin, H., Li, Y.: Exploiting the performance of dictionary-based bio-entity name recognition in biomedical literature. Comput Biol Chem 32(4), 287–291 (2008)
Article MATH Google Scholar
Aronson, A.R.: Effective mapping of biomedical text to the UMLS metathesaurus: the metamap program. In: Proceedings of the AMIA Symposium, p. 17. American Medical Informatics Association (2001)
Google Scholar
Kang, N., Singh, B., Afzal, Z., et al.: Using rule-based natural language processing to improve disease normalization in biomedical text. J. Am. Med. Inform. Assoc. 20(5), 876–881 (2013)
Article Google Scholar
Fukuda, K.I., Tsunoda, T., Tamura, A., Takagi, T.: Toward information extraction: identifying protein names from biological papers. In Pac. Symp. Biocomput. 707(18), 707–718 (1998)
Google Scholar
Khordad, M., Mercer, R.E., Rogan, P.: A machine learning approach for phenotype name recognition. In: Proceedings of COLING 2012, pp. 1425–1440 (2012)
Google Scholar
Zhu, Q., Li, X., Conesa, A., Pereira, C.: GRAM-CNN: a deep learning approach with local context for named entity recognition in biomedical text. Bioinformatics 34(9), 1547–1554 (2018)
Article Google Scholar
Kazama, J., Makino, T., Ohta, Y., et al.: Tuning support vector machines for biomedical named entity recognition. In: Proceedings of the ACL-02 Workshop on Natural Language Processing in the Biomedical Domain-vol. 3, pp. 1–8. Association for Computational Linguistics (2002)
Google Scholar
Kazkılınç, S., Adalı, E.: Koşullu Rastgele Alanlar ile Türkçe Haber Metinlerinin Etiketlenmesi. Türkiye Bilişim Vakfı Bilgisayar Bilimleri ve Mühendisliği Dergisi, 5(2) (2012)
Google Scholar
McDonald, R., Pereira, F.: Identifying gene and protein mentions in text using conditional random fields. BMC Bioinform. 6(1), 1–7 (2005)
Article Google Scholar
Luo, L., et al.: An attention-based BiLSTM-CRF approach to document-level chemical named entity recognition. Bioinformatics 34(8), 1381–1388 (2018)
Article Google Scholar
Alsentzer, E., Murphy, J.R., Boag, W., Weng, W.H., Jin, D., Naumann, T., McDermott, M.: Publicly available clinical BERT embeddings. arXiv preprint arXiv:1904.03323 (2019)
Beltagy, I., Lo, K., Cohan, A.: SciBERT: A pretrained language model for scientific text. arXiv preprint arXiv:1903.10676 (2019)
Liu, Y., et al.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019). Doğan, R.I., Leaman, R., Lu, Z.: NCBI disease corpus: a resource for disease name recognition and concept normalization. J. Biomed. Inform. 47, 1–10 (2014)
Google Scholar
Doğan, R.I., Leaman, R., Lu, Z.: NCBI disease corpus: a resource for disease name recognition and concept normalization. J. Biomed. Inform. 47, 1–10 (2014)
Article Google Scholar
Krallinger, M., et al.: The CHEMDNER corpus of chemicals and drugs and its annotation principles. J. Cheminform. 7(1), 1–17 (2015)
Article Google Scholar
Li, J., et al.: BioCreative V CDR task corpus: a resource for chemical disease relation extraction. Database 2016, baw068 (2016). https://doi.org/10.1093/database/baw068
Article Google Scholar
Kim, J.D., Ohta, T., Tsuruoka, Y., Tateisi, Y., Collier, N.: Introduction to the bio-entity recognition task at JNLPBA. In: Proceedings of the international joint workshop on natural language processing in biomedicine and its applications, pp. 70–75 (2004)
Google Scholar
Pafilis, E., et al.: The species and organisms resources for fast and accurate identification of taxonomic names in text. PLoS ONE 8(6), e65390 (2013)
Article Google Scholar
Mohan, S., Li, D.: Medmentions: A large biomedical corpus annotated with umls concepts. arXiv preprint arXiv:1902.09476 (2019)

Download references

Author information

Authors and Affiliations

Manisa Celal Bayar University, Software Engineering, Manisa, Turkey
Azer Çelikten
İzmir Katip Çelebi University, Computer Engineering, İzmir, Turkey
Aytuğ Onan
Ege University, Computer Engineering, İzmir, Turkey
Hasan Bulut

Authors

Azer Çelikten
View author publications
You can also search for this author in PubMed Google Scholar
Aytuğ Onan
View author publications
You can also search for this author in PubMed Google Scholar
Hasan Bulut
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Azer Çelikten .

Editor information

Editors and Affiliations

Department of ECE, Karunya University, Karunya Nagar, Tamil Nadu, India
D. Jude Hemanth
Department of Computer Engineering, Faculty of Engineering, Süleyman Demirel University, Isparta, Türkiye
Tuncay Yigit
Department of Computer Engineering, Suleyman Demirel University, Isparta, Türkiye
Utku Kose
Electric-Electronic Department, Duzce University, Düzce, Türkiye
Ugur Guvenc

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Çelikten, A., Onan, A., Bulut, H. (2023). Investigation of Biomedical Named Entity Recognition Methods. In: Hemanth, D.J., Yigit, T., Kose, U., Guvenc, U. (eds) 4th International Conference on Artificial Intelligence and Applied Mathematics in Engineering. ICAIAME 2022. Engineering Cyber-Physical Systems and Critical Infrastructures, vol 7. Springer, Cham. https://doi.org/10.1007/978-3-031-31956-3_18

Download citation

DOI: https://doi.org/10.1007/978-3-031-31956-3_18
Published: 27 May 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-31955-6
Online ISBN: 978-3-031-31956-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Investigation of Biomedical Named Entity Recognition Methods

Abstract

Access this chapter

Similar content being viewed by others

A Deep Learning-Based Named Entity Recognition in Biomedical Domain

A Literature Survey on Biomedical Named Entity Recognition

Exploring Recurrent Neural Networks to Detect Named Entities from Biomedical Text

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Investigation of Biomedical Named Entity Recognition Methods

Abstract

Access this chapter

Similar content being viewed by others

A Deep Learning-Based Named Entity Recognition in Biomedical Domain

A Literature Survey on Biomedical Named Entity Recognition

Exploring Recurrent Neural Networks to Detect Named Entities from Biomedical Text

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation