Deep Learning Based Biomedical Named Entity Recognition Systems

  • Pragatika Mishra
  • Sitanath Biswas
  • Sujata DashEmail author
Part of the Studies in Big Data book series (SBD, volume 68)


In this chapter, we are proposing a really crucial downside known as medicine Named Entity Recognition system. Named entity recognition could be a vital mission in linguistic communication process referring to artificial intelligence, information Retrieval and data Extraction. Linguistic communication process could be a subfield of engineering, computer science and data engineering that deals that the interaction between the pc and human language. It deals with the method and analyse the language information. It’s a pc activity during which computers square measure subjected to know, alter and analyse which has automation of activities, strategies of communication. One amongst the vital elements of linguistic communication process (NLP) is called Entity Recognition (NER), which is employed to search out and classify the expressions of specific which means in texts, written in linguistic communication. The various varieties of named entities includes person name, association name, place name, numbers etc. During this book chapter we tend to area unit solely handling medicine named entity recognition (Bio-NER) that could be a basic assignment within the conducting of medicine text terms, like ribonucleic acid, cell type, cell line, protein, and DNA. Biomedical NER be one amongst the foremost core and crucial task in medicine data extraction from documents. Recognizing or characteristic medicine named entities looks to be tougher than characteristic traditional named entities. During this book chapter we tend to area unit victimization Deep learning formula that is additionally called deep structural learning or gradable learning. It’s a division of a broader unit of machine learning ways supported learning knowledge representation conflicting such task algorithms. This kind of learning is supervised, semi supervised or unsupervised. Deep learning model area units are largely inspired by IP and communication pattern in biological nervous systems nonetheless with various variations from structural and purposeful functions of biological brains. For experiment and analysis, we’ve used GENIA Corpus that was created by a gaggle of researchers to develop the analysis of knowledge and text mining system in biological science. It consists of one, 999 MEDLINE abstracts. The GENIA Corpus has been loosely employed by linguistic communication process community for improvement of linguistics search system and institution Bio human language technology tasks. During this analysis, we tend to propose a multi-tasking learning arrangement for Bio-NER that supports NN models to avoid wasting human effort. Deep neural spec that has several layers and every layer abstract options primarily based on the standard generated by the lower layers. After comparing with the results of various experiments like Saha et al.’s (Pattern Recogn. Lett 3:1591–1597, 2010) with a Precision of 68.12, Recall 67.66 and F-Score 67.89; Liao et al.’s (Biomedical Named Entity Recognition Based on Skip-Chain Crfs. pp. 1495–1498, 2012) with a Precision of 72.8, Recall 73.6 and F-Score73.2; ABNER (A Biomedical Named Entity Recognizer, pp. 46–51, 2013) with a Precision of 69.1, Recall 72.0 and F-Score 70.5; Sasaki et al. (How to Make the Most of Ne Dictionaries in Statistical Ner. pp. 63–70, 2008) with a Precision of 68.58, Recall 79.85 and F-Score 73.78; Sun et al.’s (Comput. Biol. Med 37:1327–1333, 2007) with a Precision of 70.2, Recall 72.3 and F-Score 71.2; Our system has achieved a Precision of 66.54, Recall 76.13 and F-score 71.01% on GENIA normal take a look at corpus, that is near to the progressive performance using simply Part-of-speech feature and shows that deep learning will efficiently be performed upon medical specialty Named Entity Recognition. This book chapter deals with the following section: Introduction, Literature review, Architecture, Experiment, Results and analysis, conclusion and future work and References.


GENIA corpus Deep learning Machine learning Natural language processing Named entity recognition 


  1. 1.
    Lim, S., Lee, K., Kang, J.: Drug drug interaction extraction from the literature using a recursive neural network. PLoS ONE 13(1), e0190926 (2018)CrossRefGoogle Scholar
  2. 2.
    Lee, K., Hwang, Y., Kim, S., Rim, H.: Biomedical named entity recognition using two-phase model based on Svms. J. Biomed. Inform. 37(6), 436–447 (2004)CrossRefGoogle Scholar
  3. 3.
    Hettne, K.M., Stierum, R.H., Schuemie, M.J., Hendriksen, P.J., Schijvenaars, B.J., Mulligen, E.M.V et al.: A dictionary to identify small molecules and drugs in free text. Bioinformatics. 25(22), 2983–2991 (2009)CrossRefGoogle Scholar
  4. 4.
    Song, M., Yu, H., Han, W.S.: Developing a hybrid dictionary-based bio-entity recognition technique. BMC Med. Inform. Decis. Mak. 15(1), S9 (2015)CrossRefGoogle Scholar
  5. 5.
    Fukuda, K.I., Tsunoda, T., Tamura, A., Takagi, T. et al.: Toward information extraction: identifying protein names from biological papers. In: Pac sympbiocomput. vol. 707, p. 707–718 (1998)Google Scholar
  6. 6.
    Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. In: HLT-NAACL. The Association for Computational Linguistics. p. 260–270 (2016)Google Scholar
  7. 7.
    Pyysalo, S., Ginter, F., Moen, H., Salakoski, T., Ananiadou, S.: Distributional semantics resources for biomedical text processing. In: Proceedings of the 5th International Symposium on Languages in Biology and Medicine, Tokyo, Japan. p. 39–43 (2013)Google Scholar
  8. 8.
    Kim, J.D., Ohta, T., Tsuruoka, Y., Tateisi, Y., Collier, N.: Introduction to the bio-entity recognition task at JNLPBA. In: Proceedings of the international joint workshop on natural language processing in biomedicine and its applications. Association for Computational Linguistics. p. 70–75 (2004)Google Scholar
  9. 9.
    Caruana, R.: Multitask learning. Mach. Learn. 28(1), 41–75 (1997). Available from: Scholar
  10. 10.
    Collobert, R.: Deep learning for efficient discriminative parsing. In: International Conference on Artificial Intelligence and Statistics (2011)Google Scholar
  11. 11.
    Dai, H., Chang, Y.C., Tsai, R.T.Z.H., Hsu, W.: New challenges for biological text- mining in the next decade. J. Comput. Sci. Technol. 25(1), 169–179 (2010)CrossRefGoogle Scholar
  12. 12.
    Krallinger, M., Morgan, A., Smith, L., Leitner, F., Tanabe, L., Wilbur, J., Hirschman, L., Valencia, A.: Evaluation of text-mining systems for biology: overview of the second biocreative community challenge. Genome Biol. 9(2) (2008)CrossRefGoogle Scholar
  13. 13.
    Dai, H., Huang, C., Lin, R., Tsai, R., Hsu, W.: Biosmile web search: a web application for annotating biomedical entities and relations. Nucleic Acids Res. 36, 390–397 (2008)CrossRefGoogle Scholar
  14. 14.
    Rebholz-Schuhmann, D., Arregui, M., Gaudan, S., Kirsch, H., Jimeno, A.: Text processing through web services: calling Whatizit. Bioinformatics. 24(2) 296–300 (2008)CrossRefGoogle Scholar
  15. 15.
    Si, L., Kanungo, T., Huang, X.: Boosting performance of bio-entity recognition by combining results from multiple systems. In: Proceedings of the 5th International Workshop on Bioinformatics ACM (2005), pp. 76–83Google Scholar
  16. 16.
    Tsuruoka, Y., Tateishi, Y., Kim, J.-D., Ohta, T., McNaught, J., Ananiadou, S., Tsujii, J.I.: Developing a robust part-of-speech tagger for biomedical text. In: Advances in Informatics. Springer (2005), pp. 382–392Google Scholar
  17. 17.
    Vlachos, A.: Evaluating and combining biomedical named entity recognition systems. In: BioNLP 2007: Biological, Translational, and Clinical Language Processing (2007), pp. 199–206Google Scholar
  18. 18.
    Li, L., Zhou, R., Huang, D.: Two-phase biomedical named entity recognition using crfs. Comput. Biol. Chem. 33(4), 334–338 (2009)CrossRefGoogle Scholar
  19. 19.
    Li, L., Fan, W., Huang, D.: A two-phase bio-ner system based on integrated classifiers and multi-agent strategy. IEEE/ACM Trans. Comput. Biol. Bioinf. 10(4), 897–904 (2013)CrossRefGoogle Scholar
  20. 20.
    Lee, S., Kim, D., Lee, K., Choi, J., Kim, S., Jeon, M., et al.: BEST: next-generation biomedical entity search tool for knowledge discovery from biomedical literature. PLoS ONE 11(10), e0164680 (2016)CrossRefGoogle Scholar
  21. 21.
    Proux, D., Rechenmann, F., Julliard, L., Pillet, V., Jacq, B.: Detecting gene symbols and names in biological texts. Genome Inform. 9, 72–80 (1998)Google Scholar
  22. 22.
    Tsai, R.T.H., Sung, C.L., Dai, H.J., Hung, H.C., Sung, T.Y., Hsu, W.L.: NERBio: using selected word conjunctions, term normalization, and global patterns to improve biomedical named entity recognition. In: BMC bioinformatics. BioMed Central. 7, S11 (2006)CrossRefGoogle Scholar
  23. 23.
    Ju, M., Miwa, M., Ananiadou, S.: A neural layered model for nested named entity recognition. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). vol. 1, p. 1446–1459 (2018)Google Scholar
  24. 24.
    Doğan, R.I., Leaman, R., Lu, Z.: NCBI disease corpus: a resource for disease name recognition and concept normalization. J. Biomed. Inform. 47, 1–10 (2014)CrossRefGoogle Scholar
  25. 25.
    Crichton, G., Pyysalo, S., Chiu, B., Korhonen, A.: A neural network multi-task learning approach to biomedical named entity recognition. BMC Bioinform. 18(1), 368 (2017)CrossRefGoogle Scholar
  26. 26.
    Zheng, J.G., Howsmon, D., Zhang, B., Hahn, J., McGuinness, D., Hendler, J et al.: Entity linking for biomedical literature. In: Proceedings of the ACM 8th International Workshop on Data and Text Mining in Bioinformatics. ACM. p. 3–4 (2014)Google Scholar
  27. 27.
    Tsutsui, S., Ding, Y., Meng, G.: Machine reading approach to understand Alzheimers disease literature. In: Proceedings of the Tenth International Workshop on Data and Text Mining in Biomedical Informatics (DTMBIO) (2016)Google Scholar
  28. 28.
    Bengio, R.D.Y., Vincent, P.: A neural probalilistic language model. In: NIPS. vol. 13 (2001)Google Scholar
  29. 29.
    Westion, R.C.A.J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: ICML (2008)Google Scholar
  30. 30.
    Collobert, J.W.R., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, A.P.: Natural language processing (almost) from scratch. JMLR (2011)Google Scholar
  31. 31.
    YoshuaBengio, R.E.D., Vincent, P., Janvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)Google Scholar
  32. 32.
    Schwenk, H.: Continuous space language models. Comput. Speech Lang. 21(3), 492–518 (2007)CrossRefGoogle Scholar
  33. 33.
    Mikolov, T., Karafiat, M., Burget, L., Cernocky, J., Khudanpur, S.: Recurrent neural network based language model. In: Eleventh Annual Conference of the International Speech Communication Association (INTERSPEECH) (2010), pp. 1045–1048Google Scholar
  34. 34.
    Mnih, A., Teh, Y.W.: A fast and simple algorithm for training neural probabilistic language models. In: Proceedings of the 29th International Conference on Machine Learning (ICML-12) (2012), pp. 1751–1758Google Scholar
  35. 35.
    Collobert, R.: Deep learning for efficient discriminative parsing. In: International Conference on Artificial Intelligence and Statistics (AISTATS) (2011)Google Scholar
  36. 36.
    Turian, J., Ratinov, L., Bengio, Y.: Word representations: a simple and general method for semi-supervised learning. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (2010), pp. 384–394Google Scholar
  37. 37.
    Yih, W.T., Mikolov, T., Zweig, G.: Linguistic regularities in continuous space word representations. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, (2013) pp. 746–751Google Scholar
  38. 38.
    Bottou, L.: Stochastic gradient learning in neural networks. In: Proceedings of Neuro-Nimes, vol. 91 (1991)Google Scholar
  39. 39.
    Saha, S.N.S.K., Sarkar, S., Mitra, P.: A composite kernel for named entity recognition. Pattern Recogn. Lett. 3, 1591–1597 (2010)CrossRefGoogle Scholar
  40. 40.
    Liao, Z., Wu, H.: Biomedical named entity recognition based on skip-chain crfs. In: Industrial Control and Electronics Engineering (ICICEE), 2012 International Conference on. IEEE (2012), pp. 1495–1498Google Scholar
  41. 41.
    ABNER: A Biomedical Named Entity Recognizer (2013), pp. 46–51Google Scholar
  42. 42.
    Sasaki, Y.T.Y., McNaught, J., Ananiadou, S.: How to make the most of ne dictionaries in statistical ner. In: Proceedings Workshop Current Trends in Biomedical Natural Language Processing (2008), pp. 63–70Google Scholar
  43. 43.
    Sun, C., Guan, Y., Wang, X., Lin, L.: Rich features based conditional random fields for biological named entities recognition. Comput. Biol. Med. 37, 1327–1333 (2007)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Pragatika Mishra
    • 1
  • Sitanath Biswas
    • 2
  • Sujata Dash
    • 2
    Email author
  1. 1.Gandhi Institute for TechnologyBhubaneswarIndia
  2. 2.North Orissa UniversityBaripadaIndia

Personalised recommendations