Skip to main content

Advertisement

Log in

Knowledge graph enrichment from clinical narratives using NLP, NER, and biomedical ontologies for healthcare applications

  • Original Research
  • Published:
International Journal of Information Technology Aims and scope Submit manuscript

Abstract

Electronic health records (EHR) contain patients’ health information in varied formats such as clinical reports written in natural language, X-rays, MRI, case/discharge-summary, etc. One of its essential constituents is clinical narratives which contain significant clinical findings of a patient. Since the clinical narratives are stored in natural language, clinical evidence, significant findings, and other observations written in it by doctors remain locked. This free text information is incomprehensible by machines while processing EHRs in healthcare applications such as Clinical Decision Support System. The proposed work, Clinical Narratives to Knowledge Graph (N2K) Mapper algorithm, is an effort to map clinical narratives to a Knowledge Graph (KG) so that important clinical details as recommended by doctors can be used effectively by healthcare applications. The KG is defined by an ontological semantic meta-data called Healthcare Ontology. The N2K Mapper algorithm uses natural language processing to parse the clinical narratives and recognizes medicinal vocabulary using named entity recognition and various existing biomedical ontologies. This semantically retrieved information is then mapped onto a KG. Besides enriching the KG from clinical narratives, the algorithm also augments the biomedical ontologies with new medicinal vocabulary leading to sustainable semantic structure for future references. An experimental study was performed on Chest X-ray radiology reports taken from the Medical Information Mart for Intensive Care (MIMIC)-III dataset. It showed that the N2K Mapper was able to find and recognize approximately 80% of the total identified entities in the existing biomedical ontologies. The remaining 20% were augmented in the biomedical ontologies with an expert’s assistance. The N2K Mapper showed an accuracy of 81.5%, precision of 78.5%, recall of 85.07%, and F1-score of 78.97%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data Availability

The data that support the findings of this study are available from [35] (https://physionet.org/content/mimiciii/1.4/), but restrictions apply to the availability of these data, which were used under license for the current study.

Notes

  1. https://www.ebi.ac.uk/ols/ontologies/symp

  2. https://disease-ontology.org/.

  3. http://radlex.org/.

  4. https://www.ebi.ac.uk/ols/ontologies/fma.

References

  1. Nurdiati S, Hoede C (2008) 25 years development of knowledge graph theory: the results and the challenge. Memorandum 1876(2):1–10

    Google Scholar 

  2. Singhal A (2012) Introducing the knowledge graph: things, not strings. Official Google Blog

  3. Ji S, Pan S, Cambria E, Marttinen P, Yu PS (2021) A survey on knowledge graphs: representation, acquisition, and applications. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2021.3070843

  4. Tian L, Zhou X, Wu YP, Zhou WT, Zhang JH, Zhang TS (2022) Knowledge graph and knowledge reasoning: a systematic review. Appl Geochem. https://doi.org/10.1016/j.jnlest.2022.100159

  5. Zhang Y et al (2020) HKGB: an inclusive, extensible, intelligent, semi-auto-constructed knowledge graph framework for healthcare with clinicians’ expertise incorporated. Inf Process Manag 57(6):102324. https://doi.org/10.1016/j.ipm.2020.102324

    Article  Google Scholar 

  6. Fang Y, Wang H, Wang L, Di R, Song Y (2019) Diagnosis of COPD based on a knowledge graph and integrated model. IEEE Access 7:46004–46013. https://doi.org/10.1109/ACCESS.2019.2909069

    Article  Google Scholar 

  7. Malik KM, Krishnamurthy M, Alobaidi M, Hussain M, Alam F, Malik G (2020) Automated domain-specific healthcare knowledge graph curation framework: subarachnoid hemorrhage as phenotype. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2019.113120

    Article  Google Scholar 

  8. Spasić I, Zhao B, Jones CB, Button K (2015) KneeTex: an ontology-driven system for information extraction from MRI reports. J Biomed Semantics 6(1):1–26. https://doi.org/10.1186/s13326-015-0033-1

    Article  Google Scholar 

  9. Yao L, Liu H, Liu Y, Li X, Anwar MW (2015) Biomedical named entity recognition based on deep neutral network. Int J Hybrid Inf Technol 8(8):279–288. https://doi.org/10.14257/ijhit.2015.8.8.29

    Article  Google Scholar 

  10. Rotmensch M, Halpern Y, Tlimat A, Horng S, Sontag D (2017) Learning a health knowledge graph from electronic medical records. Sci Rep 7(1):1–11. https://doi.org/10.1038/s41598-017-05778-z

    Article  Google Scholar 

  11. Harnoune A, Rhanoui M, Mikram M, Yousfi S, Elkaimbillah Z, El Asri B (2021) BERT based clinical knowledge extraction for biomedical knowledge graph construction and analysis. Comput Methods Progr Biomed Update 1:100042. https://doi.org/10.1016/j.cmpbup.2021.100042

    Article  Google Scholar 

  12. Nath N, Lee SH, McDonnell M, Lee I (2021) The quest for better clinical word vectors: ontology based and lexical vector augmentation versus clinical contextual embeddings. Comput Biol Med. https://doi.org/10.1016/j.compbiomed.2021.104433

    Article  Google Scholar 

  13. Kamdar MR et al (2020) Text snippets to corroborate medical relations: an unsupervised approach using a knowledge graph and embeddings. AMIA Summits Transl Sci Proc 2020:288–297

    Google Scholar 

  14. Li L et al (2020) Real-world data medical knowledge graph: construction and applications. Artif Intell Med 103:101817. https://doi.org/10.1016/j.artmed.2020.101817

    Article  Google Scholar 

  15. Yuan H, Deng W (2021) Doctor recommendation on healthcare consultation platforms: an integrated framework of knowledge graph and deep learning. Internet Res. https://doi.org/10.1108/INTR-07-2020-0379

    Article  Google Scholar 

  16. Ernst P, Siu A, Weikum G (2015) “KnowLife: a versatile approach for constructing a large knowledge graph for biomedical sciences. BMC Bioinformatics. https://doi.org/10.1186/s12859-015-0549-5

    Article  Google Scholar 

  17. Shi L, Li S, Yang X, Qi J, Pan G, Zhou B (2017) Semantic health knowledge graph: semantic integration of heterogeneous medical knowledge and services. Biomed Res Int. https://doi.org/10.1155/2017/2858423

    Article  Google Scholar 

  18. Yu T et al (2017) Knowledge graph for TCM health preservation: design, construction, and applications. Artif Intell Med 77:48–52. https://doi.org/10.1016/j.artmed.2017.04.001

    Article  Google Scholar 

  19. Xia E, Sun W, Mei J, Xu E, Wang K, Qin Y (2018) Mining disease-symptom relation from massive biomedical literature and its application in severe disease diagnosis. AMIA Annu Symp Proc 2018:1118–1126

    Google Scholar 

  20. Tao X et al (2020) Mining health knowledge graph for health risk prediction. World Wide Web. https://doi.org/10.1007/s11280-020-00810-1

    Article  Google Scholar 

  21. Xiu X, Qian Q, Wu S (2020) Construction of a digestive system tumor knowledge graph based on Chinese electronic medical records: development and usability study. JMIR Med Inform. https://doi.org/10.2196/18287

    Article  Google Scholar 

  22. Smith B et al (2007) The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration. Nat Biotechnol 25(11):1251–1255. https://doi.org/10.1038/nbt1346

    Article  Google Scholar 

  23. Dhiman S, Thukral A, Bedi P (2022) OHF: an ontology based framework for healthcare. In: Dev A, Agrawal SS, Sharma A (eds) Artificial intelligence and speech technology: AIST 2021. Communications in computer and information science, vol 1546. Springer, Cham, pp 318–328

  24. Vidhate DA, Kulkarni P (2018) Improved decision making in multiagent system for diagnostic application using cooperative learning algorithms. Int J Inf Technol. https://doi.org/10.1007/s41870-017-0079-7

    Article  Google Scholar 

  25. Gruber T (2009) Definition of ontology. Database systems, pp 10–12

  26. Mohammed O, Benlamri R, Fong S (2012) Building a diseases symptoms ontology for medical diagnosis: an integrative approach. In: 1st international conference on future generation communication technologies, FGCT 2012, pp 104–108. https://doi.org/10.1109/FGCT.2012.6476567

  27. Schriml LM et al (2019) Human Disease Ontology 2018 update: classification, content and workflow expansion. Nucleic Acids Res 47(D1):955–962. https://doi.org/10.1093/nar/gky1032

    Article  Google Scholar 

  28. Langlotz CP (2006) RadLex: a new method for indexing online educational materials. Radiographics. https://doi.org/10.1148/rg.266065168

  29. Rosse C, Mejino JLV (2003) A reference ontology for biomedical informatics: the foundational model of anatomy. J Biomed Inform 36(6):478–500. https://doi.org/10.1016/j.jbi.2003.11.007

    Article  Google Scholar 

  30. Cyganiak R, Wood D, Lanthaler M (2014) RDF 1.1 concepts and abstract syntax. W3C Recommendation

  31. Tjong Kim Sang EF, Buchholz S (2000) Introduction to the CoNLL-2000 shared task: Chunking

  32. Akhil KK, Rajimol R, Anoop VS (2020) Parts-of-Speech tagging for Malayalam using deep learning techniques. Int J Inf Technol. https://doi.org/10.1007/s41870-020-00491-z

    Article  Google Scholar 

  33. Thanawala P, Pareek J (2018) MwTExt: automatic extraction of multi-word terms to generate compound concepts within ontology. Int J Inf Technol. https://doi.org/10.1007/s41870-018-0111-6

    Article  Google Scholar 

  34. Sintayehu H, Lehal GS (2021) Named entity recognition: a semi-supervised learning approach. Int J Inf Technol. https://doi.org/10.1007/s41870-020-00470-4

    Article  Google Scholar 

  35. Johnson AEW et al (2016) MIMIC-III, a freely accessible critical care database. Sci Data 3(1):1–9. https://doi.org/10.1038/sdata.2016.35

    Article  MathSciNet  Google Scholar 

  36. Sato K (2012) An inside look at Google BigQuery. Google Inc

    Google Scholar 

  37. Loper E, Bird S (2002) NLTK: the natural language Toolkit. Assoc. Comput. Linguist.

  38. Lamy JB (2017) Owlready: Ontology-oriented programming in Python with automatic classification and high level constructs for biomedical ontologies. Artif Intell Med 80(2020):11–28. https://doi.org/10.1016/j.artmed.2017.07.002

    Article  Google Scholar 

  39. DuCharme B (2010) Learning SPARQL querying and updating with SPARQL 1.1

  40. Zaki MJ, Meira W Jr (2014) Data mining and analysis: fundamental concepts and algorithms, 2nd edn. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shivani Dhiman.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Thukral, A., Dhiman, S., Meher, R. et al. Knowledge graph enrichment from clinical narratives using NLP, NER, and biomedical ontologies for healthcare applications. Int. j. inf. tecnol. 15, 53–65 (2023). https://doi.org/10.1007/s41870-022-01145-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41870-022-01145-y

Keywords

Navigation