Medical Entity and Relation Extraction from Narrative Clinical Records in Italian Language

  • Crescenzo Diomaiuta
  • Maria MercorellaEmail author
  • Mario Ciampi
  • Giuseppe De Pietro
Conference paper
Part of the Smart Innovation, Systems and Technologies book series (SIST, volume 76)


Applying Natural Language Processing techniques enables to unlock precious information contained in free text clinical reports. In this paper, we propose a system able to annotate medical entities in narrative records. Considering that existing NLP systems mainly concern entity recognition in English language, we propose an NLP pipeline to manage clinical free text in Italian. The overall architecture includes a spell checker, sentence detector, word tokenizer, part-of-speech tagger, dictionary lookup annotator, and parsing rules annotator. Essentially, it uses a rule-based approach to extract relevant concepts regarding patient’s conditions, administered medications, or performed procedures, detecting their attributes, negated forms, and relations expressions. The indexing of the documents allows the user to retrieve relevant information, increasing his/her medical knowledge.


Italian natural language processing Medical entity recognition Information Extraction Unstructured medical records UIMA 


  1. 1.
    FAROO spelling correction (2016).
  2. 2.
  3. 3.
    Mongo database (2016).
  4. 4.
    Snowball resources (2016).
  5. 5.
    UIMA home (2016).
  6. 6.
    UMLS documentation (2016).
  7. 7.
    Alicante, A., Corazza, A., Isgrò, F., Silvestri, S.: Unsupervised entity and relation extraction from clinical records in italian. Comput. Biol. Med. 72, 263–275 (2016)CrossRefGoogle Scholar
  8. 8.
    Attardi, G., Cozza, V., Sartiano, D.: Adapting linguistic tools for the analysis of Italian medical records (2014)Google Scholar
  9. 9.
    Attardi, G., Cozza, V., Sartiano, D.: UniPi: Recognition of mentions of disorders in clinical text. In: Proceedings of the 8th International Workshop on Semantic Evaluation, pp. 754–760 (2014)Google Scholar
  10. 10.
    Attardi, G., Cozza, V., Sartiano, D.: Annotation and extraction of relations from Italian medical records. In: IIR (2015)Google Scholar
  11. 11.
    Byrd, R.J., Steinhubl, S.R., Sun, J., Ebadollahi, S., Stewart, W.F.: Automatic identification of heart failure diagnostic criteria, using text analysis of clinical notes from electronic health records. Int. J. Med. Informatics 83(12), 983–992 (2014)CrossRefGoogle Scholar
  12. 12.
    De Bruijn, B., Martin, J.: Getting to the (c)ore of knowledge: mining biomedical literature. Int. J. Med. Informatics 67(1), 7–18 (2002)CrossRefGoogle Scholar
  13. 13.
    Doan, S., Conway, M., Phuong, T.M., Ohno-Machado, L.: Natural language processing in biomedicine: a unified system architecture overview. In: Clinical Bioinformatics, pp. 275–294 (2014)Google Scholar
  14. 14.
    Esuli, A., Marcheggiani, D., Sebastiani, F.: An enhanced CRFs-based system for information extraction from radiology reports. J. Biomed. Inform. 46(3), 425–435 (2013)CrossRefGoogle Scholar
  15. 15.
    Friedman, C., Shagina, L., Lussier, Y., Hripcsak, G.: Automated encoding of clinical documents based on natural language processing. J. Am. Med. Inform. Assoc. 11(5), 392–402 (2004)CrossRefGoogle Scholar
  16. 16.
    Garla, V., Re, V.L., Dorey-Stein, Z., Kidwai, F., Scotch, M., Womack, J., Justice, A., Brandt, C.: The yale cTAKES extensions for document classification: architecture and application. J. Am. Med. Inform. Assoc. 18(5), 614–620 (2011)CrossRefGoogle Scholar
  17. 17.
    Hardeniya, N.: NLTK Essentials. Packt Publishing Ltd. (2015)Google Scholar
  18. 18.
    Johnson, S.B., Bakken, S., Dine, D., Hyun, S., Mendonça, E., Morrison, F., Bright, T., Van Vleck, T., Wrenn, J., Stetson, P.: An electronic health record based on structured narrative. J. Am. Med. Inform. Assoc. 15(1), 54–64 (2008)CrossRefGoogle Scholar
  19. 19.
    Kunze, M., Rösner, D.: UIMA for NLP based researchers workplaces in medical domains. In: Towards Enhanced Interoperability for Large HLT Systems: UIMA for NLP, p. 20 (2008)Google Scholar
  20. 20.
    Lin, C.H., Lai, W.S., Lee, L.H., Tsao, H.M., Liou, D.M.: An entry generation pipeline for converting free-text medical document into clinical document architecture document with entry-level. In: 2014 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI), pp. 505–508. IEEE (2014)Google Scholar
  21. 21.
    McCray, A.T., Aronson, A.R., Browne, A.C., Rindflesch, T.C., Razi, A., Srinivasan, S.: UMLS knowledge for biomedical language processing. Bull. Med. Libr. Assoc. 81(2), 184 (1993)Google Scholar
  22. 22.
    Meystre, S.M., Savova, G.K., Kipper-Schuler, K.C., Hurdle, J.F., et al.: Extracting information from textual documents in the electronic health record: a review of recent research. Yearb. Med. Inform. 35(128), 44 (2008)Google Scholar
  23. 23.
    Reyes-Ortiz, J.A., González-Beltrán, B.A., Gallardo-López, L.: Clinical decision support systems: a survey of NLP-based approaches from unstructured data. In: 2015 26th International Workshop on Database and Expert Systems Applications (DEXA), pp. 163–167. IEEE (2015)Google Scholar
  24. 24.
    Savova, G.K., Masanz, J.J., Ogren, P.V., Zheng, J., Sohn, S., Kipper-Schuler, K.C., Chute, C.G.: Mayo clinical text analysis and knowledge extraction system (cTAKES): architecture, component evaluation and applications. J. Am. Med. Inform. Assoc. 17(5), 507–513 (2010)CrossRefGoogle Scholar
  25. 25.
    Skeppstedt, M., Kvist, M., Nilsson, G.H., Dalianis, H.: Automatic recognition of disorders, findings, pharmaceuticals and body structures from clinical text: an annotation and machine learning study. J. Biomed. Inform. 49, 148–158 (2014)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  • Crescenzo Diomaiuta
    • 1
  • Maria Mercorella
    • 1
    Email author
  • Mario Ciampi
    • 1
  • Giuseppe De Pietro
    • 1
  1. 1.National Research Council of Italy, Institute of High Performance Computing and Networking - ICARNaplesItaly

Personalised recommendations