Integrating Patient-Related Entities Using Hospital Information System Data and Automatic Analysis of Free Text

  • Svetla Boytcheva
  • Galia Angelova
  • Zhivko Angelov
  • Dimitar Tcharaktchiev
  • Hristo Dimitrov
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6908)


The article presents research in secondary use of information about medical entities that are automatically extracted from the free text of hospital patient records. To capture patient diagnoses, drugs, lab data and status, four extractors that analyse Bulgarian medical texts have been developed. An integrated repository, which comprises the extracted entities and relevant records of the hospital information system, has been constructed. The repository is further applied in experiments for discovery of adverse drug events. This paper presents the extractors and the strategy of assigning time anchors to the entities that are identified in the patient record texts. Evaluation results are summarised as well as application scenarios which make use of the extracting tools and the acquired integrated repository.


automatic information extraction secondary use of patient records temporal aspects of data integration 


  1. 1.
    Prokosch, H., Ganslandt, T.: Perspectives for medical informatics. Reusing the electronic medical record for clinical research. Methods Inf. Med. 48(1), 38–44 (2009)Google Scholar
  2. 2.
    PSIP (Patient Safety through Intelligent Procedures in Medication),
  3. 3.
    Botsis, T., Hartvigsen, G., Chen, F., Weng, C.: Secondary Use of EHR: Data Quality Issues and Informatics Opportunities. AMIA Summits Transl. Sci. Proc., 1–5 (2010)Google Scholar
  4. 4.
    Cimino, J., Ayres, E.: The clinical research data repository of the US National Institutes of Health. Stud. Health Technol. Inform. 160(Pt 2), 1299–1303 (2010)Google Scholar
  5. 5.
    Roque, F., Slaughter, L., Tkatchenko, A.: A Comparison of Several Key Information Visualisation Systems for Secondary Use of EHR Content. In: Proc. NAACL HLT 2nd Louhi Workshop on Text and Data Mining of Health Documents, pp. 76–83 (June 2010)Google Scholar
  6. 6.
    Hallett, C.: Multi-modal presentation of medical histories. In: IUI 2008: Proc. 13th Int. Conf. on Intelligent User Interfaces, pp. 80–89. ACM, New York (2008)CrossRefGoogle Scholar
  7. 7.
    Lowe, H., Ferris, T., Hernandez, P., Weber, S.: STRIDE - An integrated standards-based translational research informatics platform. In: AMIA Annual Symp. Proc. 2009, pp. 391–395 (2009)Google Scholar
  8. 8.
    International Classification of Diseases (ICD), WHO,
  9. 9.
    Stanfill, M., Williams, M., Fenton, S., Jenders, R., Hersh, W.: A systematic literature review of automated clinical coding and classification systems. JAMIA (17), 646–651 (2010)Google Scholar
  10. 10.
    Merlin, B., Chazard, E., Pereira, S., Serrot, E., Sakji, S., Beuscart, R., Darmoni, S.: Can F-MTI semantic-mined drug codes be used for Adverse Drug Events detection when no CPOE is available? Stud. Health Technol. Inform. 160(Pt 1), 1025–1029 (2010)Google Scholar
  11. 11.
    Halgrim, S., Xia, F., Solti, I., Cadag, E., Uzuner, Ö.: Extracting medication information from discharge summaries. In: Louhi 2010: Proceedings of the NAACL HLT 2010 Second Louhi Workshop on Text and Data Mining of Health Documents, pp. 61–67 (2010)Google Scholar
  12. 12.
    Xu, H., Stenner, S.P., Doan, S., Johnson, K.B., Waitman, L.R., Denny, J.C.: MedEx: a medication information extraction system for clinical narratives. JAMIA 17, 19–24 (2010)CrossRefGoogle Scholar
  13. 13.
    ATC drugs classification,
  14. 14.
    Meystre, S., Savova, G., Kipper-Schuler, K., Hurdle, J.F.: Extracting Information from Textual Documents in the Electronic Health Record: A Review of Recent Research. IMIA Yearbook of Medical Informatics, 138–154 (2008)Google Scholar
  15. 15.
    National Framework Contract between the National Health Insurance Fund, the Bulgarian Medical Association and the Bulgarian Dental Association, Official State Gazette no. 106/30.12.2005, updates no. 68/22.08.2006, and no. 101/15.12.2006, Sofia, Bulgaria,
  16. 16.
    Tcharaktchiev, D., Angelova, G., Boytcheva, S., Angelov, Z., Zacharieva, S.: Completion of Struc-tured Patient Descriptions by Semantic Mining. Stud. Health Technol. Inform. 166, 260–269 (2011)Google Scholar
  17. 17.
    Bulgarian Drug Agency,
  18. 18.
    Boytcheva, S.: Shallow Medication Extraction from Hospital Patient Records. Stud. Health Technol. Inform. 166, 260–269, 119–128 (2011)Google Scholar
  19. 19.
    Boytcheva, S., Tcharaktchiev, D., Angelova, G.: Contenxtualisation in Automatic Extraction of Drugs from Hospital Patient Records. In: The Proc. of MIE 2011, the 23th Int. Conf. of the European Federation for Medical Informatics, Norway, August 28-31. IOS Press, Amsterdam (to appear, 2011)Google Scholar
  20. 20.
    Logical Observation Identifiers Names and Codes (LOINC®),
  21. 21.
    Marcilly, R., Chazard, E., Beuscart-Zéphir, M.-C., Hackl, W., Baceanu, A., Kushniruk, A., Borycki, E.: Design of Adverse Drug Events-Scorecards. In: Proc. Int.l Conf. Information Technology and Communications in Health (ITCH), Victoria, CA (2011)Google Scholar
  22. 22.
    Koutkias, V., Kilintzis, V., Stalidis, G., Lazou, K., Collyda, C., Chazard, E., McNair, P., Beuscart, R., The PSIP Consortium, Maglaveras, N.: Constructing Clinical Decision Support Systems for Adverse Drug Event Prevention: A Knowledge-based Approach. In: AMIA Annu. Symp. Proc., pp. 402–406 (2010)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2011

Authors and Affiliations

  • Svetla Boytcheva
    • 1
  • Galia Angelova
    • 1
  • Zhivko Angelov
    • 1
  • Dimitar Tcharaktchiev
    • 2
  • Hristo Dimitrov
    • 2
  1. 1.Institute of Information and Communication Technologies (IICT)Bulgarian Academy of SciencesSofiaBulgaria
  2. 2.University Specialised Hospital for Active Treatment of Endocrinology “Acad. I. Penchev” (USHATE)Medical UniversitySofiaBulgaria

Personalised recommendations