Information Retrieval System for Medical Narrative Reports

  • Lior Rokach
  • Oded Maimon
  • Mordechai Averbuch
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3055)


This paper presents a novel information retrieval system designed specifically for medical case finding applications. The proposed system begins by extracting medical information from free-text narrative reports and storing it in a predefined relational clinical data mart. The extraction is performed using a medical thesaurus and a regular expression pattern match. Following the extraction phase, inclusion/exclusion criteria are provided to the system using a physician-friendly user interface. The system converts the entered criteria into a single SQL command which can be then executed on the relational data mart. In order to achieve the appropriate response time required for on-line analysis, the system implements several caching mechanisms. The proposed system has been examined on real-world database. The performance of the system has been compared to the results obtained manually by a physician. The comparison indicates that the proposed system can be used for non-critical case-finding applications such as: finding appropriate patients for clinical trials.


Regular Expression Information Retrieval System Medical Concept Sentence Boundary Data Mart 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Buckland, M., Gey, F.: The relationship between recall and precision. Journal of the American Society for Information Science 45(1), 12–19 (1994)CrossRefGoogle Scholar
  2. 2.
    Chapman, W.W., Bridewell, W., Hanbury, P., Cooper, G.F., Buchanann, B.: A Simple Algorithm for Identifying Negated Findings and Diseases in Discharge Summaries. Journal of Biomedical Informatics 34, 301–310 (2001)CrossRefGoogle Scholar
  3. 3.
    Cohn, J.N., Tognoni, G.: A randomized trial of the angiotensin-receptor blocker valsartan in chronic heart failure. New Engl J. Med. 345, 1667–1675 (2001)CrossRefGoogle Scholar
  4. 4.
    Doshi, S.N., McDowell, I.F., Moat, S.‘.: Folic acid improves endothelial function in coronary artery disease via mechanisms largely independent of homocysteine lowering. Circulation 105, 22–26 (2002)CrossRefGoogle Scholar
  5. 5.
    Hersh, W.R., Hickam, D.D., Leone, T.J.: Words, concepts, or both: Optimal indexing units for automated information retrieval. In: Frisse, M. E. (ed.) Proceedings of the 16th Annual SCAMC, pp. 644–648 (1992)Google Scholar
  6. 6.
    Israel, E., Banerjee, T.R., Fitzmaurice, G.M., et al.: Effects of inhaled glucocorticoids on bone density in premenopausal women. N. Engl. J. Med. 345, 941–947 (2001)CrossRefGoogle Scholar
  7. 7.
    Lin, R., Lenert, L., Middleton, B., Shiffman, S.: A free-text processing system to capture physical findings: Canonical phrase identification system (CAPIS). In: Clayton, P.D. (ed.) Proceedings of the 15th Annual SCAMC, pp. 168–172 (1991)Google Scholar
  8. 8.
    Miller, G.A.: WORDNET: A Lexical Database for English. Communications of the ACM 38(11), 39–41 (1995)CrossRefGoogle Scholar
  9. 9.
    Mohr, J.P., Thompson, J.L.P., Lazar, R.M., et al.: For theWarfarin-Aspirin Recurrent Stroke Study Group. A comparison of warfarin and aspirin for the prevention of recurrent ischemic stroke. N. Engl. J. Med. 345, 1444–1451 (2001)CrossRefGoogle Scholar
  10. 10.
    Nadkarni, P., Chen, R., Brandt, C.: UMLS concept indexing for production databases: a feasibility study. J. Am. Med. Informatics Assoc. 8, 80–91 (2001)CrossRefGoogle Scholar
  11. 11.
    NLM.: UMLS Knowledge Sources, 13th edn. (2002) Google Scholar
  12. 12.
    Pratt, A.W.: Medicine, computers, and linguistics. Advanced Biomedical Engineering 3, 97–140 (1973)MathSciNetGoogle Scholar
  13. 13.
    Roy, D., Talajic, M., Dorian, P., Connolly, S., Eisenberg, M.J., Green, M., et al.: Amiodarone to prevent recurrence of atrial fibrillation. Canadian Trial of Atrial Fibrillation Investigators. N. Engl. J. Med. 342, 913–920 (2000)CrossRefGoogle Scholar
  14. 14.
    Sager N., Hirschman, L., Grishman, R., and Insolio, C. (1977). Transforming Medical Records Into a Structured Data Base. In D.Waltz, Natural Language Interfaces, ACM-SIGART Newsletter, No. 61 (Feb. 1977), pp. 38-39. Google Scholar
  15. 15.
    Van Rijsbergen, C. J. Information Retrieval. 2nd edition, London, Butterworths, 1979. Google Scholar
  16. 16.
    White H. D., Simes J., Anderson N. E., Hankey G. J., Watson J. D., Hunt D., et al. Pravastatin therapy and the risk of stroke. N Engl J Med 2000;343:321-327.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Lior Rokach
    • 1
  • Oded Maimon
    • 1
  • Mordechai Averbuch
    • 2
  1. 1.Department of Industrial EngineeringTel-Aviv UniversityTel-AvivIsrael
  2. 2.Tel-Aviv Sourasky Medical Center and Faculty of MedicineTel-Aviv UniversityTel-AvivIsrael

Personalised recommendations