Skip to main content

Processing Text in Medical Databases

  • Chapter
  • First Online:
Computer Medical Databases

Part of the book series: Health Informatics ((HI))

  • 1076 Accesses

Abstract

In the 1950s the clinical data in medical records of patients in the United States were mostly recorded in a natural, English-language, textual form. This was commonly done by physicians when recording their notes on paper sheets for a patient’s medical history and physical examination, for reporting their interpretations of x-ray images and electrocardiograms, and for their dictated descriptions of medical and surgical procedures. Such patients’ data were generally recorded by health-care professionals as hand-written notes, or as dictated reports that were then transcribed and typed on paper sheets, that were all collated in paper-based charts; and these patients’ medical charts were then stored on shelves in the medical record room. The process of manually retrieving data from patients’ paper-based medical charts was always cumbersome and time consuming. An additional frequent problem was when a patient was seeing more than one physician on the same day in the same medical facility; then that patient’s paper-based chart was often left in the first doctor’s office, and therefore was not available to the other physicians who then had to see the patient without having access to any recorded prior patient’s information. Pratt (1974) observed that the data a medical professional recorded and collected during the care of a patient was largely in a non-numeric form, and in the United States was formulated almost exclusively in English language. He noted that a word, a phrase, or a sentence in this language was generally understood when spoken or read; and the marks of punctuation and the order of the presentation of words in a sentence represented quasi-formal structures that could be analyzed for content according to common rules for: (a) the recognition and validation of the string of language data that was a matter of morphology and syntax; (b) the recognition and the registration of each datum and of its meaning that was a matter of semantics; and (c) the mapping of the recognized, defined, syntactical and semantic elements into a data structure reflected the informational content of the original language data string, and (d) that these processes required definition and interpretation of the information by the user.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Adams LB. Three surveillance and query languages. MD Comput. 1986;3:11–9.

    PubMed  CAS  Google Scholar 

  • Addison CH, Blackwell PW, Smith WE, et al. GYPSY: General information processing system remote terminal users guide. Information science series, Monograph No. 3, Norman: University of Oklahoma; 1969.

    Google Scholar 

  • Anderson MF, Moazamipour H, Hudson DL, Cohen ME. The role of the Internet in medical decision-making. Int J Med Inform. 1997;47:43–9.

    Article  PubMed  CAS  Google Scholar 

  • Bakken S, Hyun S, Friedman C, Johnson S. A comparison of semantic categories of the ISO reference terminology models for nursing and the MedLEE natural language processing system. Proc MEDINFO. 2004:472–6.

    Google Scholar 

  • Barnett GO, Hoffman PB. Computer technology and patient care; experiences of a hospital research effort. Inquiry. 1968;5:51–7.

    Google Scholar 

  • Barnett GO, Greenes RA, Grossman JM. Computer processing of medical text information. Methods Inf Med. 1969;8:177–82.

    PubMed  CAS  Google Scholar 

  • Barrows RC, Busuioc M, Friedman C. Limited parsing of notational text visit notes: Ad-hoc vs. NLP approaches. Proc AMIA. 2000:51–5.

    Google Scholar 

  • Bishop CW. A name is not enough. MD Comput. 1989;6:200–6.

    PubMed  CAS  Google Scholar 

  • Blois MS. Medical records and clinical data bases: what is the difference. Proc AMIA. 1982:86–9.

    Google Scholar 

  • Blois MS. Information and medicine: the nature of medical descriptions. Berkeley: University of California Press; 1984.

    Google Scholar 

  • Blois MS, Tuttle MS, Shererts D. RECONSIDER: a program for generating differential diagnoses. Proc SCAMC. 1981:263–8.

    Google Scholar 

  • Borlawsky TB, Li J, Shagina L, et al. Evaluation of an ontology-anchored natural language-based approach for asserting multi-scale biomolecular networks for systems medicine. Proc AMIA CRI. 2010:6–10.

    Google Scholar 

  • Broering NC, Potter J, Mistry P. Linking bibliographic and information databases: an IAIMS prototype. Proc AAMSI. 1987:169–73.

    Google Scholar 

  • Broering NC, Bagdoyan H, Hylton J, Strickler J. BioSYNTHESIS: integrating multiple databases into a virtual database. Proc SCAMC. 1989:360–4.

    Google Scholar 

  • Buck ER, Reese GR, Lindberg DAB. A general technique for computer processing of coded patient diagnoses. Mo Med. 1966;68:276–9, 285.

    Google Scholar 

  • Campbell KE, Cohn SP, Chute CG, et al. Galapagos: computer-based support for evolution of a convergent medical terminology. Symp AMIA. 1996:26–273.

    Google Scholar 

  • Campbell KE, Cohn SP, Chute CG, et al. Scalable methodologies for distributed development of logic-based convergent medical terminology. Methods Inf Med. 1998;37:426–39.

    PubMed  CAS  Google Scholar 

  • Campion TR, Weinberg ST, Lorenzi NM, Waltman LR. Evaluation of computerized free-text sign-out notes. Appl Clin Inform. 2010;1:304–17.

    Article  PubMed  Google Scholar 

  • Cao H, Chiang MF, Cimino J, Friedman C, Hripcsak G. Automatic summarization of patient discharge summaries to create problem lists using medical language processing. Proc MEDINFO. 2004:1540.

    Google Scholar 

  • Chamberlin DD, Boyce RF. SEQUEL: a structured English query language. Proc ACM SIGFIDET workshop on data description, access and control. 1974:249–64.

    Google Scholar 

  • Chen ES, Hripsak G, Friedman C. Disseminating natural language processed clinical narratives. Proc AMIA Annu Symp. 2006:126–30.

    Google Scholar 

  • Chen ES, Hripcsak G, Xu H, et al. Automated acquisition of disease-drug knowledge from biomedical and clinical documents: an initial study. J Am Med Inform Assoc. 2008;15:87–98.

    Article  PubMed  Google Scholar 

  • Childs LC, Enelow R, Simonsen L, et al. Description of a rule-based system for the i2b2 challenge in natural language processing for clinical data. J Am Med Inform Assoc. 2009;16:571–5.

    Article  PubMed  Google Scholar 

  • Chueh H, Murphy S. The i2b2 (informatics for integrating biology and the bedside) hive and the clinical research chart. https://www.i2b2.org 2006:1–58.

  • Chute CG. The Copernican era of healthcare terminology: a re-centering of health information systems. Proc AMIA. 1998:68–73.

    Google Scholar 

  • Chute CC. The journey of meaningful use. In Interoperability Reviews, AMIA The Standards Standard 2010;1:3–4.

    Google Scholar 

  • Chute CG, Crowson DL, Buntrock JD. Medical information retrieval and WWW browsers at Mayo. Proc AMIA. 1995:903–7.

    Google Scholar 

  • Chute CG, Elkin PL, Sheretz DD, Tuttle MS. Desiderata for a clinical terminology server. Proc AMIA. 1999:42–6.

    Google Scholar 

  • Cimino JJ. Linking patient information systems to bibliographic resources. Methods Inf Med. 1996;35:122–6.

    PubMed  CAS  Google Scholar 

  • Cimino JJ. Desiderata for controlled medical vocabularies in the twenty-first century. Methods Inf Med. 1998;37:394–403.

    PubMed  CAS  Google Scholar 

  • Cimino JJ. From data to knowledge through concept-oriented terminologies. J Am Med Inform Assoc. 2000;7:288–97.

    Article  PubMed  CAS  Google Scholar 

  • Cimino JJ, Barnett GO. Automated translation between medical terminologies using semantic definitions. MD Comput. 1990;7:104–9.

    PubMed  CAS  Google Scholar 

  • Cimino JJ, Aguirre A, Johnson SB, Peng P. Generic queries for meeting clinical information needs. Bull Med Libr Assoc. 1993;81:195–205.

    PubMed  CAS  Google Scholar 

  • Cimino JJ, Clayton PD, Hripsak G, Johnson SB. Knowledge-based approaches to the maintenance of a large controlled medical terminology. J Am Med Inform Assoc. 1994;1:35–50.

    Article  PubMed  CAS  Google Scholar 

  • Cimino JJ, Socratous SA, Grewal R. The informatics superhighway: prototyping on the World Wide Web. Proc SCAMC. 1995:111–5.

    Google Scholar 

  • Codd EF. A relational model of data for large shared data banks. Commun ACM. 1970;13:377–87.

    Article  Google Scholar 

  • Codd EF, Codd SB, Salley CT. Providing OLAP (On-line analytical processing) to user-analysts: an IT Mandate. San Jose: Codd & Date, Inc.; 1993.

    Google Scholar 

  • Connolly TM, Begg CE. Database management systems: a practical approach to design, implementation, and management. 2nd ed. New York: Addison-Wesley; 1999.

    Google Scholar 

  • Cote RA. The SNOP-SNOMED concept: evolution towards common medical nomenclature and classification. Pathologist. 1977;31:383–9.

    Google Scholar 

  • Cote RA. Architecture of SNOMED, its contribution to medical language processing. Proc SCAMC. 1986:74–84.

    Google Scholar 

  • Cousins SB, Silverstein JC, Frisse ME. Query networks for medical information retrieval – assigning probabilistic relationships. Proc SCAMC. 1990:800–4.

    Google Scholar 

  • Das AK, Musen MA. A comparison of the temporal expressiveness of three database query methods. Proc AMIA. 1995:331–7.

    Google Scholar 

  • Demuth AI. Automated ICD-9-CM coding: an inevitable trend to expert systems. Health Care Commun. 1985;2:62–5.

    CAS  Google Scholar 

  • Denny JC, Ritchie MD, Basford MA, et al. PheWAS: Demonstrating the feasibility of a phenome-wide scan to discover gene-disease associations. Bioinformatics. 2010;26:1205–2010.

    Article  PubMed  CAS  Google Scholar 

  • Dolin RH, Spackman K, Abilla A, et al. The SNOMED RT procedure model. Proc AMIA. 2001:139–43.

    Google Scholar 

  • Dolin RH, Mattison JE, Cohn S, et al. Kaiser Permanente’s convergent medical terminology. Proc MEDINFO. 2004:346–50.

    Google Scholar 

  • Doszkocs TE. CITE NLM: natural-language searching in an online catalog. Inf Technol Libr. 1983;2:364–80.

    Google Scholar 

  • Dozier JA, Hammond WE, Stead WW. Creating a link between medical and analytical databases. Proc SCAMC. 1985:478–82.

    Google Scholar 

  • Eden M. Storage and retrieval of the results of clinical research. Proc IRE Trans Med Electronics (ME-7). 1960:265–8.

    Google Scholar 

  • Enlander D. Computer data processing of medical diagnoses in pathology. Am J Clin Pathol. 1975;63:538–44.

    PubMed  CAS  Google Scholar 

  • Farrington JF. CPT-4: a computerized system of terminology and coding. In: Emlet HE, editor. Challenges and prospects for advanced medical systems. Miami: Symposia Specialists; 1978. p. 147–50.

    Google Scholar 

  • Feinstein AR. Unsolved scientific problems in the nosology of clinical medicine. Arch Int Med. 1988;148:2269–74.

    Article  CAS  Google Scholar 

  • Forman BH, Cimino JJ, Johnson SB, et al. Applying a controlled terminology to a distributed, production clinical information system. Proc AMIA. 1995:421–5.

    Google Scholar 

  • Friedman C. Towards a comprehensive medical language processing system: methods and issues. Proc AMIA. 1997:595–9.

    Google Scholar 

  • Friedman C. A broad-coverage natural language processing system. Proc AMIA. 2000:270–4.

    Google Scholar 

  • Friedman C, Hripcsak G. Evaluating natural language processors in the clinical domain. Methods Inf Med. 1998;37:334–44.

    PubMed  CAS  Google Scholar 

  • Friedman C, Hripcsak G. Natural language processing and its future in medicine: can computers make sense out of natural language text. Acad Med. 1999;74:890–5.

    Article  PubMed  CAS  Google Scholar 

  • Friedman C, Johnson SB. Medical text processing: past achievements, future directions. Chap 13. In: Ball MJ, Collen MF, editors. Aspects of the computer-based patient record. New York: Springer; 1992. p. 212–28.

    Google Scholar 

  • Friedman C, Alderson PO, Austin JHM, et al. A general natural-language text processor for clinical radiology. J Am Med Inform Assoc. 1994;1:161–74.

    Article  PubMed  CAS  Google Scholar 

  • Friedman C, Hripcsak G, DuMouchel W, et al. Natural language processing in an operational clinical information system. Nat Lang Eng. 1995a;1:83–108.

    Article  Google Scholar 

  • Friedman C, Johnson SB, Forman B, Starren J. Architectural requirements for a multipurpose natural language processor in the clinical environment. Proc AMIA. 1995b:347–51.

    Google Scholar 

  • Friedman C, Shagina L, Socratous S, Zeng X. A WEB-based version of MedLEE: a medical language extraction and encoding system. Proc AMIA. 1996:938.

    Google Scholar 

  • Friedman C, Hripcsak G, Shablinsky I. An evaluation of natural language processing methodologies. Proc AMIA. 1998b:855–9.

    Google Scholar 

  • Friedman C, Shagina L, Lussier Y, Hripcsak G. Automated encoding of clinical documents based on natural language processing. J Am Med Inform Assoc. 2004;11:392–402.

    Article  PubMed  Google Scholar 

  • Frisse ME. Digital libraries and information retrieval. Proc AMIA. 1996:320–2.

    Google Scholar 

  • Frisse ME, Cousins SB. Query by browsing: an alternative hypertext information retrieval method. Proc SCAMC. 1989:3880391.

    Google Scholar 

  • Fusaro VA, Kos PJ, Tector M, et al. Electronic medical record analysis using cloud computing. Proc AMIA CRI. 2010:90.

    Google Scholar 

  • Gabrieli ER. The medicine-compatible computer: a challenge for medical informatics. Methods Inf Med. 1984;9:233–50.

    CAS  Google Scholar 

  • Gabrieli ER. Computerizing text from office records. MD Comput. 1987;4:444–9.

    Google Scholar 

  • Gainer V, Goryachev S, Zeng Q, et al. Using derived concepts from electronic medical records for discovery research in informatics for integrating biology and the bedside (i2b2). Proc AMIA TBI. 2010:91.

    Google Scholar 

  • Gantner GE. SNOMED: the Systematized Nomenclature of Medicine as an ideal standard language for computer applications in medical care. Proc SCAMC. 1980:1224–6.

    Google Scholar 

  • Goldstein L. MEDUS/A: a high-level database management system. Proc SCAMC. 1980:1653–60.

    Google Scholar 

  • Gordon BL. Standard medical terminology. JAMA. 1965;191:311–3.

    Article  PubMed  CAS  Google Scholar 

  • Gordon BL. Biomedical language and format for manual and computer applications. Dis Chest. 1968;53:38–42.

    Article  Google Scholar 

  • Gordon BL. Terminology and content of the medical record. Comput Biomed Res. 1970;3:436–44.

    Article  PubMed  CAS  Google Scholar 

  • Gordon BI. Linguistics for medical records. In: Driggs MF, editor. Problem-directed and medical information systems. New York: Intercontinental Medical Book Co; 1973. p. 5–13.

    Google Scholar 

  • Graepel PH. Manual and automatic indexing of the medical record: categorized nomenclature (SNOP) versus classification (ICD). Med Inform. 1976;1:77–86.

    Article  Google Scholar 

  • Graepel PH, Henson DE, Pratt AW. Comments on the use of Systematized Nomenclature of Pathology. Methods Inf Med. 1975;14:72–5.

    PubMed  CAS  Google Scholar 

  • Grams RR, Jin ZM. The natural language processing of medical databases. J Med Syst. 1989;2:79–87.

    Article  Google Scholar 

  • Hammond WE, Straube MJ, Blunden PB, Stead WW. Query: the language of databases. Proc SCAMC. 1989:419–23.

    Google Scholar 

  • Haug PJ, Warner HR. Decision-driven acquisition of qualitative data. Proc SCAMC. 1984:189–92.

    Google Scholar 

  • Haug PJ, Gardner RM, Tate KE, et al. Decision support in medicine: examples from the HELP System. Comput Biomed Res. 1994;27:396–418.

    Article  PubMed  CAS  Google Scholar 

  • Hendrix GG, Sacerdota ED. Natural language processing; the field in perspective. Byte. 1981;6:304–52.

    Google Scholar 

  • Henkind SJ, Benis AM, Teichholz LE. Quantification as a means to increase the utility of nomenclature-classification systems. Proc MEDINFO. 1986:858–61.

    Google Scholar 

  • Hersh WR. Informatics retrieval at the millennium. Proc AMIA. 1998:38–45.

    Google Scholar 

  • Hersh WR, Donohue LC. SAPHIRE International: a tool for cross-language information retrieval. Proc AMIA. 1998:673–7.

    Google Scholar 

  • Hersh WR, Greenes RA. SAPHIRE – An information retrieval system featuring concept matching, automatic indexing. probabilistic retrieval, and hierarchical relationships. Comput Biomed Res. 1990;23:410–25.

    Article  PubMed  CAS  Google Scholar 

  • Hersh WR, Hickam D. Information retrieval in medicine: the SAPHIRE experience. Proc MEDINFO. 1995:1433–7.

    Google Scholar 

  • Hersh WR, Leone TJ. The SAPHIRE server: a new algorithm and implementation. Proc AMIA. 1995 858–63.

    Google Scholar 

  • Hersh WR, Pattison-Gordon E, Evans DA. Adaptation of Meta-1 for SAPHIRE, A general purpose information retrieval program. Proc SCAMC. 1990b:156–60.

    Google Scholar 

  • Hersh WR, Campbell EH, Evans DA, Brownlow ND. Empirical, automated vocabulary discovery using large text corpora and advanced natural language processing tools. Proc AMIA. 1996a 159–63.

    Google Scholar 

  • Hersh WR, Brown KE, Donohoe LC, et al. CliniWeb: managing clinical information on the World Wide Web. JAMIA. 1996b;3(4):273–80.

    PubMed  CAS  Google Scholar 

  • Himes BE, Kohane IS, Ramoni MF, Weiss ST. Characterization of patients who suffer asthma using data extracted from electronic medical records. Proc AMIA Ann Symp. 2008:308–12.

    Google Scholar 

  • Hogan WR, Wagner MM. Free-text fields change the meaning of coded data. Proc AMIA. 1996:517–21.

    Google Scholar 

  • Hogarth MA, Gertz M, Gorin FA. Terminology query language: a server interface for concept-oriented terminology systems. Proc AMIA. 2000:349–53.

    Google Scholar 

  • Hripcsak G, Friedman C, Alderson PO, et al. Unlocking clinical data from narrative reports: a study of natural language processing. Ann Intern Med. 1995;122:681–8.

    PubMed  CAS  Google Scholar 

  • Hripcsak G, Allen B, Cimino JJ, Lee R. Access to data: comparing AcessMed with Query by Review. J Am Med Inform Assoc. 1996;3:288–99.

    Article  PubMed  CAS  Google Scholar 

  • Humphreys BL. De facto, de rigeur, and even useful: standards for the published literature and their relationship to medical informatics. Proc SCAMC. 1990:2–8.

    Google Scholar 

  • Humphreys BL, Lindberg DAB. Building the unified medical language. Proc SCAMC. 1989:475–80.

    Google Scholar 

  • Jacobs H. A natural language information retrieval system. Proc 8th IBM Med Symp; Poughkeepsie; 1967:47–56.

    Google Scholar 

  • Jacobs H. A natural language information retrieval system. Methods Inf Med. 1968;7:8–16.

    PubMed  CAS  Google Scholar 

  • Johnson SB. Conceptual graph grammar – a simple formalism for sublanguage. Methods Inf Med. 1998;37:345–52.

    PubMed  CAS  Google Scholar 

  • Johnson SB, Friedman C. Integrating data from natural language processing into a clinical information system. Proc AMIA. 1996:537–41.

    Google Scholar 

  • Johnson SB, Aguirre A, Peng P, Cimino J. Interpreting natural language queries using the UMLS. Proc AMIA. 1994:294–8.

    Google Scholar 

  • Johnson KB, Rosenbloom ST, et al. Computer-based documentation: past, present, and future, Chap 14. In: Lehman HP, Abbott PA, Roderer NK, editors. Aspects of electronic health record systems. 2nd ed. 2006. p. 309–28.

    Google Scholar 

  • Johnston HB, Higgins SB, Harris TR, Lacy WW. The effect of a CLINFO management and analysis system on clinical research. Proc MEDCOMP. IEEE, 1982a:517–8.

    Google Scholar 

  • Johnston HB, Higgins SB, Harris TR, Lacy WW. Five years experience with the CLINFO data base management and analysis system. Proc SCAMC. 1982b:833–6.

    Google Scholar 

  • Karpinski RHS, Bleich HL. MISAR: a miniature information storage and retrieval system. Comput Biomed Res. 1971;4:655–71.

    Article  PubMed  CAS  Google Scholar 

  • Katz B. Clinical research system. MD Comput. 1986;3:53–5, 61.

    PubMed  CAS  Google Scholar 

  • Kementsietsidis A, Lipyeow L, Wang M. Profile-based retrieval of records in medical databases. Proc AMIA Annu Symp. 2009:312–6.

    Google Scholar 

  • Kent A. Computers and biomedical information storage and retrieval. JAMA. 1966;196:927–32.

    Article  PubMed  CAS  Google Scholar 

  • King C, Strong RM, Dovovan K. MEDUS/A: 1983 status of a database system for research and patient care. Proc SCAMC. 1983a:709–11.

    Google Scholar 

  • King C, Strong RM, Goldstein L. MEDUS/A: Distributing database management for research and patient data. Proc SCAMC. 1988:818–26.

    Google Scholar 

  • Kingsland LC. RDBS: Research data base system for microcomputers; coding techniques and file structures. Proc AAMSI Conf. 1982:85–9.

    Google Scholar 

  • Korein J. The computerized medical record. The variable-field-length format system and its applications. Proc IFIPS TCH Conf. 1970:259–91.

    Google Scholar 

  • Korein J, Tick L, Woodbury MA, et al. Computer processing of medical data by variable-field-length format. JAMA. 1963;186:132–8.

    Article  PubMed  CAS  Google Scholar 

  • Korein J, Goodgold AJ, Randt CT. Computer processing of medical data by variable-field-length format. II: progress and application to narrative documents. JAMA. 1966;196:950–6.

    Article  PubMed  CAS  Google Scholar 

  • Lacson R, Long W. Natural language processing of spoken diet records. Proc AMIA Annu Symp Proc. 2006:454–8.

    Google Scholar 

  • Lamson BG, Glinsky BC, Hawthorne GS, et al. Storage and retrieval of uncoded tissue pathology diagnoses in the original English free-text. Proc 7th IBM Med Symp; Poukeepsie; 1965:411–26.

    Google Scholar 

  • Layard MW, McShane DJ. Applications of MEDLOG, A microprocessor-based system for time-oriented clinical data. Proc SCAMC. 1983:731–4.

    Google Scholar 

  • Levy AH, Lawrance DP. Information retrieval, Chap 7. In: Ball MJ, Collen MF, editors. Aspects of the computer-based patient record. New York: Springer; 1992. p. 146–52.

    Google Scholar 

  • Levy C, Rogers E. Clinician oriented access to data – C.O.A.D. A natural language interface to a VA DHCP database. Proc AMIA. 1995:933.

    Google Scholar 

  • Lincoln TL, Groner GF, Quinn JJ, Lukes RJ. The analysis of functional studies in acute lymphatic leukemia using CLINFO – A small computer information and analysis system for clinical investigators. Med Inform. 1976;1:95–103.

    Article  Google Scholar 

  • Lindberg DAB. The computer and medical care. Springfield: Charles C. Thomas; 1968.

    Google Scholar 

  • Lindberg DAB, Rowland LR, Bush WF, et al. CONSIDER: a computer program for medical instruction. 9th IBM Med Symp. 1968:59–61.

    Google Scholar 

  • Logan JR, Britell S, Delcambre LM, et al. Representing multi-database study schemas for reusability. Proc STB. 2010:21–5.

    Google Scholar 

  • Lupovitch A, Memminger JJ, Corr RM. Manual and computerized cumulative reporting systems for the clinical microbiology laboratory. Am J Clin Pathol. 1979;72:841–7.

    PubMed  CAS  Google Scholar 

  • Lussier YA, Rothwell DJ, Cote RA. The SNOMED model: a knowledge source for the controlled terminology of the computerized patient record. Methods Inf Med. 1998;37:161–4.

    PubMed  CAS  Google Scholar 

  • Lussier Y, Borlawski T, Rappaport D, et al. PHENOGO: assigning phenotypic context to gene ontology annotations with natural language processing. Pac Symp Biocomput. 2006;11:64–75.

    Article  Google Scholar 

  • Lyman M, Sager N, Friedman C, Chi E. Computer-structured narrative in ambulatory care: its use in longitudinal review of clinical data. Proc SCAMC. 1985:82–6.

    Google Scholar 

  • Mabry JC, Thompson HK, Hopwood MD, Baker WR. A prototype data management and analysis system (CLINFO): system description and user experience. Proc MEDINFO. 1977:71–5.

    Google Scholar 

  • Mays E, Weida R, Dionne R, et al. Scalable and expressive medical terminologies. Proc AMIA. 1996:259–63.

    Google Scholar 

  • McCormick BH, Chang SK, Boroved RT, et al. Technological trends in clinical information systems. Proc MEDINFO. 1977:43–8.

    Google Scholar 

  • McCormick PJ, Elhadad N, Stetson PD, et al. Use of semantic features to classify patient smoking status. Proc AMIA. 2008:450–4.

    Google Scholar 

  • McCray AT. The nature of lexical knowledge. Methods Inf Med. 1998;37:353–60.

    PubMed  CAS  Google Scholar 

  • McCray AT, Sponsler JL, Brylawski B, Browne AC. The role of lexical knowledge in biomedical text understanding. Proc SCAMC. 1987:103–7.

    Google Scholar 

  • McCray AT, Bodenreider O, Malley JD, Browne AC. Evaluating UMLS strings for natural language processing. Proc AMIA. 2001 448–52.

    Google Scholar 

  • McDonald CJ. Protocol-based computer reminders, the quality of care and the non-perfectibility of man. N Engl J Med. 1976;295:1351–5.

    Article  PubMed  CAS  Google Scholar 

  • McDonald CJ, Blevens L, Glazener T, et al. Data base management, feedback control and the Regenstrief medical record. Proc SCAMC. 1982:52–60.

    Google Scholar 

  • Melski JW, Geer DE, Bleich HL. Medical information storage and retrieval using preprocessed variables. Comput Biomed Res. 1978;11:613–21.

    Article  PubMed  CAS  Google Scholar 

  • Mendonca EA, Cimino JJ, Johnson SB, Seol YH. Accessing heterogeneous sources of evidence to answer clinical questions. J Biomed Inform. 2001;34:85–98.

    Article  PubMed  CAS  Google Scholar 

  • Meystre S, Haug PJ. Medical problem and document model for natural language understanding. Proc AMIA Ann Symp. 2003:455–9.

    Google Scholar 

  • Meystre SM, Haug PJ. Comparing natural language processing tools to extract medical problems from narrative text. Proc AMIA Annu Symp. 2005:525–9.

    Google Scholar 

  • Meystre SM, Deshmukh VG, Mitchell J. A clinical use case to evaluate the i2b2 Hive: predicting asthma exacerbations. Proc AMIA Annu Symp. 2009:442–6.

    Google Scholar 

  • Miller PB, Strong RM. Clinical care and research using MEDUS/A, a medically oriented data base management system. Proc SCAMC. 1978:288–97.

    Google Scholar 

  • Miller RA, Kapoor WN, Peterson J. The use of relational databases as a tool for conducting clinical studies. Proc SCAMC. 1983:705–8.

    Google Scholar 

  • Mirel BR, Wright DZ, Tenenbaum JD, et al. User requirements for exploring a resource inventory for clinical research. Proc AMIA CRI. 2010:31–5.

    Google Scholar 

  • Morgan MM, Beaman PD, Shusman DL, et al. Medical query language. Proc SCAMC. 1981:322–5.

    Google Scholar 

  • Mullins HC, Scanland PM, Collins D, et al. The efficacy of SNOMED, Read Codes, and UMLS in coding ambulatory family practice clinical records. Proc AMIA. 1996:135–9.

    Google Scholar 

  • Munoz F., Hersh W. MCM Generastors: a Java-based tool for generating medical metadata. Proc AMIA. 1998:648–52.

    Google Scholar 

  • Murphy SN, Morgan MM, Barnett GO, Chueh HC. Optimizing healthcare research data warehouse design through a past COSTAR query analysis. Proc AMIA. 1999:892–6.

    Google Scholar 

  • Murphy SN, Mendis M, Hackett K, et al. Architecture of the open-source clinical research chart from informatics for integrating biology and the bedside. Proc AMIA. 2007:548–52.

    Google Scholar 

  • Myers J, Gelblat M, Enterline HT. Automatic encoding of pathology data. Arch Pathol. 1970;89:73–8.

    PubMed  CAS  Google Scholar 

  • Nelson S, Hoffman S, Karnekal H, Varma A. Making the most of RECONSIDER; an evaluation of input strategies. Proc SCAMC. 1983:852–5.

    Google Scholar 

  • Nielson J, Wilcox A. Linking structured text to medical knowledge. Proc MEDINFO. 2004:1777.

    Google Scholar 

  • Nigrin DJ, Kohane IS. Scaling a data retrieval and mining application to the enterprise-wide level. Proc AMIA. 1999:901–5.

    Google Scholar 

  • NIH-DRR: General Clinical Research Centers, A Research Resources Directory, seventh revised edition. Bethesda: Division of Res Resources, NIH; 1988.

    Google Scholar 

  • Niland JC, Rouse L, et al. Clinical research needs, Chap 3. In: Lehman HP, Abbott PA, Roderer NK, editors. Aspects of electronic health record systems. New York: Springer; 2006. p. 31–46.

    Google Scholar 

  • Nunnery AW. A medical information storage and statistical system (MICRO-MISSY). Proc SCAMC. 1984:383–5.

    Google Scholar 

  • O’Connor MJ, Samson W, Musen MA. Representation of temporal indeterminacy in clinical databases. Proc AMIA Symp. 2000:615–9.

    Google Scholar 

  • Obermeier KK. Natural-language processing, an introductory look at some of the technology used in this area of artificial intelligence. BYTE. 1987;12:225–32.

    Google Scholar 

  • Okubo RS, Russell WS, Dimsdale B, Lamson BG. Natural language storage and retrieval of medical diagnostic information. Comput Programs Biomed. 1975;75:105–30.

    Article  Google Scholar 

  • Oliver DE, Barnes MR, Barnett GO, et al. InterMed: an Internet-based medical collaboratory. Proc AMIA. 1995:1023.

    Google Scholar 

  • Oliver DE, Shortliffe EH, et al. Collaborative model development for vocabulary and guidelines. Proc AMIA. 1996:826.

    Google Scholar 

  • Olson NE, Sheretz, Erlbaum MS, et al. Explaining your terminology to a computer. Proc AMIA. 1995:957.

    Google Scholar 

  • Ozbolt JG, Russo M, Stultz MP. Validity and reliability of standard terms and codes for patient care data. Proc AMIA. 1995:37–41.

    Google Scholar 

  • Pendse N. Online analytical processing. Wikipedia. Retrieved in 2008. http://en.wikipedia:org/wiki/Online_analytical_processing.

    Google Scholar 

  • Porter D, Safran C. On-line searches of a hospital data base for clinical research and patient care. Proc SCAMC. 1984:277–9.

    Google Scholar 

  • Powsner SM, Barwick KW, Morrow JS, et al. Coding semantic relationships for medical bibliographic retrieval: a preliminary study. Proc SCAMC. 1987:108–12.

    Google Scholar 

  • Prather JC, Lobach DF, Hales JW, et al. Converting a legacy system database into relational format to enhance query efficiency. Proc SCAMC. 1995:372–6.

    Google Scholar 

  • Pratt AW. Automatic processing of pathology data. Journees D’Informatique Medicale. 1971:595–609.

    Google Scholar 

  • Pratt AW. Medicine, computers, and linguistics. In: Brown JHU, Dickson JF, editors. Biomedical engineering. New York: Academic; 1973. p. 97–140.

    Google Scholar 

  • Pratt AW. Medicine and linguistics. MEDINFO. 1974:5–11.

    Google Scholar 

  • Pratt AW. Representation of medical language data utilizing the Systemized Nomenclature of Pathology. In: Enlander D, editor. Computers in laboratory medicine. New York: Academic; 1975. p. 42–53.

    Google Scholar 

  • Pratt AW, Pacak M. Identification and transformation of terminal morphemes in medical English. Methods Inf Med. 1969;8:84–90.

    PubMed  CAS  Google Scholar 

  • Pratt AW, Pacak M. Automatic processing of medical English. Preprint No. 11, Classification: IR 3.4. Reprinted by USHEW, NIH. 1969b.

    Google Scholar 

  • Price SL, Hersh WR, Olson DD, et al. SmartQuery: context-sensitive links to medical knowledge sources from the electronic patient record. Proc AMIA. 2002:627–31.

    Google Scholar 

  • Pryor DB, Stead WW, Hammond WE, et al. Features of TMR for a successful clinical and research database. Proc SCAMC. 1982:79–83.

    Google Scholar 

  • Ranum DL. Knowledge based understanding of radiology text. Proc SCAMC. 1988:141–5.

    Google Scholar 

  • Robinson RE. Acquisition and analysis of narrative medical record data. In Collen MF, editor. Proceedings of the Conference on Med Inform Systems. Rockville: NCHSR&D; 1970. p. 111–27.

    Google Scholar 

  • Robinson RE. Pathology subsystem. In: Collen MF, editor. Hospital computer systems. New York: Wiley; 1974. p. 194–205.

    Google Scholar 

  • Robinson RE. Surgical pathology information processing system. In: Coulson WF, editor. Surgical pathology. Philadelphia: JB Lippincott; 1978. p. 1–20.

    Google Scholar 

  • Roper WL. From the Health Care Financing Administration. JAMA. 1989;261:1550.

    Article  PubMed  CAS  Google Scholar 

  • Roper WL, Winkenwerder W, Hackbarth GM, Krakaur H. Effectiveness in health care; an initiative to evaluate and improve medical practice. N Engl J Med. 1988;319:1197–202.

    Article  PubMed  CAS  Google Scholar 

  • Rothwell DJ, Cote RA. Optimizing the structure of a standardized vocabulary. Proc SCAMC. 1990:181–4.

    Google Scholar 

  • Rothwell DJ, Cote RA. Managing information with SNOMED: Understanding the model. Proc AMIA. 1996:80–3.

    Google Scholar 

  • Safran C, Porter D. New uses of a large clinical data base, Chap 7. In: Orthner HF, Blum BI, editors. Implementing health care systems. New York: Springer; 1989. p. 123–32.

    Chapter  Google Scholar 

  • Safran C, Rury C, Lightfoot J, Porter D. CLINQUERY: a program that allows physicians to search a large clinical database. Proc MEDINFO. 1989a:966–70.

    Google Scholar 

  • Safran C, Porter D, Lightfoot J, et al. ClinQuery: a system for online searching of data in a teaching hospital. Ann Int Med. 1989b;111:751–756

    Google Scholar 

  • Sager N, Hirschman L. Computerized language processing for multiple use of narrative discharge summaries. Proc SCAMC. 1978:330–43.

    Google Scholar 

  • Sager N, Kosaka M. A database of literature organized by relations. Proc SCAMC. 1983:692–5.

    Google Scholar 

  • Sager N, Tick L, Story G, Hirschman L. A codasyl-type schema for natural language medical records. Proc SCAMC. 1980:1027–33.

    Google Scholar 

  • Sager N, Bross IDJ, Story G, et al. Automatic encoding of clinical narrative. Comput Biol Med. 1982a;12:43–55.

    Article  PubMed  CAS  Google Scholar 

  • Sager N, Chi EC, Tick LJ, Lyman M. Relational database design for computer-analyzed medical narrative. Proc SCAMC. 1982b:797–804.

    Google Scholar 

  • Sager N, Friedman C, Lyman MS, et al. The analysis and process of clinical narrative. Proc MEDINFO. 1986:1101–5.

    Google Scholar 

  • Sager N, Lyman M, Bucknall C, et al. Natural language processing and the representation of clinical data. JAMIA. 1994;1:142–60.

    PubMed  CAS  Google Scholar 

  • Sager N, Nhan NT, Lyman M, Tick LJ. Medical language processing with SGML display. Proc AMIA. 1996:547–51.

    Google Scholar 

  • Schoch NA, Sewell W. The many faces of natural language searching. Proc AMIA. 1995:914.

    Google Scholar 

  • Seol YH, Johnson HB, Cimino JJ. Conceptual guidelines in information retrieval. Proc AMIA. 2001:1026.

    Google Scholar 

  • Shapiro AR. Exploratory analysis of the medical record. Proc SCAMC. 1982:781–5.

    Google Scholar 

  • Shusman DJ, Morgan MM, Zielstorff R, Barnett GO. The medical query language. Proc SCAMC. 1983:742–5.

    Google Scholar 

  • Sim I, Carini S, Tu S, Wynden R, et al. The human studies database project: Federating human studies design data using the ontology of clinical research. Proc AMIA CRI. 2010:51–5.

    Google Scholar 

  • Smith JW, Svirbely JR. Laboratory information systems. MD Comput. 1988;5:38–47.

    PubMed  Google Scholar 

  • Spackman KA. Rates of change in a large clinical terminology: three years experience with SNOMED clinical terms. Proc AMIA. 2005:714–8.

    Google Scholar 

  • Spackman KA, Hersh WR. Recognizing noun phrases in medical discharge summaries: an evaluation of two natural language parsers. Proc AMIA. 1996:155158.

    Google Scholar 

  • Stearns MQ, Price C, Spackman KA, Wang AY. SNOMED clinical terms: overview of the development process and project status. Proc AMIA. 2001:662–6.

    Google Scholar 

  • Story G, Hirschman L. Data base design for natural language medical data. J Med Syst. 1982;6:77–88.

    Article  PubMed  CAS  Google Scholar 

  • Tatch D. Automatic encoding of medical diagnoses. Proc 6th IBM Med Symp. Poughkeepsie;1964:1–7.

    Google Scholar 

  • Thompson HK, Baker WR, Christopher TG, et al. CLINFO, a research data management and analysis system acceptable to physician users. Proc SCAMC. 1977:140–2.

    Google Scholar 

  • Tuttle MS, Campbell KE, Olson NE, et al. Concept, code, term and word: preserving the distinctions. Proc AMIA. 1995:956.

    Google Scholar 

  • Wang X, Chused A, Elhadad N, et al. Automated knowledge acquisition from clinical narrative reports. Proc AMIA Symp. 2008:783–7.

    Google Scholar 

  • Wang L, Wang G, Shi X, et al. User experience evaluation of Google search for obtaining medical knowledge: a case study. Proc AMIA STB. 2010:120.

    Google Scholar 

  • Ward RE, MacWilliam CH, Ye E, et al. Development and multi-institutional implementation of coding and transmission standards for health outcomes data. Proc AMIA. 1996:438–42.

    Google Scholar 

  • Ware H, Mullett CJ, Jagannathan V. Natural language processing framework to assess clinical conditions. J Am Med Inform Assoc. 2009;16:585–9.

    Article  PubMed  Google Scholar 

  • Warner HR, Guo D, Mason C, et al. Enroute toward a computer based patient record: the ACIS project. Proc AMIA. 1995:152–6.

    Google Scholar 

  • Webster S, Morgan M, Barnett GO. Medical Query Language: improved access to MUMPS databases. Proc SCAMC. 1987:306–9.

    Google Scholar 

  • Wells AH. The conversion of SNOP to the computer languages of medicine. Pathologists. 1971;25:371–8.

    Google Scholar 

  • Weyl S, Fries J, Wiederhold G, Germano F. A modular self-describing clinical databank system. Comput Biomed Res. 1975;8:279–93.

    Article  PubMed  CAS  Google Scholar 

  • Whitehead SF, Streeter M. CLINFO – a successful technology transfer. Proc SCAMC. 1984:557–60.

    Google Scholar 

  • Whiting-O’Keefe Q, Strong PC, Simborg DW. An automated system for coding data from summary time oriented record (STOR). Proc SCAMC. 1983:735–7.

    Google Scholar 

  • Williams GZ, Williams RL. Clinical laboratory subsystem. In: Collen MF, editor. Hospital computer systems. New York: Wiley; 1974. p. 148–93.

    Google Scholar 

  • Wynden R. Providing a high security environment for the Integrated Data Repository lead institution. Proc AMIA STB. 2010:123.

    Google Scholar 

  • Wynden R, Weiner MG, Sim I, et al. Ontology mapping and data discovery for the translational investigator. Proc AMIA STB. 2010:66–70.

    Google Scholar 

  • Xu H, Friedman C. Facilitating research in pathology using natural language processing. Proc AMIA. 2003:1057

    Google Scholar 

  • Yang H, Spasic I, Keane JA, Nenadic G. A text mining approach to the prediction of disease status from clinical discharge summaries. J Am Med Inform Assoc. 2009;16:596–600.

    Article  PubMed  Google Scholar 

  • Yianilos PN, Harbort RA, Buss SR, Tuttle EP. The application of a pattern matching algorithm to searching medical record text. Proc SCAMC. 1978:308–13.

    Google Scholar 

  • Zacks MP, Hersh WR. Developing search strategies for detecting high quality reviews in a hypertext test collection. Proc AMIA. 1998:663–7.

    Google Scholar 

  • Zeng Q, Cimino JJ. Mapping medical vocabularies to the Unified Medical Language System. Proc AMIA. 1996:105–9.

    Google Scholar 

  • Zeng Q, Cimino JJ. Evaluation of a system to identify relevant patient information and its impact on clinical information retrieval. Proc AMIA. 1999:642–6.

    Google Scholar 

  • Zhang G, Siegler T, Saxman P, et al. VISAGE: A query interface for clinical research. Proc AMIA CRI. 2010:76–80.

    Google Scholar 

  • Zhou L, Tao Y, Cimino JJ, et al. Terminal model discovery using natural language processing and visualization techniques. J Biomed Inform. 2006;39:626–36.

    Article  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag London Limited

About this chapter

Cite this chapter

Collen, M.F. (2012). Processing Text in Medical Databases. In: Computer Medical Databases. Health Informatics. Springer, London. https://doi.org/10.1007/978-0-85729-962-8_3

Download citation

  • DOI: https://doi.org/10.1007/978-0-85729-962-8_3

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-0-85729-961-1

  • Online ISBN: 978-0-85729-962-8

  • eBook Packages: MedicineMedicine (R0)

Publish with us

Policies and ethics