Skip to main content

Evaluation

  • 386 Accesses

Part of the Health Informatics book series (HI)

Abstract

This chapter focuses on the evaluation of operational biomedical and health information retrieval (IR) systems. It follows a framework that reviews studies that look at use of systems, uses for the system, user satisfaction, how well the system was used, factors associated with successful use of the system, and impact that the system had. The chapter finishes with a discussion of relevance judgments and their role and limitations in IR evaluation research.

Keywords

  • Usage
  • User satisfaction
  • System-oriented evaluation
  • User-oriented evaluation
  • Failure analysis
  • Relevance
  • Kappa statistic

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-47686-1_7
  • Chapter length: 47 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   59.99
Price excludes VAT (USA)
  • ISBN: 978-3-030-47686-1
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   74.99
Price excludes VAT (USA)
Hardcover Book
USD   109.99
Price excludes VAT (USA)
Fig. 7.1
Fig. 7.2

Notes

  1. 1.

    https://www.nlm.nih.gov/bsd/medline_pubmed_production_stats.html

  2. 2.

    https://www.ncbi.nlm. nih.gov/books/NBK3827/#pubmedhelp.Clinical_Queries_Filters

  3. 3.

    https://www.epistemonikos.org/

References

  1. Hersh W, Hickam D. How well do physicians use electronic information retrieval systems? A framework for investigation and review of the literature. J Am Med Assoc. 1998;280:1347–52.

    CAS  CrossRef  Google Scholar 

  2. Anonymous. From screen to script: the Doctor’s digital path to treatment. New York, NY: Manhattan Research; 2012.

    Google Scholar 

  3. Fox S, Duggan M. Health online 2013. Washington, DC: Pew Internet & American Life Project; 2013 January 15.

    Google Scholar 

  4. Gorman P. Information needs of physicians. J Am Soc Inf Sci. 1995;46:729–36.

    CrossRef  Google Scholar 

  5. Duran-Nelson A, Gladding S, Beattie J, Nixon L. Should we Google it? Resource use by internal medicine residents for point-of-care clinical decision making. Acad Med. 2013;88:788–94.

    PubMed  CrossRef  Google Scholar 

  6. Cook D, Sorensen K, Nishimura R, Ommen S, Lloyd F. A comprehensive information technology system to support physician learning at the point of care. Acad Med. 2014;90:33–9.

    CrossRef  Google Scholar 

  7. Cook D, Sorensen K, Linderbaum J, Pencille LJ, Rhodes D. Information needs of generalists and specialists using online best-practice algorithms to answer clinical questions. J Am Med Inform Assoc. 2017;24:754–61.

    PubMed  CrossRef  PubMed Central  Google Scholar 

  8. Herskovic J, Tanaka L, Hersh W, Bernstam E. A day in the life of PubMed: analysis of a typical day’s query log. J Am Med Inform Assoc. 2007;14:212–20.

    PubMed  PubMed Central  CrossRef  Google Scholar 

  9. Seguin A, Haynes R, Carballo S, Iorio A, Perrier A, Agoritsas T. Physicians’ translation of clinical questions into searchable queries: an analytical survey. JMIR Medical Education. 2020:Epub ahead of print.

    Google Scholar 

  10. Fiksdal A, Kumbamu A, Jadhav A, Cocos C, Nelsen L, Pathak J, et al. Evaluating the process of online health information searching: a qualitative approach to exploring consumer perspectives. J Med Internet Res. 2014;16(10):e224.

    PubMed  PubMed Central  CrossRef  Google Scholar 

  11. Jadhav A, Andrews D, Fiksdal A, Kumbamu A, McCormick J, Misitano A, et al. Comparative analysis of online health queries originating from personal computers and smart devices on a consumer health information portal. J Med Internet Res. 2014;16(7):e160.

    PubMed  PubMed Central  CrossRef  Google Scholar 

  12. Fox S. Health topics. Washington, DC: Pew Internet & American Life Project; 2011 February 1.

    Google Scholar 

  13. Ritchie H. Does the news reflect what we die from? Our world in data 2019.

    Google Scholar 

  14. Palotti J, Hanbury A, Muller H, Kahn C. How users search and what they search for in the medical domain - understanding laypeople and experts through query logs. Information Retrieval Journal. 2016;19:189–224.

    CrossRef  Google Scholar 

  15. Nielsen J, Levy J. Measuring usability: preference vs. performance. Commun ACM. 1994;37:66–75.

    CrossRef  Google Scholar 

  16. Taylor H. The growing influence and use of health care information obtained online. New York, NY: Harris Interactive 2011 September 15. Contract No.: Harris Poll #98.

    Google Scholar 

  17. Dixit R, Rogith D, Narayana V, Salimi M, Gururaj A, Ohno-Machado L, et al. User needs analysis and usability assessment of data med–a biomedical data discovery index. J Am Med Inform Assoc. 2018;25:337–44.

    PubMed  CrossRef  Google Scholar 

  18. Cleverdon C, Keen E. Factors determining the performance of indexing systems (Vol. 1: design, Vol. 2: results). Aslib Cranfield Research Project: Cranfield, England; 1966.

    Google Scholar 

  19. Swanson D. Information retrieval as a trial-and-error process. Libr Q. 1977;47:128–48.

    CrossRef  Google Scholar 

  20. Lancaster F. Evaluation of the MEDLARS demand search service. Bethesda, MD: National Library of Medicine; 1968.

    Google Scholar 

  21. McCain K, White H, Griffith B. Comparing retrieval performance in online databases. Inf Process Manag. 1987;23:539–53.

    CrossRef  Google Scholar 

  22. Gehanno J, Paris C, Thirion B, Caillard J. Assessment of bibliographic databases performance in information retrieval for occupational and environmental toxicology. Occup Environ Med. 1998;55:562–6.

    CAS  PubMed  PubMed Central  CrossRef  Google Scholar 

  23. Alper B, Stevermer J, White D, Ewigman B. Answering family physicians’ clinical questions using electronic medical databases. J Fam Pract. 2001;50:960–5.

    CAS  PubMed  Google Scholar 

  24. Koonce T, Giuse N, Todd P. Evidence-based databases versus primary medical literature: an in-house investigation on their optimal use. J Med Libr Assoc. 2004;92:407–11.

    PubMed  PubMed Central  Google Scholar 

  25. Trumble J, Anderson M, Caldwell M, Chuang F, Fulton S, Howard A, et al. A systematic evaluation of evidence based medicine tools for point-of-care Houston. TX: Texas Health Science Libraries Consortium; 2007.

    Google Scholar 

  26. Haynes R, McKibbon K, Walker C, Mousseau J, Baker L, Fitzgerald D, et al. Computer searching of the medical literature: an evaluation of MEDLINE searching systems. Ann Intern Med. 1985;103:812–6.

    CAS  PubMed  CrossRef  Google Scholar 

  27. Haynes R, Walker C, McKibbon K, Johnston M, Willan A. Performance of 27 MEDLINE systems tested by searches with clinical questions. J Am Med Inform Assoc. 1994;1:285–95.

    CAS  PubMed  PubMed Central  CrossRef  Google Scholar 

  28. Blair D, Maron M. An evaluation of retrieval effectiveness for a full-text document-retrieval system. Commun ACM. 1985;28:289–99.

    CrossRef  Google Scholar 

  29. McKinin E, Sievert M, Johnson E, Mitchell J. The MEDLINE/full-text research project. J Am Soc Inf Sci. 1991;42:297–307.

    CAS  PubMed  CrossRef  Google Scholar 

  30. Lokker C, Haynes R, Chu R, McKibbon K, Wilczynski N, Walter S. How well are journal and clinical article characteristics associated with the journal impact factor? A retrospective cohort study. J Med Libr Assoc. 2012;100:28–33.

    PubMed  PubMed Central  CrossRef  Google Scholar 

  31. Shariff S, Sontrop J, Haynes R, Iansavichus A, McKibbon K, Wilczynski N, et al. Impact of PubMed search filters on the retrieval of evidence by physicians. Can Med Assoc J. 2012;184:E184–E90.

    CrossRef  Google Scholar 

  32. Wilczynski N, Lokker C, McKibbon K, Hobson N, Haynes R. Limits of search filter development. J Med Libr Assoc. 2016;104:42–6.

    PubMed  PubMed Central  CrossRef  Google Scholar 

  33. Neilson C. A failed attempt at developing a search filter for systematic review methodology articles in Ovid Embase. J Med Libr Assoc. 2019;107:203–9.

    PubMed  PubMed Central  CrossRef  Google Scholar 

  34. Agoritsas T, Merglen A, Courvoisier D, Combescure C, Garin N, Perrier A, et al. Sensitivity and predictive value of 15 PubMed search strategies to answer clinical questions rated against full systematic reviews. J Med Internet Res. 2012;14(3):e85.

    PubMed  PubMed Central  CrossRef  Google Scholar 

  35. Izcovich A, Criniti J, Popoff F, Ragusa M, Gigler C, Malla C, et al. Answering medical questions at the point of care: a cross-sectional study comparing rapid decisions based on PubMed and Epistemonikos searches with evidence-based recommendations developed with the GRADE approach. BMJ Open. 2017;7:e016113.

    PubMed  PubMed Central  CrossRef  Google Scholar 

  36. Bramer W, Rethlefsen M, Kleijnen J, Franco O. Optimal database combinations for literature searches in systematic reviews: a prospective exploratory study. Syst Rev. 2016;6:245.

    CrossRef  Google Scholar 

  37. Beckles Z, Glover S, Ashe J, Stockton S, Boynton J, Lai R, et al. Searching CINAHL did not add value to clinical questions posed in NICE guidelines. J Clin Epidemiol. 2013;66:1051–7.

    PubMed  CrossRef  Google Scholar 

  38. Gusenbauer M. Google scholar to overshadow them all? Comparing the sizes of 12 academic search engines and bibliographic databases. Scientometrics. 2019;118:177–214.

    CrossRef  Google Scholar 

  39. Jeffery R, Navarro T, Lokker C, Haynes R, Wilczynski N, Farjou G. How current are leading evidence-based medical textbooks? An analytic survey of four online textbooks. J Med Internet Res. 2012;14(6):e175.

    PubMed  PubMed Central  CrossRef  Google Scholar 

  40. Lin J. Is searching full text more effective than searching abstracts? BMC Bioinformatics. 2009;10:46.

    PubMed  PubMed Central  CrossRef  Google Scholar 

  41. Jimmy, Zuccon G, Demartini G. On the volatility of commercial search engines and its impact on information retrieval research. Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval; 2019; Ann Arbor, MI.

    Google Scholar 

  42. Fenichel C. The process of searching online bibliographic databases: a review of research. Library Res. 1980;2:107–27.

    Google Scholar 

  43. Haynes R, McKibbon K, Walker C, Ryan N, Fitzgerald D, Ramsden M. Online access to MEDLINE in clinical settings. Ann Intern Med. 1990;112:78–84.

    CAS  PubMed  CrossRef  Google Scholar 

  44. McKibbon K, Haynes R, Dilks CW, Ramsden M, Ryan N, Baker L, et al. How good are clinical MEDLINE searches? A comparative study of clinical end-user and librarian searches. Comput Biomed Res. 1990;23(6):583–93.

    CAS  PubMed  CrossRef  Google Scholar 

  45. Haynes R, Johnston M, McKibbon K, Walker C, Willan A. A randomized controlled trial of a program to enhance clinical use of MEDLINE. Online J Curr Clin Trials. 1992;Doc No 56.

    Google Scholar 

  46. Hersh W, Hickam D. The use of a multi-application computer workstation in a clinical setting. Bull Med Libr Assoc. 1994;82:382–9.

    CAS  PubMed  PubMed Central  Google Scholar 

  47. Saracevic T, Kantor P. A study of information seeking and retrieving. III. Searchers, searches, and overlap. J Am Soc Inf Sci. 1988;39:197–216.

    CrossRef  Google Scholar 

  48. Hersh W, Hickam D. An evaluation of interactive Boolean and natural language searching with an on-line medical textbook. J Am Soc Inf Sci. 1995;46:478–89.

    CrossRef  Google Scholar 

  49. Egan D, Remde J, Gomez L, Landauer T, Eberhardt J, Lochbaum C. Formative design-evaluation of Superbook. ACM Trans Inf Syst. 1989;7:30–57.

    CrossRef  Google Scholar 

  50. Mynatt B, Leventhal L, Instone K, Farhat J, Rohlman D. Hypertext or book: which is better for answering questions? Proceedings of Computer-Human Interface 92; 1992.

    Google Scholar 

  51. Hersh W, Elliot D, Hickam D, Wolf S, Molnar A, Leichtenstein C, Towards new measures of information retrieval evaluation. Proceedings of the 18th Annual Symposium on Computer Applications in Medical Care; 1994; Washington, DC: Hanley & Belfus.

    Google Scholar 

  52. Hersh W, Pentecost J, Hickam D. A task-oriented approach to information retrieval evaluation. J Am Soc Inf Sci. 1996;47:50–6.

    CAS  CrossRef  Google Scholar 

  53. Rose L. Factors influencing successful use of information retrieval systems by nurse practitioner students [M.S.]. Portland, OR: Oregon Health Sciences University; 1998.

    Google Scholar 

  54. Hersh W, Crabtree M, Hickam D, Sacherek L, Rose L, Friedman C. Factors associated with successful answering of clinical questions using an information retrieval system. Bull Med Libr Assoc. 2000;88:323–31.

    CAS  PubMed  PubMed Central  Google Scholar 

  55. Gorman P, Helfand M. Information seeking in primary care: how physicians choose which clinical questions to pursue and which to leave unanswered. Med Decis Mak. 1995;15:113–9.

    CAS  CrossRef  Google Scholar 

  56. Hersh W, Crabtree M, Hickam D, Sacherek L, Friedman C, Tidmarsh P, et al. Factors associated with success for searching MEDLINE and applying evidence to answer clinical questions. J Am Med Inform Assoc. 2002;9:283–93.

    PubMed  PubMed Central  CrossRef  Google Scholar 

  57. McKibbon K, Fridsma D. Effectiveness of clinician-selected electronic information resources for answering primary care physicians’ information needs. J Am Med Inform Assoc. 2006;13:653–9.

    PubMed  PubMed Central  CrossRef  Google Scholar 

  58. Westbrook J, Coiera E, Gosling A. Do online information retrieval systems help experienced clinicians answer clinical questions? J Am Med Inform Assoc. 2005;12:315–21.

    PubMed  PubMed Central  CrossRef  Google Scholar 

  59. van der Vegt A, Zuccon G, Koopman B. Do better search engines really equate to better clinical decisions? If not, why not? J Am Soc Inf Sci. 2020; in press.

    Google Scholar 

  60. Westbrook J, Gosling A, Coiera E. The impact of an online evidence system on confidence in decision making in a controlled setting. Med Decis Mak. 2005;25:178–85.

    CrossRef  Google Scholar 

  61. vander Vegt A, Zuccon G, Koopman B, Deacon A. Impact of a search engine on clinical decisions under time and system effectiveness constraints: research protocol. JMIR Research Protocols. 2019;8(5):e12803.

    CrossRef  Google Scholar 

  62. Roberts K, Simpson M, Voorhees E, Hersh W. Overview of the TREC 2015 clinical decision support track. The Twenty-Fourth Text REtrieval Conference (TREC 2015) Proceedings; 2015; Gaithersbug, MD.

    Google Scholar 

  63. vander Vegt A, Zuccon G, Koopman B. Do better search engines really equate to better clinical decisions? If not, why not? J Am Soc Inf Sci. 2020:In review.

    Google Scholar 

  64. Ahmadi S, Faghankhani M, Javanbakht A, Akbarshahi M, Mirghorbani M, Safarnejad B, et al. A comparison of answer retrieval through four evidence-based textbooks (ACP PIER, essential evidence Plus, first consult, and UpToDate): a randomized controlled trial. Med Teach. 2011;33:724–30.

    PubMed  CrossRef  Google Scholar 

  65. Prorok J, Iserman E, Wilczynski N, Haynes R. The quality, breadth, and timeliness of content updating vary substantially for 10 online medical texts: an analytic survey. J Clin Epidemiol. 2012;65:1289–95.

    PubMed  CrossRef  Google Scholar 

  66. Kritz M, Gschwandtner M, Stefanov V, Hanbury A, Samwald M. Utilization and perceived problems of online medical resources and search tools among different groups of European physicians. J Med Internet Res. 2013;15(6):e122.

    PubMed  PubMed Central  CrossRef  Google Scholar 

  67. Kim S, Noveck H, Galt J, Hogshire L, Willett L, O’Rourke K. Searching for answers to clinical questions using Google versus evidence-based summary resources: a randomized controlled crossover study. Acad Med. 2014;89:940–3.

    PubMed  CrossRef  Google Scholar 

  68. Thiele R, Poiro N, Scalzo D, Nemergut E. Speed, accuracy, and confidence in Google, Ovid, PubMed, and UpToDate: results of a randomised trial. Postgrad Med J. 2010;86:459–65.

    PubMed  CrossRef  Google Scholar 

  69. Ensan L, Faghankhani M, Javanbakht A, Ahmadi S, Baradaran H. To compare PubMed clinical queries and UpToDate in teaching information mastery to clinical residents: a crossover randomized controlled trial. PLoS One. 2011;6:e23487.

    CAS  CrossRef  Google Scholar 

  70. Markonis D, Holzer M, Baroz F, DeCastaneda R, Boyer C, Langs G, et al. User-oriented evaluation of a medical image retrieval system for radiologists. Int J Med Inform. 2015;84:774–83.

    PubMed  CrossRef  Google Scholar 

  71. Scaffidi M, Khan R, Wang C, Keren D, Tsui C, Garg A, et al. Comparison of the impact of Wikipedia, UpToDate, and a digital textbook on short-term knowledge acquisition among medical students: randomized controlled trial of three web-based resources. JMIR Med Educ. 2017;3(2):e20.

    PubMed  PubMed Central  CrossRef  Google Scholar 

  72. Lau A, Coiera E, Zrimec T, Compton P. Clinician search behaviors may be influenced by search engine design. J Med Internet Res. 2010;12(2):e25.

    PubMed  PubMed Central  CrossRef  Google Scholar 

  73. Eysenbach G, Kohler C. How do consumers search for and appraise health information on the world wide web? Qualitative study using focus groups, usability tests, and in-depth interviews. Br Med J. 2002;324:573–7.

    CrossRef  Google Scholar 

  74. Lau A, Coiera E. Impact of web searching and social feedback on consumer decision making: a prospective online experiment. J Med Internet Res. 2008;10(1):e2.

    PubMed  PubMed Central  CrossRef  Google Scholar 

  75. Lau A, Kwok T, Coiera E. How online crowds influence the way individual consumers answer health questions. Appl Clin Inform. 2011;2:177–89.

    CAS  PubMed  PubMed Central  CrossRef  Google Scholar 

  76. van Deursen A. Internet skill-related problems in accessing online health information. Int J Med Inform. 2012;81:61–72.

    PubMed  CrossRef  Google Scholar 

  77. Taylor A. A study of the information search behaviour of the millennial generation. Inf Res. 2012;17:1.

    Google Scholar 

  78. Jimmy J, Zuccon G, Koopman B, Demartini G. Health cards for consumer health search. Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval; 2019; Paris, France.

    Google Scholar 

  79. Jimmy J, Zuccon G, Demartini G, Koopman B. Health cards to assist decision making in consumer health search. Proceedings of the AMIA 2019 Annual Symposium; 2019; Washington, DC.

    Google Scholar 

  80. Saracevic T, Kantor P, Chamis A, Trivison D. A study of information seeking and retrieving. I. Background and methodology. J Am Soc Inf Sci. 1988;39:161–76.

    CrossRef  Google Scholar 

  81. Saracevic T, Kantor P. A study in information seeking and retrieving. II. Users, questions, and effectiveness. J Am Soc Inf Sci. 1988;39:177–96.

    CrossRef  Google Scholar 

  82. Ekstrom R, French J, Harmon H. Manual for kit of factor-referenced cognitive tests. Princeton, NJ: Educational Testing Service; 1976.

    Google Scholar 

  83. Chin J, Diehl V, Norman K. Development of an instrument measuring user satisfaction of the human-computer interface. Proceedings of CHI ‘88 - Human Factors in Computing Systems; 1988; New York: ACM Press.

    Google Scholar 

  84. Magrabi F, Westbrook J, Coiera E. What factors are associated with the integration of evidence retrieval technology into routine general practice settings? Int J Med Inform. 2007;76:701–9.

    PubMed  CrossRef  Google Scholar 

  85. Liu Y, Wacholder N. Evaluating the impact of MeSH (medical subject headings) terms on different types of searchers. Inf Process Manag. 2017;53:851–70.

    CrossRef  Google Scholar 

  86. Koopman B, Zuccon G, Bruza P. What makes an effective clinical query and querier? J Am Soc Inf Sci Tech. 2017;68:2557–71.

    CAS  CrossRef  Google Scholar 

  87. Pogacar F, Ghenai A, Smucker M, Clarke C. The positive and negative influence of search results on people’s decisions about the efficacy of medical treatments. 2017 ACM SIGIR International Conference on the Theory of Information Retrieval; 2017; Amsterdam, Netherlands.

    Google Scholar 

  88. Kingsland L, Harbourt A, Syed E, Schuyler P. COACH: applying UMLS knowledge sources in an expert searcher environment. Bull Med Libr Assoc. 1993;81:178–83.

    PubMed  PubMed Central  Google Scholar 

  89. Walker C, McKibbon K, Haynes R, Ramsden M. Problems encountered by clinical end users of MEDLINE and grateful med. Bull Med Libr Assoc. 1991;79:67–9.

    CAS  PubMed  PubMed Central  Google Scholar 

  90. Russell-Rose T, Chamberlain J. Expert search strategies: the information retrieval practices of healthcare information professionals. JMIR Med Inform. 2017;5(4):e33.

    PubMed  PubMed Central  CrossRef  Google Scholar 

  91. McCray A, Tse T. Understanding search failures in consumer health information systems. Proceedings of the AMIA 2003 Annual Symposium; 2003; Washington, DC: Hanley & Belfus.

    Google Scholar 

  92. King D. The contribution of hospital library information services to clinical care: a study of eight hospitals. Bull Med Libr Assoc. 1987;75:291–301.

    CAS  PubMed  PubMed Central  Google Scholar 

  93. Marshall J. The impact of the hospital library on decision making: the Rochester study. Bull Med Libr Assoc. 1992;80:169–78.

    CAS  PubMed  PubMed Central  Google Scholar 

  94. Mathis Y, Huisman L, Swanson S, Griswold M, Salzwedel B, Watson M. Mediated literature searches. Bull Med Libr Assoc. 1994;69:360.

    CAS  Google Scholar 

  95. Marshall J, Sollenberger J, Easterby-Gannett S, Morgan L, Klem M, Cavanaugh S, et al. The value of library and information services in patient care: results of a multisite study. J Med Libr Assoc. 2013;101:38–46.

    PubMed  PubMed Central  CrossRef  Google Scholar 

  96. McGowan J, Hogg W, Rader T, Salzwedel D, Worster D, Cogo E, et al. A rapid evidence-based service by librarians provided information to answer primary care clinical questions. Health Inf Libr J. 2009;27:11–21.

    CrossRef  Google Scholar 

  97. McGowan J, Hogg W, Zhong J, Zhao X. A cost-consequences analysis of a primary care librarian question and answering service. PLoS One. 2012;7(3):e33837.

    CAS  PubMed  PubMed Central  CrossRef  Google Scholar 

  98. Lindberg D, Siegel E, Rapp B, Wallingford K, Wilson S. Use of MEDLINE by physicians for clinical problem solving. J Am Med Assoc. 1993;269:3124–9.

    CAS  CrossRef  Google Scholar 

  99. Westbrook J, Coiera E, Braithwaite J. Measuring the impact of online evidence retrieval systems using critical incidents and journey mapping. Stud Health Technol Inform. 2005;116:533–8.

    PubMed  Google Scholar 

  100. Pluye P, Grad R. How information retrieval technology may impact on physician practice: an organizational case study in family medicine. J Eval Clin Pract. 2004;10:413–30.

    CAS  PubMed  CrossRef  Google Scholar 

  101. Pluye P, Grad R, Dunikowski L, Stephenson R. Impact of clinical information-retrieval technology on physicians: a literature review of quantitative, qualitative and mixed methods studies. Int J Med Inform. 2005;74:745–68.

    PubMed  CrossRef  Google Scholar 

  102. Grad R, Pluye P, Meng Y, Segal B, Tamblyn R. Assessing the impact of clinical information-retrieval technology in a family practice residency. J Eval Clin Pract. 2005;11:576–86.

    PubMed  CrossRef  Google Scholar 

  103. Isaac T, Zheng J, Jha A. Use of UpToDate and outcomes in US hospitals. J Hosp Med. 2012;7:85–90.

    PubMed  CrossRef  Google Scholar 

  104. Phua J, See K, Khalizah H, Low S, Lim T. Utility of the electronic information resource UpToDate for clinical decision-making at bedside rounds. Singap Med J. 2012;53:116–20.

    CAS  Google Scholar 

  105. Reed D, West C, Holmboe E, Halvorsen A, Lipner R, Jacobs C, et al. Relationship of electronic medical knowledge resource use and practice characteristics with internal medicine maintenance of certification examination scores. J Gen Intern Med. 2012;27:917–23.

    PubMed  PubMed Central  CrossRef  Google Scholar 

  106. Cartright M, White R, Horvitz E. Intentions and attention in exploratory health search. Proceedings of the 34th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2011); 2011; Beijing, China.

    Google Scholar 

  107. White R, Horvitz E. Studies of the onset and persistence of medical concerns in search logs. Proceedings of the 35th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012); 2012; Portland, OR.

    Google Scholar 

  108. White R, Horvitz E. Cyberchondria: studies of the escalation of medical concerns in web search. ACM Trans Inf Syst. 2009;4:23–37.

    Google Scholar 

  109. White R, Tatonetti N, Shah N, Altman R, Horvitz E. Web-scale pharmacovigilance: listening to signals from the crowd. J Am Med Inform Assoc. 2013;20:404–8.

    PubMed  PubMed Central  CrossRef  Google Scholar 

  110. Nguyen T, Larsen M, O’Dea B, Phung D, Venkatesh S, Christensen H. Estimation of the prevalence of adverse drug reactions from social media. J Biomed Inform. 2017;102:130–7.

    Google Scholar 

  111. Paparrizos J, White R, Horvitz E. Screening for pancreatic adenocarcinoma using signals from web search logs: feasibility study and results. J Oncol Pract. 2016;12:737–44.

    PubMed  CrossRef  Google Scholar 

  112. White R, Horvitz E. Evaluation of the feasibility of screening patients for early signs of lung carcinoma in web search logs. JAMA Oncol. 2017;3:398–401.

    PubMed  CrossRef  Google Scholar 

  113. Yom-Tov E. Crowdsourced health: how what you do on the internet will improve medicine. Cambridge, MA: MIT Press; 2016.

    CrossRef  Google Scholar 

  114. Hersh W, Buckley C, Leone T, Hickam D. OHSUMED: an interactive retrieval evaluation and new large test collection for research. Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 1994; Dublin, Ireland: Springer-Verlag.

    Google Scholar 

  115. Hersh W, Hickam D. A comparison of two methods for indexing and retrieval from a full-text medical database. Med Decis Mak. 1993;13:220–6.

    CAS  CrossRef  Google Scholar 

  116. Hersh W, Hickam D, Haynes R, McKibbon K. A performance and failure analysis of SAPHIRE with a MEDLINE test collection. J Am Med Inform Assoc. 1994;1:51–60.

    CAS  PubMed  PubMed Central  CrossRef  Google Scholar 

  117. Kramer M, Feinstein A. Clinical biostatistics: LIV. The biostatistics of concordance. Clin Pharmacol Ther. 1981;29:111–23.

    CAS  PubMed  CrossRef  Google Scholar 

  118. Saracevic T. Relevance: a review of and a framework for the thinking on the notion in information science. J Am Soc Inf Sci. 1975;26:321–43.

    CrossRef  Google Scholar 

  119. Schamber L, Eisenberg M, Nilan M. A re-examination of relevance: toward a dynamic, situational definition. Inf Process Manag. 1990;26:755–76.

    CrossRef  Google Scholar 

  120. Saracevic T. The notion of relevance in information science: everybody knows what relevance is. But, what is it really? San Rafael. CA: Morgan & Claypool; 2016.

    Google Scholar 

  121. Meadow C. Relevance? J Am Soc Inf Sci. 1985;36:354–5.

    CrossRef  Google Scholar 

  122. Cooper W. On selecting a measure of retrieval effectiveness. J Am Soc Inf Sci. 1973;24:87–100.

    CrossRef  Google Scholar 

  123. Harter S. Psychological relevance and information science. J Am Soc Inf Sci. 1992;43:602–15.

    CrossRef  Google Scholar 

  124. Rees A. The relevance of relevance to the testing and evaluation of document retrieval systems. ASLIB Proc. 1966;18:316–24.

    CrossRef  Google Scholar 

  125. Anonymous. Evidence-based medicine: a new approach to teaching the practice of medicine. Evidence-based medicine working group. J Am Med Assoc. 1992;268:2420–5.

    CrossRef  Google Scholar 

  126. Lesk M, Salton G. Relevance assessments and retrieval system evaluation. Information Storage and Retrieval. 1968;4:343–59.

    CrossRef  Google Scholar 

  127. Voorhees E. Variations in relevance judgments and the measurement of retrieval effectiveness. Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 1998; Melbourne, Australia: ACM Press.

    Google Scholar 

  128. Bailey P, Craswell N, Soboroff I. Relevance assessment: are judges exchangeable and does it matter? Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 2008; Singapore.

    Google Scholar 

  129. Rees A, Schultz D. A field experimental approach to the study of relevance assessments in relation to document searching. Cleveland, OH: Center for Documentation and Communication Research, Case Western Reserve University; 1967.

    Google Scholar 

  130. Cuadra C, Katter R. Experimental studies of relevance judgments. Santa Monica, CA: Systems Development Corp.1967. Report No.: TM-3520/001, 002, 003.

    Google Scholar 

  131. Eisenberg M. Measuring relevance judgments. Inf Process Manag. 1988;24:373–89.

    CrossRef  Google Scholar 

  132. Eisenberg M, Barry C. Order effects: a study of the possible influence of presentation order on user judgments of document relevance. J Am Soc Inf Sci. 1988;39:293–300.

    CrossRef  Google Scholar 

  133. Parker L, Johnson R. Does order of presentation affect users' judgment of documents? J Am Soc Inf Sci. 1990;41:493–4.

    CrossRef  Google Scholar 

  134. Florance V, Marchionini G. Information processing in the context of medical care. Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 1995; Seattle: ACM Press.

    Google Scholar 

  135. Mao J, Liu Y, Zhou K, Nie J, Song J, Zhang M et al. When does relevance mean usefulness and user satisfaction in web search? Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval; 2016; Pisa, Italy.

    Google Scholar 

  136. Jiang J, He D, Allan J. Comparing in situ and multidimensional relevance judgments. Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval; 2017; Tokyo, Japan.

    Google Scholar 

  137. Zuccon G. Understandability biased evaluation for information retrieval. Advances in Information Retrieval: 38th European Conference on IR Research; 2016; Padua, Italy.

    Google Scholar 

  138. Swanson D. Historical note: information retrieval and the future of an illusion. J Am Soc Inf Sci. 1988;39:92–8.

    CrossRef  Google Scholar 

  139. Hersh W. Relevance and retrieval evaluation: perspectives from medicine. J Am Soc Inf Sci. 1994;45:201–6.

    CrossRef  Google Scholar 

  140. Belkin N, Vickery A. Interaction in the information system: a review of research from document retrieval to knowledge-based system. The British Library: London, England; 1985.

    Google Scholar 

  141. Soboroff I, Nicholas C, Cahan P. Ranking retrieval systems without relevance judgments. Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 2001; New Orleans, LA: ACM Press.

    Google Scholar 

  142. Aslam J, Pavlu V, Yilmaz E. A statistical method for system evaluation using incomplete judgments. Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 2006; Seattle, WA: ACM Press.

    Google Scholar 

  143. Buckley C, Voorhees E. Retrieval evaluation with incomplete information. Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 2004; Sheffield, England: ACM Press.

    Google Scholar 

  144. Joachims T. Optimizing search engines using clickthrough data. Proceedings of the ACM Conference on Knowledge Discovery and Data Mining; 2002; Edmonton, Alberta, Canada: ACM Press.

    Google Scholar 

  145. Joachims T. Evaluating retrieval performance using clickthrough data. Proceedings of the SIGIR Workshop on Mathematical/Formal Methods in Information Retrieval; 2002; Tampere, Finland: ACM Press.

    Google Scholar 

  146. Joachims T, Granka L, Pang B, Hembrooke H, Gay G. Accurately interpreting clickthrough data as implicit feedback. Proceedings of the 28th International ACM SIGIR Conference on Research and Development in Information Retrieval; 2005; Salvador, Brazil: ACM Press.

    Google Scholar 

  147. Cohen J. A coefficient of agreement for nominal scales. Educ Psychol Meas. 1960;20:37–46.

    CrossRef  Google Scholar 

  148. Fleiss J, Levin B, Paik M. The measurement of Interrater agreement. Statistical methods for rates and proportions. 3rd ed. Hoboken, NJ: Wiley; 2003. p. 598–626.

    CrossRef  Google Scholar 

  149. Friedman C, Wyatt J. Evaluation methods in biomedical informatics. New York, NY: Springer; 2006.

    CrossRef  Google Scholar 

  150. Hersh W, Hickam D. A comparison of retrieval effectiveness for three methods of indexing medical literature. Am J Med Sci. 1992;303:292–300.

    CAS  PubMed  CrossRef  Google Scholar 

  151. Hripcsak G, Rothschild A. Agreement, the F-measure, and reliability in information retrieval. J Am Med Inform Assoc. 2005;12:296–8.

    PubMed  PubMed Central  CrossRef  Google Scholar 

  152. DiEugenio B, Glass M. The kappa statistic: a second look. Comput Linguist. 2004;30:95–101.

    CrossRef  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to William Hersh .

Rights and permissions

Reprints and Permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Verify currency and authenticity via CrossMark

Cite this chapter

Hersh, W. (2020). Evaluation. In: Information Retrieval: A Biomedical and Health Perspective. Health Informatics. Springer, Cham. https://doi.org/10.1007/978-3-030-47686-1_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-47686-1_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-47685-4

  • Online ISBN: 978-3-030-47686-1

  • eBook Packages: MedicineMedicine (R0)