Advertisement

Evaluating Interactive Question Answering

  • William Hersh
Part of the Text, Speech and Language Technology book series (TLTB, volume 32)

This volume is filled with a variety of innovative approaches to helping users answer questions. In much of the research, however, one part of the solution is missing, namely the user. This chapter describes evaluation of interactive question answering with a focus on two initiatives: the Text REtrieval Conference (TREC) Interactive Track and studies in the medical domain. As is seen, there is considerable overlap between the two in terms of the model underlying the research and the methods used.

Keywords

Medical Student Information Retrieval Relevance Feedback Mean Average Precision Information Retrieval System 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

5. References

  1. Abraham, V., Friedman, C., et al. (1999). Student and faculty performance in clinical simulations with access to a searchable information resource. Proceedings of the AMIA 1999 Annual Symposium, Washington, DC. Hanley & Belfus. 648-652.Google Scholar
  2. Allan, J. (1997). Building hypertext using information retrieval. Information Processing and Management, 33: 145-160.CrossRefGoogle Scholar
  3. Allen, B. (1992). Cognitive differences in end-user searching of a CD-ROM index. Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Copenhagen, Denmark. ACM Press. 298-309.CrossRefGoogle Scholar
  4. Belkin, N., Cool, C., et al. (2000). Iterative exploration, design and evaluation of support for query reformulation in interactive information retrieval. Information Processing and Management, 37: 403-434.CrossRefGoogle Scholar
  5. Belkin, N., Keller, A., et al. (2000). Support for question-answering in interactive information retrieval: Rutger’s TREC-9 interactive track experience. The Ninth Text REtrieval Conference (TREC-9), Gaithersburg, MD. National Institute of Standards and Technology. 463-474.Google Scholar
  6. Chin, J., Diehl, V., et al. (1988). Development of an instrument measuring user satisfaction of the human-computer interface. Proceedings of CHI88 - Human Factors in Computing Systems, New York. ACM Press. 213-218.Google Scholar
  7. Cleverdon, C. and Keen, E. (1966). Factors determining the performance of indexing systems (Vol. 1: Design, Vol. 2: Results). Cranfield, England, Aslib Cranfield Research Project.Google Scholar
  8. deBliek, R., Friedman, C., et al. (1994). Information retrieved from a database and the augmentation of personal knowledge. Journal of the American Medical Informatics Association, 1: 328-338.Google Scholar
  9. Egan, D., Remde, J., et al. (1989). Formative design-evaluation of Superbook. ACM Transactions on Information Systems, 7: 30-57.CrossRefGoogle Scholar
  10. Ekstrom, R., French, J., et al. (1976). Manual for Kit of Factor-Referenced Cognitive Tests. Princeton, NJ. Educational Testing Service.Google Scholar
  11. Eysenbach, G. and Kohler, C. (2002). How do consumers search for and appraise health information on the World Wide Web? Qualitative study using focus groups, usability tests, and in-depth interviews. British Medical Journal, 324: 573-577.CrossRefGoogle Scholar
  12. Fellbaum, C., ed. (1998). WordNet: An Electronic Lexical Database. Cambridge, MA. MIT Press.Google Scholar
  13. Friedman, C., Wildemuth, B., et al. (1996). A comparison of hypertext and Boolean access to biomedical information. Proceedings of the 1996 AMIA Annual Fall Symposium, Washington, DC. Hanley & Belfus. 2-6.Google Scholar
  14. Gomez, L., Egan, D., et al. (1986). Learning to use a text editor: some learner characteristics that predict success. Human-Computer Interaction, 2: 1-23.CrossRefGoogle Scholar
  15. Gorman, P. and Helfand, M. (1995). Information seeking in primary care: how physicians choose which clinical questions to pursue and which to leave unanswered. Medical Decision Making, 15: 113-119.CrossRefGoogle Scholar
  16. Harman, D. (1993). Overview of the First Text REtrieval Conference. Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Pittsburgh, PA. ACM Press. 36-47.CrossRefGoogle Scholar
  17. Hersh, W. (1994). Relevance and retrieval evaluation: perspectives from medicine. Journal of the American Society for Information Science, 45: 201-206.CrossRefGoogle Scholar
  18. Hersh, W. (2001). Interactivity at the Text Retrieval Conference (TREC). Information Processing and Management, 37: 365-366.CrossRefGoogle Scholar
  19. Hersh, W., Crabtree, M., et al. (2002). Factors associated with success for searching MEDLINE and applying evidence to answer clinical questions. Journal of the American Medical Informatics Association, 9: 283-293.CrossRefGoogle Scholar
  20. Hersh, W., Crabtree, M., et al. (2000). Factors associated with successful answering of clinical questions using an information retrieval system. Bulletin of the Medical Library Association, 88: 323-331.Google Scholar
  21. Hersh, W., Elliot, D., et al. (1994). Towards new measures of information retrieval evaluation. Proceedings of the 18th Annual Symposium on Computer Applications in Medical Care, Washington, DC. Hanley & Belfus. 895-899.Google Scholar
  22. Hersh, W. and Over, P. (2000). TREC-9 Interactive Track report. The Ninth Text REtrieval Conference (TREC-9), Gaithersburg, MD. National Institute of Standards and Technology. 41-50.Google Scholar
  23. Hersh, W., Pentecost, J., et al. (1996). A task-oriented approach to information retrieval evaluation. Journal of the American Society for Information Science, 47: 50-56.CrossRefGoogle Scholar
  24. Hersh, W., Turpin, A., et al. (2000a). Do batch and user evaluations give the same results? Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Athens, Greece. ACM Press. 17-24.CrossRefGoogle Scholar
  25. Hersh, W., Turpin, A., et al. (2001). Challenging conventional assumptions of automated information retrieval with real users: Boolean searching and batch retrieval evaluations. Information Processing and Management, 37: 383-402.CrossRefGoogle Scholar
  26. Hersh, W., Turpin, A., et al. (2000b). Further analysis of whether batch and user evaluations give the same results with a question-answering task. The Ninth Text REtrieval Conference (TREC-9), Gaithersburg, MD. National Institute of Standards and Technology. 407-416.Google Scholar
  27. Hu, F., Goldberg, J., et al. (1998). Comparison of population-averaged and subject-specific approaches for analyzing repeated binary outcomes. American Journal of Epidemiology, 147: 694-703.Google Scholar
  28. Lagergren, E. and Over, P. (1998). Comparing interactive information retrieval systems across sites: the TREC-6 interactive track matrix experiment. Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Melbroune, Australia. ACM Press. 162-172.Google Scholar
  29. Meadow, C. (1985). Relevance. Journal of the American Society for Information Science, 36: 354-355.CrossRefGoogle Scholar
  30. Mynatt, B., Leventhal, L., et al. (1992). Hypertext or book: which is better for answering questions? Proceedings of Computer-Human Interface 92. 19-25.Google Scholar
  31. Osheroff, J. and Bankowitz, R. (1993). Physicians’ use of computer software in answering clinical questions. Bulletin of the Medical Library Association, 81: 11-19.Google Scholar
  32. Over, P. (2000). The TREC Interactive Track: an annotated bibliography. Information Processing and Management,: 369-382.Google Scholar
  33. Robertson, S. and Walker, S. (1994). Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval. Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Dublin, Ireland. SpringerVerlag. 232-241.Google Scholar
  34. Robertson, S., Walker, S., et al. (1998). Okapi at TREC-7: automatic ad hoc, filtering, VLC, and interactive track. The Seventh Text REtrieval Conference (TREC-7), Gaithersburg, MD. National Institute of Standards and Technology. 253-264.Google Scholar
  35. Rose, L. (1998). Factors Influencing Successful Use of Information Retrieval Systems by Nurse Practitioner Students. School of Nursing. M.S. Thesis. Oregon Health Sciences University.Google Scholar
  36. Rose, L., Crabtree, K., et al. (1998). Factors influencing successful use of information retrieval systems by nurse practitioner students. Proceedings of the AMIA 1998 Annual Symposium, Orlando, FL. Hanley & Belfus. 1067.Google Scholar
  37. Salton, G. (1991). Developments in automatic text retrieval. Science, 253: 974-980.CrossRefGoogle Scholar
  38. Saracevic, T. and Kantor, P. (1988a). A study in information seeking and retrieving. II. Users, questions, and effectiveness. Journal of the American Society for Information Science, 39: 177-196.CrossRefGoogle Scholar
  39. Saracevic, T. and Kantor, P. (1988b). A study of information seeking and retrieving. III. Searchers, searches, and overlap. Journal of the American Society for Information Science, 39: 197-216.CrossRefGoogle Scholar
  40. Saracevic, T., Kantor, P., et al. (1988). A study of information seeking and retrieving. I. Background and methodology. Journal of the American Society for Information Science, 39: 161-176.CrossRefGoogle Scholar
  41. Singhal, A., Buckley, C., et al. (1996). Pivoted document length normalization. Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Zurich, Switzerland. ACM Press. 21-29.CrossRefGoogle Scholar
  42. Sparck-Jones, K. (1981). Information Retrieval Experiment. London, Butterworths.Google Scholar
  43. Staggers, N. and Mills, M. (1994). Nurse-computer interaction: staff performance outcomes. Nursing Research, 43: 144-150.CrossRefGoogle Scholar
  44. Swan, R. and Allan, J. (1998). Aspect windows, 3-D visualization, and indirect comparisons of information retrieval systems. Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Melbourne, Australia. ACM Press. 173-181.CrossRefGoogle Scholar
  45. Swanson, D. (1977). Information retrieval as a trial-and-error process. Library Quarterly, 47: 128-148.CrossRefGoogle Scholar
  46. Turpin, A. and Hersh, W. (2001). Why batch and user evaluations do not give the same results. Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New Orleans, LA. ACM Press. 225-231.CrossRefGoogle Scholar
  47. Wildemuth, B., deBliek, R., et al. (1995). Medical students’ personal knowledge, searching proficiency, and database use in problem solving. Journal of the American Society for Information Science, 46: 590-607.CrossRefGoogle Scholar
  48. Witten, I., Moffat, A., et al. (1994). Managing Gigabytes - Compressing and Indexing Documents and Images. New York, Van Nostrand Reinhold.Google Scholar
  49. Wu, M., Fuller, M., et al. (2000). Using clustering and classification approaches in interactive retrieval. Information Processing and Management, 37: 459-484.CrossRefGoogle Scholar
  50. Wu, M., Fuller, M., et al. (2001). Searcher performance in question answering. Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New Orleans, LA. ACM Press. 375-381.CrossRefGoogle Scholar
  51. Xu, J. and Croft, W. (1996). Query expansion using local and global document analysis. Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Zurich, Switzerland. ACM Press. 4-11.CrossRefGoogle Scholar
  52. Yang, K., Maglaughlin, K., et al. (2000). Passage feedback with IRIS. Information Processing and Management, 37: 521-541.CrossRefGoogle Scholar

Copyright information

© Springer 2008

Authors and Affiliations

  • William Hersh
    • 1
  1. 1.Oregon Health & Science UniversityPortlandUSA

Personalised recommendations