Skip to main content

Information Retrieval in Literature-Based Discovery

  • Chapter
Literature-based Discovery

Part of the book series: Information Science and Knowledge Management ((ISKM,volume 15))

Abstract

Finding and accessing relevant information is essential for wider use of literature-based discovery (LBD). This chapter provides an overview of information retrieval (IR) with a focus on its role in LBD. It covers the major approaches to indexing and retrieval, followed by a description of research evaluating them. The chapter concludes with an overview of IR techniques used for LBD and promising directions for the future.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anonymous (2007). PubMed Help, National Library of Medicine. http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=helppubmed.chapter.pubmedhelp

  2. Aronson, A., Bodenreider, O., et al. (2000). The NLM indexing initiative. Proceedings of the AMIA 2000 Annual Symposium, Los Angeles, CA: Hanley & Belfus, pp. 17–21

    Google Scholar 

  3. Bahls, C., Weitzman, J., et al. (2003). Biology’s models. The Scientist. June 2, 2003. 5. http://www.the-scientist.com/yr2003/jun/feature_030602.html

  4. Beckett, D., Miller, E., et al. (2000). Using Dublin Core in XML. Dublin Core Metadata Initiative. http://dublincore.org/documents/dcmes-xml/. Accessed 1 July 2002

  5. Berners-Lee, T., Lassila, O., et al. (2001). The semantic web. Scientific American, 284(5): 34–43. http://www.scientificamerican.com/article.cfm?articleID=00048144-10D2-1C70-84A9809EC588EF21&catID=2

    Google Scholar 

  6. Borgman, C. (1999). What are digital libraries? Competing visions. Information Processing and Management, 35: 227–244

    Article  Google Scholar 

  7. Brin, S. and Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. Computer Networks, 30: 107–117

    Google Scholar 

  8. Buckley, C. and Voorhees, E. (2005). Retrieval System Evaluation, in Voorhees, E. and Harman, D., eds. TREC: Experiment and Evaluation in Information Retrieval. Cambridge, MA: MIT Press, pp. 53–75

    Google Scholar 

  9. Bult, C., Blake, J., et al. (2004). The Mouse Genome Database (MGD): integrating biology with the genome. Nucleic Acids Research, 32: D476–D481

    Article  Google Scholar 

  10. Caruso, D. (2000). Digital commerce: if the AOL-Time Warner deal is about proprietary content, where does that leave a noncommercial directory it will own? New York Times. January 17, 2000

    Google Scholar 

  11. Charen, T. (1976). MEDLARS Indexing Manual, Part I: Bibliographic Principles and Descriptive Indexing, 1977. Springfield, VA: National Technical Information Service

    Google Scholar 

  12. Charen, T. (1983). MEDLARS Indexing Manual, Part II. Springfield, VA: National Technical Information Service

    Google Scholar 

  13. Coletti, M. and Bleich, H. (2001). Medical subject headings used to search the biomedical literature. Journal of the American Medical Informatics Association, 8: 317–323

    Google Scholar 

  14. Diehn, M., Sherlock, G., et al. (2003). SOURCE: a unified genomic resource of functional annotations, ontologies, and gene expression data. Nucleic Acids Research, 31: 219–223

    Article  Google Scholar 

  15. Egan, D., Remde, J., et al. (1989). Formative design-evaluation of superbook. ACM Transactions on Information Systems, 7: 30–57

    Article  Google Scholar 

  16. Fox, C. (1992). Lexical Analysis and Stop Lists, in Frakes, W. and Baeza-Yates, R., eds. Information Retrieval: Data Structures and Algorithms. Englewood Cliffs, NJ: Prentice-Hall, pp. 102–130

    Google Scholar 

  17. Frakes, W. (1992). Stemming Algorithms, in Frankes, W. and Baeza-Yates, R., eds. Information Retrieval: Data Structures and Algorithms. Englewood Cliffs, NJ: Prentice-Hall, pp. 131–160

    Google Scholar 

  18. Funk, M. and Reid, C. (1983). Indexing consistency in MEDLINE. Bulletin of the Medical Library Association, 71: 176–183

    Google Scholar 

  19. Gordon, M. and Dumais, S. (1998). Using latent semantic indexing for literature-based discovery. Journal of the American Society for Information Science and Technology, 49: 674–685

    Article  Google Scholar 

  20. Gordon, M. and Lindsay, R. (1996). Toward discovery support systems: a replication, re-examination, and extension of Swanson’s work on literature-based discovery of a connection between Raynaud’s and fish oil. Journal of the American Society for Information Science and Technology, 47: 116–128

    Article  Google Scholar 

  21. Gordon, M., Lindsay, R., et al. (2002). Literature-based discovery on the World Wide Web. ACM Transactions on Internet Technology, 2: 261–275

    Article  Google Scholar 

  22. Harter, S. (1992). Psychological relevance and information science. Journal of the American Society for Information Science, 43: 602–615

    Article  Google Scholar 

  23. Hersh, W. (1994). Relevance and retrieval evaluation: perspectives from medicine. Journal of the American Society for Information Science, 45: 201–206

    Article  Google Scholar 

  24. Hersh, W. (2001). Interactivity at the Text Retrieval Conference (TREC). Information Processing and Management, 37: 365–366

    Article  Google Scholar 

  25. Hersh, W. (2003). Information Retrieval: A Health and Biomedical Perspective (Second Edition). Berlin Heidelberg New York: Springer. http://www.irbook.info

  26. Hersh, W., Crabtree, M., et al. (2002). Factors associated with success for searching MEDLINE and applying evidence to answer clinical questions. Journal of the American Medical Informatics Association, 9: 283–293

    Article  Google Scholar 

  27. Hersh, W., Crabtree, M., et al. (2000). Factors associated with successful answering of clinical questions using an information retrieval system. Bulletin of the Medical Library Association, 88: 323–331

    Google Scholar 

  28. Hersh, W. and Hickam, D. (1998). How well do physicians use electronic information retrieval systems? A framework for investigation and review of the literature. Journal of the American Medical Association, 280: 1347–1352

    Article  Google Scholar 

  29. Hristovski, D., Peterlin, B., et al. (2005). Using literature-based discovery to identify disease candidate genes. International Journal of Medical Informatics, 74: 289–298

    Article  Google Scholar 

  30. Hristovski, D., Stare, J., et al. (2001). Supporting discovery in medicine by association rule mining in Medline and UMLS. MEDINFO 2001 – Proceedings of the Tenth World Congress on Medical Informatics, London, UK: IOS Press, pp. 1344–1348

    Google Scholar 

  31. Langville, A. and Meyer, C. (2006). Google’s PageRank and Beyond: The Science of Search Engine Rankings. Princeton, NJ: Princeton University Press

    MATH  Google Scholar 

  32. Lawrence, S., Giles, C., et al. (1999). Digital libraries and autonomous citation indexing. Computer, 32: 67–71

    Article  Google Scholar 

  33. Lindsay, R. and Gordon, M. (1999). Literature-based discovery by lexical statistics. Journal of the American Society for Information Science and Technology, 50: 574–587

    Article  Google Scholar 

  34. Miller, E. (1998). An introduction to the resource description framework. D-Lib Magazine, 4. http://www.dlib.org/dlib/may98/miller/05miller.html

  35. Miller, N., Lacroix, E., et al. (2000). MEDLINEplus: building and maintaining the National Library of Medicine’s consumer health web service. Bulletin of the Medical Library Association, 88: 11–17

    Google Scholar 

  36. Mynatt, B., Leventhal, L., et al. (1992). Hypertext or book: which is better for answering questions? Proceedings of Computer-Human Interface 92. 19–25

    Google Scholar 

  37. Perkel, J. (2003). Feeding the info junkies. The Scientist. June 2, 2003. 39. http://www.the-scientist.com/yr2003/jun/feature14_030602.html

  38. Salton, G. (1983). Introduction to Modern Information Retrieval. New York: McGraw-Hill

    MATH  Google Scholar 

  39. Salton, G. (1991). Developments in automatic text retrieval. Science, 253: 974–980

    Article  MathSciNet  Google Scholar 

  40. Salton, G., Fox, E., et al. (1983). Extended Boolean information retrieval. Communications of the ACM, 26: 1022–1036

    Article  MATH  MathSciNet  Google Scholar 

  41. Salton, G. and Lesk, M. (1965). The SMART automatic document retrieval system: an illustration. Communications of the ACM, 8: 391–398

    Article  Google Scholar 

  42. Spitzer, V., Ackerman, M., et al. (1996). The visible human male: a technical report. Journal of the American Medical Informatics Association, 3: 118–130

    Google Scholar 

  43. Srinivasan, P. (2004). Text mining: generating hypotheses from MEDLINE. Journal of the American Society for Information Science and Technology, 55: 396–413

    Article  Google Scholar 

  44. Stegmann, J. and Grohmann, G. (2003). Hypothesis generation guided by co-word clustering. Scientometrics, 56: 111–135

    Article  Google Scholar 

  45. Swanson, D. (1986). Fish oil, Raynaud’s syndrome, and undiscovered public knowledge. Perspectives in Biology and Medicine, 30: 7–18

    Google Scholar 

  46. Swanson, D. (1988a). Historical note: information retrieval and the future of an illusion. Journal of the American Society for Information Science, 39: 92–98

    Article  Google Scholar 

  47. Swanson, D. (1988b). Migraine and magnesium: eleven neglected connections. Perspectives in Biology and Medicine, 31: 526–557

    Google Scholar 

  48. Swanson, D. (1990). Somatomedin C and arginine: implicit connections between mutually isolated literatures. Perspectives in Biology and Medicine, 33: 157–186

    Google Scholar 

  49. van Rijsbergen, C. (1979). Information Retrieval. London: Butterworth

    Google Scholar 

  50. Voorhees, E. and Harman, D., eds. (2005). TREC: Experiment and Evaluation in Information Retrieval. Cambridge, MA: MIT Press

    Google Scholar 

  51. Weeber, M., Vos, R., et al. (2003). Generating hypotheses by discovering implicit associations in the literature: a case report of a search for new potential therapeutic uses for thalidomide. Journal of the American Medical Informatics Association, 10: 252–259

    Article  Google Scholar 

  52. Weibel, S. (1996). The Dublin Core: a simple content description model for electronic resources. ASIS Bulletin, 24(1): 9–11. http://www.asis.org/Bulletin/Oct-97/weibel.htm

  53. Wildemuth, B., deBliek, R., et al. (1995). Medical students’ personal knowledge, searching proficiency, and database use in problem solving. Journal of the American Society for Information Science, 46: 590–607

    Article  Google Scholar 

  54. Wren, J., Bekeredjian, R., et al. (2004). Knowledge discovery by automated identification and ranking of implicit relationships. Bioinformatics, 20: 389–398

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Hersh, W. (2008). Information Retrieval in Literature-Based Discovery. In: Bruza, P., Weeber, M. (eds) Literature-based Discovery. Information Science and Knowledge Management, vol 15. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68690-3_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-68690-3_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68685-9

  • Online ISBN: 978-3-540-68690-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics