Skip to main content

Semantic Search with GoPubMed

  • Chapter
Semantic Techniques for the Web

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5500))

Abstract

Searching relevant information on the web is a main occupation of researchers nowadays. Classical keyword-based search engines have limits. Inconsistent vocabulary used by authors is not handled. Relevant information spread over multiple documents can not be found. An overview over an entire document collection can not be given by the means of ranked lists. Question answering requiring semantic disambiguation of occurring terminology is not possible. Trends in the literature can not be followed if vocabulary is evolving over time.

GoPubMed is a semantic search engine using the background knowledge of ontologies to index the biomedical literature. In this chapter we discuss how semantic search can contribute to overcome the limits of classical search paradigms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ackoff, R.L.: From data to wisdom. Journal of Applies Systems Analysis 16, 3–9 (1989)

    Google Scholar 

  2. Andreopoulos, B., Alexopoulou, D., Schroeder, M.: Word sense disambiguation in biomedical ontologies with term co-occurrence analysis and document clustering. Internation Journal of Data Mining and Bioinformatics (2008) (Special Issue on Text Mining and Information Retrieval)

    Google Scholar 

  3. Apweiler, R., Bairoch, A., Wu, C., Barker, W., Boeckmann, B., Ferro, S., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., Martin, M., Natale, D., ODonovan, C., Redaschi, N., Yeh, L.: UniProt: the universal protein knowledgebase. Nucleic Acids Res. 32(D), D115–D119 (2004)

    Article  Google Scholar 

  4. Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., Harris, M.A., Hill, D.P., Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese, J.C., Richardson, J.E., Ringwald, M., Rubin, G.M., Sherlock, G.: Gene ontology: tool for the unification of biology. The gene ontology consortium. Nat. Genet. 25(1), 25–29 (2000)

    Article  Google Scholar 

  5. Bellinger, G., Castro, D., Mills, A.: Data, Information, Knowledge, and Wisdom (2004), http://www.systems-thinking.org/dikw/dikw.htm

  6. Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The protein data bank. Nucleic Acids Res. 28(1), 235–242 (2000)

    Article  Google Scholar 

  7. Blaschke, C., Hirschman, L., Valencia, A.: Information extraction in molecular biology. Briefings in Bioinformatics 3, 154–165 (2002)

    Article  Google Scholar 

  8. Blaschke, C., Leon, E.A., Krallinger, M., Valencia, A.: Evaluation of biocreative assessment of task 2. BMC Bioinformatics 6(suppl. 1) (2005)

    Google Scholar 

  9. Boyack, K.: Mapping knowledge domains: Characterizing PNAS. PNAS 101(1), 5192–5199 (2004)

    Article  Google Scholar 

  10. Börner, K., Mary, J., Goldstone, R.: The simultaneous evolution of author and paper networks. PNAS 101(1), 5266–5273 (2004)

    Article  Google Scholar 

  11. Brown, P.O., Botstein, D.: Exploring the new world of the genome with dna microarrays. Nat. Genet. 21(suppl. 1), 33–37 (1999)

    Article  Google Scholar 

  12. Chen, C.: Searching for intellectual turning points: Progressive knowledge domain visualization. PNAS 101(1), 5303–5318 (2004)

    Article  Google Scholar 

  13. Chen, H., Sharp, B.M.: Content-rich biological network constructed by mining PubMed abstracts. BMC Bioinformatics 5(1) (October 2004)

    Google Scholar 

  14. Doms, A.: Using sequence alignment algorithms to extract gene ontology terms in biomedical literature abstracts. Diplomathesis, TU Dresden (2004)

    Google Scholar 

  15. Doms, A.: GoPubMed: Ontology-based literature search for the life sciences. PhD thesis, Technical University of Dresden (2009)

    Google Scholar 

  16. Doms, A., Schroeder, M.: GoPubMed: exploring PubMed with the Gene Ontology. Nucl. Acids Res. 33, W783–W786 (2005)

    Article  Google Scholar 

  17. Eaton, A.D.: Hubmed: a web-based biomedical literature search interface. Nucleic Acids Res. 34(Web Server issue) (July 2006)

    Google Scholar 

  18. Fontelo, P., Liu, F., Leon, S., Anne, A., Ackerman, M.: PICO linguist and BabelMeSH: Development and partial evaluation of evidence-based multilanguage search tools for medline/pubmed. Stud. Health Technol. Inform. 129, 817–821 (2007)

    Google Scholar 

  19. Fontelo, P., Liu, F., Ackerman, M.: Askmedline: a free-text, natural language query tool for medline/pubmed. BMC Med. Inform. Decis. Mak. 5(1) (March 2005)

    Google Scholar 

  20. Forrey, A.W., McDonald, C.J., DeMoor, G., Huff, S.M., Leavelle, D., Leland, D., Fiers, T., Charles, L., Griffin, B., Stalling, F., Tullis, A., Hutchins, K., Baenziger, J.: Logical observation identifier names and codes (loinc) database: a public use set of codes and names for electronic reporting of clinical laboratory test results. Clin. Chem. 42(1), 81–90 (1996)

    Google Scholar 

  21. Friedman, C., Kra, P., Yu, H., Krauthammer, M., Rzhetsky, A.: GENIES: a natural-language processing system for the extraction of molecular pathways from journal articles. In: Proceedings of the International Conference on Intelligent Systems for Molecular Biology, pp. 574–582 (2001)

    Google Scholar 

  22. Fukuda, K., Tamura, A., Tsunoda, T., Takagi, T.: Toward information extraction: identifying protein names from biological papers. In: Pac. Symp. Biocomput., pp. 707–718 (1998)

    Google Scholar 

  23. Garfield, E., Melino, G.: The growth of the cell death field: an analysis from the isi science citation index. Cell Death and Differentiation 4, 352–361 (1997)

    Article  Google Scholar 

  24. Gene Ontology Consortium: The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res. 1(32), D258–D261 (2004)

    Google Scholar 

  25. Goetz, T., von der Lieth, C.W.: Pubfinder: a tool for improving retrieval rate of relevant pubmed abstracts. Nucleic Acids Res. 33(Web Server issue) (July 2005)

    Google Scholar 

  26. Gruber, T.R.: A translation approach to portable ontology specifications. Knowl. Acquis. 5(2), 199–220 (1993)

    Article  Google Scholar 

  27. Guimer, R., Uzzi, B., Spiro, J., Amaral, L.: Team assembly mechanisms determine collaboration network structure and team performance. Science 308(5722), 697–702 (2005)

    Article  Google Scholar 

  28. Hakenberg, J., Plake, C., Royer, L., Strobelt, H., Leser, U., Schroeder, M.: Gene mention normalization and interaction extraction with context models and sentence motifs. Genome Biology 9(suppl. 2) (2008)

    Google Scholar 

  29. Hamosh, A., Scott, A.F., Amberger, J., Bocchini, C., Valle, D., McKusick, V.A.: Online mendelian inheritance in man (omim), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 30(1), 52–55 (2002)

    Article  Google Scholar 

  30. Hersh, W., Cohen, A.M., Roberts, P., Rekapalli, H.K.: Overview of the TREC 2006 question answering track. In: The Fifteenth Text REtrieval Conference (TREC 2006) Proceedings (2006)

    Google Scholar 

  31. Hirschman, L., Colosimo, M., Morgan, A., Yeh, A.: Overview of BioCreAtIvE task 1b: normalized gene lists. BMC Bioinformatics 6(1), S11 (2005)

    Article  Google Scholar 

  32. Hoffmann, R., Valencia, A.: A gene network for navigating the literature. Nature Genetics 36, 664 (2004)

    Article  Google Scholar 

  33. Hoffmann, R., Valencia, A.: A gene network for navigating the literature. Nat. Genet. 36(7) (2004)

    Google Scholar 

  34. Jenssen, T.K., Laegreid, A., Komorowski, J., Hovig, E.: A literature network of human genes for high-throughput analysis of gene expression. Nat. Genet. 28(1), 21–28 (2001)

    Google Scholar 

  35. Joachims, T.: A probabilistic analysis of the rocchio algorithm with tfidf for text categorization. In: Fisher, D.H. (ed.) Proceedings of ICML 1997, 14th International Conference on Machine Learning, Nashville, US, pp. 143–151. Morgan Kaufmann Publishers, San Francisco (1997)

    Google Scholar 

  36. Kaenel, I.d., Iriarte, P.: Alternative interfaces for PubMed searches. In: European Association for Health Information & Libraries Workshop (2006)

    Google Scholar 

  37. Koster, J.: PubMed Pubreminer: a tool for PubMed query building and literature mining (2007)

    Google Scholar 

  38. Lambrix, P., Tan, H., Jakoniene, V., Strömbäck, L.: Biological Ontologies, pp. 85–99. Springer, Heidelberg (2007)

    Google Scholar 

  39. Law, S., Jerzy, O., Dawid, S.: Lingo: Search results clustering algorithm based on singular value decomposition (2004)

    Google Scholar 

  40. Lowe, H.J., Barnett, G.O.: Understanding and using the medical subject headings (mesh) vocabulary to perform literature searches. JAMA 271(14), 1103–1108 (1994)

    Article  Google Scholar 

  41. Müler, H.M., Kenny, E.E., Sternberg, P.W.: Textpresso: An ontology-based information retrieval and extraction system for biological literature. PLoS Biology 2(11) (2003)

    Google Scholar 

  42. Muin, M., Fontelo, P.: Technical development of PubMed interact: an improved interface for Medline/PubMed searches. BMC Medical Informatics and Decision Making 6, 36+ (2006)

    Article  Google Scholar 

  43. Newman, M.: The structure of scientific collaboration networks. PNAS 98(2), 404–409 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  44. Newman, M.: Coauthorship networks and patterns of scientific collaboration. PNAS 101(1), 5200–5205 (2004)

    Article  Google Scholar 

  45. Patrick, J., Wang, Y., Budd, P.: An automated system for conversion of clinical notes into snomed clinical terminology. In: ACSW 2007: Proceedings of the fifth Australasian symposium on ACSW frontiers, Darlinghurst, Australia, pp. 219–226. Australian Computer Society, Inc., Australia (2007)

    Google Scholar 

  46. Perez-Iratxeta, C., Perez, A., Bork, P., Andrade, M.: Update on XplorMed: A web server for exploring scientific literature. Nucleic Acids Res. 31(13), 3866–3868 (2003)

    Article  Google Scholar 

  47. Plake, C., Schiemann, T., Pankalla, M., Hakenberg, J., Leser, U.: ALIBABA: PubMed as a graph. Bioinformatics 22(19), 2444 (2006)

    Article  Google Scholar 

  48. Plikus, M.V., Zhang, Z., Chuong, C.M.: Pubfocus: Semantic medline/pubmed citations analytics through integration of controlled biomedical dictionaries and ranking algorithm. BMC Bioinformatics 7, 424 (2006)

    Article  Google Scholar 

  49. Price, D.: Networks of scientific papers. Science 30(149), 510–515 (1965)

    Article  Google Scholar 

  50. Quackenbush, J.: Genomics. microarrays–guilt by association. Science 302(5643), 240–241 (2003)

    Article  Google Scholar 

  51. Rebholz-Schuhmann, D., Arregui, M., Gaudan, S., Kirsch, H., Jimeno, A.: Text processing through web services: calling whatizit. Bioinformatics 24(2), 296–298 (2008)

    Article  Google Scholar 

  52. Rebholz-Schuhmann, D., Kirsch, H., Arregui, M., Gaudan, S., Riethoven, M., Stoehr, P.: EBIMed–text crunching to gather facts for proteins from medline. Bioinformatics 23(2), e237–e244 (2007)

    Article  Google Scholar 

  53. Sharma, N.: The origin of the data information knowledge wisdom hierarchy (February 2008) (unpublished)

    Google Scholar 

  54. Siadaty, M.S., Shu, J., Knaus, W.A.: Relemed: Sentence-level search engine with relevance score for the medline database of biomedical articles. BMC Medical Informatics and Decision Making 7, 1+ (2007)

    Article  Google Scholar 

  55. Smith, T., Cleary, J.: Automatically linking medline abstracts to the geneontology. In: Proc. of the Sixth Annual Bio-Ontologies Meeting, Brisbane, Australia (2003)

    Google Scholar 

  56. Taylor, D.P.: An integrated biomedical knowledge extraction and analysis platform: using federated search and document clustering technology. Methods Mol. Biol. 356, 293–300 (2006)

    Google Scholar 

  57. Thomas, J., Milward, D., Ouzounis, C., Pulman, S., Carroll, M.: Automatic extraction of protein interactions from scientific abstracts. In: Proc. of the Pacific Symp. on Biocomputing, pp. 538–549 (2002)

    Google Scholar 

  58. Tyers, M., Mann, M.: From genomics to proteomics. Nature (London) 422, 193–197 (2003)

    Article  Google Scholar 

  59. Yao, Y., Zeng, Y., Zhong, N., Huang, X.: Knowledge retrieval (KR). In: Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence (2007)

    Google Scholar 

  60. Yeh, A., Morgan, A., Colosimo, M., Hirschman, L.: BioCreAtIvE task 1a: gene mention finding evaluation. BMC Bioinformatics 6(1), S2 (2005)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Doms, A., Schroeder, M. (2009). Semantic Search with GoPubMed. In: Bry, F., Małuszyński, J. (eds) Semantic Techniques for the Web. Lecture Notes in Computer Science, vol 5500. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04581-3_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04581-3_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04580-6

  • Online ISBN: 978-3-642-04581-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics