Abstract
Finding relevant publications in the large and rapidly growing body of biomedical literature is challenging. Search queries on PubMed often return thousands of publications and it can be a tedious task to filter out irrelevant publications and choose a manageable set to read. We have developed a visual analytics system, named Bio-Jigsaw, which acts like a visual index on a document collection and supports biologists in investigating and understanding connections between biological entities. We apply natural language processing techniques to identify biological entities such as genes and pathways and visualize connections among them via multiple representations. Connections are based on co-occurrence in abstracts and also are drawn from ontologies or annotations in digital libraries. We demonstrate how Bio-Jigsaw can be used to analyze a PubMed search query on a gene related to breast cancer resulting in over 1500 primary papers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baumgartner Jr., W.A., Lu, Z., Johnson, H.L., Caporaso, J.G., Paquette, J., Lindemann, A., White, E.K., Medvedeva, O., Cohen, K.B., Hunter, L.: Concept recognition for extracting protein interaction relations from biomedical text. Genome Biology 9 (in press)
Blaschke, C., Andrade, M.A., Ouzounis, C., Valencia, A.: Automatic extraction of biological information from scientific text: protein-protein interactions. Intelligent Systems for Molecular Biology, 60–67 (1999)
Chun, H.W., Tsuruoka, Y., Kim, J.D., Shiba, R., Nagata, N., Hishiki, T., Tsujii, J.: Extraction of gene-disease relations from medline using domain dictionaries and machine learning. In: Pacific Symposium on Biocomputing, pp. 4–5 (2006)
Craven, M., Kumlien, J.: Constructing biological knowledge bases by extracting information from text sources. Intelligent Systems for Molecular Biology, 77–86 (1999)
Doms, A., Schroeder, M.: GoPubMed: exploring PubMed with the Gene Ontology. Nucleic Acids Research 33, 783–786 (2005), GoPubMed
Gabow, A., Leach, S.M., Baumgartner Jr., W.A., Hunter, L.E., Goldberg, D.S.: Improving protein function prediction methods with integrated literature data. BMC Bioinformatics 9(198) (2008)
Galperin, M.Y., Cochrane, G.R.: Nucleic acids research annual database issue and the nar online molecular biology database collection in 2009. Nucleic Acids Research 37, D1–D4 (2009)
Graham, E.F., Dracy, A.E.: The effect of relaxin – mechanical di-latation of the bovine cervix. Journal of Dairy Science 36, 772–777 (1953)
Hoffmann, R., Valencia, A.: A gene network for navigating the literature. Nat. Genet. 736(7) (July 2004)
Hunter, L., Cohen, B.: Biomedical language processing: what’s beyond PubMed? Mol. Cell. 21(5), 589–594 (2006)
Hunter, L., Lu, Z., Firby, J., Baumgartner Jr., W.A., Johnson, H.L., Ogren, P.V., Cohen, K.B.: OpenDMAP: An open-source, ontology-driven concept analysis engine, with applications to capturing knowledge regarding protein transport, protein interactions and cell-specific gene expression. BMC Bioinformatics 9(78) (2008)
Kang, Y., Görg, C., Stasko, J.: The evaluation of visual analytics systems for investigative analysis: Deriving design principles from a case study. In: IEEE VAST, pp. 139–146 (October 2009)
Kim, J.-D., Ohta, T., Pyysalo, S., Kano, Y., Tsujii, J.: Overview of BioNLP 2009 shared task on event extraction. In: BioNLP 2009 Companion Volume: Shared Task on Entity Extraction, pp. 1–9 (2009)
Krallinger, M., Leitner, F., Valencia, A.: The BioCreative II.5 challenge overview. In: Proceedings of the BioCreative II.5 Workshop 2009 on Digital Annotations (2009)
Krallinger, M., Morgan, A., Smith, L., Leitner, F., Tanabe, L., Wilbur, J., Hirschman, L., Valencia, A.: Evaluation of text-mining systems for biology: overview of the second biocreative community challenge. Genome Biology 9(suppl. 2), S1 (2008)
Kupari, M., Mikkola, T.S., Turto, H., Lommi, J.: Is the pregnancy hormone relaxin an important player in human heart failure? Eur. J. Heart Fail 7, 195–198 (2005)
Muller, H.M., Kenny, E.E., Sternberg, P.W.: Textpresso: an ontology-based information retrieval and extraction system for biological literature. PLoS Biol. 2(11) (2004)
Nagel, K., Jimeno-Yepes, A., Rebholz-Schuhmann, D.: Annotation of protein residues based on a literature analysis: cross-validation against uniprotkb. BMC Bioinformatics 10(suppl. 8), S4 (2009)
Santora, K., Rasa, C., Visco, D., Steinetz, B.G., Bagnell, C.A.: Antiarthritic effects of relaxin, in combination with estrogen, in rat adjuvant-induced arthritis. J. Pharmacol. Exp. Ther. 322, 887–893 (2007)
Sayers, E.W., Barrett, T., Benson, D.A., Bryant, S.H., Canese, K., Chetvernin, V., Church, D.M., DiCuccio, M., Edgar, R., Federhen, S., Feolo, M., Geer, L.Y., Helmberg, W., Kapustin, Y., Landsman, D., Lipman, D.J., Madden, T.L., Maglott, D.R., Miller, V., Mizrachi, I., Ostell, J., Pruitt, K.D., Schuler, G.D., Sequeira, E., Sherry, S.T., Shumway, M., Sirotkin, K., Souvorov, A., Starchenko, G., Tatusova, T.A., Wagner, L., Yaschenko, E., Ye, J.: Database resources of the National Center for Biotechnology Information. Nucl. Acids Res. 37(suppl. 1),D5–D15 (2009)
Stasko, J., Görg, C., Liu, Z.: Jigsaw: supporting investigative analysis through interactive visualization. Information Visualization 7(2), 118–132 (2008)
Wattenberg, M., Viégas, F.B.: The word tree, an interactive visual concordance. IEEE Transactions on Visualization and Computer Graphics 14(6), 1221–1228 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Görg, C. et al. (2010). Visualization and Language Processing for Supporting Analysis across the Biomedical Literature. In: Setchi, R., Jordanov, I., Howlett, R.J., Jain, L.C. (eds) Knowledge-Based and Intelligent Information and Engineering Systems. KES 2010. Lecture Notes in Computer Science(), vol 6279. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15384-6_45
Download citation
DOI: https://doi.org/10.1007/978-3-642-15384-6_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15383-9
Online ISBN: 978-3-642-15384-6
eBook Packages: Computer ScienceComputer Science (R0)