Structured Literature Image Finder: Extracting Information from Text and Images in Biomedical Literature

  • Luís Pedro Coelho
  • Amr Ahmed
  • Andrew Arnold
  • Joshua Kangas
  • Abdul-Saboor Sheikh
  • Eric P. Xing
  • William W. Cohen
  • Robert F. Murphy
Conference paper

DOI: 10.1007/978-3-642-13131-8_4

Part of the Lecture Notes in Computer Science book series (LNCS, volume 6004)
Cite this paper as:
Coelho L.P. et al. (2010) Structured Literature Image Finder: Extracting Information from Text and Images in Biomedical Literature. In: Blaschke C., Shatkay H. (eds) Linking Literature, Information, and Knowledge for Biology. Lecture Notes in Computer Science, vol 6004. Springer, Berlin, Heidelberg

Abstract

Slif uses a combination of text-mining and image processing to extract information from figures in the biomedical literature. It also uses innovative extensions to traditional latent topic modeling to provide new ways to traverse the literature. Slif provides a publicly available searchable database (http://slif.cbi.cmu.edu).

Slif originally focused on fluorescence microscopy images. We have now extended it to classify panels into more image types. We also improved the classification into subcellular classes by building a more representative training set. To get the most out of the human labeling effort, we used active learning to select images to label.

We developed models that take into account the structure of the document (with panels inside figures inside papers) and the multi-modality of the information (free and annotated text, images, information from external databases). This has allowed us to provide new ways to navigate a large collection of documents.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Luís Pedro Coelho
    • 1
    • 2
    • 3
  • Amr Ahmed
    • 4
    • 5
  • Andrew Arnold
    • 4
  • Joshua Kangas
    • 1
    • 2
    • 3
  • Abdul-Saboor Sheikh
    • 3
  • Eric P. Xing
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
  • William W. Cohen
    • 1
    • 2
    • 3
    • 4
  • Robert F. Murphy
    • 1
    • 2
    • 3
    • 4
    • 6
    • 7
  1. 1.Lane Center for Computational BiologyCarnegie Mellon University 
  2. 2.Joint Carnegie Mellon University-University of Pittsburgh Ph.D. Program in Computational Biology 
  3. 3.Center for Bioimage InformaticsCarnegie Mellon University 
  4. 4.Machine Learning DepartmentCarnegie Mellon University 
  5. 5.Language Technologies InstituteCarnegie Mellon University 
  6. 6.Department of Biological SciencesCarnegie Mellon University 
  7. 7.Department of Biomedical EngineeringCarnegie Mellon University 

Personalised recommendations