Advertisement

Gemina: A Web-Based Epidemiology and Genomic Metadata System Designed to Identify Infectious Agents

  • Lynn M. Schriml
  • Aaron Gussman
  • Kathy Phillippy
  • Sam Angiuoli
  • Kumar Hari
  • Alan Goates
  • Ravi Jain
  • Tanja Davidsen
  • Anu Ganapathy
  • Elodie Ghedin
  • Steven Salzberg
  • Owen White
  • Neil Hall
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4506)

Abstract

The Gemina system (http://gemina.tigr.org) developed at TIGR is a tool for identification of microbial and viral pathogens and their associated genomic sequences based on the associated epidemiological data. Gemina has been designed as a tool to identify epidemiological factors of disease incidence and to support the design of DNA-based diagnostics such as the development of DNA signature-based assays. The Gemina database contains the full complement of microbial and viral pathogens enumerated in the Microbial Rosetta Stone database (MRS) [1]. Initially, curation efforts in Gemina have focused on the NIAID category A, B, and C priority pathogens [2] identified to the level of strains. For the bacterial NIAID category A-C pathogens, for example, we have included 38 species and 769 strains in Gemina. Representative genomic sequences are selected for each pathogen from NCBI’s GenBank by a three tiered filtering system and incorporated into TIGR’s Panda DNA sequence database. A single representative sequence is selected for each pathogen firstly from complete genome sequences (Tier 1), secondly from whole genome shotgun (WGS) data from genome projects (Tier 2), or thirdly from genomic nucleotide sequences from genome projects (Tier3). The list of selected accessions is transferred to Insignia when new pathogens are added to Gemina, allowing Insignia’s Signature Pipeline [3] to be run for each pathogen identified in a Gemina query.

Keywords

Infection System Viral Pathogen Control Vocabulary Whole Genome Shotgun Transmission Method 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Ecker, D.J., et al.: The Microbial Rosetta Stone Database: a compilation of global and emerging infectious microorganisms and bioterrorist threat agents. BMC Microbiol. 5, 19 (2005)CrossRefGoogle Scholar
  2. 2.
  3. 3.
  4. 4.
    Smith, B., et al.: Relations in biomedical ontologies. Genome Biol., R46 (2005)Google Scholar
  5. 5.
    National Center for Biotechnology Information (NCBI) Taxonomy, http://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html/

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Lynn M. Schriml
    • 1
  • Aaron Gussman
    • 1
  • Kathy Phillippy
    • 1
  • Sam Angiuoli
    • 1
  • Kumar Hari
    • 2
  • Alan Goates
    • 2
  • Ravi Jain
    • 3
  • Tanja Davidsen
    • 1
  • Anu Ganapathy
    • 1
  • Elodie Ghedin
    • 4
  • Steven Salzberg
    • 1
    • 5
  • Owen White
    • 1
  • Neil Hall
    • 1
  1. 1.The Institute for Genomic Research, a Division of J. Craig Venter Institute 
  2. 2.Ibis Biosciences, a Division of Isis Pharmaceuticals, Inc. 
  3. 3.cBIO, Inc. 
  4. 4.University of Pittsburgh School of Medicine, Division of Infectious Diseases 
  5. 5.Center for Bioinformatics and Computational Biology, University of Maryland 

Personalised recommendations