Molecular Medicine

, Volume 9, Issue 9–12, pp 185–192 | Cite as

Using Genomic Databases for Sequence-Based Biological Discovery

  • Andreas D Baxevanis
In Overview


The inherent potential underlying the sequence data produced by the International Human Genome Sequencing Consortium and other systematic sequencing projects is, obviously, tremendous. As such, it becomes increasingly important that all biologists have the ability to navigate through and cull important information from key publicly available databases. The continued rapid rise in available sequence information, particularly as model organism data is generated at breakneck speed, also underscores the necessity for all biologists to learn how to effectively make their way through the expanding “sequence information space.” This review discusses some of the more commonly used tools for sequence discovery; tools have been developed for the effective and efficient mining of sequence information. These include LocusLink, which provides a gene-centric view of sequence-based information, as well as the 3 major genome browsers: the National Center for Biotechnology Information Map Viewer, the University of California Santa Cruz Genome Browser, and the European Bioinformatics Institute’s Ensembl system. An overview of the types of information available through each of these front-ends is given, as well as information on tutorials and other documentation intended to increase the reader’s familiarity with these tools.


  1. 1.
    Collins FS, Green ED, Guttmacher AE, Guyer MS. (2003) A vision for the future of genomics research. Nature 422:835–47.CrossRefGoogle Scholar
  2. 2.
    Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. (2003) GenBank. Nucleic Acids Res. 31:23–7.CrossRefGoogle Scholar
  3. 3.
    Baxevanis AD. Information retrieval from biological databases. In:Bioinformatics: a practical guide to the analysis of genes and proteins. 2nd edition. Baxevanis AD and Ouellette BFF (eds.) John Wiley and Sons, New York, pp. 155–85.Google Scholar
  4. 4.
    Hamosh A et al. (2002) Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 30:52–5.CrossRefGoogle Scholar
  5. 5.
    Wolfsberg TG, Landsman D. Expressed sequence tags. In: Bioinformatics: a practical guide to the analysis of genes and proteins. 2nd edition. Baxevanis AD and Ouellette BFF (eds.) John Wiley and Sons, New York, pp. 283–302.Google Scholar
  6. 6.
    Velculescu VE, Vogelstein B, Kinzler KW. (2000) Analyzing uncharted transcriptomes with SAGE. Trends Genet. 16:423–5.CrossRefGoogle Scholar
  7. 7.
    Blake JA et al. (2003) MGD: the Mouse Genome Database. Nucleic Acids Res. 31:193–5.CrossRefGoogle Scholar
  8. 8.
    Sprague J et al. (2003) The Zebrafish Information Network (ZFIN): the zebrafish model organism database. Nucleic Acids Res. 31:241–3.CrossRefGoogle Scholar
  9. 9.
    Yeh RF, Lim LP, Burge CB. (2001) Computational inference of homologous gene structures in the human genome. Genome Res. 11:803–16.CrossRefGoogle Scholar
  10. 10.
    Karolchik D et al. (2003) The UCSC Genome Browser database. Nucleic Acids Res. 31:51–4.CrossRefGoogle Scholar
  11. 11.
    Clamp M et al. (2003) Ensembl 2002: accommodating comparative genomics. Nucleic Acids Res. 31:38–42.CrossRefGoogle Scholar
  12. 12.
    Wolfsberg TG, Wetterstrand KA, Guyer MS, Collins FS, Baxevanis AD. (2002) A user’s guide to the human genome. Nat. Genet., vol. 32 supplement.Google Scholar
  13. 13.
    Baxevanis AD. (2003) The Molecular Biology Database Collection: 2003 update. Nucleic Acids Res. 31:1–12.CrossRefGoogle Scholar

Copyright information

© Feinstein Institute for Medical Research 2003

Authors and Affiliations

  1. 1.Genome Technology BranchNational Human Genome Research Institute, National Institutes of HealthBethesdaUSA

Personalised recommendations