Advertisement

Using GenBank

  • Eric W. Sayers
  • Ilene Karsch-Mizrachi
Part of the Methods in Molecular Biology book series (MIMB, volume 1374)

Abstract

GenBank® is a comprehensive database of publicly available DNA sequences for 300,000 named organisms, more than 110,000 within the embryophyta, obtained through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Daily data exchange with the European Nucleotide Archive (ENA) in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system that integrates data from the major DNA and protein sequence databases with taxonomy, genome, mapping, protein structure and domain information, as well as the biomedical journal literature in PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. GenBank usage scenarios ranging from local analyses of the data available via FTP to online analyses supported by the NCBI web-based tools are discussed. To access GenBank and its related retrieval and analysis services, go to the NCBI home page at www.ncbi.nlm.nih.gov.

Key words

NCBI Entrez DNA Sequence BLAST MegaBLAST 

Notes

Acknowledgements

Funding for this work was provided by the Intramural Research Program of the National Institutes of Health, National Library of Medicine.

References

  1. 1.
    Benson DA, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW (2014) GenBank. Nucleic Acids Res 42:D32–D37PubMedCentralCrossRefPubMedGoogle Scholar
  2. 2.
    Pakseresht N, Alako B, Amid C, Cerdeno-Tarraga A, Cleland I, Gibson R, Goodgame N, Gur T, Jang M, Kay S et al (2014) Assembly information services in the European Nucleotide Archive. Nucleic Acids Res 42:D38–D43PubMedCentralCrossRefPubMedGoogle Scholar
  3. 3.
    Kosuge T, Mashima J, Kodama Y, Fujisawa T, Kaminuma E, Ogasawara O, Okubo K, Takagi T, Nakamura Y (2014) DDBJ progress report: a new submission system for leading to a correct annotation. Nucleic Acids Res 42:D44–D49PubMedCentralCrossRefPubMedGoogle Scholar
  4. 4.
    NCBI Resource Coordinators (2014) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 42:D7–D17PubMedCentralCrossRefGoogle Scholar
  5. 5.
    Federhen S (2012) The NCBI Taxonomy database. Nucleic Acids Res 40:D136–D143PubMedCentralCrossRefPubMedGoogle Scholar
  6. 6.
    Kodama Y, Shumway M, Leinonen R (2012) The Sequence Read Archive: explosive growth of sequencing data. Nucleic Acids Res 40:D54–D56PubMedCentralCrossRefPubMedGoogle Scholar
  7. 7.
    Barrett T, Clark K, Gevorgyan R, Gorelenkov V, Gribov E, Karsch-Mizrachi I, Kimelman M, Pruitt KD, Resenchuk S, Tatusova T et al (2012) BioProject and BioSample databases at NCBI: facilitating capture and organization of metadata. Nucleic Acids Res 40:D57–D63PubMedCentralCrossRefPubMedGoogle Scholar
  8. 8.
    Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, Marshall KA, Phillippy KH, Sherman PM, Holko M et al (2013) NCBI GEO: archive for functional genomics data sets--update. Nucleic Acids Res 41:D991–D995PubMedCentralCrossRefPubMedGoogle Scholar
  9. 9.
    Lappalainen I, Lopez J, Skipper L, Hefferon T, Spalding JD, Garner J, Chen C, Maguire M, Corbett M, Zhou G et al (2013) DbVar and DGVa: public archives for genomic structural variation. Nucleic Acids Res 41:D936–D941PubMedCentralCrossRefPubMedGoogle Scholar
  10. 10.
    Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, Sirotkin K (2001) dbSNP: the NCBI database of genetic variation. Nucleic Acids Res 29:308–311PubMedCentralCrossRefPubMedGoogle Scholar
  11. 11.
    Madej T, Addess KJ, Fong JH, Geer LY, Geer RC, Lanczycki CJ, Liu C, Lu S, Marchler-Bauer A, Panchenko AR et al (2012) MMDB: 3D structures and macromolecular interactions. Nucleic Acids Res 40:D461–D464PubMedCentralCrossRefPubMedGoogle Scholar
  12. 12.
    Marchler-Bauer A, Lu S, Anderson JB, Chitsaz F, Derbyshire MK, DeWeese-Scott C, Fong JH, Geer LY, Geer RC, Gonzales NR et al (2011) CDD: a Conserved Domain Database for the functional annotation of proteins. Nucleic Acids Res 39:D225–D229PubMedCentralCrossRefPubMedGoogle Scholar
  13. 13.
    Wang Y, Xiao J, Suzek TO, Zhang J, Wang J, Zhou Z, Han L, Karapetyan K, Dracheva S, Shoemaker BA et al (2012) PubChem’s BioAssay Database. Nucleic Acids Res 40:D400–D412PubMedCentralCrossRefPubMedGoogle Scholar
  14. 14.
    Wang Y, Xiao J, Suzek TO, Zhang J, Wang J, Bryant SH (2009) PubChem: a public information system for analyzing bioactivities of small molecules. Nucleic Acids Res 37:W623–W633PubMedCentralCrossRefPubMedGoogle Scholar
  15. 15.
    Boratyn GM, Camacho C, Cooper PS, Coulouris G, Fong A, Ma N, Madden TL, Matten WT, McGinnis SD, Merezhuk Y et al (2013) BLAST: a more efficient report with usability improvements. Nucleic Acids Res 41:W29–W33PubMedCentralCrossRefPubMedGoogle Scholar
  16. 16.
    Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215:403–410CrossRefPubMedGoogle Scholar
  17. 17.
    Johnson M, Zaretskaya I, Raytselis Y, Merezhuk Y, McGinnis S, Madden TL (2008) NCBI BLAST: a better web interface. Nucleic Acids Res 36:W5–W9PubMedCentralCrossRefPubMedGoogle Scholar
  18. 18.
    Ye J, McGinnis S, Madden TL (2006) BLAST: improvements for better sequence analysis. Nucleic Acids Res 34:W6–W9PubMedCentralCrossRefPubMedGoogle Scholar
  19. 19.
    Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402PubMedCentralCrossRefPubMedGoogle Scholar
  20. 20.
    Morgulis A, Coulouris G, Raytselis Y, Madden TL, Agarwala R, Schaffer AA (2008) Database indexing for production MegaBLAST searches. Bioinformatics 24:1757–1764PubMedCentralCrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.National Center for Biotechnology Information, National Library of MedicineNational Institutes of HealthBethesdaUSA

Personalised recommendations