Update on Genomic Databases and Resources at the National Center for Biotechnology Information

Tatusova, Tatiana

doi:10.1007/978-1-4939-3572-7_1

Tatiana Tatusova⁴

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1415))

4369 Accesses
6 Citations
9 Altmetric

Abstract

The National Center for Biotechnology Information (NCBI), as a primary public repository of genomic sequence data, collects and maintains enormous amounts of heterogeneous data. Data for genomes, genes, gene expressions, gene variation, gene families, proteins, and protein domains are integrated with the analytical, search, and retrieval resources through the NCBI website, text-based search and retrieval system, provides a fast and easy way to navigate across diverse biological databases.

Comparative genome analysis tools lead to further understanding of evolution processes quickening the pace of discovery. Recent technological innovations have ignited an explosion in genome sequencing that has fundamentally changed our understanding of the biology of living organisms. This huge increase in DNA sequence data presents new challenges for the information management system and the visualization tools. New strategies have been designed to bring an order to this genome sequence shockwave and improve the usability of associated data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Matsen FA (2015) Phylogenetics and the human microbiome. Syst Biol 64(1):e26–e41, Review
Article PubMed PubMed Central Google Scholar
Hedlund BP, Dodsworth JA, Murugapiran SK, Rinke C, Woyke T (2014) Impact of single-cell genomics and metagenomics on the emerging view of extremophile "microbial dark matter". Extremophiles 18(5):865–875, Review
Article CAS PubMed Google Scholar
Vernikos G, Medini D, Riley DR, Tettelin H (2015) Ten years of pan-genome analyses. Curr Opin Microbiol 23:148–154, Review
Article CAS PubMed Google Scholar
Henson J, Tischler G, Ning Z (2012) Next-generation sequencing and large genome assemblies. Pharmacogenomics 13(8):901–915
Article CAS PubMed PubMed Central Google Scholar
Wang Y, Navin NE (2015) Advances and applications of single-cell sequencing technologies. Mol Cell 58(4):598–609
Article CAS PubMed Google Scholar
Feng Y, Zhang Y, Ying C, Wang D, Du C (2015) Nanopore-based fourth-generation DNA sequencing technology. Genomics Proteomics Bioinformatics 13(1):4–16
Article PubMed PubMed Central Google Scholar
Wu AR, Neff NF, Kalisky T, Dalerba P, Treutlein B, Rothenberg ME, Mburu FM, Mantalas GL, Sim S, Clarke MF, Quake SR (2014) Quantitative assessment of single-cell RNA-sequencing methods. Nat Methods 11(1):41–46
Article CAS PubMed PubMed Central Google Scholar
Berlin K, Koren S, Chin CS, Drake JP, Landolin JM, Phillippy AM (2015) Assembling large genomes with single-molecule sequencing and locality sensitive hashing. Nat Biotechnol 33(6):623–630
Article CAS PubMed Google Scholar
Madoui MA, Engelen S, Cruaud C, Belser C, Bertrand L, Alberti A, Lemainque A, Wincker P, Aury JM (2015) Genome assembly using Nanopore-guided long and error-free DNA reads. BMC Genomics 16(1):327
Article PubMed PubMed Central Google Scholar
Koren S, Phillippy AM (2015) One chromosome, one contig: complete microbial genomes from long-read sequencing and assembly. Curr Opin Microbiol 23:110–120
Article CAS PubMed Google Scholar
Silvester N, Alako B, Amid C, Cerdeño-Tárraga A et al (2015) Content discovery and retrieval services at the European Nucleotide Archive. Nucleic Acids Res 43(Database issue): D23–D29
Article PubMed PubMed Central Google Scholar
Benson DA, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW (2015) GenBank. Nucleic Acids Res 43(Database issue):D30–D35
Article PubMed PubMed Central Google Scholar
Kodama Y, Mashima J, Kosuge T, Katayama T, Fujisawa T, Kaminuma E, Ogasawara O, Okubo K, Takagi T, Nakamura Y (2015) The DDBJ Japanese Genotype-phenotype Archive for genetic and phenotypic human data. Nucleic Acids Res 43(Database issue):D18–D22
Article PubMed PubMed Central Google Scholar
Barrett T, Clark K, Gevorgyan R, Gorelenkov V, Gribov E, Karsch-Mizrachi I, Kimelman M, Pruitt KD, Resenchuk S, Tatusova T, Yaschenko E, Ostell J (2012) BioProject and BioSample databases at NCBI: facilitating capture and organization of metadata. Nucleic Acids Res 40(Database issue):D57–D63
Article CAS PubMed PubMed Central Google Scholar
Kodama Y, Shumway M, Leinonen R, International Nucleotide Sequence Database Collaboration (2012) The Sequence Read Archive: explosive growth of sequencing data. Nucleic Acids Res 40(Database issue): D54–D56
Article CAS PubMed PubMed Central Google Scholar
Pruitt KD, Brown GR, Hiatt SM et al (2014) RefSeq: an update on mammalian reference sequences. Nucleic Acids Res 42(Database issue):D756–D763
Article CAS PubMed PubMed Central Google Scholar
Tatusova T, Ciufo S, Federhen S, Fedorov B, McVeigh R, O'Neill K, Tolstoy I, Zaslavsky L (2015) Update on RefSeq microbial genome resources. Nucleic Acids Res 43(Database issue):D599–D605
Article PubMed PubMed Central Google Scholar
NCBI Resource Coordinators (2015) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 43(Database issue):D6–D17
Article PubMed Central Google Scholar
Salzberg SL, Church D, DiCuccio M, Yaschenko E, Ostell J (2004) The genome Assembly Archive: a new public resource. PLoS Biol 2(9), E285
Article PubMed PubMed Central Google Scholar
Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25(17):3389–3402, Review
Article CAS PubMed PubMed Central Google Scholar
Federhen S (2012) The NCBI Taxonomy database. Nucleic Acids Res 40:D13–D25
Article PubMed PubMed Central Google Scholar
Gurevich A, Saveliev V, Vyahhi N, Tesler G (2013) QUAST: quality assessment tool for genome assemblies. Bioinformatics 29(8):1072–1075
Article CAS PubMed PubMed Central Google Scholar
Rahman A, Pachter L (2013) CGAL: computing genome assembly likelihoods. Genome Biol 14(1):R8
Article PubMed PubMed Central Google Scholar
Blattner FR, Plunkett G 3rd, Bloch CA et al (1997) The complete genome sequence of Escherichia coli K-12. Science 277(5331):1453–1462
Article CAS PubMed Google Scholar
Riley M, Abe T, Arnaud MB, Berlyn MK et al (2006) Escherichia coli K-12: a cooperatively developed annotation snapshot--2005. Nucleic Acids Res 34(1):1–9
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

The authors would like to thank, in alphabetic order, Boris Fedorov and Sergei Resenchuk for their expertise and diligence in the design and maintenance of the databases highlighted in this publication and Stacy Ciufo for the helpful discussion and comments. These projects represent the efforts of many NCBI staff members along with the collective contributions of many dedicated scientists worldwide.

Author information

Authors and Affiliations

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD, 20894, USA
Tatiana Tatusova

Authors

Tatiana Tatusova
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tatiana Tatusova .

Editor information

Editors and Affiliations

Max F. Perutz Laboratories GmbH, Universität Wien, Wien, Austria
Oliviero Carugo
Technology and Research (A*STAR), Agency for Science, Singapore, Singapore
Frank Eisenhaber

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Tatusova, T. (2016). Update on Genomic Databases and Resources at the National Center for Biotechnology Information. In: Carugo, O., Eisenhaber, F. (eds) Data Mining Techniques for the Life Sciences. Methods in Molecular Biology, vol 1415. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-3572-7_1

Download citation

DOI: https://doi.org/10.1007/978-1-4939-3572-7_1
Published: 27 April 2016
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-3570-3
Online ISBN: 978-1-4939-3572-7
eBook Packages: Springer Protocols

Publish with us

Policies and ethics