Genomic Databases and Resources at the National Center for Biotechnology Information

Tatusova, Tatiana

doi:10.1007/978-1-60327-241-4_2

Tatiana Tatusova³

Part of the book series: Methods in Molecular Biology ((MIMB,volume 609))

3438 Accesses
13 Citations

Abstract

The National Center for Biotechnology Information (NCBI), as a primary public repository of genomic sequence data, collects and maintains enormous amounts of heterogeneous data. Data for genomes, genes, gene expressions, gene variation, gene families, proteins, and protein domains are integrated with the analytical, search, and retrieval resources through the NCBI Web site. Entrez, a text-based search and retrieval system, provides a fast and easy way to navigate across diverse biological databases.

Customized genomic BLAST enables sequence similarity searches against a special collection of organism-specific sequence data and viewing the resulting alignments within a genomic context using NCBI’s genome browser, Map Viewer.

Comparative genome analysis tools lead to further understanding of evolutionary processes, quickening the pace of discovery.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 159.00; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Liolios, K., Mavrommatis, K., Tavernarakis, N., Kyrpides, N. C. (2007) The Genomes On Line Database (GOLD) in 2007: status of genomic and metagenomic projects and their associated metadata. Nucleic Acids Res. 36(Database issue), D475–D479.
Article PubMed Google Scholar
Cochrane, G., Akhtar, R., Aldebert, P., Althorpe, N., Baldwin, A., Bates, K., Bhattacharyya, S., Bonfield, J., Bower, L., Browne, P., Castro, M., Cox, T., Demiralp, F., Eberhardt, R., Faruque, N., Hoad, G., Jang, M., Kulikova, T., Labarga, A., Leinonen, R., Leonard, S., Lin, Q., Lopez, R., Lorenc, D., McWilliam, H., Mukherjee, G., Nardone, F., Plaister, S., Robinson, S., Sobhany, S., Vaughan, R., Wu, D., Zhu, W., Apweiler, R., Hubbard, T., Birney, E. (2008) Priorities for nucleotide trace, sequence and annotation data capture at the Ensembl Trace Archive and the EMBL Nucleotide Sequence Database. Nucleic Acids Res 36(Database issue), D5–D12.
CAS PubMed Google Scholar
Benson, D. A., Karsch-Mizrachi, I., Lipman, D. J., Wheeler, D. L. (2008) GenBank. Nucleic Acids Res 36(Database issue), D25–D30.
CAS PubMed Google Scholar
Sugawara, H., Ogasawara, O., Okubo, K., Gojobori, T., Tateno, Y. (2008) DDBJ with new system and face. Nucleic Acids Res 36(Database issue), D22–D24.
CAS PubMed Google Scholar
Galperin, M. Y. (2008) The molecular biology database collection: 2008 update. Nucleic Acids Res 36(Database issue), D2–D4.
CAS PubMed Google Scholar
Wheeler, D. L., et al. (2008) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 36(Database issue), D13–D21.
CAS PubMed Google Scholar
Mailman, M. D., Feolo, M., Jin, Y., Kimura, M., Tryka, K., Bagoutdinov, R., Hao, L., Kiang, A., Paschall, J., Phan, L., Popova, N., Pretel, S., Ziyabari, L., Lee, M., Shao, Y., Wang, Z. Y., Sirotkin, K., Ward, M., Kholodov, M., Zbicz, K., Beck, J., Kimelman, M., Shevelev, S., Preuss, D., Yaschenko, E., Graeff, A., Ostell, J., Sherry, S. T. (2007) The NCBI dbGaP database of genotypes and phenotypes. Nat Genet 39(10), 1181–1186.
Article CAS PubMed Google Scholar
Pruitt, K. D., Tatusova, T., Maglott, D. R. (2007) NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res 35(Database issue), D61–D65.
Article CAS PubMed Google Scholar
Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W., Lipman, D. J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25(17), 3389–3402. Review.
Article CAS PubMed Google Scholar
Maglott, D. R., Ostell, J., Pruitt, K. D., Tatusova, T. (2007) Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res 35(Database issue), D26–D31.
Article CAS PubMed Google Scholar
Hillary, E. S., Maria, A. S., eds. (2006) Genomes (Cold Spring Harbor Monograph Series, 46). Cold Spring Harbor, New York.
Google Scholar
Salzberg, S. L., Church, D., DiCuccio, M., Yaschenko, E., Ostell, J. (2004) The genome Assembly Archive: a new public resource. PLoS Biol. 2(9), E285.
Article PubMed Google Scholar
Tatusova, T. A., Karsch-Mizrachi, I., Ostell, J. A. (1999) Complete genomes in WWW Entrez: data representation and analysis. Bioinformatics 15(7–8), 536–543.
Article CAS PubMed Google Scholar
Fleischmann, R. D., et al. Whole-genome random sequencing and assembly of Haemophilus influenza Rd. (1995) Science 269(5223), 496–512.
Article CAS PubMed Google Scholar
Tatusov, R. L., Fedorova, N. D., Jackson, J. D., Jacobs, A. R., Kiryutin, B., Koonin, E. V., Krylov, D. M., Mazumder, R., Mekhedov, S. L., Nikolskaya, A. N., Rao, B. S., Smirnov, S., Sverdlov, A. V., Vasudevan, S., Wolf, Y. I., Yin, J. J., Natale, D. A. (2003) The COG database: an updated version includes eukaryotes. BMC Bioinformatics 4, 41.
Article PubMed Google Scholar
Klimke, W., Tatusova, T. (2006) Microbial genomes at NCBI in (Mulder, N., Apweiler, R., eds.) In Silico Genomics And Proteomics: Functional Annotation of Genomes And Proteins, Nova Science Publishers; 1st ed., pp. 157–183.
Google Scholar
Tatusova, T., Smith-White, B., Ostell, J. A. (2006) Collection of plant-specific genomic data and resources at the National Center for Biotechnology Information, in (David, E., ed.), Plant Bioinformatics: Methods and Protocols (Methods in Molecular Biology), Humana Press, 1st ed., pp. 61–87.
Google Scholar
Nakabachi, A., Yamashita, A., Toh, H., Ishikawa, H., Dunbar, H. E., Moran, N. A., Hattori, M. (2006) The 160-kilobase genome of the bacterial endosymbiont. Carsonella Sci 314(5797), 267.
CAS Google Scholar
Schneiker, S., et al. (2007) Complete genome sequence of the myxobacterium Sorangium cellulosum. Nat Biotechnol 25(11), 1281–1289.
Article CAS PubMed Google Scholar
Brügger, K., et al. (2007) The genome of Hyperthermus butylicus: a sulfur-reducing, peptide fermenting, neutrophilic Crenarchaeote growing up to 108 degrees C. Archaea 2(2), 127–135.
Article PubMed Google Scholar
Teeling, H., Lombardot, T., Bauer, M., Ludwig, W., Glockner, F. O. (2004) Evaluation of the phylogenetic position of the planctomycete ‘Rhodopirellula baltica’ SH 1 by means of concatenated ribosomal protein sequences, DNA-directed RNA polymerase subunit sequences and whole genome trees. Int J Syst Evol Microbiol 54, 791–801.
Article CAS PubMed Google Scholar
Darling, A. C., Mau, B., Blattner, F. R., et al. (2004). Mauve: multiple alignment of conserved genomic sequence with rearrangements. Genome Res 14(7), 1394–1403.
Article CAS PubMed Google Scholar
Ahn, S. N., Tanksley, S. D. (1993) Comparative linkage maps of the rice and maize genomes. Proc Natl Acad Sci USA 90, 7980–7984.
Article CAS PubMed Google Scholar
Devos, K. M., Chao, S., Li, Q. Y., Simonetti, M. C., Gale, M. D. (1994) Relationship between chromosome 9 of maize and wheat homeologous group 7 chromosomes. Genetics 138, 1287–1292.
CAS PubMed Google Scholar
Kurata, N., Moore, G., Nagamura, Y., Foote, T., Yano, M., Minobe, Y., Gale, M. D. (1994) Conservation of genome structure between rice and wheat. Biotechnology (NY) 12, 276–278.
Article CAS Google Scholar
van Deynze, A. E., Nelson, J. C., O’Donoghue, L. S., Ahn, S. N., Siripoonwiwat, W., Harrington, S. E., Yglesias, E. S., Braga, D. P., McCouch, S. R., Sorrells, M. E. (1995) Comparative mapping in grasses: oat relationships. Mol Gen Genet 249, 349–356.
Article PubMed Google Scholar
Lederburg, E. M. (1986) Plasmid prefix designations registered by the Plasmid Reference Center 1977–1985. Plasmid 1, 57–92.
Article Google Scholar
Altschul, S. F., Gish, W., Miller, W., et al. (1990). Basic local alignment search tool. J Mol Biol 215(3), 403–410.
CAS PubMed Google Scholar
Cummings, L., Riley, L., Black, L., Souvorov, A., Resenchuk, S., Dondoshansky, I., Tatusova, T. (2002) Genomic BLAST: custom-defined virtual databases for complete and unfinished genomes. FEMS Microbiol Lett 216(2), 133–138.
Article CAS PubMed Google Scholar

Download references

Acknowledgments

The authors would like to thank, in alphabetic order, Vyacheslav Chetvernin, Boris Fedorov, Andrei Kochergin, Peter Meric and Sergei Resenchuk, and Martin Shumway for their expertise and diligence in the design and maintenance of the databases highlighted in this publication and Stacy Ciufo for the helpful discussion and comments. These projects represent the efforts of many NCBI staff members along with the collective contributions of many dedicated scientists worldwide.

Author information

Authors and Affiliations

National Institute of Heath, Bethesda, MD, USA
Tatiana Tatusova

Authors

Tatiana Tatusova
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Max F. Perutz Laboratories GmbH, Universität Wien, Dr. Bohr-Gasse 9, Wien, 1030, Austria
Oliviero Carugo
Research (A*STAR), Agency for Science & Technology, Biopolis Street 30, Singapore, 138671, Singapore
Frank Eisenhaber

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Tatusova, T. (2010). Genomic Databases and Resources at the National Center for Biotechnology Information. In: Carugo, O., Eisenhaber, F. (eds) Data Mining Techniques for the Life Sciences. Methods in Molecular Biology, vol 609. Humana Press. https://doi.org/10.1007/978-1-60327-241-4_2

Download citation

DOI: https://doi.org/10.1007/978-1-60327-241-4_2
Published: 30 October 2009
Publisher Name: Humana Press
Print ISBN: 978-1-60327-240-7
Online ISBN: 978-1-60327-241-4
eBook Packages: Springer Protocols

Publish with us

Policies and ethics