Abstract
The National Center for Biotechnology Information (NCBI), as a primary public repository of genomic sequence data, collects and maintains enormous amounts of heterogeneous data. Data for genomes, genes, gene expressions, gene variation, gene families, proteins, and protein domains are integrated with the analytical, search, and retrieval resources through the NCBI Web site. Entrez, a text-based search and retrieval system, provides a fast and easy way to navigate across diverse biological databases.
Customized genomic BLAST enables sequence similarity searches against a special collection of organism-specific sequence data and viewing the resulting alignments within a genomic context using NCBI’s genome browser, Map Viewer.
Comparative genome analysis tools lead to further understanding of evolutionary processes, quickening the pace of discovery.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Liolios, K., Mavrommatis, K., Tavernarakis, N., Kyrpides, N. C. (2007) The Genomes On Line Database (GOLD) in 2007: status of genomic and metagenomic projects and their associated metadata. Nucleic Acids Res. 36(Database issue), D475–D479.
Cochrane, G., Akhtar, R., Aldebert, P., Althorpe, N., Baldwin, A., Bates, K., Bhattacharyya, S., Bonfield, J., Bower, L., Browne, P., Castro, M., Cox, T., Demiralp, F., Eberhardt, R., Faruque, N., Hoad, G., Jang, M., Kulikova, T., Labarga, A., Leinonen, R., Leonard, S., Lin, Q., Lopez, R., Lorenc, D., McWilliam, H., Mukherjee, G., Nardone, F., Plaister, S., Robinson, S., Sobhany, S., Vaughan, R., Wu, D., Zhu, W., Apweiler, R., Hubbard, T., Birney, E. (2008) Priorities for nucleotide trace, sequence and annotation data capture at the Ensembl Trace Archive and the EMBL Nucleotide Sequence Database. Nucleic Acids Res 36(Database issue), D5–D12.
Benson, D. A., Karsch-Mizrachi, I., Lipman, D. J., Wheeler, D. L. (2008) GenBank. Nucleic Acids Res 36(Database issue), D25–D30.
Sugawara, H., Ogasawara, O., Okubo, K., Gojobori, T., Tateno, Y. (2008) DDBJ with new system and face. Nucleic Acids Res 36(Database issue), D22–D24.
Galperin, M. Y. (2008) The molecular biology database collection: 2008 update. Nucleic Acids Res 36(Database issue), D2–D4.
Wheeler, D. L., et al. (2008) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 36(Database issue), D13–D21.
Mailman, M. D., Feolo, M., Jin, Y., Kimura, M., Tryka, K., Bagoutdinov, R., Hao, L., Kiang, A., Paschall, J., Phan, L., Popova, N., Pretel, S., Ziyabari, L., Lee, M., Shao, Y., Wang, Z. Y., Sirotkin, K., Ward, M., Kholodov, M., Zbicz, K., Beck, J., Kimelman, M., Shevelev, S., Preuss, D., Yaschenko, E., Graeff, A., Ostell, J., Sherry, S. T. (2007) The NCBI dbGaP database of genotypes and phenotypes. Nat Genet 39(10), 1181–1186.
Pruitt, K. D., Tatusova, T., Maglott, D. R. (2007) NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res 35(Database issue), D61–D65.
Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W., Lipman, D. J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25(17), 3389–3402. Review.
Maglott, D. R., Ostell, J., Pruitt, K. D., Tatusova, T. (2007) Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res 35(Database issue), D26–D31.
Hillary, E. S., Maria, A. S., eds. (2006) Genomes (Cold Spring Harbor Monograph Series, 46). Cold Spring Harbor, New York.
Salzberg, S. L., Church, D., DiCuccio, M., Yaschenko, E., Ostell, J. (2004) The genome Assembly Archive: a new public resource. PLoS Biol. 2(9), E285.
Tatusova, T. A., Karsch-Mizrachi, I., Ostell, J. A. (1999) Complete genomes in WWW Entrez: data representation and analysis. Bioinformatics 15(7–8), 536–543.
Fleischmann, R. D., et al. Whole-genome random sequencing and assembly of Haemophilus influenza Rd. (1995) Science 269(5223), 496–512.
Tatusov, R. L., Fedorova, N. D., Jackson, J. D., Jacobs, A. R., Kiryutin, B., Koonin, E. V., Krylov, D. M., Mazumder, R., Mekhedov, S. L., Nikolskaya, A. N., Rao, B. S., Smirnov, S., Sverdlov, A. V., Vasudevan, S., Wolf, Y. I., Yin, J. J., Natale, D. A. (2003) The COG database: an updated version includes eukaryotes. BMC Bioinformatics 4, 41.
Klimke, W., Tatusova, T. (2006) Microbial genomes at NCBI in (Mulder, N., Apweiler, R., eds.) In Silico Genomics And Proteomics: Functional Annotation of Genomes And Proteins, Nova Science Publishers; 1st ed., pp. 157–183.
Tatusova, T., Smith-White, B., Ostell, J. A. (2006) Collection of plant-specific genomic data and resources at the National Center for Biotechnology Information, in (David, E., ed.), Plant Bioinformatics: Methods and Protocols (Methods in Molecular Biology), Humana Press, 1st ed., pp. 61–87.
Nakabachi, A., Yamashita, A., Toh, H., Ishikawa, H., Dunbar, H. E., Moran, N. A., Hattori, M. (2006) The 160-kilobase genome of the bacterial endosymbiont. Carsonella Sci 314(5797), 267.
Schneiker, S., et al. (2007) Complete genome sequence of the myxobacterium Sorangium cellulosum. Nat Biotechnol 25(11), 1281–1289.
Brügger, K., et al. (2007) The genome of Hyperthermus butylicus: a sulfur-reducing, peptide fermenting, neutrophilic Crenarchaeote growing up to 108 degrees C. Archaea 2(2), 127–135.
Teeling, H., Lombardot, T., Bauer, M., Ludwig, W., Glockner, F. O. (2004) Evaluation of the phylogenetic position of the planctomycete ‘Rhodopirellula baltica’ SH 1 by means of concatenated ribosomal protein sequences, DNA-directed RNA polymerase subunit sequences and whole genome trees. Int J Syst Evol Microbiol 54, 791–801.
Darling, A. C., Mau, B., Blattner, F. R., et al. (2004). Mauve: multiple alignment of conserved genomic sequence with rearrangements. Genome Res 14(7), 1394–1403.
Ahn, S. N., Tanksley, S. D. (1993) Comparative linkage maps of the rice and maize genomes. Proc Natl Acad Sci USA 90, 7980–7984.
Devos, K. M., Chao, S., Li, Q. Y., Simonetti, M. C., Gale, M. D. (1994) Relationship between chromosome 9 of maize and wheat homeologous group 7 chromosomes. Genetics 138, 1287–1292.
Kurata, N., Moore, G., Nagamura, Y., Foote, T., Yano, M., Minobe, Y., Gale, M. D. (1994) Conservation of genome structure between rice and wheat. Biotechnology (NY) 12, 276–278.
van Deynze, A. E., Nelson, J. C., O’Donoghue, L. S., Ahn, S. N., Siripoonwiwat, W., Harrington, S. E., Yglesias, E. S., Braga, D. P., McCouch, S. R., Sorrells, M. E. (1995) Comparative mapping in grasses: oat relationships. Mol Gen Genet 249, 349–356.
Lederburg, E. M. (1986) Plasmid prefix designations registered by the Plasmid Reference Center 1977–1985. Plasmid 1, 57–92.
Altschul, S. F., Gish, W., Miller, W., et al. (1990). Basic local alignment search tool. J Mol Biol 215(3), 403–410.
Cummings, L., Riley, L., Black, L., Souvorov, A., Resenchuk, S., Dondoshansky, I., Tatusova, T. (2002) Genomic BLAST: custom-defined virtual databases for complete and unfinished genomes. FEMS Microbiol Lett 216(2), 133–138.
Acknowledgments
The authors would like to thank, in alphabetic order, Vyacheslav Chetvernin, Boris Fedorov, Andrei Kochergin, Peter Meric and Sergei Resenchuk, and Martin Shumway for their expertise and diligence in the design and maintenance of the databases highlighted in this publication and Stacy Ciufo for the helpful discussion and comments. These projects represent the efforts of many NCBI staff members along with the collective contributions of many dedicated scientists worldwide.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Humana Press, a part of Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Tatusova, T. (2010). Genomic Databases and Resources at the National Center for Biotechnology Information. In: Carugo, O., Eisenhaber, F. (eds) Data Mining Techniques for the Life Sciences. Methods in Molecular Biology, vol 609. Humana Press. https://doi.org/10.1007/978-1-60327-241-4_2
Download citation
DOI: https://doi.org/10.1007/978-1-60327-241-4_2
Published:
Publisher Name: Humana Press
Print ISBN: 978-1-60327-240-7
Online ISBN: 978-1-60327-241-4
eBook Packages: Springer Protocols