Next Generation Microarray Bioinformatics pp 41-53 | Cite as
Strategies to Explore Functional Genomics Data Sets in NCBI’s GEO Database
- 33 Citations
- 1 Mentions
- 11k Downloads
Abstract
The Gene Expression Omnibus (GEO) database is a major repository that stores high-throughput functional genomics data sets that are generated using both microarray-based and sequence-based technologies. Data sets are submitted to GEO primarily by researchers who are publishing their results in journals that require original data to be made freely available for review and analysis. In addition to serving as a public archive for these data, GEO has a suite of tools that allow users to identify, analyze, and visualize data relevant to their specific interests. These tools include sample comparison applications, gene expression profile charts, data set clusters, genome browser tracks, and a powerful search engine that enables users to construct complex queries.
Key words
Database Microarray Next-generation sequence Gene expression Epigenomics Functional genomics Data miningNotes
Acknowledgments
This chapter is an official contribution of the National Institutes of Health; not subject to copyright in the USA. The authors unreservedly acknowledge the expertise of the whole GEO curation and development team – Pierre Ledoux, Carlos Evangelista, Irene Kim, Kimberly Marshall, Katherine Phillippy, Patti Sherman, Michelle Holko, Dennis Troup, Maxim Tomashevsky, Rolf Muertter, Oluwabukunmi Ayanbule, Andrey Yefanov, and Alexandra Soboleva.
Funding
This research was supported by the Intramural Research Program of the NIH, National Library of Medicine.
References
- 1.
- 2.Edgar R, Domrachev M, Lash AE (2002) Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res 30:207–210PubMedCrossRefGoogle Scholar
- 3.Barrett T, Troup DB, Wilhite SE et al (2009) NCBI GEO: archive for high-throughput functional genomic data. Nucleic Acids Res 37:D885–890PubMedCrossRefGoogle Scholar
- 4.Sayers EW, Barrett T, Benson DA et al (2009) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 37:D5–15PubMedCrossRefGoogle Scholar
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.Altschul SF, Gish W, Miller W et al (1990) Basic local alignment search tool. J Mol Biol 215:403–410PubMedGoogle Scholar
- 13.Fingerman IM, McDaniel L, Zhang X et al (2011) NCBI Epigenomics: A new public resource for exploring epigenomic datasets. Nucleic Acids Res 39:D908–12PubMedCrossRefGoogle Scholar
- 14.
- 15.Rhead B, Karolchik D, Kuhn RM et al (2010) The UCSC Genome Browser database: update 2010. Nucleic Acids Res 38:D613–619.PubMedCrossRefGoogle Scholar
- 16.
- 17.
- 18.Bhattacharya A, De RK (2008) Divisive Correlation Clustering Algorithm (DCCA) for grouping of genes: detecting varying patterns in expression profiles. Bioinformatics 24:1359–1366PubMedCrossRefGoogle Scholar
- 19.Pierre M, DeHertogh B, Gaigneaux A et al (2010) Meta-analysis of archived DNA microarrays identifies genes regulated by hypoxia and involved in a metastatic phenotype in cancer cells. BMC Cancer 10:176PubMedCrossRefGoogle Scholar
- 20.Ogata Y, Suzuki H, Sakurai N et al (2010) CoP: a database for characterizing co-expressed gene modules with biological information in plants. Bioinformatics 26:1267–1268PubMedCrossRefGoogle Scholar
- 21.Liu S (2010) Increasing alternative promoter repertories is positively associated with differential expression and disease susceptibility. PLoS One 5:e9482PubMedCrossRefGoogle Scholar
- 22.Chen R, Sigdel TK, Li L et al (2010) Differentially Expressed RNA from Public Microarray Data Identifies Serum Protein Biomarkers for Cross-Organ Transplant Rejection and Other Conditions. PLoS Comput Biol 6:e1000940CrossRefGoogle Scholar
- 23.
- 24.
- 25.
- 26.
- 27.McGrath-Morrow S, Rangasamy T, Cho C et al (2008) Impaired lung homeostasis in neonatal mice exposed to cigarette smoke. Am J Respir Cell Mol Biol 38:393–400PubMedCrossRefGoogle Scholar