Abstract
Microarray technology has been widely adopted by researchers who use both home-made microarrays and microarrays purchased from commercial vendors. Associated with the adoption of this technology has been a deluge of complex data, both from the microarrays themselves, and also in the form of associated meta data, such as gene annotation information, the properties and treatment of biological samples, and the data transformation and analysis steps taken downstream. In addition, standards for annotation and data exchange have been proposed, and are now being adopted by journals and funding agencies alike. The coupling of large quantities of complex data with extensive and complex standards require all but the most small-scale of microarray users to have access to a robust and scaleable database with various tools. In this review, we discuss some of the desirable properties of such a database, and look at the features of several freely available alternatives.
Similar content being viewed by others
References
Lockhart, D. J., Dong, H., Byrne, M. C., et al. (1996) Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat. Biotechnol. 14, 1675–1680.
DeRisi, J. L., Iyer, V. R., and Brown, P. O. (1997) Exploring the metabolic and genetic control of gene expression on a genomic scale. Science 278, 680–686.
Schena, M., Shalon, D., Heller, R., Chai, A., Brown, P. O., and Davis, R. W. (1996) Parallel human genome analysis: microarray-based expression monitoring of 1000 genes. Proc. Natl. Acad. Sci. USA 93, 10,614–10,619.
Tu, I. P., Schaner, M., Diehn, M., et al. (2004) A method for detecting and correcting feature misidentification on expression microarrays. BMC Genomics 5, 64.
Ashburner, M., Ball, C. A. Blake, J. A. et al. (2000) Gene ontology: tool for the unification of biology. Nat. Genet. 25, 25–99.
Brazma, A., Hingamp, P., Quackenbush, J., et al. (2001) Minimum information about a microarray experiment (MIAME)-toward standards for microarray data. Nat. Genet. 29, 365–371.
Ball, C. A., Sherlock, G. Parkinson, H., et al. (2002) Standards for microarray data. Science 298, 539.
Ball, C., Brazma, A., Causton, H., et al. (2004) An open letter on microarray data from the MGED Society. Microbiology 150, 3522–3524.
Ball, C., Brazma, A., Causton, H., et al. (2004) Standards for microarray data: an open letter. Environ. Health Perspect. 112, A666-A667.
Ball, C. A., Brazma, A., Causton, H., et al. (2004) Submission of microarray data to public repositories. PLoS Biol. 2, E317.
Ball, C. A., Sherlock, G., Parkinson, H., et al. (2002) The underlying principles of scientific publication. Bioinformatics 18, 1409.
Spellman, P. T., Miller, M., Stewart, J., et al. (2002) Design and implementation of microarray gene expression markup language (MAGE-ML). Genome. Biol., 3, RESEARCH0046.
Barrett, T., Suzek, T. O., Troup, D. B., et al. (2005) NCBI GEO: mining millions of expression profiles—database and tools. Nucleic Acids Res. 33 Database Issue, D562-D566.
Parkinson, H., Sarkans, U., Shojatalab, M., et al. (2005) ArrayExpress—a public repository for microarray gene expression data at the EBI. Nucleic Acids Res. 33 Database Issue, D553-D555.
Eisen, M. B., Spellman, P. T., Brown, P. O., and Botstein, D. (1998) Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95, 14,863–14,868.
Kohonen, T. (1995) Self Organizing Maps. Berlin: Springer, Germany.
Tamayo, P., Slonim, D., Mesirov, J., et al. (1999) Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proc. Natl. Acad. Sci. USA 96, 2907–2912.
Toronen, P., Kolehmainen, M., Wong, G., and Castren, E. (1999) Analysis of gene expression data using self-organizing maps. FEBS Lett. 451, 142–146.
Alter, O., Brown, P. O., and Botstein, D. (2000) Singular value decomposition for genome-wide expression data processing and modeling. Proc. Natl. Acad. Sci. USA 97, 10,101–10,106.
Ball, C. A., Awad, I. A., Demeter, J., et al. (2005) The Stanford Microarray Database accommodates additional microarray platforms and data formats. Nucleic Acids Res. 33 Database Issue, D580-D582.
Gardiner-Garden, M. and Littlejohn, T. G. (2001) A comparison of microarray databases. Brief Bioinform. 2, 143–158.
Saal, L. H., Troein, C., Vallon-Christersson, J., Gruvberger, S., Borg, A., and Peterson, C. (2002) BioArray Software Environment (BASE): a platform for comprehensive management and analysis of microarray data. Genome Biol. 3, SOFTWARE0003.
Troyanskaya, O., Cantor, M., Sherlock, G., et al. (2001) Missing value estimation methods for DNA microarrays. Bioinformatics 17, 520–525.
Killion, P. J., Sherlock, G., and Iyer, V. R. (2003) The Longhorn Array Database (LAD): an opensource, MIAME compliant implementation of the Stanford Microarray Database (SMD). BMC Bio-informatics 4, 32.
Mangalam, H., Stewart, J., Zhou, K., et al. (2001) GeneX: An Open Source gene expression database and integrated tool set. IBM Systems Journal 40, 552–569.
Lee, J. K., Laudeman, T., Kanter, J., et al. (2004) GeneX Va: VBC open source microarray database and analysis software. Biotechniques 36, 634–638, 640, 642.
Saeed, A. I., Sharov, V., White, J., et al. (2003) TM4: a free, open-source system for microarray data management and analysis. Biotechniques 34, 374–378.
Manduchi, E., Grant, G. R., He, H., et al. (2004) RAD and the RAD Study-Annotator: an approach to collection, organization and exchange of all relevant information for high-throughput gene expression studies. Bioinformatics 20, 452–459.
Theilhaber, J., Ulyanov, A., Malanthara, A., et al. (2004) GECKO: a complete large-scale gene expression analysis platform. BMC Bioinformatics 5, 195.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Sherlock, G., Ball, C.A. Storage and retrieval of microarray data and open source microarray database software. Mol Biotechnol 30, 239–251 (2005). https://doi.org/10.1385/MB:30:3:239
Issue Date:
DOI: https://doi.org/10.1385/MB:30:3:239