The amount of data generated in cancer research is growing rapidly. High-density array-based technologies, such as genome-wide single nucleotide polymorphism (SNP) genotyping and gene expression microarrays, are producing data that is not only larger in size, but also in complexity, with regard to study design and associated meta-data. This chapter discusses how the flood of genomic and transcriptomic data is managed in databases, often by large collaborative consortia developing new approaches in informatics to maximize the availability and utility of data. Genetic variation databases are most often designed for a particular layer of detail, such as single disease-causing variants associated with specific phenotypes; databases for genome-wide variation, both for SNPs and structural variants; and large repositories for complete genome-wide association studies. Gene-expression microarray data is stored in large repositories, and new services have been developed that take advantage of the increasing number and diversity of stored experiments. By associating data with biological information and integrative analysis, it can be transformed from high dimensionality to a summary level that is directly usable by bench biologists.
Data Data Wellcome Trust Case Control Consortium Mendelian Disease Research Research Single Nucleotide Polymorphism Database
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in to check access.
Alizadeh AA et al (2000) Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403:503–511PubMedCrossRefGoogle Scholar
Mills RE et al (2006) An initial map of insertion and deletion (INDEL) variation in the human genome. Genome Res 16:1182–1190PubMedCrossRefGoogle Scholar
Parkinson H et al (2009) ArrayExpress update—from an archive of functional genomics experiments to the atlas of gene expression. Nucleic Acids Res 37(Database issue):D868–D872PubMedCrossRefGoogle Scholar
Pleasance ED et al (2010a) A small-cell lung cancer genome with complex signatures of tobacco exposure. Nature 463:184–190CrossRefGoogle Scholar
Pleasance ED et al (2010b) A comprehensive catalogue of somatic mutations from a human cancer genome. Nature 463:191–196CrossRefGoogle Scholar
Rayner TF et al (2006) A simple spreadsheet-based, MIAME-supportive format for microarray data: MAGE-TAB. BMC Bioinformatics 7:489PubMedCrossRefGoogle Scholar
Rhodes DR et al (2004) ONCOMINE: a cancer microarray database and integrated data-mining platform. Neoplasia 6:1–6PubMedGoogle Scholar