Abstract
We briefly review some principles of the Bioconductor project for statistical analysis of genome scale data and show in detail how these are deployed to analyze high-density genotyping arrays manufactured by Affymetrix, Inc. (Genome-wide SNP 6.0). Issues addressed include probe design and verification of probe address assertions, preprocessing of scanned intensities via SNPRMA, genotype calling via CRLMM, and downstream analysis of genotype-expression associations.
Similar content being viewed by others
References
Bolstad BM, Irizarry RA, Astrand M, Speed TP (2003) A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19(2):185–193
Carvalho B, Bengtsson H, Speed TP, Irizarry RA (2007) Exploration, normalization, and genotype calls of high-density oligonucleotide snp array data. Biostatistics 8(2):485–499
Carvalho B, Louis T, Irizarry RA (2009) Quantifying uncertainty in genotype calls. Johns Hopkins University, Dept. of Biostatistics Working Papers No. 180
Clayton D, Leung HT (2007) An R package for the analysis of whole-genome association studies. Hum Hered 64:45–51
The International HapMap Consortium (2003) The international HapMap project. Nature 426(6968):789–796
Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, Hornik K, Hothorn T, Huber W, Iacus S, Irizarry R, Leisch F, Li C, Maechler M, Rossini AJ, Sawitzki G, Smith C, Smyth G, Tierney L, Yang JYH, Zhang J (2004) Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 5(10):R80
Ioannidis JPA, Allison DB, Ball CA, Coulibaly I, Cui X, Culhane AC, Falchi M, Furlanello C, Game L, Jurman G, Mangion J, Mehta T, Nitzberg M, Page GP, Petretto E, van Noort V (2009) Repeatability of published microarray gene expression analyses. Nat Genet 41(2):149–155
Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, Speed TP (2003) Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 4(2):249–264
Lawrence M, Gentleman R, Carey V (2009) Rtracklayer: an R package for interfacing with genome browsers. Bioinformatics
Li C, Wong WH (2001a) Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. Proc Natl Acad Sci USA 98(1):31–36
Li C, Wong WH (2001b) Model-based analysis of oligonucleotide arrays: model validation, design issues and standard error application. Genome Biology 2(8):RESEARCH0032
Lin S, Carvalho BS, Cutler D, Arking D, Chakravarti A, Irizarry R (2008) Validation and extension of an empirical Bayes method for SNP calling on Affymetrix microarrays. Genome Biol 9(4):R63
Naef F, Lim DA, Patil N, Magnasco MO (2001) From features to expression: high-density oligonucleotide array analysis revisited. arXiv:physics/0102010v2
Smyth GK (2004) Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol
Tukey JW (1977) Exploratory data analysis. Addison-Wesley, Reading, p 688
Williams RB, Chan E, Cowley MJ, Little PF (2007) The influence of genetic variation on gene expression. Genome Res 17(12):1707–1716
Wu Z, Irizarry RA, Gentleman RC, (2004) FMM A model-based background adjustment for oligonucleotide expression arrays. J Am Stat Assoc 99(468):909–917
Author information
Authors and Affiliations
Corresponding author
Additional information
Work supported in part by NIH P41 HG004059-01 and NIH R01 HL086601-01.
Rights and permissions
About this article
Cite this article
Carvalho, B., Irizarry, R.A., Scharpf, R.B. et al. Processing and Analyzing Affymetrix SNP Chips with Bioconductor. Stat Biosci 1, 160–180 (2009). https://doi.org/10.1007/s12561-009-9015-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12561-009-9015-0