Abstract
The analysis of two-colour cDNA microarray data usually involves subtracting background values from foreground values prior to normalization and further analysis. This approach has the advantage of reducing bias and the disadvantage of blowing up the variance of lower abundant spots. Whenever background subtraction is considered, it implicitly assumes locally constant background values. In practice, this assumption is often not met, which casts doubts on the usefulness of simple background subtraction. In order to improve background correction, we propose local background smoothing within the pre-processing pipeline of cDNA microarray data prior to background correction. For this purpose, we employ a geostatistical framework with ordinary kriging using both isotropic and anisotropic models of spatial correlation and 2-D locally weighted regression. We show that application of local background smoothing prior to background correction is beneficial in comparison to using raw background estimates. This is done using data of a self-versus-self experiment in Arabidopsis where subsets of differentially expressed genes were simulated. Using locally smoothed background values in conjunction with existing background correction methods increases the power, increases the accuracy and decreases the number of false positive results.
Similar content being viewed by others
References
Arteaga-Salas JM, Harrison AP, Upton GJG (2008) Reducing spatial flaws in oligonucleotide arrays by using neighborhood information. Stat Appl Genet Mol Biol 7 (1), Article 29
Beatty M, Hondred D, Fengler K, Li B, Rafalski A, Beló A (2009) Allelic genome structural variations in maize detected by array comparative genome hybridization. Theor Appl Genet (this issue) (in press)
Byrd RH, Lu P, Nocedal J, Zhu C (1995) A limited memory algorithm for bound constrained optimization. SIAM J Sci Comput 16:1190–1208
Cleveland WS, Grosse E (1991) Computational methods for local regression. Stat Comput 1:47–62
Cleveland WS, Devlin SJ, Grosse E (1988) Regression by local fitting—methods, properties, and computational algorithms. J Econom 37:87–114
Colantuoni C, Henry G, Zeger S, Pevsner J (2002) Local mean normalization of microarray element signal intensities across an array surface: quality control and correction of spatially systematic artifacts. BioTech 32:1316–1320
Cressie NAC (1993) Statistics for spatial data. Wiley, New York
Cressie NAC, Hawkins DM (1980) Robust estimation of the variogram: I. Math Geol 12:115–125
Diggle PJ, Ribeiro PJ Jr (2007) Model-based geostatistics. Springer, New York
Edwards D (2003) Non-linear normalization and background correction in one-channel cDNA microarrays. Bioinformatics 19:825–833
Frisch M, Thiemann A, Fu J, Schrag TA, Scholten S, Melchinger AE (2009) Transcriptome-based distance measures for grouping of germplasm and prediction of hybrid performance in maize. Theor Appl Genet (this issue) (in press)
Fujita A, Sato JR, de Oliveira Rodrigues L, Ferreira CE, Sogayar MC (2006) Evaluating different methods of microarray data normalization. BMC Bioinformatics 7, Article 469
Haldermans P, Shkedy Z, Van Sanden S, Burzykowski T, Aerts M (2007) Using linear mixed models for normalization of cDNA microarrays. Stat Appl Genet Mol Biol 6 (1), Article 19
Hilson P, Allemeersch J, Altmann T, Aubourg S, Avon A, Beynon J, Bhalerao RP, Bitton F, Caboche M, Cannoot B, Chardakov V, Cognet-Holliger C, Colot V, Crowe M, Darimont C, Durinck S, Eickhoff H, Falcon de Longuevialle A, Farmer EE, Grant M, Kuiper MTR, Lehrach H, Léon C, Leyva A, Lundeberg J, Lurin C, Moreau Y, Nietfeld W, Paz-Ares J, Reymond P, Rouzé P, Sandberg G, Segura MD, Serizet C, Tabrett A, Taconnat L, Thareau V, Van Hummelen P, Vercruysse S, Vuylsteke M, Weingartner M, Weisbeek PJ, Wirta V, Wittink FRA, Zabeau M, Small I (2004) Versatile gene-specific sequence tags for Arabidopsis functional genomics: transcript profiling and reverse genetics applications. Genome Res 14:2176–2189
Höcker N, Keller B, Chollet D, Descombes P, Piepho HP, Hochholdinger F (2008) Comparison of maize (Zea mays) hybrid and parental inbred line primary root transcriptomes suggests organ specific patterns of non-additive gene expression and conserved expression trends between different hybrids in a subset of genes. Genetics 179:1275–1283
Huber W, von Heydebreck A, Sültmann H, Poustka A, Vingron M (2002) Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics 18(Suppl.1):96–104
Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, Speed TP (2003) Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 4:249–264
Isaaks EH, Srivastava RM (1989) An introduction to applied geostatistics. Oxford University Press, New York
Jahnke S, Sarholz B, Thiemann A, Kühr V, Gutierrez-Marcos JF, Geiger HH, Piepho HP, Scholten S (2009) Heterosis in early seed development: A comparative study of F1 embryo and endosperm tissues six days after fertilization. Theoret Appl Genet (this issue) (in press)
Keller B, Emrich K, Hoecker N, Sauer M, Hochholdinger F, Piepho H-P (2005) Designing a microarray experiment to estimate dominance in maize (Zea mays L.). Theoret Appl Genet 111:57–64
Kooperberg C, Fazzio TG, Delrow JJ (2002) Improved background correction for spotted DNA microarrays. J Comput Biol 9:55–66
Little D, Gouhier-Darimont C, Bruessow F, Reymond P (2007) Oviposition by pierid butterflies triggers defense gene expression in Arabidopsis. Plant Physiol 143:784–800
Mary-Huard T, Daudin J-J, Robin S, Bitton F, Cabannes E, Hilson P (2004) Spotting effect in microarray experiments. BMC Bioinformatics 5, Article 63
McGee M, Chen Z (2006) Parameter estimation for the exponential-normal convolution model for background correction of Affymetrix GeneChip data. Stat Appl Genet Mol Biol 5 (1), Article 24
McQuarrie ADR, Tsai CL (1998) Regression and time series model selection. World Scientific, Singapore
Nelder JA (2000) Functional marginality and response-surface fitting. J Appl Stat 27:109–112
Neuvial P, Hupe P, Brito I, Liva S, Manie E, Brennetot C, Radvanyi F, Aurias A, Barillot E (2006) Spatial normalization of array-CGH data. BMC Bioinformatics 7, Article 264
Paschold A, Marcon C, Hoecker N, Hochholdinger F (2009) Molecular dissection of heterosis manifestation during early maize root development. Theor Appl Genet (this issue) (in press)
Piepho HP, Keller B, Hoecker N, Hochholdinger F (2006) Combining signals from spotted cDNA microarrays obtained at different scanning intensities. Bioinformatics 22:802–807
Ribeiro PJ Jr, Diggle PJ (2001) geoR: a package for geostatistical analysis. R-NEWS 1(2):15–18
Ritchie ME, Silver J, Oshlack A, Holmes M, Diyagama D, Holloway A, Smyth GK (2007) A comparison of background correction methods for two-color microarrays. Bioinformatics 23:2700–2707
Robinson GK (1991) That BLUP is a good thing: the estimation of random effects. Stat Sci 6:15–51
Sarholz B, Piepho H-P (2008) Variance component estimation for mixed model analysis of cDNA microarray data. Biom J 50:927–939
Schabenberger O, Pierce FJ (2002) Contemporary statistical models for the plant and soil sciences. CRC Press, Boca Raton
Scharpf RB, Iacobuzio-Donahue CA, Sneddon JB, Parmigiani G (2006) When should one subtract background fluorescence in two color microarrays? Biostatistics 8:695–707
Schena M (2003) Microarray analysis. Wiley, Hoboken
Schuchhardt J, Beule D, Malik A, Wolski E, Eickhoff H, Lehrach H, Herzel H (2000) Normalization strategies for cDNA microarrays. Nucleic Acids Res 28, Article 47
Searle SR (1971) Linear Models. Wiley, New York
Smyth GK (2004) Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol 3, Article 3
Smyth GK, Speed TP (2003) Normalization of cDNA microarray data. Methods 31:265–273
Storey JD (2002) A direct approach to false discovery rates. J R Stat Soc B 64:479–498
Thiemann A, Fu J, Schrag TA, Melchinger AE, Frisch M, Scholten S (2009) Correlation between parental transcriptome and field data for the characterization of heterosis in Zea mays L. Theoret Appl Genet (this issue) (in press)
Tran PH, Peiffer DA, Shin Y, Meek LM, Brody JP, Cho KWY (2002) Microarray optimizations: increasing spot accuracy and automated identification of true microarray signals. Nucleic Acids Res 30, Article 54
Uzarowska A, Keller B, Piepho HP, Schwarz G, Ingvardsen C, Wenzel G, Lübberstedt T (2007) Comparative expression profiling in meristems of inbred–hybrid triplets of maize. Plant Mol Biol 63:21–34
Uzarowska A, Dionisio G, Sarholz B, Piepho HP, Xu M, Ingvardsen C, Wenzel G, and Lübberstedt T (2009) Validation of candidate genes putatively associated with resistance to SCMV and MDMV in maize (Zea mays L.) by expression profiling. BMC Plant Biology 9, Article 15
Wright GW, Simon RM (2003) A random variance model for detection of differential gene expression in small microarray experiments. Bioinformatics 19:2448–2455
Yang YH, Buckley MJ, Speed TP (2001) Analysis of cDNA microarray images. Brief Bioinform 2:341–349
Yang YH, Dudoit S, Luu P, Lin DM, Peng V, Ngai J, Speed TP (2002) Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res 30, Article 15
Yin W, Chen T, Zhou XS, Chakraborty A (2005) Background correction for cDNA microarray images using the TV+L1 model. Bioinformatics 21:2410–2416
Yuan DS, Irizarry RA (2006) High-resolution spatial normalization for microarrays containing embedded technical replicates. Bioinformatics 22:3054–3060
Acknowledgments
We thank Caroline Gouhier-Darimont and Philippe Reymond (Department of Plant Molecular Biology, University of Lausanne, 1015 Lausanne, Switzerland) for providing the self-vs-self dataset. This work was funded by the Deutsche Forschungsgemeinschaft (DFG) within the priority program SPP1149-Heterosis in Plants (grant-no. PI 377/7-3). Prof. Uwe Jensen and two anonymous reviewers are thanked for helpful and constructive comments on earlier versions of the paper.
Conflict of interest statement
The authors declare that they have no conflict of interest.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by M. Frisch.
Contribution to the special issue “Heterosis in Plants”.
The R-code that we used is available from the authors upon request.
Rights and permissions
About this article
Cite this article
Schützenmeister, A., Piepho, HP. Background correction of two-colour cDNA microarray data using spatial smoothing methods. Theor Appl Genet 120, 475–490 (2010). https://doi.org/10.1007/s00122-009-1210-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00122-009-1210-3