Data Pre-Processing Issues in Microarray Analysis

  • Nicholas A. Tinker
  • Laurian S. Robert
  • Gail Butler
  • Linda J. Harris


Defined broadly, pre-processing involves many potential steps that are essential for successful microarray experimentation. The need for some steps (e.g., experimental design, image analysis) is unquestionable. Other steps are less dogmatic. Data transformation, inspection, and filtering should occur based on individual analytical goals and data management systems. These steps may take on new meaning as different techniques for analysis become widely accepted. A complete and perfect recipe for pre-processing and analyzing microarray experiments does not exist. Therefore, each experimenter must develop systems and procedures that are both appropriate and correct. We hope that the concepts introduced in this chapter will help the reader to better understand the detailed presentations found in later chapters.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Allain L.R., Askari M., Stokes D.L., Vo-Dinh T. (2001). Microarray sampling-platform fabrication using bubble-jet technology for a biochip system. Fresenius J Anal Chem 371:146–50.PubMedCrossRefGoogle Scholar
  2. Becker K.G. (2001). The sharing of cDNA microarray data. Nat Rev Neurosci 2:438–40.PubMedCrossRefGoogle Scholar
  3. Bozinov D., Rahnenfuhrer J. (2002). Unsupervised technique for robust target separation and analysis of DNA microarray spots through adaptive pixel clustering. Bioinformatics 18:747–56.PubMedCrossRefGoogle Scholar
  4. Brazma A., Hingamp P., Quackenbush J., Sherlock G., Spellman P., Stoeckert C., Aach J., Ansorge W., Ball C.A., Causton H.C., Gaasterland T., Glenisson P., Holstege F.C., Kim I.F., Markowitz V., Matese J.C., Parkinson H., Robinson A., Sarkans U., Schulze-Kremer S., Stewart J., Taylor R., Vilo J., Vingron M. (2001). Minimum information about a microarray experiment (MIAME)-toward standards for microarray data. Nat Genet 29:365–71.PubMedCrossRefGoogle Scholar
  5. Breitkreutz B.J. (2001). Jorgensen P., Breitkreutz A., Tyers M. AFM 4.0: a toolbox for DNA microarray analysis. Genome Biol 2:Software0001.1-0001.3.Google Scholar
  6. Comander J., Weber G.M., Gimbrone M.A. Jr, Garcia-Cardena G. (2001). Argus — a new database system for Web-based analysis of multiple microarray data sets. Genome Res 11:1603–10.PubMedCrossRefGoogle Scholar
  7. Dolan P.L., Wu Y., Ista L.K., Metzenberg R.L., Nelson M.A., Lopez G.P. (2001). Robust and efficient synthetic method for forming DNA microarrays. Nucleic Acids Res 29:E107–7.PubMedCrossRefGoogle Scholar
  8. Dudoit S., Yang Y.H., Callow M.J., Speed T.P. (2002). Statistical methods for identifying genes with differential expression in replicated cDNA microarray experiments. Statistica Sinica 12:111–139.Google Scholar
  9. Fellenberg K., Hauser N.C., Brors B., Hoheisel J.D., Vingron M. (2002). Microarray data warehouse allowing for inclusion of experiment annotations in statistical analysis. Bioinformatics 18:423–33.PubMedCrossRefGoogle Scholar
  10. Hoyle D.C., Rattray M., Jupp R., Brass A. (2002). Making sense of microarray data distributions. Bioinformatics 18:576–84.PubMedCrossRefGoogle Scholar
  11. Ihaka R., Gentleman R. (1996). R: A language for data analysis and graphics. Journal of Computational and Graphical Statistics 5:299–314.Google Scholar
  12. Jain A.N., Tokuyasu T.A., Snijders A.M., Segraves R., Albertson D.G., Pinkel D. (2002). Fully automatic quantification of microarray image data. Genome Res 12:325–32.PubMedCrossRefGoogle Scholar
  13. Kellam P. (2001). Microarray gene expression database: progress towards an international repository of gene expression data. Genome Biol 2:Reports 4011.Google Scholar
  14. Kerr M.K., Churchill G.A. (2001a). Statistical design and the analysis of gene expression microarray data. Genet Res 77:123–8.PubMedGoogle Scholar
  15. Kerr M.K., Churchill G.A. (2001b). Experimental design for gene expression microarrays. Biostatistics 2:183–201.PubMedCrossRefGoogle Scholar
  16. Kim J.H., Kim H.Y., Lee Y.S. (2001). A novel method using edge detection for signal extraction from cDNA microarray image analysis Exp Mol Med 33:83–8.PubMedGoogle Scholar
  17. Lee M.L., Kuo F.C., Whitmore G.A., Sklar J. (2000). Importance of replication in microarray gene expression studies: statistical methods and evidence from repetitive cDNA hybridizations. Proc Natl Acad Sci 97:9834–9.PubMedGoogle Scholar
  18. Liao B., Hale W., Epstein C.B., Butow R.A., Garner H.R. (2000). MAD: a suite of tools for microarray data management and processing. Bioinformatics 16:946–7.PubMedCrossRefGoogle Scholar
  19. Nadon R. (2002). Shoemaker J. Statistical issues with microarrays: processing and analysis. Trends Genet 18:265–71.PubMedCrossRefGoogle Scholar
  20. Rockett J.C., Christopher Luft J., Brian Garges J., Krawetz S.A., Hughes M.R., Hee Kirn K., Oudes A.J., Dix D.J. (2001). Development of a 950-gene DNA array for examining gene expression patterns in mouse testis. Genome Biol 2:Research0014.1-0014.9.Google Scholar
  21. Sherlock G., Hernandez-Boussard T., Kasarskis A., Binkley G., Matese J.C., Dwight S.S., Kaloper M., Weng S., Jin H., Ball C.A., Eisen M.B., Spellman P.T., Brown P.O., Botstein D., Cherry J.M. (2001). The Stanford Microarray Database. Nucleic Acids Res 29:152–5.PubMedCrossRefGoogle Scholar
  22. Snedecor G.W., Cochran W.G. (1989). Statistical Methods. 8th edition. Iowa State University Press, Ames. 503 pp.Google Scholar
  23. Tran P.H., Peiffer D.A., Shin Y., Meek L.M., Brody J.P., Cho K.W. (2002). Microarray optimizations: increasing spot accuracy and automated identification of true microarray signals. Nucleic Acids Res 30:e54.PubMedCrossRefGoogle Scholar
  24. Tseng G.C., Oh M.K., Rohlin L., Liao J.C., Wong W.H. (2001). Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects. Nucleic Acids Res 29:2549–57.PubMedCrossRefGoogle Scholar
  25. Wolfinger R.D., Gibson G., Wolfinger E.D., Bennett L., Hamadeh H., Bushel P., Afshari C., Paules R.S. (2001). Assessing gene significance from cDNA microarray expression data via mixed models. J Comput Biol 8:625–37.PubMedCrossRefGoogle Scholar
  26. Yang Y.H., Dudoit S., Luu P., Lin D.M., Peng V., Ngai J., Speed T.P. (2002). Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res 30:e15.PubMedGoogle Scholar

Copyright information

© Kluwer Academic Publishers 2003

Authors and Affiliations

  • Nicholas A. Tinker
    • 1
  • Laurian S. Robert
    • 1
  • Gail Butler
    • 1
  • Linda J. Harris
    • 1
  1. 1.ECORCAgriculture and Agri-Food CanadaOttawaCanada

Personalised recommendations