Skip to main content

Tools and Approaches for an End-to-End Expression Array Analysis

  • Chapter
Bioinformatics for Systems Biology

Abstract

Microarray experiments can appear daunting because the considerations called for in their analysis cover several fields of research. To understand the data microarrays generate some knowledge of classical statistics and recent complexity theory are useful while emerging computational techniques such as XML directed workflows could aid in managing the data. These considerations are called for because as experimental tools, microarrays (arrays) exemplify the recent trend in biological research towards high dimensionality datasets. Until recently observations were made on only a few variables at a time and these were used to support or refute hypotheses, but high dimensionality datasets are generated by observing a very large number of variables (e.g. gene expression measurements) at the same time. The number of expression measurements made on arrays is not only high, but notably high when compared to the size of a typical sample population. This combination of high dimensionality and asymmetry leads to large datasets and fundamental problems when using standard approaches to interpret the data. An end-to-end approach is a general framework in which to place some useful considerations when planning an analysis. The framework described here explores the origins of signal and several sources of variance, approaches to representing high-throughput data, the statistical considerations when modeling array data and the software tools that can aid in carrying out the analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    This is the one-tailed alternate hypothesis a two-tailed alternate hypothesis could have multiple conditions such as H1 and H2 for over and under abundance relative to null condition.

Suggested Reading: Background to Microarray Technologies

  1. Zhang W, Shmulevich I, Astola J. Microarray quality control. Hoboken, N.J.: Wiley-Liss; 2004.

    Book  Google Scholar 

  2. Speed TP. Statistical analysis of gene expression microarray data. Boca Raton, FL: Chapman & Hall/CRC; 2003.

    Book  Google Scholar 

  3. Lee ML, Whitmore GA. Power and sample size for DNA microarray studies. Statistics in medicine 2002;21(23):3543–3570.

    Article  PubMed  Google Scholar 

  4. Do K-A, Müller P, Vannucci M. Bayesian inference for gene expression and proteomics. Cambridge; New York: Cambridge University Press; 2006.

    Book  Google Scholar 

  5. Bentley DR. Whole-genome re-sequencing. Current opinion in genetics & development 2006;16(6):545-52.

    Article  CAS  Google Scholar 

  6. Heng HH, Stevens JB, Liu G, et al. Stochastic cancer progression driven by non-clonal chromosome aberrations. Journal of cellular physiology 2006;208(2):461–472.

    Article  CAS  PubMed  Google Scholar 

  7. Martins RP, Krawetz SA. Decondensing the protamine domain for transcription. Proceedings of the National Academy of Sciences of the United States of America 2007;104(20):8340–8345.

    Google Scholar 

  8. Martin S, Pombo A. Transcription factories: quantitative studies of nanostructures in the mammalian nucleus. Chromosome Res 2003;11(5):461–470.

    Article  CAS  PubMed  Google Scholar 

  9. Martins RP, Ostermeier GC, Krawetz SA. Nuclear matrix interactions at the human protamine domain: a working model of potentiation. The Journal of biological chemistry 2004;279(50):51862–51868.

    Article  CAS  PubMed  Google Scholar 

  10. Wilusz CJ, Wilusz J. Bringing the role of mRNA decay in the control of gene expression into focus. Trends Genet 2004;20(10):491–497.

    Article  CAS  PubMed  Google Scholar 

  11. Moore MJ. From birth to death: the complex lives of eukaryotic mRNAs. Science New York, NY 2005;309(5740):1514–1518.

    CAS  Google Scholar 

  12. Wang Y, Liu CL, Storey JD, Tibshirani RJ, Herschlag D, Brown PO. Precision and functional specificity in mRNA decay. Proceedings of the National Academy of Sciences of the United States of America 2002;99(9):5860–5865.

    Google Scholar 

  13. Meizel S. The sperm, a neuron with a tail: ‘neuronal’ receptors in mammalian sperm. Biological reviews of the Cambridge Philosophical Society 2004;79(4):713–732.

    Article  PubMed  Google Scholar 

  14. Hargrove JL, Schmidt FH. The role of mRNA and protein stability in gene expression. Faseb J. 1989;3(12):2360–2370.

    CAS  PubMed  Google Scholar 

  15. Schwartz DR, Moin K, Yao B, et al. Hu/Mu ProtIn oligonucleotide microarray: dual-species array for profiling protease and protease inhibitor gene expression in tumors and their microenvironment. Mol Cancer Res 2007;5(5):443–454.

    Article  CAS  PubMed  Google Scholar 

  16. Dallas PB, Gottardo NG, Firth MJ, et al. Gene expression levels assessed by oligonucleotide microarray analysis and quantitative real-time RT-PCR — how well do they correlate? BMC genomics 2005;6(1):59.

    Article  PubMed  Google Scholar 

Signal Analysis and Modeling

  1. Qiu W, Lee ML. SPCalc: A web-based calculator for sample size and power calculations in micro-array studies. Bioinformation 2006;1(7):251–252.

    PubMed  Google Scholar 

  2. Seo J, Gordish-Dressman H, Hoffman EP. An interactive power analysis tool for microarray hypothesis testing and generation. Bioinformatics (Oxford, England) 2006;22(7):808–814.

    Article  CAS  Google Scholar 

  3. ENCODE. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 2007;447(7146):799–816.

    Google Scholar 

  4. Draghici S. Data analysis tools for DNA microarrays. Boca Raton: Chapman & Hall/CRC; 2003.

    Book  Google Scholar 

  5. Choe SE, Boutros M, Michelson AM, Church GM, Halfon MS. Preferred analysis methods for Affymetrix GeneChips revealed by a wholly defined control dataset. Genome biology 2005;6(2):R16.

    Article  PubMed  Google Scholar 

  6. Hoffmann R, Seidl T, Dugas M. Profound effect of normalization on detection of differentially expressed genes in oligonucleotide microarray data analysis. Genome biology 2002;3(7):RESEARCH0033.

    Google Scholar 

  7. Dabney AR, Storey JD. A new approach to intensity-dependent normalization of two-channel microarrays. Biostatistics (Oxford, England) 2007;8(1):128–139.

    Google Scholar 

  8. Irizarry RA, Wu Z, Jaffee HA. Comparison of Affymetrix GeneChip expression measures. Bioinformatics (Oxford, England) 2006;22(7):789–794.

    Article  CAS  Google Scholar 

  9. Online document: http://www.ambion.com/techlib/tn/111/8.html.

  10. Williamson SH, Hubisz MJ, Clark AG, Payseur BA, Bustamante CD, Nielsen R. Localizing Recent Adaptive Evolution in the Human Genome. PLoS Genet 2007;3(6):e90.

    Article  PubMed  Google Scholar 

  11. Ptitsyn AA, Zvonic S, Gimble JM. Digital Signal Processing Reveals Circadian Baseline Oscillation in Majority of Mammalian Genes. PLoS Comput Biol 2007;3(6):e120.

    Article  PubMed  Google Scholar 

  12. Tomita H, Vawter MP, Walsh DM, et al. Effect of agonal and postmortem factors on gene expression profile: quality control in microarray analyses of postmortem human brain. Biological psychiatry 2004;55(4):346–352.

    Article  CAS  PubMed  Google Scholar 

Statistical Approaches

  1. Wu B. Differential gene expression detection and sample classification using penalized linear regression models. Bioinformatics (Oxford, England) 2006;22(4):472–476.

    Article  CAS  Google Scholar 

  2. Robson B. Clinical and pharmacogenomic data mining: 3. Zeta theory as a general tactic for clinical bioinformatics. Journal of proteome research 2005;4(2):445–455.

    Article  CAS  PubMed  Google Scholar 

  3. Baldi P, Brunak S. Bioinformatics : the machine learning approach. 2nd ed. Cambridge, Mass: MIT Press; 2001.

    Google Scholar 

  4. Carlin BP, Louis TA. Bayes and Empirical Bayes methods for data analysis. 2nd ed. Boca Raton: Chapman & Hall/CRC; 2000.

    Book  Google Scholar 

  5. Benjamini, Y. Yekutieli, D. The Control of the false discovery rate in multiple testing under dependency. The Annals of Statistics 2001;29(4):1165–1188.

    Article  Google Scholar 

  6. Westfall PH, Young SS. Resampling-based multiple testing: examples and methods for P-value adjustment. New York: Wiley; 1993.

    Google Scholar 

  7. Irizarry RA, Warren D, Spencer F, et al. Multiple-laboratory comparison of microarray platforms. Nature methods 2005;2(5):345–350.

    Article  CAS  PubMed  Google Scholar 

  8. Jain AK, Murty MN, Flynn PJ. Data clustering: a review. ACM Computing Surveys 1999;31(3):264–323.

    Article  Google Scholar 

  9. Eisen MB, Spellman PT, Brown PO, Botstein D. Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences of the United States of America 1998;95(25):14863–14868.

    Google Scholar 

  10. Subramanian A, Tamayo P, Mootha VK, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proceedings of the National Academy of Sciences of the United States of America 2005;102(43):15545–15550.

    Google Scholar 

  11. Tibshirani RJ, Efron B. On testing the significance of sets of genes. The Annals of Applied Statistics 2007;1(1):107–129.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Adrian E. Platts .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Humana Press, a part of Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Platts, A.E., Krawetz, S.A. (2009). Tools and Approaches for an End-to-End Expression Array Analysis. In: Krawetz, S. (eds) Bioinformatics for Systems Biology. Humana Press. https://doi.org/10.1007/978-1-59745-440-7_13

Download citation

Publish with us

Policies and ethics