Skip to main content

POE: Statistical Methods for Qualitative Analysis of Gene Expression

  • Chapter
The Analysis of Gene Expression Data

Part of the book series: Statistics for Biology and Health ((SBH))

Abstract

In many gene expression studies, the goals include discovery of novel biological classes and identification of genes whose expression can reliably be associated with these classes. Here we present a statistical analysis approach to facilitate both of these goals. The key idea is to model gene expression using latent categories that can be interpreted as a gene being turned “on“ or “off“ compared to a baseline level of expression. This three-way categorization is used for defining a reference in the unsupervised setting, for removing noise prior to clustering, for defining molecular subclasses in a way that is portable across platforms, and for defining easily interpretable probability-based distance measures for visualization, mining, and clustering.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, Rosenwald A, Boldrick JC, Sabet H, Tran T, Yu X, Powell JI, Yang L, Marti GE, Moore T, J. Hudson Jr J, Lu L, Lewis DB, Tibshirani R, Sherlock G, Chan WC, Greiner TC, Weisenburger DD, Armitage JO, Warnke R, Levy R, Wilson W, Grever MR, Byrd JC, Botstein D, Brown PO, Staudt LM (2000). Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403:503–511.

    Article  Google Scholar 

  • Berger JO (1985). Statistical Decision Theory and Bayesian Analysis, 2nd ed. New York: Springer-Verlag.

    Book  MATH  Google Scholar 

  • Bhattacharjee A, Richards WG, Staunton J, Li C, Monti S, Vasa P, Ladd C, Beheshti J, Bueno R, Gillette M, Loda M, Weber G, Mark EJ, Lander ES, Wong W, Johnson BE, Golub TR, Sugarbaker DJ, Meyerson M (2001). Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proceedings of the National Academy of Sciences USA 98:13790–13795.

    Article  Google Scholar 

  • Bittner M, Meltzer P, Chen Y, Jiang Y, Seftor E, Hendrix M, Radmacher M, Simon R, Yakhini Z, Ben-Dor A, Sampas N, Dougherty E, Wang W, Marincola F, Gooden C, Lueders J, Glatfelter A, Pollock P, Carpten J, Gillanders E, Leja D, Dietrich K, Beaudry C, Berens M, Alberts D, Sondak V, Hayward N, Trent J (2000). Molecular classification of cutaneous malignant melanoma by gene expression profiling. Nature 406:536–540.

    Article  Google Scholar 

  • Clyde MA, Parmigiani G, Vidakovic B (1998). Multiple shrinkage and subset selection in wavelets. Biometrika 85:391–402.

    Article  MATH  MathSciNet  Google Scholar 

  • Cowles MK, Carlin BP (1996). Markov chain Monte Carlo convergence diagnostics: A comparative review. Journal of the American Statistical Association 91:883–904.

    Article  MATH  MathSciNet  Google Scholar 

  • Diebolt J, Robert CP (1994). Estimation of finite mixture distributions through Bayesian sampling. Journal of the Royal Statistical Society, Series B 56:363–375.

    MATH  MathSciNet  Google Scholar 

  • Duggan D, Bittner M, Chen Y, Meltzer P, Trent J (1999). Expression profiling using cDNA microarrays. Nature Genetics 21:10–14.

    Article  Google Scholar 

  • Eisen MB, Spellman PT, Brown PO, Botstein D (1998). Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Science, USA 95:14863–14868.

    Article  Google Scholar 

  • Fraley C, Raftery AE (1998). How many clusters? Which clustering method? — Answers via model-based cluster analysis. Computer Journal 41:578–588.

    Article  MATH  Google Scholar 

  • Garber ME, Troyanskaya OG, Schluens K, Petersen S, Thaesler Z, Pacyna-Gengelbach M, van de Rijn M, Rosen GD, Perou CM, Whyte RI, Altman RB, Brown PO, Botstein D, Petersen I (2001). Diversity of gene expression in adenocarcinoma of the lung. Proceedings of the National Academy of Sciences USA 98:13784–13789.

    Article  Google Scholar 

  • George EI (1986). Minimax multiple shrinkage estimation. The Annals of Statistics 14:188–205.

    Article  MATH  MathSciNet  Google Scholar 

  • Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh M, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES (1999). Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286:531–537.

    Article  Google Scholar 

  • Hastie T, Tibshirani R, Eisen MB, Alizadeh A, Levy R, Staudt L, Chan WC, Botstein D, Brown P (2000). “Gene shaving“ as a method for identifying distinct sets of genes with similar expression patterns. Genome Biology 1:research0003.1–research0003.21.

    Google Scholar 

  • Kato K, Hida Y, Miyamoto M, Hashida H, Shinohara T, Itoh T, Okushiba S, Kondo S, Katoh H (2002). Overexpression of caveolin-1 in esophageal squamous cell carcinoma correlates with lymph node metastasis and pathologic stage. Cancer 94:929–933.

    Article  Google Scholar 

  • Lee ML, Kuo FC, Whitmore GA, Sklar J (2000). Importance of replication in microarray gene expression studies: Statistical methods and evidence from repetitive cDNA hybridizations. Proceedings of the National Academy of Sciences USA 97(18):9834–9839.

    Article  MATH  Google Scholar 

  • McLachlan GJ, Bean RW, DP (2002). A mixture model-based approach to the clustering of microarray expression data. Bioinformatics 18:413–422.

    Article  Google Scholar 

  • Parmigiani G, Garrett ES, Anbazhagan R, Gabrielson E (2002). A statistical framework for expression-based molecular classification in cancer. Journal of the Royal Statistical Society, Series B, to appear.

    Google Scholar 

  • Quackenbush J (2001). Computational analysis of microarray data. Nature Reviews Genetics 2:418–427.

    Article  Google Scholar 

  • Rousseeuw P, Struyf A, Hubert M (1996). Clustering in an object-oriented environment. Journal of Statistical Software 1:1–30.

    Google Scholar 

  • Walliman T, Hemmer W (1994). Creatine kinase in non-muscle tissues and cells. Molecular Cell Biochemistry 133–134:193–220.

    Article  Google Scholar 

  • West M, Turner D (1994). Deconvolution of mixtures in analysis of neural synaptic transmission. The Statistician 43:31–43.

    Article  Google Scholar 

  • Yang G, Truong L, Wheeler T, Park S, Nasu Y, Bangma M, Kattan P, Scardino P, Thompson T (1998). Elevated expression of caveolin is associated with prostate and breast cancer. Clinical Cancer Research 4:1873–1880.

    Google Scholar 

  • Yeung K, Fraley C, Murua A, Raftery A, Ruzzo W (2001). Model-based clustering and data transformations for gene expression data. Bioinformatics 17:977–987.

    Article  Google Scholar 

Download references

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag New York, Inc.

About this chapter

Cite this chapter

Garrett, E.S., Parmigiani, G. (2003). POE: Statistical Methods for Qualitative Analysis of Gene Expression. In: Parmigiani, G., Garrett, E.S., Irizarry, R.A., Zeger, S.L. (eds) The Analysis of Gene Expression Data. Statistics for Biology and Health. Springer, New York, NY. https://doi.org/10.1007/0-387-21679-0_16

Download citation

  • DOI: https://doi.org/10.1007/0-387-21679-0_16

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-0-387-95577-3

  • Online ISBN: 978-0-387-21679-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics