Skip to main content

A Semiparametric Bayesian Method of Clustering Genes Using Time-Series of Expression Profiles

  • Chapter
  • First Online:
Advances in Directional and Linear Statistics

Abstract

An increasing number of microarray experiments look at expression levels of genes over the course of several points in time. In this article, we present two models for clustering such time series of expression profiles. We use nonparametric Bayesian methods which make the models robust to misspecifications and provide a natural framework for clustering of the genes through the use of Dirichlet process priors. Unlike other clustering techniques, the resulting number of clusters is completely data driven. We demonstrate the effectiveness of our methodology using simulation studies with artificial data as well as through an application to a real data set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Antoniak CE (1974) Mixtures of Dirichlet processes with applications to nonparametric problems. Ann Stat 2:1152–1174

    Article  MATH  MathSciNet  Google Scholar 

  2. Bar-Joseph Z, Gerber G, Jaakkola T, Gifford D, Simon I (2003) Continuous representations of time series gene expression data. J Comput Biol 3:341–356

    Article  Google Scholar 

  3. Dahl D (2006) Model-based clustering for expression data via a Dirichlet process mixture model. In: Bayesian inference for gene expression and proteomics. Cambridge University Press, Cambridge, pp 201–218

    Google Scholar 

  4. Escobar M, West M (1995) Bayesian density estimation and inference using mixtures. J Am Stat Assoc 90(430):577–588

    Article  MATH  MathSciNet  Google Scholar 

  5. Ferguson TS (1973) A Bayesian analysis of some nonparametric problems. Ann Stat 1(2): 209–230

    Article  MATH  MathSciNet  Google Scholar 

  6. Gilks WR, Best NG, Tan KKC (1995) Adaptive rejection metropolis sampling. Appl Stat 44: 455–472

    Article  MATH  Google Scholar 

  7. Gilks WR, Wild P (1992) Adaptive rejection sampling for Gibbs sampling. Appl Stat 41(2): 337–348

    Article  MATH  Google Scholar 

  8. Liu X, Sivaganesan S, Yeung K, Guo J, Baumgarner RE, Medvedovic M (2006) Context-specific infinite mixtures for clustering gene expression profiles across diverse microarray data set. Bioinformatics 22:1737–1744

    Article  Google Scholar 

  9. Medvedovic M, Sivaganesan S (2002) Bayesian infinite mixture model based clustering of gene expression profiles. Bioinformatics 18(9):1194–1206

    Article  Google Scholar 

  10. Medvedovic M, Yeung KY, Baumgarner R (2004) Bayesian mixture model based clustering of replicated microarray data. Bioinformatics 20:1222–1232

    Article  Google Scholar 

  11. Neal RM (2000) Markov chain sampling methods for Dirichlet process mixture models. J Comput Graph Stat 9(2):249–265

    Article  MathSciNet  Google Scholar 

  12. Qian J, Dolled-Filhart M, Lin J, Yu H, Gerstein M (2001) Beyond synexpression relationships: local clustering of time-shifted and inverted gene expression profiles identifies new, biologically relevant interactions. J Mol Biol 314:1053–1066

    Article  Google Scholar 

  13. Singh R, Palmer N, Gifford D, Berger B, Bar-Joseph Z (2005) Active learning for sampling in time-series experiments with application to gene expression analysis. In: ICML ’05: proceedings of the 22nd international conference on Machine learning. ACM, New York, pp 832–839

    Google Scholar 

  14. Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, Eisen MB, Botstein D, Futcher B (1998) Comprehensive identification of cell cycle-regulated genes of the yeast saccharomyces cervisiae by microarray hybridization. Mol Biol Cell 9(12):3273–3297

    Google Scholar 

  15. Yuan M, Kendziorski C (2006) Hidden Markov models for microarray time course data in multiple biological conditions. J Am Stat Assoc 101(476):1323–1332

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arvind K. Jammalamadaka .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Jammalamadaka, A.K., Ghosh, K. (2011). A Semiparametric Bayesian Method of Clustering Genes Using Time-Series of Expression Profiles. In: Wells, M., SenGupta, A. (eds) Advances in Directional and Linear Statistics. Physica-Verlag HD. https://doi.org/10.1007/978-3-7908-2628-9_6

Download citation

Publish with us

Policies and ethics