A Semiparametric Bayesian Method of Clustering Genes Using Time-Series of Expression Profiles

Jammalamadaka, Arvind K.; Ghosh, Kaushik

doi:10.1007/978-3-7908-2628-9_6

Arvind K. Jammalamadaka³ &
Kaushik Ghosh

1172 Accesses

Abstract

An increasing number of microarray experiments look at expression levels of genes over the course of several points in time. In this article, we present two models for clustering such time series of expression profiles. We use nonparametric Bayesian methods which make the models robust to misspecifications and provide a natural framework for clustering of the genes through the use of Dirichlet process priors. Unlike other clustering techniques, the resulting number of clusters is completely data driven. We demonstrate the effectiveness of our methodology using simulation studies with artificial data as well as through an application to a real data set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Antoniak CE (1974) Mixtures of Dirichlet processes with applications to nonparametric problems. Ann Stat 2:1152–1174
Article MATH MathSciNet Google Scholar
Bar-Joseph Z, Gerber G, Jaakkola T, Gifford D, Simon I (2003) Continuous representations of time series gene expression data. J Comput Biol 3:341–356
Article Google Scholar
Dahl D (2006) Model-based clustering for expression data via a Dirichlet process mixture model. In: Bayesian inference for gene expression and proteomics. Cambridge University Press, Cambridge, pp 201–218
Google Scholar
Escobar M, West M (1995) Bayesian density estimation and inference using mixtures. J Am Stat Assoc 90(430):577–588
Article MATH MathSciNet Google Scholar
Ferguson TS (1973) A Bayesian analysis of some nonparametric problems. Ann Stat 1(2): 209–230
Article MATH MathSciNet Google Scholar
Gilks WR, Best NG, Tan KKC (1995) Adaptive rejection metropolis sampling. Appl Stat 44: 455–472
Article MATH Google Scholar
Gilks WR, Wild P (1992) Adaptive rejection sampling for Gibbs sampling. Appl Stat 41(2): 337–348
Article MATH Google Scholar
Liu X, Sivaganesan S, Yeung K, Guo J, Baumgarner RE, Medvedovic M (2006) Context-specific infinite mixtures for clustering gene expression profiles across diverse microarray data set. Bioinformatics 22:1737–1744
Article Google Scholar
Medvedovic M, Sivaganesan S (2002) Bayesian infinite mixture model based clustering of gene expression profiles. Bioinformatics 18(9):1194–1206
Article Google Scholar
Medvedovic M, Yeung KY, Baumgarner R (2004) Bayesian mixture model based clustering of replicated microarray data. Bioinformatics 20:1222–1232
Article Google Scholar
Neal RM (2000) Markov chain sampling methods for Dirichlet process mixture models. J Comput Graph Stat 9(2):249–265
Article MathSciNet Google Scholar
Qian J, Dolled-Filhart M, Lin J, Yu H, Gerstein M (2001) Beyond synexpression relationships: local clustering of time-shifted and inverted gene expression profiles identifies new, biologically relevant interactions. J Mol Biol 314:1053–1066
Article Google Scholar
Singh R, Palmer N, Gifford D, Berger B, Bar-Joseph Z (2005) Active learning for sampling in time-series experiments with application to gene expression analysis. In: ICML ’05: proceedings of the 22nd international conference on Machine learning. ACM, New York, pp 832–839
Google Scholar
Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, Eisen MB, Botstein D, Futcher B (1998) Comprehensive identification of cell cycle-regulated genes of the yeast saccharomyces cervisiae by microarray hybridization. Mol Biol Cell 9(12):3273–3297
Google Scholar
Yuan M, Kendziorski C (2006) Hidden Markov models for microarray time course data in multiple biological conditions. J Am Stat Assoc 101(476):1323–1332
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA, 02139, USA
Arvind K. Jammalamadaka

Authors

Arvind K. Jammalamadaka
View author publications
You can also search for this author in PubMed Google Scholar
Kaushik Ghosh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Arvind K. Jammalamadaka .

Editor information

Editors and Affiliations

, Statistical Science, Cornell University, Comstock Hall 1190, Ithaca, NY 14853, 14853, USA
Martin T. Wells
, Applied Statistics Unit, Indian Statistical Institute, Barrackpore Trunk Road 203, Kolkata, 700035, India
Ashis SenGupta

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Jammalamadaka, A.K., Ghosh, K. (2011). A Semiparametric Bayesian Method of Clustering Genes Using Time-Series of Expression Profiles. In: Wells, M., SenGupta, A. (eds) Advances in Directional and Linear Statistics. Physica-Verlag HD. https://doi.org/10.1007/978-3-7908-2628-9_6

Download citation

DOI: https://doi.org/10.1007/978-3-7908-2628-9_6
Published: 27 September 2010
Publisher Name: Physica-Verlag HD
Print ISBN: 978-3-7908-2627-2
Online ISBN: 978-3-7908-2628-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics