Abstract
An increasing number of microarray experiments look at expression levels of genes over the course of several points in time. In this article, we present two models for clustering such time series of expression profiles. We use nonparametric Bayesian methods which make the models robust to misspecifications and provide a natural framework for clustering of the genes through the use of Dirichlet process priors. Unlike other clustering techniques, the resulting number of clusters is completely data driven. We demonstrate the effectiveness of our methodology using simulation studies with artificial data as well as through an application to a real data set.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Antoniak CE (1974) Mixtures of Dirichlet processes with applications to nonparametric problems. Ann Stat 2:1152–1174
Bar-Joseph Z, Gerber G, Jaakkola T, Gifford D, Simon I (2003) Continuous representations of time series gene expression data. J Comput Biol 3:341–356
Dahl D (2006) Model-based clustering for expression data via a Dirichlet process mixture model. In: Bayesian inference for gene expression and proteomics. Cambridge University Press, Cambridge, pp 201–218
Escobar M, West M (1995) Bayesian density estimation and inference using mixtures. J Am Stat Assoc 90(430):577–588
Ferguson TS (1973) A Bayesian analysis of some nonparametric problems. Ann Stat 1(2): 209–230
Gilks WR, Best NG, Tan KKC (1995) Adaptive rejection metropolis sampling. Appl Stat 44: 455–472
Gilks WR, Wild P (1992) Adaptive rejection sampling for Gibbs sampling. Appl Stat 41(2): 337–348
Liu X, Sivaganesan S, Yeung K, Guo J, Baumgarner RE, Medvedovic M (2006) Context-specific infinite mixtures for clustering gene expression profiles across diverse microarray data set. Bioinformatics 22:1737–1744
Medvedovic M, Sivaganesan S (2002) Bayesian infinite mixture model based clustering of gene expression profiles. Bioinformatics 18(9):1194–1206
Medvedovic M, Yeung KY, Baumgarner R (2004) Bayesian mixture model based clustering of replicated microarray data. Bioinformatics 20:1222–1232
Neal RM (2000) Markov chain sampling methods for Dirichlet process mixture models. J Comput Graph Stat 9(2):249–265
Qian J, Dolled-Filhart M, Lin J, Yu H, Gerstein M (2001) Beyond synexpression relationships: local clustering of time-shifted and inverted gene expression profiles identifies new, biologically relevant interactions. J Mol Biol 314:1053–1066
Singh R, Palmer N, Gifford D, Berger B, Bar-Joseph Z (2005) Active learning for sampling in time-series experiments with application to gene expression analysis. In: ICML ’05: proceedings of the 22nd international conference on Machine learning. ACM, New York, pp 832–839
Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, Eisen MB, Botstein D, Futcher B (1998) Comprehensive identification of cell cycle-regulated genes of the yeast saccharomyces cervisiae by microarray hybridization. Mol Biol Cell 9(12):3273–3297
Yuan M, Kendziorski C (2006) Hidden Markov models for microarray time course data in multiple biological conditions. J Am Stat Assoc 101(476):1323–1332
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Jammalamadaka, A.K., Ghosh, K. (2011). A Semiparametric Bayesian Method of Clustering Genes Using Time-Series of Expression Profiles. In: Wells, M., SenGupta, A. (eds) Advances in Directional and Linear Statistics. Physica-Verlag HD. https://doi.org/10.1007/978-3-7908-2628-9_6
Download citation
DOI: https://doi.org/10.1007/978-3-7908-2628-9_6
Published:
Publisher Name: Physica-Verlag HD
Print ISBN: 978-3-7908-2627-2
Online ISBN: 978-3-7908-2628-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)