Skip to main content
Log in

Bayesian model-based tight clustering for time course data

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

Cluster analysis has been widely used to explore thousands of gene expressions from microarray analysis and identify a small number of similar genes (objects) for further detailed biological investigation. However, most clustering algorithms tend to identify loose clusters with too many genes. In this paper, we propose a Bayesian tight clustering method for time course gene expression data, which selects a small number of closely-related genes and constructs tight clusters only with these closely-related genes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Basford KE, McLachlan GJ (1985) Likelihood estimation with normal mixture models. Appl Stat 34: 282–289

    Article  MathSciNet  Google Scholar 

  • Basford KE, Greenway DR, McLachlan GJ, Peel D (1997) Standard errors of fitted means under normal mixture models. Comput Stat 12: 1–17

    MATH  Google Scholar 

  • Booth J, Casella G, Hobert J (2008) Clustering using objective functions and stochastic search. J R Stat Soc B 70(1): 119–140

    MATH  MathSciNet  Google Scholar 

  • Costa IG, Carvalho FAT, Souto MCP (2004) Comparative analysis of clustering methods for gene expression time course data. Genet Mol Biol 27: 623–631

    Google Scholar 

  • Crowley EM (1997) Product partition models for normal means. J Am Stat Assoc 92: 192–198

    Article  MATH  Google Scholar 

  • Datta S, Datta S (2003) Comparisons and validation of statistical clustering techniques for microarray gene expression data. Bioinformatics 19: 459–466

    Article  Google Scholar 

  • Ghosh D, Chinnaiyan AM (2002) Mixture modelling of gene expression data from microarray experiments. Bioinformatics 18: 275–286

    Article  Google Scholar 

  • Hakamada K, Okamoto M, Hanai T (2006) Novel technique for preprocessing high dimensional time-course data from DNA microarray: mathematical model-based clustering. Bioinformatics 22: 843–848

    Article  Google Scholar 

  • Hartigan JA (1991) Partition models. Commun Stat Theory Methods 19: 2745–2756

    MathSciNet  Google Scholar 

  • Hartigan JA, Wong MA (1979) Algorithm AS 136: a K-means clustering algorithm. Appl Stat 28: 100–108

    Article  MATH  Google Scholar 

  • James GM, Sugar CA (2003) Clustering for sparsely sampled functional data. J Am Stat Assoc 98: 397–408

    Article  MATH  MathSciNet  Google Scholar 

  • Jerrum M, Sinclair A (1996) The Markov Chain Monte Carlo method: an approach to approximate counting and integration. In: Approximation algorithms for NP-hard problems. PWS Publishing, Boston

  • Johnson RA, Wichern DW (2002) Applied multivariate statistical analysis, 5th edn. Prentice Hall, Upper Saddle River

    Google Scholar 

  • Leng X, Muller H (2006) Classification using functional data analysis for temporal gene expression data. Bioinformatics 22: 68–76

    Article  Google Scholar 

  • Luan Y, Li H (2003) Clustering of time-course gene expression data using a mixed-effects model with B-splines. Bioinformatics 19: 474–482

    Article  Google Scholar 

  • Lukashin AV, Fuchs R (2001) Analysis of temporal gene expression profiles: clustering by simulated annealing and determining the optimal number of clusters. Bioinformatics 17: 405–414

    Article  Google Scholar 

  • Ma P, Castillo-Davis CI, Zhong W, Liu JS (2006) A data-driven clustering method for time course gene expression data. Nucleic Acids Res 34: 1261–1269

    Article  Google Scholar 

  • McLachlan GJ, Baford KE (1988) Mixture models: inference and applications to clustering. Marcel Dekker, Inc., New York

    MATH  Google Scholar 

  • McLachlan GJ, Bean RW, Peel D (2002) A mixture model-based approach to the clustering of microarray expression data. Bioinformatics 18: 413–422

    Article  Google Scholar 

  • Ng SK, McLachlan GJ, Wang K, Jones LB, Ng SW (2006) A mixture model with random-effects components for clustering correlated gene-expression profiles. Bioinformatics 22: 1745–1752

    Article  Google Scholar 

  • Ouyang M, Welsh WJ, Georgopoulos P (2004) Gaussian mixture clustering and imputation of microarray data. Bioinformatics 20: 917–923

    Article  Google Scholar 

  • Park T, Yi S, Lee S, Lee SY, Yoo D, Ahn J, Lee Y (2003) Statistical tests for identifying differentially expressed genes in time-course microarray experiments. Bioinformatics 19: 694–703

    Article  Google Scholar 

  • Peddada SD, Lobenhofer EK, Li L, Afshari CA, Weinberg CR, Umbach DM (2003) Gene selection and clustering for time-course and dose response microarray experiments using order-restricted inference. Bioinformatics 19: 834–841

    Article  Google Scholar 

  • Pitman J (1997) Some probabilistic aspects of set partitions. Am Math Mon 104: 201–209

    Article  MATH  MathSciNet  Google Scholar 

  • Ruppert D, Wand MP, Caroll RJ (2003) Semiparametric regression. Cambridge University Press, New York

    MATH  Google Scholar 

  • Schliep A, Schonhuth A, Steinhoff C (2003) Using hidden Markov models to analyze gene expression time course data. Bioinformatics 19: i255–i263

    Article  Google Scholar 

  • Thalamuthu A, Mukhopadhyay I, Zheng X, Tseng GC (2006) Evaluation and comparison of gene clustering methods in microarray analysis. Bioinformatics 22: 2405–2412

    Article  Google Scholar 

  • Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc B 58: 267–288

    MATH  MathSciNet  Google Scholar 

  • Tseng GC, Wong WH (2005) Tight clustering: a resampling-based approach for identifying stable and tight patterns in data. Biometrics 61: 10–16

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yongsung Joo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Joo, Y., Casella, G. & Hobert, J. Bayesian model-based tight clustering for time course data. Comput Stat 25, 17–38 (2010). https://doi.org/10.1007/s00180-009-0159-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-009-0159-7

Keywords

Navigation