Advertisement

Statistics and Computing

, Volume 16, Issue 4, pp 355–362 | Cite as

Bayesian model learning based on a parallel MCMC strategy

  • Jukka CoranderEmail author
  • Mats Gyllenberg
  • Timo Koski
Article

Abstract

We introduce a novel Markov chain Monte Carlo algorithm for estimation of posterior probabilities over discrete model spaces. Our learning approach is applicable to families of models for which the marginal likelihood can be analytically calculated, either exactly or approximately, given any fixed structure. It is argued that for certain model neighborhood structures, the ordinary reversible Metropolis-Hastings algorithm does not yield an appropriate solution to the estimation problem. Therefore, we develop an alternative, non-reversible algorithm which can avoid the scaling effect of the neighborhood. To efficiently explore a model space, a finite number of interacting parallel stochastic processes is utilized. Our interaction scheme enables exploration of several local neighborhoods of a model space simultaneously, while it prevents the absorption of any particular process to a relatively inferior state. We illustrate the advantages of our method by an application to a classification model. In particular, we use an extensive bacterial database and compare our results with results obtained by different methods for the same data.

Keywords

Bayesian analysis Markov chain Monte Carlo Model learning Parallel search 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Altekar G., Dwarkadas S., Huelsenbeck J.P., and Ronquist F. 2004. Parallel metropolis-coupled Markov chain Monte Carlo for Bayesian phylogenetic inference. Bioinformatics 20: 407–415.CrossRefGoogle Scholar
  2. Brooks S.P., Giudici P., and Roberts G.O. 2003. Efficient construction of reversible jump Markov chain Monte Carlo proposal distributions. J. Roy. Statist. Soc. B 65: 3–39.zbMATHMathSciNetCrossRefGoogle Scholar
  3. Carlin B.P. and Chib S. 1995. Bayesian model choice via Markov-chain Monte Carlo methods. J. Roy. Statist. Soc. B 57: 473–484.zbMATHGoogle Scholar
  4. Chib S. and Greenberg E. 1995. Understanding the Metropolis-Hastings algorithm. Amer. Statist. 49: 327–335.CrossRefGoogle Scholar
  5. Corander J., Gyllenberg M., and Koski T. 2006. Bayesian unsupervised classification framework based on stochastic partitions of data and a parallel search strategy. Submitted to J. Statist. Comput. Simulation.Google Scholar
  6. Corander J., Waldmann P., Marttinen P., and Sillanpää M.J. 2004. BAPS 2: enhanced possibilities for the analysis of genetic population structure. Bioinformatics 20: 2363–2369.CrossRefGoogle Scholar
  7. Diaconis P., Holmes S., and Neal R.M. 2000. Analysis of a nonreversible Markov chain sampler. Ann. App. Prob. 10: 726–752.zbMATHMathSciNetCrossRefGoogle Scholar
  8. Doucet A., Godsill S., and Andrieu C. 2000. On sequential Monte Carlo sampling methods for Bayesian filtering. Stat. Comput. 10: 197–208.CrossRefGoogle Scholar
  9. Farmer J.J., Davis B.R., and Hickmanbrenner F.W. 1985. Biochemical identification of new species and biogroups of Enterobacteriaceae isolated from clinical specimens. J. Clin. Microbiology 21: 46–76.Google Scholar
  10. Geyer C.J. and Thompson E.A. 1995. Annealing Markov chain Monte Carlo with applications to ancestral inference. J. Amer. Stat. Assoc. 90: 909–920.zbMATHCrossRefGoogle Scholar
  11. Gidas B. 1985. Nonstationary Markov chains and convergence of the annealing algorithm. J. Statist. Phys. 39: 73–130.zbMATHMathSciNetCrossRefGoogle Scholar
  12. Giudici P. and Castelo R. 2003. Improving Markov chain Monte Carlo search for data mining. Machine Learning 50: 127–158.zbMATHCrossRefGoogle Scholar
  13. Green P. 1995. Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika82: 711–732.zbMATHMathSciNetCrossRefGoogle Scholar
  14. Gyllenberg H.G., Gyllenberg M., Koski T., Lund T., Schindler J., and Verlaan, M. 1997. Classification of Enterobacteriaceae by minimization of stochastic complexity. Microbiology 143: 721–732.CrossRefGoogle Scholar
  15. Gyllenberg H.G., Gyllenberg M., Koski T., and Lund T. 1998. Stochastic complexity as a taxonomic tool. Computer Methods and Programs in Biomedicine 56: 11–22.CrossRefGoogle Scholar
  16. Gyllenberg H.G., Gyllenberg M., Koski T., Lund T., and Schindler J. 1999a. An assessment of cumulative classification. Quantitative Microbiology 1: 7–27.CrossRefGoogle Scholar
  17. Gyllenberg H.G., Gyllenberg M., Koski T., Lund T., and Schindler J. 1999b. Enterobacteriaceae taxonomy approached by minimization of stochastic complexity. Quantitative Microbiology 1: 157–170.CrossRefGoogle Scholar
  18. Gyllenberg H.G., Gyllenberg M., Koski T., Lund T., Mannila H., and Meek C. 1999c. Singling out ill-fit items in a classification. Application to the taxonomy of Enterobacteriaceae. Archives of Control Sciences 9: 97–105.MathSciNetzbMATHGoogle Scholar
  19. Gyllenberg M., Koski T., Lund T., and Gyllenberg H.G. 1999. Bayesian predictive identification and cumulative classification of bacteria. Bulletin of Mathematical Biology 61: 85–111.CrossRefGoogle Scholar
  20. Häggström O. 2002. Finite Markov Chains and Algorithmic Applications. Cambridge, Cambridge University Press.zbMATHGoogle Scholar
  21. Isaacson D.L. and Madsen R.W. 1976. Markov Chains: Theory and Applications, New York, Wiley.Google Scholar
  22. Jensen S.T., Liu S., Zhou Q., and Liu J.S. 2004. Computational discovery of gene regulatory binding motifs: A Bayesian perspective. Stat. Sci. 19: 188–204.zbMATHMathSciNetCrossRefGoogle Scholar
  23. Laskey K.B. and Myers J.W. 2003. Population Markov chain Monte Carlo. Machine Learning 50: 175–196.zbMATHCrossRefGoogle Scholar
  24. Robert C.P. and Casella G. 2005. Monte Carlo Statistical Methods. 2nd edition, New York, Springer.zbMATHGoogle Scholar
  25. Schervish M.J. 1995. Theory of Statistics. New York, Springer.zbMATHGoogle Scholar
  26. Sisson S.A. 2005. Transdimensional Markov chains: A decade of progress and future perspectives. J. Amer. Stat. Assoc. 100: 1077–1089.MathSciNetCrossRefzbMATHGoogle Scholar
  27. Tierney L.M. 1994. Markov chains for exploring posterior distributions. Ann. Statist. 22: 1701–1728.zbMATHMathSciNetGoogle Scholar

Copyright information

© Springer Science + Business Media, LLC 2006

Authors and Affiliations

  1. 1.Rolf Nevanlinna Institute, Department of Mathematics and StatisticsUniversity of HelsinkiFinland
  2. 2.Department of MathematicsUniversity of LinköpingLinköpingSweden

Personalised recommendations