Machine Learning

, Volume 50, Issue 1–2, pp 5–43 | Cite as

An Introduction to MCMC for Machine Learning

  • Christophe Andrieu
  • Nando de Freitas
  • Arnaud Doucet
  • Michael I. Jordan
Article

Abstract

This purpose of this introductory paper is threefold. First, it introduces the Monte Carlo method with emphasis on probabilistic machine learning. Second, it reviews the main building blocks of modern Markov chain Monte Carlo simulation, thereby providing and introduction to the remaining papers of this special issue. Lastly, it discusses new interesting research horizons.

Markov chain Monte Carlo MCMC sampling stochastic algorithms 

References

  1. Al-Qaq, W. A., Devetsikiotis, M., &; Townsend, J. K. (1995). Stochastic gradient optimization of importance sampling for the efficient simulation of digital communication systems. IEEE Transactions on Communications, 43:12, 2975-2985.Google Scholar
  2. Albert, J., &; Chib, S. (1993). Bayesian analysis of binary and polychotomous response data. Journal of the American Statistical Association, 88:422, 669-679.Google Scholar
  3. Anderson, H. L. (1986). Metropolis, Monte Carlo, and the MANIAC. Los Alamos Science, 14, 96-108.Google Scholar
  4. Andrieu, C., &; Doucet, A. (1999). Joint Bayesian detection and estimation of noisy sinusoids via reversible jump MCMC. IEEE Transactions on Signal Processing, 47:10, 2667-2676.Google Scholar
  5. Andrieu, C., Breyer, L. A., &; Doucet, A. (1999). Convergence of simulated annealing using Foster-Lyapunov criteria. Technical Report CUED/F-INFENG/TR 346, Cambridge University Engineering Department.Google Scholar
  6. Andrieu, C., de Freitas, N., &; Doucet, A. (1999). Sequential MCMC for Bayesian model selection. In IEEE Higher Order Statistics Workshop, Caesarea, Israel (pp. 130-134).Google Scholar
  7. Andrieu, C., de Freitas, N., &; Doucet, A. (2000a). Reversible jump MCMC simulated annealing for neural networks. In Uncertainty in artificial intelligence (pp. 11-18). San Mateo, CA: Morgan Kaufmann.Google Scholar
  8. Andrieu, C., de Freitas, N., &; Doucet, A. (2000b). Robust full Bayesian methods for neural networks. In S. A. Solla, T. K. Leen, &; K.-R. Müller (Eds.), Advances in neural information processing systems 12 (pp. 379-385). MIT Press.Google Scholar
  9. Andrieu, C., de Freitas, N., &; Doucet, A. (2001a). Robust full Bayesian learning for radial basis networks. Neural Computation, 13:10, 2359-2407.Google Scholar
  10. Andrieu, C., de Freitas, N., &; Doucet, A. (2001b). Rao-blackwellised particle filtering via data augmentation. Advances in Neural Information Processing Systems (NIPS13).Google Scholar
  11. Andrieu, C., Doucet, A., &; Punskaya, E. (2001). Sequential Monte Carlo methods for optimal filtering. In A Doucet, N. de Freitas, &; N. J. Gordon (Eds.), Sequential Monte Carlo methods in practice. Berlin: Springer-Verlag.Google Scholar
  12. Applegate, D., &; Kannan, R. (1991). Sampling and integration of near log-concave functions. In Proceedings of the Twenty Third Annual ACM Symposium on Theory of Computing (pp. 156-163).Google Scholar
  13. Bar-Yossef, Z., Berg, A., Chien, S., Fakcharoenphol, J., &; Weitz, D. (2000). Approximating aggregate queries about web pages via random walks. In International Conference on Very Large Databases (pp. 535-544).Google Scholar
  14. Barber, D., &; Williams, C. K. I. (1997). Gaussian processes for Bayesian classification via hybrid Monte Carlo. In M. C. Mozer, M. I. Jordan, &; T. Petsche (Eds.), Advances in neural information processing systems 9 (pp. 340-346). Cambridge, MA: MIT Press.Google Scholar
  15. Baum, L. E., Petrie, T., Soules, G., &; Weiss, N. (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Annals of Mathematical Statistics, 41, 164-171.Google Scholar
  16. Baxter, R. J. (1982). Exactly solved models in statistical mechanics. San Diego, CA: Academic Press.Google Scholar
  17. Beichl, I., &; Sullivan, F. (2000). The Metropolis algorithm. Computing in Science &; Engineering, 2:1, 65-69.Google Scholar
  18. Bergman, N. (1999). Recursive Bayesian estimation: Navigation and tracking applications. Ph.D. Thesis, Department of Electrical Engineering, Linköping University, Sweden.Google Scholar
  19. Berners-Lee, T., Cailliau, R., Luotonen, A., Nielsen, H. F., &; Secret, A. (1994). The World-Wide Web. Communications of the ACM, 10:4, 49-63.Google Scholar
  20. Besag, J., Green, P. J., Hidgon, D., &; Mengersen, K. (1995). Bayesian computation and stochastic systems. Statistical Science, 10:1, 3-66.Google Scholar
  21. Bielza, C., Müller, P., &; Rios Insua, D. (1999). Decision Analysis by Augmented Probability Simulation, Management Science, 45:7, 995-1007.Google Scholar
  22. Brooks, S. P. (1998). Markov chain Monte Carlo method and its application. The Statistician, 47:1 69-100.Google Scholar
  23. Browne, W. J., &; Draper, D. (2000). Implementation and performance issues in the Bayesian and likelihood fitting of multilevel models. Computational Statistics 15, 391-420.Google Scholar
  24. Bucher, C. G. (1988). Adaptive sampling-An iterative fast Monte Carlo procedure. Structural Safety, 5, 119-126.Google Scholar
  25. Bui, H. H., Venkatesh, S., &; West, G. (1999). On the recognition of abstract Markov policies. In National Conference on Artificial Intelligence (AAAI-2000).Google Scholar
  26. Carlin, B. P., &; Chib, S. (1995). Bayesian Model choice via MCMC. Journal of the Royal Statistical Society Series B, 57, 473-484.Google Scholar
  27. Carter, C. K., &; Kohn, R. (1994). On Gibbs sampling for state space models. Biometrika, 81:3, 541-553.Google Scholar
  28. Casella, G., &; Robert, C. P. (1996). Rao-Blackwellisation of sampling schemes. Biometrika, 83:1, 81-94.Google Scholar
  29. Casella, G., Mengersen, K. L., Robert, C. P., &; Titterington, D. M. (1999). Perfect slice samplers for mixtures of distributions. Technical Report BU-1453-M, Department of Biometrics, Cornell University.Google Scholar
  30. Celeux, G., &; Diebolt, J. (1985). The SEM algorithm: A probabilistic teacher algorithm derived from the EM algorithm for the mixture problem. Computational Statistics Quarterly, 2, 73-82.Google Scholar
  31. Celeux, G., &; Diebolt, J. (1992). A stochastic approximation typeEMalgorithm for the mixture problem. Stochastics and Stochastics Reports, 41, 127-146.Google Scholar
  32. Chen, M. H., Shao, Q. M., &; Ibrahim, J. G. (Eds.) (2001). Monte Carlo methods for Bayesian computation. Berlin: Springer-Verlag.Google Scholar
  33. Cheng, J., &; Druzdzel, M. J. (2000). AIS-BN:An adaptive importance sampling algorithm for evidential reasoning in large bayesian networks. Journal of Artificial Intelligence Research, 13, 155-188.Google Scholar
  34. Chenney, S., &; Forsyth, D. A. (2000). Sampling plausible solutions to multi-body constraint problems. SIGGRAPH (pp. 219-228).Google Scholar
  35. Clark, E., &; Quinn, A. (1999). A data-driven Bayesian sampling scheme for unsupervised image segmentation. In IEEE International Conference on Acoustics, Speech, and Signal Processing, Arizona (Vol. 6, pp. 3497-3500).Google Scholar
  36. Damien, P., Wakefield, J., &; Walker, S. (1999). Gibbs sampling for Bayesian non-conjugate and hierarchical models by auxiliary variables. Journal of the Royal Statistical Society B, 61:2, 331-344.Google Scholar
  37. de Freitas, N., Højen-Sørensen, P., Jordan, M. I., &; Russell, S. (2001). Variational MCMC. In J. Breese &; D. Koller (Eds.), Uncertainty in artificial intelligence (pp. 120-127). San Matio, CA: Morgan Kaufmann.Google Scholar
  38. de Freitas, N., Niranjan, M., Gee, A. H., &; Doucet, A. (2000). Sequential Monte Carlo methods to train neural network models. Neural Computation, 12:4, 955-993.Google Scholar
  39. De Jong, P., &; Shephard, N. (1995). Efficient sampling from the smoothing density in time series models. Biometrika, 82:2, 339-350.Google Scholar
  40. Dempster, A. P., Laird, N. M., &; Rubin, D. B. (1997). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society Series B, 39, 1-38.Google Scholar
  41. Denison, D. G. T., Mallick, B. K., &; Smith, A. F. M. (1998). A Bayesian CART algorithm. Biometrika, 85, 363-377.Google Scholar
  42. Diaconis, P., &; Saloff-Coste, L. (1998). What do we know about the Metropolis algorithm? Journal of Computer and System Sciences, 57, 20-36.Google Scholar
  43. Doucet, A. (1998). On sequential simulation-based methods for Bayesian filtering. Technical Report CUED/FINFENG/TR 310, Department of Engineering, Cambridge University.Google Scholar
  44. Doucet, A., de Freitas, N., &; Gordon, N. J. (Eds.) (2001). Sequential Monte Carlo methods in practice. Berlin: Springer-Verlag.Google Scholar
  45. Doucet, A., de Freitas, N., Murphy, K., &; Russell, S. (2000). Rao blackwellised particle filtering for dynamic Bayesian networks. In C. Boutilier &; M. Godszmidt (Eds.), Uncertainty in artificial intelligence (pp. 176-183). Morgan Kaufmann Publishers.Google Scholar
  46. Doucet, A., Godsill, S., &; Andrieu, C. (2000). On sequential Monte Carlo sampling methods for Bayesian filtering. Statistics and Computing, 10:3, 197-208.Google Scholar
  47. Doucet, A., Godsill, S. J., &; Robert, C. P. (2000). Marginal maximum a posteriori estimation using MCMC. Technical Report CUED/F-INFENG/TR 375, Cambridge University Engineering Department.Google Scholar
  48. Duane, S., Kennedy, A. D., Pendleton, B. J., &; Roweth, D. (1987). Hybrid Monte Carlo. Physics Letters B, 195:2, 216-222.Google Scholar
  49. Dyer, M., Frieze, A., &; Kannan, R. (1991). A random polynomial-time algorithm for approximating the volume of convex bodies. Journal of the ACM, 1:38, 1-17.Google Scholar
  50. Eckhard, R. (1987). Stan Ulam, John Von Neumann and the Monte Carlo method. Los Alamos Science, 15, 131-136.Google Scholar
  51. Escobar, M. D., &; West, M. (1995). Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association, 90, 577-588.Google Scholar
  52. Fill, J. A. (1998). An interruptible algorithm for perfect sampling via Markov chains. The Annals of Applied Probability, 8:1, 131-162.Google Scholar
  53. Forsyth, D. A. (1999). Sampling, resampling and colour constancy. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 300-305).Google Scholar
  54. Fox, D., Thrun, S., Burgard,W., &; Dellaert, F. (2001). Particle filters for mobile robot localization. In A. Doucet, N. de Freitas, &; N. J. Gordon (Eds.), Sequential Monte Carlo methods in practice. Berlin: Springer-Verlag.Google Scholar
  55. Gelfand, A. E., &; Sahu, S. K. (1994). On Markov chain Monte Carlo acceleration. Journal of Computational and Graphical Statistics, 3, 261-276.Google Scholar
  56. Gelfand, A. E., &; Smith, A. F. M. (1990). Sampling-based approaches to calculating marginal densities. Journal of the American Statistical Association, 85:410, 398-409.Google Scholar
  57. Geman, S., &; Geman, D. (1984). Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6:6, 721-741.Google Scholar
  58. Geweke, J. (1989). Bayesian inference in econometric models using Monte Carlo integration. Econometrica, 24, 1317-1399.Google Scholar
  59. Ghahramani, Z. (1995). Factorial learning and the EM algorithm. In G. Tesauro, D. S. Touretzky, &; J. Alspector (Eds.), Advances in neural information processing systems 7 (pp. 617-624).Google Scholar
  60. Ghahramani, Z., &; Jordan, M. (1995). Factorial hidden Markov models. Technical Report 9502, MIT Artificial Intelligence Lab, MA.Google Scholar
  61. Gilks, W. R., &; Berzuini, C. (1998). Monte Carlo inference for dynamic Bayesian models. Unpublished. Medical Research Council, Cambridge, UK.Google Scholar
  62. Gilks, W. R., &; Roberts, G. O. (1996). Strategies for improving MCMC. In W. R. Gilks, S. Richardson, &; D. J. Spiegelhalter (Eds.), Markov chain Monte Carlo in practice (pp. 89-114). Chapman &; Hall.Google Scholar
  63. Gilks, W. R., Richardson, S., &; Spiegelhalter, D. J. (Eds.) (1996). Markov chain Monte Carlo in practice. Suffolk: Chapman and Hall.Google Scholar
  64. Gilks, W. R., Roberts, G. O., &; Sahu, S. K. (1998). Adaptive Markov chain Monte Carlo through regeneration. Journal of the American Statistical Association, 93, 763-769.Google Scholar
  65. Gilks, W. R., Thomas, A., &; Spiegelhalter, D. J. (1994).A language and program for complex Bayesian modelling. The Statistician, 43, 169-178.Google Scholar
  66. Godsill, S. J., &; Rayner, P. J. W. (Eds.) (1998). Digital audio restoration: A statistical model based approach. Berlin: Springer-Verlag.Google Scholar
  67. Gordon, N. J., Salmond, D. J., &; Smith, A. F. M. (1993). Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proceedings-F, 140:2, 107-113.Google Scholar
  68. Green, P. J. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika, 82, 711-732.Google Scholar
  69. Green, P. J., &; Richardson, S. (2000). Modelling heterogeneity with and without the Dirichlet process. Department of Statistics, Bristol University.Google Scholar
  70. Haario, H., &; Sacksman, E. (1991). Simulated annealing process in general state space. Advances in Applied Probability, 23, 866-893.Google Scholar
  71. Hastings, W. K. (1970). Monte Carlo sampling methods using Markov chains and their Applications. Biometrika 57, 97-109.Google Scholar
  72. Higdon, D. M. (1998). Auxiliary variable methods for Markov chain Monte Carlo with application. Journal of American Statistical Association, 93:442, 585-595.Google Scholar
  73. Holmes, C. C., &; Mallick, B. K. (1998). Bayesian radial basis functions of variable dimension. Neural Computation, 10:5, 1217-1233.Google Scholar
  74. Isard, M., &; Blake, A. (1996). Contour tracking by stochastic propagation of conditional density. In European Conference on Computer Vision (pp. 343-356). Cambridge, UK.Google Scholar
  75. Ishwaran, H. (1999). Application of hybrid Monte Carlo to Bayesian generalized linear models: Quasicomplete separation and neural networks. Journal of Computational and Graphical Statistics, 8, 779-799.Google Scholar
  76. Jensen, C. S., Kong, A., &; Kjærulff, U. (1995). Blocking-Gibbs sampling in very large probabilistic expert systems. International Journal of Human-Computer Studies, 42, 647-666.Google Scholar
  77. Jerrum, M., &; Sinclair, A. (1996). The Markov chain Monte Carlo method: an approach to approximate counting and integration. In D. S. Hochbaum (Ed.), Approximation algorithms for NP-hard problems (pp. 482-519). PWS Publishing.Google Scholar
  78. Jerrum, M., Sinclair, A., &; Vigoda, E. (2000). A polynomial-time approximation algorithm for the permanent of a matrix. Technical Report TR00-079, Electronic Colloquium on Computational Complexity.Google Scholar
  79. Kalos, M. H., &; Whitlock, P. A. (1986). Monte Carlo methods. New York: John Wiley &; Sons.Google Scholar
  80. Kam, A. H. (2000). A general multiscale scheme for unsupervised image segmentation. Ph.D. Thesis, Department of Engineering, Cambridge University, Cambridge, UK.Google Scholar
  81. Kanazawa, K., Koller, D., &; Russell, S. (1995). Stochastic simulation algorithms for dynamic probabilistic networks. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (pp. 346-351). Morgan Kaufmann.Google Scholar
  82. Kannan, R., &; Li, G. (1996). Sampling according to the multivariate normal density. In 37th Annual Symposium on Foundations of Computer Science (pp. 204-212). IEEE.Google Scholar
  83. Kirkpatrick, S., Gelatt, C. D., &; Vecchi, M. P. (1983). Optimization by simulated annealing. Science, 220, 671-680.Google Scholar
  84. Levine, R., &; Casella, G. (2001).Implementations of the Monte Carlo EM algorithm. Journal of Computational and Graphical Statistics, 10:3, 422-440.Google Scholar
  85. Liu, J. S. (Ed.) (2001). Monte Carlo strategies in scientific computing. Berlin: Springer-Verlag.Google Scholar
  86. MacEachern, S. N., Clyde, M., &; Liu, J. S. (1999). Sequential importance sampling for nonparametric Bayes models: The next generation. Canadian Journal of Statistics, 27, 251-267.Google Scholar
  87. McCulloch, C. E. (1994). Maximum likelihood variance components estimation for binary data. Journal of the American Statistical Association, 89:425, 330-335.Google Scholar
  88. Mengersen, K. L., &; Tweedie, R. L. (1996). Rates of convergence of the Hastings and Metropolis algorithms. The Annals of Statistics, 24, 101-121.Google Scholar
  89. Metropolis, N., &; Ulam, S. (1949). The Monte Carlo method. Journal of the American Statistical Association, 44:247, 335-341.Google Scholar
  90. Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., &; Teller, E. (1953). Equations of state calculations by fast computing machines. Journal of Chemical Physics, 21, 1087-1091.Google Scholar
  91. Meyn, S. P., &; Tweedie, R. L. (1993). Markov chains and stochastic stability. New York: Springer-Verlag.Google Scholar
  92. Mira, A. (1999). Ordering, slicing and splitting Monte Carlo Markov chains. Ph.D. Thesis, School of Statistics, University of Minnesota.Google Scholar
  93. Morris, R. D., Fitzgerald, W. J., &; Kokaram, A. C. (1996). A sampling based approach to line scratch removal from motion picture frames. In IEEE International Conference on Image Processing (pp. 801-804).Google Scholar
  94. Müller, P., &; Rios Insua, D. (1998). Issues in Bayesian analysis of neural network models. Neural Computation, 10, 571-592.Google Scholar
  95. Mykland, P., Tierney, L., &; Yu, B. (1995). Regeneration in Markov chain samplers. Journal of the American Statistical Association, 90, 233-241.Google Scholar
  96. Neal, R. M. (1993). Probabilistic inference using markov chain monte carlo methods. Technical Report CRG-TR-93-1, Dept. of Computer Science, University of Toronto.Google Scholar
  97. Neal, R. M. (1996). Bayesian learning for neural networks. Lecture Notes in Statistics No. 118. New York: Springer-Verlag.Google Scholar
  98. Neal, R. M. (2000). Slice sampling. Technical Report No. 2005, Department of Statistics, University of Toronto.Google Scholar
  99. Neuwald, A. F., Liu, J. S., Lipman, D. J., &; Lawrence, C. E. (1997). Extracting protein alignment models from the sequence database. Nucleic Acids Research, 25:9, 1665-1677.Google Scholar
  100. Newton, M. A., &; Lee,Y. (2000). Inferring the location and effect of tumor suppressor genes by instability-selection modeling of allelic-loss data. Biometrics, 56, 1088-1097.Google Scholar
  101. Ormoneit, D., Lemieux, C., &; Fleet, D. (2001). Lattice particle filters. Uncertainty in artificial intelligence. San Mateo, CA: Morgan Kaufmann.Google Scholar
  102. Ortiz, L. E., &; Kaelbling, L. P. (2000). Adaptive importance sampling for estimation in structured domains. In C. Boutilier, &; M. Godszmidt (Eds.), Uncertainty in artificial intelligence (pp. 446-454). San Mateo, CA: Morgan Kaufmann Publishers.Google Scholar
  103. Page, L., Brin, S., Motwani, R., &; Winograd, T. (1998). The PageRank citation ranking: Bringing order to the Web. Stanford Digital Libraries Working Paper.Google Scholar
  104. Pasula, H., &; Russell, S. (2001). Approximate inference for first-order probabilistic languages. In International Joint Conference on Artificial Intelligence, Seattle.Google Scholar
  105. Pasula, H., Russell, S., Ostland, M., &; Ritov,Y. (1999). Tracking many objects with many sensors. In International Joint Conference on Artificial Intelligence, Stockholm.Google Scholar
  106. Pearl, J. (1987). Evidential reasoning using stochastic simulation. Artificial Intelligence, 32, 245-257.Google Scholar
  107. Peskun, P. H. (1973). Optimum Monte-Carlo sampling using Markov chains. Biometrika, 60:3, 607-612.Google Scholar
  108. Pitt, M. K., &; Shephard, N. (1999). Filtering via simulation: Auxiliary particle filters. Journal of the American Statistical Association, 94:446, 590-599.Google Scholar
  109. Propp, J., &; Wilson, D. (1998). Coupling from the past: a user's guide. InD. Aldous, &; J. Propp (Eds.), Microsurveys in discrete probability. DIMACS series in discrete mathematics and theoretical computer science.Google Scholar
  110. Remondo, D., Srinivasan, R., Nicola, V. F., van Etten, W. C., &; Tattje, H. E. P. (2000). Adaptive importance sampling for performance evaluation and parameter optimization of communications systems. IEEE Transactions on Communications, 48:4, 557-565.Google Scholar
  111. Richardson, S., &; Green, P. J. (1997). On Bayesian analysis of mixtures with an unknown number of components. Journal of the Royal Statistical Society, 59:4, 731-792.Google Scholar
  112. Ridgeway, G. (1999). Generalization of boosting algorithms and applications of bayesian inference for massive datasets. Ph.D. Thesis, Department of Statistics, University of Washington.Google Scholar
  113. Rios Insua, D., &; Müller, P. (1998). Feedforward neural networks for nonparametric regression. In D. K. Dey, P. Müller, &; D. Sinha (Eds.), Practical nonparametric and semiparametric bayesian statistics (pp. 181-191). Springer Verlag.Google Scholar
  114. Robert, C. P., &; Casella, G. (1999). Monte Carlo statistical methods. New York: Springer-Verlag.Google Scholar
  115. Roberts, G., &; Tweedie, R. (1996). Geometric convergence and central limit theorems for multidimensional Hastings and Metropolis algorithms. Biometrika, 83, 95-110.Google Scholar
  116. Rubin, D. B. (1998). Using the SIR algorithm to simulate posterior distributions. In J. M. Bernardo, M. H. DeGroot, D. V. Lindley, &; A. F. M. Smith (Eds.), Bayesian statistics 3 (pp. 395-402). Cambridge, MA: Oxford University Press.Google Scholar
  117. Rubinstein, R. Y. (Eds.) (1981). Simulation and the Monte Carlo method. New York: John Wiley and Sons.Google Scholar
  118. Salmond, D., &; Gordon, N. (2001). Particles and mixtures for tracking and guidance. In A. Doucet, N. de Freitas, &; N. J. Gordon (Eds.), Sequential Monte Carlo methods in practice. Berlin: Springer-Verlag.Google Scholar
  119. Schuurmans, D., &; Southey, F. (2000). Monte Carlo inference via greedy importance sampling. In C. Boutilier, &; M. Godszmidt (Eds.), Uncertainty in artificial intelligence (pp. 523-532). Morgan Kaufmann Publishers.Google Scholar
  120. Sherman, R. P., Ho, Y. K., &; Dalal, S. R. (1999). Conditions for convergence of Monte Carlo EM sequences with an application to product diffusion modeling. Econometrics Journal, 2:2, 248-267.Google Scholar
  121. Smith, P. J., Shafi, M., &; Gao, H. (1997). Quick simulation: A review of importance sampling techniques in communications systems. IEEE Journal on Selected Areas in Communications, 15:4, 597-613.Google Scholar
  122. Stephens, M. (1997). Bayesian methods for mixtures of normal distributions. Ph.D. Thesis, Department of Statistics, Oxford University, England.Google Scholar
  123. Swendsen, R. H., &; Wang, J. S. (1987). Nonuniversal critical dynamics in Monte Carlo simulations. Physical Review Letters, 58:2, 86-88.Google Scholar
  124. Tanner, M. A., &; Wong, W. H. (1987). The calculation of posterior distributions by data augmentation. Journal of the American Statistical Association, 82:398, 528-550.Google Scholar
  125. Thrun, S. (2000). Monte Carlo POMDPs. In S. Solla, T. Leen, &; K.-R. Müller (Eds.), Advances in neural information processing systems 12 (pp. 1064-1070). Cambridge, MA: MIT Press.Google Scholar
  126. Tierney, L. (1994). Markov chains for exploring posterior distributions. The Annals of Statistics, 22:4, 1701-1762.Google Scholar
  127. Tierney, L., &; Mira, A. (1999). Some adaptive Monte Carlo methods for Bayesian inference. Statistics in Medicine, 18, 2507-2515.Google Scholar
  128. Troughton, P. T., &; Godsill, S. J. (1998). A reversible jump sampler for autoregressive time series. In International Conference on Acoustics, Speech and Signal Processing (Vol. IV, pp. 2257-2260).Google Scholar
  129. Tu, Z. W., &; Zhu, S. C. (2001). Image segmentation by data driven Markov chain Monte Carlo. In International Computer Vision Conference.Google Scholar
  130. Utsugi, A. (2001). Ensemble of independent factor analyzers with application to natural image analysis. Neural Processing Letters, 14:1, 49-60.Google Scholar
  131. van der Merwe, R., Doucet, A., de Freitas, N., &; Wan, E. (2000). The unscented particle filter. Technical Report CUED/F-INFENG/TR 380, Cambridge University Engineering Department.Google Scholar
  132. Van Laarhoven, P. J., &; Arts, E. H. L. (1987). Simulated annealing: Theory and applications. Amsterdam: Reidel Publishers.Google Scholar
  133. Veach, E., &; Guibas, L. J. (1997). Metropolis light transport. SIGGRAPH, 31, 65-76.Google Scholar
  134. Vermaak, J., Andrieu, C., Doucet, A., &; Godsill, S. J. (1999). Non-stationary Bayesian modelling and enhancement of speech signals. Technical Report CUED/F-INFENG/TR, Cambridge University Engineering Department.Google Scholar
  135. Wakefield, J. C., Gelfand, A. E., &; Smith, A. F. M. (1991). Efficient generation of random variates via the ratio-of-uniforms methods. Statistics and Computing, 1, 129-133.Google Scholar
  136. Wei, G. C. G., &; Tanner, M. A. (1990). A Monte Carlo implementation of the EM algorithm and the poor man's data augmentation algorithms. Journal of the American Statistical Association, 85:411, 699-704.Google Scholar
  137. West, M., Nevins, J. R., Marks, J. R., Spang, R., &; Zuzan, H. (2001). Bayesian regression analysis in the “large p, small n” paradigm with application in DNA microarray studies. Department of Statistics, Duke University.Google Scholar
  138. Wilkinson, D. J., &; Yeung, S. K. H. (2002). Conditional simulation from highly structured Gaussian systems, with application to blocking-MCMC for the Bayesian analysis of very large linear models. Statistics and Computing, 12, 287-300.Google Scholar
  139. Wood, S., &; Kohn, R. (1998). A Bayesian approach to robust binary nonparametric regression. Journal of the American Statistical Association, 93:441, 203-213.Google Scholar

Copyright information

© Kluwer Academic Publishers 2003

Authors and Affiliations

  • Christophe Andrieu
    • 1
  • Nando de Freitas
    • 2
  • Arnaud Doucet
    • 3
  • Michael I. Jordan
    • 4
  1. 1.Department of Mathematics, Statistics GroupUniversity of BristolUniversity WalkUK
  2. 2.Department of Computer ScienceUniversity of British ColumbiaVancouverCanada
  3. 3.Department of Electrical and Electronic EngineeringUniversity of MelbourneParkvilleAustralia
  4. 4.Departments of Computer Science and StatisticsUniversity of California at BerkeleyBerkeleyUSA

Personalised recommendations