Skip to main content
Log in

Augmentation schemes for particle MCMC

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Particle MCMC involves using a particle filter within an MCMC algorithm. For inference of a model which involves an unobserved stochastic process, the standard implementation uses the particle filter to propose new values for the stochastic process, and MCMC moves to propose new values for the parameters. We show how particle MCMC can be generalised beyond this. Our key idea is to introduce new latent variables. We then use the MCMC moves to update the latent variables, and the particle filter to propose new values for the parameters and stochastic process given the latent variables. A generic way of defining these latent variables is to model them as pseudo-observations of the parameters or of the stochastic process. By choosing the amount of information these latent variables have about the parameters and the stochastic process we can often improve the mixing of the particle MCMC algorithm by trading off the Monte Carlo error of the particle filter and the mixing of the MCMC moves. We show that using pseudo-observations within particle MCMC can improve its efficiency in certain scenarios: dealing with initialisation problems of the particle filter; speeding up the mixing of particle Gibbs when there is strong dependence between the parameters and the stochastic process; and enabling further MCMC steps to be used within the particle filter.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Amit, Y.: On rates of convergence of stochastic relaxation for Gaussian and non-Gaussian distributions. J. Multivar. Anal. 38(1), 82–99 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  • Andrieu, C., Roberts, G.O.: The pseudo-marginal approach for efficient computations. Ann. Stat. 37, 697–725 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  • Andrieu, C., Thoms, J.: A tutorial on adaptive MCMC. Stat. Comput. 18, 343–373 (2008)

    Article  MathSciNet  Google Scholar 

  • Andrieu, C., Doucet, A., Holenstein, R.: Particle Markov chain Monte Carlo (with discussion). J. R. Stat. Soc. Ser. B 62, 269–342 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  • Andrieu, C., Lee, A., Vihola, M.: Uniform ergodicity of the iterated conditional SMC and geometric ergodicity of particle Gibbs samplers. ArXiv e-prints 1312, 6432 (2013)

    Google Scholar 

  • Blackwell, D., MacQueen, J.B.: Ferguson distributions via Polya urn schemes. Ann. Stat. 1, 353–355 (1973)

    Article  MathSciNet  MATH  Google Scholar 

  • Carvalho, C.M., Johannes, M.S., Lopes, H.F., Polson, N.G.: Particle learning and smoothing. Stat. Sci. 25(1), 88–106 (2010a)

    Article  MathSciNet  MATH  Google Scholar 

  • Carvalho, C.M., Lopes, H.F., Polson, N.G., Taddy, M.A.: Particle learning for general mixtures. Bayesian Anal. 5, 709–740 (2010b)

    Article  MathSciNet  MATH  Google Scholar 

  • Chopin, N., Singh, S.S.: On the particle Gibbs sampler. arXiv preprint arXiv:1304.1887 (2013)

  • Dahlin, J., Lindsten, F., Schön, T.B.: Particle Metropolis–Hastings using gradient and Hessian information. Stat. Comput. 25(1), 1–12 (2014)

    MathSciNet  MATH  Google Scholar 

  • Del Moral, P.: Feynman-Kac Formulae: Genealogical and Interacting Particle Systems with Applications. Springer, New York (2004)

    Book  MATH  Google Scholar 

  • Del Moral, P., Kohn, R., Patras, F.: On Feynman-Kac and particle Markov chain Monte Carlo models. arXiv preprint arXiv:1404.5733 (2014)

  • Dembo, A., Kagan, A., Shepp, L.A., et al.: Remarks on the maximum correlation coefficient. Bernoulli 7(2), 343–350 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  • Doucet, A., Godsill, S.J., Andrieu, C.: On sequential Monte Carlo sampling methods for Bayesian filtering. Stat. Comput. 10, 197–208 (2000)

    Article  Google Scholar 

  • Doucet, A., Pitt, M.K., Deligiannidis, G., Kohn, R.: Efficient implementation of Markov chain Monte Carlo when using an unbiasedlikelihood estimator. Biometrika. 102, 295–313 (2015). doi:10.1093/biomet/asu075

  • Falush, D., Stephens, M., Pritchard, J.K.: Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164, 1567–1587 (2003)

    Google Scholar 

  • Fearnhead, P.: MCMC, sufficient statistics and particle filters. J. Comput. Graph. Stat. 11, 848–862 (2002)

    Article  MathSciNet  Google Scholar 

  • Fearnhead, P.: Particle filters for mixture models with an unknown number of components. Stat. Comput. 14, 11–21 (2004)

    Article  MathSciNet  Google Scholar 

  • Fearnhead, P.: Computational methods for complex stochastic systems: a review of some alternatives to MCMC. Stat. Comput. 18, 151–171 (2008)

    Article  MathSciNet  Google Scholar 

  • Fearnhead, P.: MCMC for state-space models. In: Brooks, S., Gelman, A., Jones, G.L., Meng, X. (eds.) Handbook of Markov Chain Monte Carlo. Chapman & Hall/CRC, London (2011)

    Google Scholar 

  • Fearnhead, P., Clifford, P.: Online inference for hidden Markov models. J. R. Stat. Soc. Ser. B 65, 887–899 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  • Ferguson, T.S.: A Bayesian analysis of some nonparametric problems. Ann. Stat. 1, 209–230 (1973)

    Article  MathSciNet  MATH  Google Scholar 

  • Gilks, W.R., Berzuini, C.: Following a moving target—Monte Carlo inference for dynamic Bayesian models. J. R. Stat. Soc. Ser. B 63, 127–146 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  • Gramacy, R.B., Polson, N.G.: Particle learning of Gaussian process models for sequential design and optimization. J. Comput. Graph. Stat. 20, 102–118 (2011)

    Article  MathSciNet  Google Scholar 

  • Lindsten, F., Jordan, M.I., Schön, T.B.: Particle Gibbs with ancestor sampling. J. Mach. Learn. Res. 15(1), 2145–2184 (2014)

    MathSciNet  MATH  Google Scholar 

  • Liu, J.S.: Fraction of missing information and convergence rate of data augmentation. In: Computing Science and Statistics: Proceedings of the 26th Symposium on the Interface, Interface Foundation of North America, Fairfax Station, VA, pp. 490–496 (1994)

  • Liu, J.S., Wong, W.H., Kong, A.: Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes. Biometrika 81(1), 27–40 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  • Mendes, E.F., Carter, C.K., Kohn, R.: On general sampling schemes for Particle Markov chain Monte Carlo methods. arXiv preprint arXiv:1401.1667 (2014)

  • Murray, L.M., Jones, E.M., Parslow, J.: On disturbance state-space models and the particle marginal Metropolis–Hastings sampler. SIAM/ASA J. Uncertain. Quantif. 1(1), 295–313 (2012)

  • Nemeth, C., Sherlock, C., Fearnhead, P.: Particle Metropolis adjusted Langevin algorithms. arXiv preprint arXiv:1412.7299 (2014)

  • Olsson, J., Ryden, T.: Rao–Blackwellization of particle Markov chain Monte Carlo methods using forward filtering backward sampling. Signal Process. IEEE Trans. 59(10), 4606–4619 (2011)

    Article  MathSciNet  Google Scholar 

  • Patterson, N., Price, A.L., Reich, D.: Population structure and eigenanalysis. PLoS Genet. 2(12), e190 (2006)

    Article  Google Scholar 

  • Pitt, M.K., Shephard, N.: Analytic convergence rates, and parameterization issues for the Gibbs sampler applied to state space models. J. Time Ser. Anal. 20, 63–85 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  • Pitt, M.K., dos Santos Silva, R., Giordani, P., Kohn, R.: On some properties of Markov chain Monte Carlo simulation methods based on the particle filter. J. Econ. 171, 134–151 (2012)

    Article  MathSciNet  Google Scholar 

  • Price, A.L., Patterson, N.J., Plenge, R.M., Weinblatt, M.E., Shadick, N.A., Reich, D.: Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38(8), 904–909 (2006)

    Article  Google Scholar 

  • Pritchard, J.K., Stephens, M., Donnelly, P.: Inference of population structure using multilocus genotype data. Genetics 155, 945–959 (2000a)

    Google Scholar 

  • Pritchard, J.K., Stephens, M., Rosenberg, N.A., Donnelly, P.: Association mapping in structured populations. Am. J. Hum. Genet. 67, 170–181 (2000b)

    Article  Google Scholar 

  • Rasmussen, D.A., Ratmann, O., Koelle, K.: Inference for nonlinear epidemiological models using genealogies and time series. PLoS Comput. Biol. 7(8), e1002136 (2011)

    Article  MathSciNet  Google Scholar 

  • Roberts, G.O., Rosenthal, J.S.: Optimal scaling for various Metropolis–Hastings algorithms. Stat. Sci. 16, 351–367 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  • Rosenberg, N.A., Pritchard, J.K., Weber, J.L., Cann, H.M., Kidd, K.K., Zhivotovsky, L.A., Feldman, M.W.: Genetic structure of human populations. Science 298, 2381–2385 (2002)

    Article  Google Scholar 

  • Sherlock, C., Thiery, A.H., Roberts, G.O.: On the efficiency of pseudo marginal random walk Metropolis algorithms. Ann. Stat. 43, 238–275 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  • Storvik, G.: Particle filters for state-space models with the presence of unknown static parameters. IEEE Trans. Signal Process. 50, 281–289 (2002)

    Article  MathSciNet  Google Scholar 

  • Tanner, M.A., Wong, W.H.: The calculation of posterior distributions by data augmentation. J. Am. Stat. Assoc. 82(398), 528–540 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  • van Dyk, D.A., Meng, X.-L.: The art of data augmentation. J. Comput. Graph. Stat. 10(1), 1–50 (2001)

    Article  MathSciNet  Google Scholar 

  • Wood, F., vand de Meent, J.W., Mansinghka, V.: A new approach to probabilistic programming inference. In: Proceedings of the 17th International conference on Artificial Intelligence and Statistics (2014)

Download references

Acknowledgments

The first author was supported by the Engineering and Physical Sciences Research Council Grant EP/K014463/1.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paul Fearnhead.

Appendices

Appendix 1: Calculations for the stochastic volatility model

First consider \(Z_{\beta _X}\). Standard calculations give

$$\begin{aligned} p(\beta _X|z_{\beta _X})\propto & {} p(\beta _X)p(z_{\beta _X}|\beta _X) \\\propto & {} \beta _x^{a_x-1}\exp \{-b_x\beta _x\}\left( \beta _X^{n_x}\exp \{-\beta _x z_{\beta _X}\}\right) . \end{aligned}$$

This gives that the conditional distribution of \(\beta _X\) given \(z_{\beta _X}\) is gamma with parameters \(n_x+a_x\) and \(b_x+z_{\beta _X}\). Furthermore the marginal distribution for \(Z_{\beta _X}\) is

$$\begin{aligned} p(z_{\beta _X})= & {} \int p(\beta _X)p(z_{\beta _X}|\beta _X)\text{ d }\beta _X \\= & {} \frac{b_x^{a_x}z_{\beta _x}^{n_x-1}}{\Gamma (a_x)\Gamma (n_x)}\int \beta _x^{a_x+n_x-1}\exp \{-(b_x+z_{\beta _X}\beta _x\} \text{ d }\beta _X \\= & {} \left( \frac{\Gamma (a_x+n_x)b_x^{a_x}}{\Gamma (a_x)\Gamma (n_x)} \right) \left( \frac{z_{\beta _x}^{n_x-1}}{(z_{\beta _x}+b_x)^{n_x+a_x}} \right) . \end{aligned}$$

The calculations for \(Z_{\beta _Y}\) are identical.

Calculations for \(Z_X\) and \(Z_\gamma \) are as for the linear Gaussian model (see Sect. 3).

Appendix 2: PMMH algorithm with particle learning

figure c

Appendix 3: Calculations for the Dirichlet process mixture model

The conditional distribution of \(Z_x\) given \(x_{1:n}\) can be split into (i) the marginal distribution for v, p(v); (ii) the conditional distribution of the sampled individuals, \(i_1,\ldots ,i_v\), given v. Given \(i_1,\ldots ,i_v\), the clustering of these individuals is deterministic, being defined by the clustering \((x_{i_1},\ldots ,x_{i_v})\).

The marginal distribution of \(Z_x\) thus can be written as

$$\begin{aligned} p(Z_x)=p(v)p(i_1,\ldots ,i_v|v)p(x_{i_1},\ldots ,x_{i_v}). \end{aligned}$$

Where we that, due to uniform sampling of the individuals,

$$\begin{aligned} p(i_1,\ldots ,i_v|v)=\left( \begin{array}{c} n \\ v \end{array} \right) . \end{aligned}$$

Finally, \(p(x_{i_1},\ldots ,x_{i_v})\) is given by the Dirichlet process prior. If we relabel the populations so that \(x_{i_1}=1\), population 2 is the population of the first individual in \(i_1,\ldots ,i_v\) that is not in population 1, and so on; then for \(v>1\),

$$\begin{aligned} p(x_{i_1},\ldots ,x_{i_v})=\prod _{j=2}^v p(x_{i_j}|x_{i_1},\ldots ,x_{i_{j-1}}), \end{aligned}$$

with \(p(x_{i_j}|x_{i_1},\ldots ,x_{i_{j-1}})\) defined by (4).

Within the PMMH we use a proposal for \(Z_x\) given \(X_{1:n}\) that is its full conditional

$$\begin{aligned} q(Z_x|x_{1:n})=p(Z_x|x_{1:n})=p(v)p(i_1,\ldots ,i_v|v). \end{aligned}$$

In practice we take the distribution of v to be a Poisson distribution with mean 5, truncated to take values less than n. (Similar results were observed as we varied both the distribution and the mean value.)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fearnhead, P., Meligkotsidou, L. Augmentation schemes for particle MCMC. Stat Comput 26, 1293–1306 (2016). https://doi.org/10.1007/s11222-015-9603-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-015-9603-4

Keywords

Navigation