Adaptive approximate Bayesian computation for complex models

Abstract

We propose a new approximate Bayesian computation (ABC) algorithm that aims at minimizing the number of model runs for reaching a given quality of the posterior approximation. This algorithm automatically determines its sequence of tolerance levels and makes use of an easily interpretable stopping criterion. Moreover, it avoids the problem of particle duplication found when using a MCMC kernel. When applied to a toy example and to a complex social model, our algorithm is 2–8 times faster than the three main sequential ABC algorithms currently available.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Notes

  1. 1.

    http://motive.cemagref.fr/people/maxime.lenormand/script_r_toyex.

  2. 2.

    PRototypical policy Impacts on Multifunctional Activities in rural municipalities—EU 7th Framework Research Programme; 2008–2011; https://prima.cemagref.fr/the-project.

References

  1. Beaumont MA (2010) Approximate Bayesian computation in evolution and ecology. Annu Rev Ecol Evol Syst 41(1):379–406

    Google Scholar 

  2. Beaumont MA, Cornuet J, Marin J, Robert CP (2009) Adaptive approximate Bayesian computation. Biometrika 96(4):983–990

    Google Scholar 

  3. Beaumont MA, Zhang W, Balding DJ (2002) Approximate Bayesian computation in population genetics. Genetics 162(4):2025–2035

    Google Scholar 

  4. Blum MGB, François O (2010) Non-linear regression models for approximate Bayesian computation. Stat Comput 20(1):63–73

    MathSciNet  Article  Google Scholar 

  5. Carnell R (2009) lhs: Latin hypercube samples. R package version 0.5

  6. Del Moral P, Doucet A, Jasra A (2006) Sequential Monte Carlo samplers. J R Stat Soc Ser B Stat Methodol 68(3):411–436

    MathSciNet  Article  MATH  Google Scholar 

  7. Del Moral P, Doucet A, Jasra A (2012) An adaptive sequential Monte Carlo method for approximate Bayesian computation. Stat Comput 22(5):1009–1020

    MathSciNet  Article  MATH  Google Scholar 

  8. Drovandi CC, Pettitt AN (2011) Estimation of parameters for macroparasite population evolution using approximate Bayesian computation. Biometrics 67(1):225–233

    MathSciNet  Article  MATH  Google Scholar 

  9. Fearnhead P, Prangle D (2011) Constructing summary statistics for approximate Bayesian computation: semi-automatic ABC. Technical report 1004.1112. arXiv.org

  10. Filippi S, Barnes C, Stumpf MPH (2012) On optimality of kernels for approximate Bayesian computation using sequential Monte Carlo. arXiv:1106.6280v4

  11. Glynn P, Whitt W (1992) The asymptotic effciency of simulation estimators. Oper Res 40(3):505–520

    MathSciNet  Article  MATH  Google Scholar 

  12. Huet S, Deffuant G (2011) Common framework for the microsimulation model in prima project. Technical report, Cemagref LISC

  13. Jabot F, Faure T, Dumoulin N (2013) EasyABC: performing efficient approximate Bayesian computation sampling schemes using R. Methods Ecol Evol (in press). doi:10.1111/2041-210X.12050

  14. Joyce P, Marjoram P (2008) Approximately sufficient statistics and Bayesian computation. Stat Appl Genet Mol Biol 7(1):1–18

    Google Scholar 

  15. Marjoram P, Molitor J, Plagnol V, Tavaré S (2003) Markov chain Monte Carlo without likelihoods. Proc Natl Acad Sci USA 100(26):15324–15328

    Article  Google Scholar 

  16. R Development Core Team (2011) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0

  17. Sisson SA, Fan Y, Tanaka MM (2007) Sequential Monte Carlo without likelihoods. Proc Natl Acad Sci USA 104(6):1760–1765

    MathSciNet  Article  MATH  Google Scholar 

  18. Toni T, Welch D, Strelkowa N, Ipsen A, Stumpf MPH (2009) Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems. J R Soc Interface 6:187

    Article  Google Scholar 

  19. Wegmann D, Leuenberger C, Excoffier L (2009) Efficient approximate Bayesian computation coupled with Markov chain Monte Carlo without likelihood. Genetics 182(4):1207–1218

    Article  Google Scholar 

  20. Wegmann D, Leuenberger C, Neuenschwander S, Excoffier L (2010) Abctoolbox: a versatile toolkit for approximate Bayesian computations. BMC Bioinformatics 11(1):116

    Article  MATH  Google Scholar 

Download references

Acknowledgments

This publication has been funded by the Prototypical policy impacts on multifunctional activities in rural municipalities collaborative project, European Union 7th Framework Programme (ENV 2007-1), contract no. 212345. The work of the first author has been funded by the Auvergne region.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Maxime Lenormand.

Appendices

Appendix 1: Description of the algorithms

figurea
figureb
figurec
figured
figuree

Appendix 2: Proof that the algorithm stops

We know that there exists \(\varepsilon _\infty > 0\) such that \(\varepsilon _t \underset{t\rightarrow +\infty }{\longrightarrow } \varepsilon _\infty \) because, by construction of the algorithm \((\varepsilon _t)\) is a positive decreasing sequence and it is bounded by 0.

For each \(\theta \in \varTheta \), we consider the distance \((\rho (x,y)| \theta )\) as a random variable \(\rho (\theta )\). Let \(f_{\rho (\theta )}\) be the probability density function of \(\rho (\theta )\).

The probability \(\mathbb P [\rho (\theta ) \ge \varepsilon _{t}] \) that the drawn distance associated to parameter \(\theta \) is higher than the current tolerance \(\varepsilon _{t}\) satisfies:

$$\begin{aligned} \begin{array}{l@{\quad }l} \mathbb P [\rho (\theta ) \ge \varepsilon _{t}] &{} = 1 - \mathbb P [(\rho (\theta ) < \varepsilon _{t}]\\ &{} = 1 - \displaystyle \int \limits _{\varepsilon _\infty }^{\varepsilon _{t}} f_{\rho (\theta )}(x) d_x\\ \end{array} \end{aligned}$$

We define:

$$\begin{aligned} \mathbb P _{max}=\sup _{\theta \in \varTheta }\left\{ \sup _{x \in \mathbb R ^+}\left\{ f_{\rho (\theta )}(x)\right\} \right\} \end{aligned}$$

We have:

$$\begin{aligned} \mathbb P [\rho (\theta ) \ge \varepsilon _{t}] \ge 1 -\mathbb P _{max}(\varepsilon _{t}-\varepsilon _\infty ) \end{aligned}$$

The \(N-N_\alpha \) particles are independent and identically distributed from \(\pi _{t+1}\) the density defined by the algorithm, hence the probability \( \mathbb P [p_{acc}(t+1)=0] \) that no particle is accepted at step \(t+1\) is such that:

$$\begin{aligned} \mathbb P [p_{acc}(t+1)=0] \ge \left( 1 - \mathbb P _{max}(\varepsilon _{t}-\varepsilon _\infty )\right) ^{N-N_\alpha } \end{aligned}$$

If \(\mathbb P _{max} < +\infty \), because \(\varepsilon _t - \varepsilon _{\infty }\underset{t\rightarrow +\infty }{\longrightarrow } 0\), we have:

$$\begin{aligned} \mathbb P [p_{acc}(t+1)=0]\underset{t\rightarrow +\infty }{\longrightarrow } 1 \end{aligned}$$

We can conclude that \(p_{acc}(t)\) converges in probability towards 0 if \(\mathbb P _{max}< +\infty \). This ensures that the algorithm stops, whatever the chosen value of \(p_{acc_{min}}\).

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Lenormand, M., Jabot, F. & Deffuant, G. Adaptive approximate Bayesian computation for complex models. Comput Stat 28, 2777–2796 (2013). https://doi.org/10.1007/s00180-013-0428-3

Download citation

Keywords

  • ABC
  • Population Monte Carlo
  • Sequential Monte Carlo