Skip to main content

Advertisement

Log in

Adaptive particle allocation in iterated sequential Monte Carlo via approximating meta-models

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Sequential Monte Carlo (SMC) filters (also known as particle filters) are widely used in the analysis of non-linear and non-Gaussian time series models in diverse application areas such as engineering, finance, and epidemiology. When a time series contains an observation that is very unlikely given the previous observations, evaluation of its conditional log likelihood by SMC can suffer from high variance. The presence of one or more such observations can result in poor Monte Carlo estimate of the overall likelihood. In this article, we develop a novel strategy of particle allocation for off-line iterated SMC based filters, in order to reduce the overall variance of the likelihood estimate to enable efficient computation. The complications arising from the intractability of the actual SMC variance is handled via an approximating meta-model, in which we model the SMC errors in the evaluation of conditional log likelihood of the observations as an autoregressive process. We demonstrate numerical results on both simulated and real data sets where adaptive particle allocation results in 54 % lower overall variance over the naïve equal allocation of particles at all time points in simulations and 53 % lower variance on a real time series model of epidemic malaria transmission. The approximating model approach presented in this article is novel in the context of SMC and offers a computationally attractive procedure for practical analysis of a broad class of time series models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Andrieu, C., Doucet, A., Holenstein, R.: Particle Markov chain Monte Carlo methods. J. R. Stat. Soc. Ser. B (Stat. Method.) 72, 269–342 (2010)

    Article  MathSciNet  Google Scholar 

  • Arulampalam, M.S., Maskell, S., Gordon, N., Clapp, T.: A tutorial on particle filters for online nonlinear, non-Gaussian Bayesian tracking. IEEE Trans. Signal Process. 50, 174–188 (2002)

    Article  Google Scholar 

  • Bérard, J., Del-Moral, P., Doucet, A.: A lognormal central limit theorem for particle approximations of normalizing constants (2013). arXiv:1307.0181

  • Berzuini, C., Gilks, W.: RESAMPLE-MOVE filtering with cross-model jumps. In: Doucet, A., de Freitas, N., Gordon, N.J. (eds.) Sequential Monte Carlo Methods in Practice, pp. 117–138. Springer, New York (2001)

    Chapter  Google Scholar 

  • Bhadra, A., Ionides, E.L., Laneri, K., Bouma, M., Dhiman, R.C., Pascual, M.: Malaria in Northwest India: data analysis via partially observed stochastic differential equation models driven by Lévy noise. J. Am. Stat. Assoc. 106, 440–451 (2011a)

    Article  MATH  MathSciNet  Google Scholar 

  • Bhadra, A., Ionides, E. L., Laneri, K., Bouma, M., Dhiman, R. C.,Pascual, M.: Online supplement to: Malaria in Northwest India: Data analysis via partially observed stochastic differential equation models driven by Lévy noise. J. Am. Stat. Assoc. 106 (2011). http://pubs.amstat.org/doi/suppl/10.1198/jasa.2011.ap10323

  • Cappé, O., Godsill, S., Moulines, E.: An overview of existing methods and recent advances in sequential Monte Carlo. Proc. IEEE 95, 899–924 (2007)

    Article  Google Scholar 

  • Cappé, O., Moulines, E., Rydén, T.: Inference in Hidden Markov Models. Springer, New York (2005)

    MATH  Google Scholar 

  • Del Moral, P., Doucet, A., Jasra, A.: On adaptive resampling strategies for sequential Monte Carlo methods. Bernoulli 18, 252–278 (2012)

  • Doucet, A., Johansen, A.M.: A tutorial on particle filtering and smoothing: fiteen years later. In: Crisan, D., Rozovsky, B. (eds.) Oxford Handbook of Nonlinear Filtering. Oxford University Press, Oxford (2011)

    Google Scholar 

  • Floudas, C., Gounaris, C.: A review of recent advances in global optimization. J. Glob. Optim. 45, 3–38 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  • Fox, D.: Adapting the sample size in particle filters through KLD-sampling. Int. J. Robotics Res. 22, 985–1003 (2003)

    Article  Google Scholar 

  • Gordon, N.J., Salmond, D.J., Smith, A.F.M.: Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proc. F (Radar Signal Process.) 140, 107–113 (1993)

    Article  Google Scholar 

  • He, D., Ionides, E.L., King, A.A.: Plug-and-play inference for disease dynamics: measles in large and small towns as a case study. J. R. Soc. Interface 7, 271–283 (2010)

    Article  Google Scholar 

  • Ionides, E.L., Bhadra, A., Atchadé, Y., King, A.A.: Iterated filtering. Ann. Stat. 39, 1776–1802 (2011)

    Article  MATH  Google Scholar 

  • Ionides, E.L., Bretó, C., King, A.A.: Inference for nonlinear dynamical systems. Proc. Natl. Acad. Sci. USA 103, 18438–18443 (2006)

    Article  Google Scholar 

  • Jin, Y., Branke, J.: Evolutionary optimization in uncertain environments—a survey. IEEE Trans. Evol. Comput. 9, 303–317 (2005)

    Article  Google Scholar 

  • King, A. A., Ionides, E. L., Bretó, C. M., Ellner, S., Kendall, B.pomp: Statistical inference for partially observed Markov processes. R package (2009). Available at http://www.r-project.org

  • Koller, D., Fratkina, R.: Using learning for approximation in stochastic processes. In: Proceedings of the 15th International Conference on Machine Learning (ICML), pp. 287–295 (1998)

  • Laneri, K., Bhadra, A., Ionides, E.L., Bouma, M., Dhiman, R.C., Yadav, R.S., Pascual, M.: Forcing versus feedback: epidemic malaria and monsoon rains in northwest india. PLoS Comput. Biol. 6, e1000898 (2010)

  • Lanz, O.: An information theoretic rule for sample size adaptation in particle filtering. In: Proceedings of the 14th International Conference on Image Analysis and Processing (ICIAP), pp. 317–322. Washington DC (2007)

  • Liu, J.S.: Monte Carlo Strategies in Scientific Computing. Springer, New York (2001)

    MATH  Google Scholar 

  • Liu, J.S., Chen, R.: Blind deconvolution via sequential imputations. J. Am. Stat. Assoc. 90, 567–576 (1995)

    Article  MATH  Google Scholar 

  • Nocedal, J., Wright, S.J.: Numerical Optimization. Springer, New York (1999)

    Book  MATH  Google Scholar 

  • Pan, P., Schonfeld, D.: Dynamic proposal variance and optimal particle allocation in particle filtering for video tracking. IEEE Trans. Circuits Syst. Video Technol. 18, 1268–1279 (2008)

    Article  Google Scholar 

  • Pitt, M.K., Shephard, N.: Filtering via simulation: auxiliary particle filters. J. Am. Stat. Assoc. 94, 590–599 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  • Soto, A.: Self adaptive particle filter. In: Proceedings of the 19th International Joint Conference on Artificial Intelligence (IJCAI), pp. 1398–1406 (2005)

  • van der Vaart, A.W.: Asymptotic statistics. Cambridge Series in Statistical and Probabilistic MathematicsCambridge University Press, Cambridge (1998)

  • Whiteley, N., Lee, A.: Twisted particle filters. Ann. Stat. 42, 115–141 (2014)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Acknowledgments

The authors thank two anonymous referees for their constructive suggestions

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anindya Bhadra.

Appendices

Appendix 1: Proof of Proposition 1

We have a single inequality constraint \(1 - \sum _{k=1}^{K} p_k \ge 0\) and the Lagrangian function (Sect. 12.1, Nocedal and Wright 1999) is given by

$$\begin{aligned} {\mathcal {L}}(\mathbf{p},\lambda )=\sum _{k=1}^{K} \phi _{k}/p_{k} - \lambda \left( 1 - \sum _{k=1}^{K} p_k\right) , \end{aligned}$$

Thus, the gradient of the function \({\mathcal {L}}\) with respet to \(\mathbf{p}\) is given by

$$\begin{aligned} \nabla _p {\mathcal {L}}(\mathbf{p},\lambda ) = \left( -\frac{\phi _1}{p_1^2} + \lambda , \ldots , -\frac{\phi _K}{p_K^2} + \lambda \right) ^T, \end{aligned}$$

while the Hessian is given by

$$\begin{aligned} \nabla _{pp} {\mathcal {L}}(\mathbf{p},\lambda ) = {\mathrm {diag}} \left( 2\phi _k/p_k^3\right) . \end{aligned}$$
(27)

Solving \(\nabla _p {\mathcal {L}}(\mathbf{p}^*, \lambda ^*)=0\) we get

$$\begin{aligned} \frac{\phi _{k}}{{p_{k}^{*}}^{2}} - {\lambda ^{*}}&= 0,\nonumber \\ \hbox {i.e.,} \quad p_k^{*}&= \sqrt{\frac{\phi _k}{\lambda ^{*}}}. \end{aligned}$$
(28)

Let us choose \(\lambda ^{*}\) such that

$$\begin{aligned} \lambda ^{*} = (\sum _{k=1}^{K}\sqrt{\phi _k})^2 \ge 0. \end{aligned}$$

Using this in Eq. (28) we get

$$\begin{aligned} p_k^*=\frac{\sqrt{\phi _k}}{\sum _{\ell =1}^{K}\sqrt{\phi _\ell }}. \end{aligned}$$

This choice of \((\mathbf{p}^*, \lambda ^*)\) satisfies the first order Karush–Kuhn–Tucker (KKT) conditions and we also have \(\sum _{k=1}^{K} p_k^* - 1 =0\), satisfying the strict complementarity condition (Theorem 12.1 and Definition 12.2, Nocedal and Wright 1999). Plugging in \(p_k^*\) in Eq. (27) we see the Hessian is positive definite. By Theorem 12.6 of Nocedal and Wright (1999) \(\mathbf{p}^*\) is then the strict unique local minimizer for \(\sum _{k=1}^{K} \phi _k/p_k\). Evaluating the function \(\sum _{k=1}^{K} \phi _k/p_k\) at \(\mathbf{p}^*\), we see the minimized value of the objective function is \((\sum _{k=1}^{K}\sqrt{\phi _k})^2\). This completes the proof of part A.

To prove part B, we make use of the following series expansion of the floor function:

$$\begin{aligned} \lfloor x \rfloor = {\left\{ \begin{array}{ll} x &{} \text {if } x \text{ is } \text{ an } \text{ integer },\\ x -\frac{1}{2} +\frac{1}{\pi } \sum _{i=1}^{\infty } \frac{\sin (2 \pi i x)}{i} &{} \text {otherwise.} \end{array}\right. } \end{aligned}$$

Noting the definition of \(\widetilde{p}_k = \lfloor p^*_k M \rfloor /\{ \sum _{\ell =1}^{K} \lfloor p^*_\ell M \rfloor \}\) and using the relation specified above yields after some algebra that for each \(k=1, \ldots , K\)

$$\begin{aligned} | \widetilde{p}_k - p^*_k| = O(1/M). \end{aligned}$$

Part B then follows as a result of the mean value theorem and with the assumption \(\nabla _p {\mathrm{V}}(\mathbf{p})\) is bounded.

Appendix 2: Details of calculations in Sect. 4

From Eq. (11) we have

$$\begin{aligned} \mathrm{Var}(x_k)&= q^2 \mathrm{Var}( x_{k-1}) + \frac{\phi _k}{M_k}\\&= q^2 \mathrm{Var}(q x_{k-2} + \epsilon _{k-1}) + \frac{\phi _k}{M_k}\\&= \cdots \\&= \sum _{m=1}^{k} \frac{\phi _m q^{2(k-m)}}{M_m}. \end{aligned}$$

and for \(i>k\)

$$\begin{aligned} \mathrm{Cov}(x_i, x_k)&= \mathrm{Cov}(\mu _i + q(x_{i-1} - \mu _{i-1}) + \epsilon _{i} , x_k)\\&= q \mathrm{Cov}(x_{i-1}, x_k)\\&= \cdots \\&= q^{i-k} \mathrm{Var}(x_k)\\&= q^{i-k} \sum _{m=1}^{k} \frac{\phi _m q^{2(k-m)}}{M_m}. \end{aligned}$$

Note the definitions of \(L_m, C_1, C_2, A_m\) and \(B_m\) from Eqs. (1216). Now

$$\begin{aligned}&\mathrm{Var}(\sum _{k=1}^{K} x_k)= \sum _{k=1}^{K} \mathrm{Var}(x_k) + 2 {\sum _{i=2}^{K} \sum _{j =1}^{i-1}} \mathrm{Cov}(x_i , x_j)\\&\quad = \sum _{k=1}^{K} \sum _{m=1}^{k} \frac{\phi _m q^{2(k-m)}}{M_m} + 2 {\sum _{i=2}^{K} \sum _{j =1}^{i-1}} q^{i-j} \sum _{m=1}^{j} \frac{\phi _m q^{2(j-m)}}{M_m}\\&\quad = \sum _{m=1}^{K} \sum _{k=m}^{K} \frac{\phi _m q^{2(k-m)}}{M_m} + 2 {\sum _{i=2}^{K} \sum _{j =1}^{i-1}} \sum _{m=1}^{j} \frac{\phi _m q^{(i+j-2m)}}{M_m}\\&\quad = \sum _{m=1}^{K} \frac{\phi _m q^{-2m}}{M_m}\sum _{k=m}^{K} q^{2k} + 2 {\sum _{m=1}^{K} \frac{\phi _m q^{-2m}}{M_m} \sum _{j =1}^{K-1}} q^j\sum _{i=j+1}^{K} q^{i}\\&\quad = \frac{1}{1-q^2} \sum _{m=1}^{K} \frac{\phi _m (1 - q^{2(K-m+1)})}{M_m} \\&\qquad + \frac{2q}{1\,{-}\,q} \Bigg (\frac{1\,{-}\,q^{2(K-1)}}{1\,{-}\,q^2} \,{-}\, q^K\frac{1\,{-}\,q^{(K-1)}}{1\,{-}\,q} \Bigg ) \sum _{m=1}^{K} \frac{\phi _m q^{-2m}}{M_m} \\&\quad = C_1\sum _{m=1}^{K} \frac{A_m}{M_m} + C_2 \sum _{m=1}^{K} \frac{B_m}{M_m}\\&\quad = \sum _{m=1}^{K}\frac{L_m}{M_m}. \end{aligned}$$

Appendix 3: Malaria model description

We describe the \({\hbox {VS}^{2}\hbox {EI}^{2}}\) model with rainfall of malaria used in Sect. 6. A schematic diagram is presented in Fig. 8. The purpose here is to describe the state transition equations and the observation equation that were used in a plug and play particle filtering. For a detailed scientific background on the model, we refer the interested reader to Laneri et al. (2010) and Bhadra et al. (2011a). Each human in the system can be present in one of the classes \(S_1\) (susceptible), \(S_2\) (partially protected), \(E\) (exposed and carrying a latent infection), \(I_1\) (infected and infectious to uninfected mosquitoes) and \(I_2\) (carrying asymptomatic infection and infectious to uninfected mosquitoes with a reduced level of infectivity than class \(I_1\)) at a given time point \(t\). The total population at time \(t\) is denoted by \(P(t)\) and is supposed to be known through an interpolation of the decennial population census. Thus we have the constraint \(P(t) = S_1 (t) + S_2 (t) + E(t) + I_1 (t) + I_2 (t)\). The class \(\lambda (t)\) represents the latent force of infection, capturing the mosquito survival and growth of the malaria parasite inside the body of an infected mosquito. We denote by \(\mu _{XY}\) the rate of transition from class \(X\) to \(Y\), where \(X\) and \(Y\) \(\in \{S_1, S_2, E, I_1,I_2\}\). We assume newborns are susceptible to the infection and put them in class \(S_1\). The per capita birth rate \(\mu _{BS_1}\) is calculated via interpolation and the natural death rate \(\mu _{XD} = \delta \) for \(X\in \{S_1, S_2, E, I_1, I_2\}\) is assumed known. Disease related mortality is not considered separately. Table 3 gives a description of the symbols used in the model, along with the values used in the filtering operations in Sect. 6. We now present the state transition equations of this hidden Markov model defining the dynamics.

$$\begin{aligned}&{d{{S_1}}/dt}=\mu _{B{S_1}}P-\mu _{{S_1}E}(t){S_1}+ \mu _{{I_1}{S_1}} {I_1}\nonumber \\&\qquad \qquad \quad \; +\, \mu _{{S_2}{S_1}}{S_2}-\mu _{{S_1}D} {S_1},\end{aligned}$$
(29)
$$\begin{aligned}&{d{{S_2}}/dt} = \mu _{{I_2}{S_2}}{I_2}-\mu _{{S_2}{S_1}}{S_2}- \mu _{{S_2}{I_2}} {S_2}- \mu _{{S_2}D} {S_2}, \end{aligned}$$
(30)
$$\begin{aligned}&{d{E}/dt} = \mu _{{S_1}E}{S_1}- \mu _{E{I_1}}E - \mu _{ED} E, \end{aligned}$$
(31)
$$\begin{aligned}&{d{{I_1}}/dt} = \mu _{E{I_1}}E -\mu _{{I_1}{S_1}}{I_1}- \mu _{{I_1}{I_2}} {I_1}- \mu _{{I_1}D} {I_1}, \end{aligned}$$
(32)
$$\begin{aligned}&{d{{I_2}}/dt} = \mu _{{I_1}{I_2}} {I_1}+ \mu _{{S_2}{I_2}}{S_2}- \mu _{{I_2}{S_2}} {I_2}-\mu _{{I_2}D} {I_2},\end{aligned}$$
(33)
$$\begin{aligned}&\lambda (t) = \frac{{I_1}(t)+q{I_2}(t)}{P(t)} \, \bar{\beta } \, \exp \Big \{ \sum _{i=1}^{n_s}{\beta _i s_i(t)} + Z_t \beta \Big \},\end{aligned}$$
(34)
$$\begin{aligned}&\mu _{{S_1}E}(t) = \int _{-\infty }^{t} \gamma (t-s)\lambda (s)d\Gamma (s)\quad \nonumber \\&\quad \mathrm for \quad \gamma (t) = \frac{(k/\tau )^{k}t^{k-1}}{(k-1)! }\exp \{-kt/\tau \}. \end{aligned}$$
(35)

Here, \(q\) represents the transmissibility, relative to full-blown infections, from asymptomatic infections in partially immune individuals; the seasonality of disease transmission is modeled by the coefficients \(\{\beta _i\}\) corresponding to a periodic cubic B-spline basis \(\{s_i(t), i=1,\dots ,n_s\}\) constructed using \(n_s\) evenly spaced knots; time-varying covariates enter via the row vector \(Z_t\) with coefficients in a column vector \(\beta \); the dimensional constant \(\bar{\beta }\) is required to give \(\lambda (t)\) units of \(t^{-1}\), and we set \(\bar{\beta } = 1\hbox {yr}^{-1}\). . Here the \(\Gamma (t)\) is a gamma process with intensity \(\sigma ^2\), i.e. \(\Gamma (t) - \Gamma (s) \sim \hbox {Gamma} ((t-s)/\sigma ^2, \sigma ^2)\) where \(\hbox {Gamma}(a,b)\) is the gamma distribution with mean \(ab\) and variance \(ab^2\). The gamma-distribued delay with mean \(\tau \) and variance \(\tau ^2/k\) allows us to define \(\lambda _1 (t), \ldots , \lambda _k (t)\) to satisfy the following:

$$\begin{aligned}&d\lambda _1/dt= (\lambda d\Gamma /dt - \lambda _1)k\tau ^{-1},\end{aligned}$$
(36)
$$\begin{aligned}&d\lambda _i/dt= (\lambda _{i-1} - \lambda _i)k\tau ^{-1}\qquad \hbox {for} \quad i=2, \ldots , k. \end{aligned}$$
(37)

We see that Eqs. (3637) are equivalent to Eq. (35) if we set \(\mu _{{S_1}E}(t)=\lambda _k (t)\). The system is solved via a stochastic Euler method of time step \(\varDelta =1\) day. Here, we take \(Z_t\) to be a scalar covariate measuring the thresholded rainfall integrated over a time interval \([t-u,t]\). Specifically, from the accumulated rainfall data \(\{r_n,n=1,\dots ,N\}\) at times \(t_1,\dots ,t_N\) we interpolated a continuous-time cubic spline \(r(t)\) and then set

$$\begin{aligned} {\widetilde{Z}}_t = \max \big \{\, {\textstyle \int _{t-u}^t r(s)\,ds} - v\; , \; 0 \big \}. \end{aligned}$$
(38)

The covariate is standardized by setting \(Z_t=({\widetilde{Z}}_t - {\overline{Z}})/\sigma _Z\), where \({\overline{Z}}= (t_N-t_0)^{-1}\int _{t_0}^{t_N}{\widetilde{Z}}_s \, ds\) and \(\sigma ^2_Z=(t_N-t_0)^{-1}\int _{t_0}^{t_N}\big (Z_s-{\overline{Z}}\big )^2\, ds\). This standardization makes the coefficient \(\beta \) a dimensionless quantity which is expected to vary on a unit scale.

Fig. 8
figure 8

Flow diagram for a compartment model of malaria transmission, from Bhadra et al. (2011a). Human classes are \({S_1}\) (susceptible), \({S_2}\) (partially protected), \(E\) (exposed, carrying a latent infection), \({I_1}\) (infected and infectious) and \({I_2}\) (asymptomatic, with reduced infectivity). The possibility of transition between class \(X\) and \(Y\) is denoted by a solid arrow, with the corresponding rate written as \(\mu _{XY}\). The dotted arrows represent interactions between the human and mosquito stages of the parasite. Mosquito dynamics are modeled via the two stages \(\kappa \) (the latent force of infection) and \(\lambda \) (the current force of infection), with \(\tau \) being the mean latency time. The model, which we call VS\(^2\)EI\(^2\) with ‘V’ for ‘vector-borne pathogen’ followed by a list of the human classes with their multiplicities as superscripts, is formalized by Eqs. (2935)

Table 3 List of parameters used in the malaria model along with a brief description, their units and the values used in the filtering operation in Sect. 6

1.1 Observation equation

We write \(\{t_n, n=1,\dots ,N\}\) for the times of the \(N\) observations (\(y_1,\ldots ,y_N\)), and we suppose that the model is initialized at some time \(t_0<t_1\). We define

$$\begin{aligned}&C_n = \int _{t_{n-1}}^{t_n}\mu _{E{I_1}} E(s)\,ds,\\&y_n\mid C_n \sim {\hbox {Negbin}}(\rho C_n,\psi ^2), \end{aligned}$$

where \({\hbox {Negbin}}(\alpha ,\beta )\) is the negative binomial distribution with mean \(\alpha \) and variance \(\alpha + \alpha ^2\beta \). This distribution allows for the possibility of over or under-reporting, and can be viewed as an over-dispersed Poisson distribution with dispersion parameter \(\psi \).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bhadra, A., Ionides, E.L. Adaptive particle allocation in iterated sequential Monte Carlo via approximating meta-models. Stat Comput 26, 393–407 (2016). https://doi.org/10.1007/s11222-014-9513-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-014-9513-x

Keywords

Navigation