Skip to main content
Log in

Monte Carlo algorithms for computing \(\alpha \)-permanents

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

We consider the computation of the \(\alpha \)-permanent of a non-negative \(n \times n\) matrix. This appears in a wide variety of real applications in statistics, physics and computer-science. It is well-known that the exact computation is a #P complete problem. This has resulted in a large collection of simulation-based methods, to produce randomized solutions whose complexity is only polynomial in \(n\). This paper will review and develop algorithms for both the computation of the permanent \(\alpha =1\) and \(\alpha >0\) permanent. In the context of binary \(n \times n\) matrices a variety of Markov chain Monte Carlo (MCMC) computational algorithms have been introduced in the literature whose cost, in order to achieve a given level of accuracy, is \(\mathcal {O}(n^7\log ^4(n))\); see Bezakova (Faster Markov chain Monte Carlo algorithms for the permanent and binary contingency tables. University of Chicago, Chicago, 2008), Jerrum et al. (J Assoc Comput Mach 51:671–697, 2004). These algorithms use a particular collection of probability distributions, the ‘ideal’ of which, (in some sense) are not known and need to be approximated. In this paper we propose an adaptive sequential Monte Carlo (SMC) algorithm that can both estimate the permanent and the ideal sequence of probabilities on the fly, with little user input. We provide theoretical results associated to the SMC estimate of the permanent, establishing its convergence. We also analyze the relative variance of the estimate, associated to an ‘ideal’ algorithm (related to our algorithm) and not the one we develop, in particular, computating explicit bounds on the relative variance which depend upon \(n\). As this analysis is for an ideal algorithm, it gives a lower-bound on the computational cost, in order to achieve an arbitrarily small relative variance; we find that this cost is \(\mathcal {O}(n^4\log ^4(n))\). For the \(\alpha \)-permanent, perhaps the gold standard algorithm is the importance sampling algorithm of Kou and McCullagh (Biometrika 96:635–644, 2009); in this paper we develop and compare new algorithms to this method; apriori one expects, due to the weight degeneracy problem, that the method of Kou and McCullagh (Biometrika 96:635–644, 2009) might perform very badly in comparison to the more advanced SMC methods we consider. We also present a statistical application of the \(\alpha \)-permanent for statistical estimation of boson point process and MCMC methods to fit the associated model to data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Andrieu, C., Roberts, G.O.: The pseudo-marginal approach for efficient Monte Carlo computations. Ann. Stat. 37, 697–725 (2009)

  • Beskos, A., Jasra, A., Kantas, N., Thiery, A.: On The Convergence of Adaptive Sequential Monte Carlo Methods. (2014), arXiv preprint, arXiv:1306.6462

  • Bezakova, I.: Faster Markov chain Monte Carlo Algorithms for the Permanent and Binary Contingency Tables. Ph.D. thesis, University of Chicago, Chicago, 2006

  • Bezakova, I., Stefankovic, D., Vazirani, V., Vigoda, E.: Accelerating simulated annealing for the permanent and combinatorial counting problems. SIAM J. Comput. 37, 1429–1454 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  • Bezakova, I., Sinclair, A., Stefankovic, D., Vigoda, E.: Negative examples for sequential importance sampling of binary contingency tables. In: Azar, Y., Erlebach, T. (eds.) Algorithms ESA 2006, vol. 4168, pp. 136–147. Springer, Berlin (2006)

    Chapter  Google Scholar 

  • Bezakova, I., Bhatnagar, N., Vigoda, E.: Sampling binary contingency tables with a greedy start. Random Struct. Algorithms 30, 168–205 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  • Chen, Y., Diaconis, P., Holmes, S., Liu, J.S.: Sequential Monte Carlo methods for statistical analysis of tables. J. Am. Stat. Assoc. 105, 109–120 (2005)

    Article  MathSciNet  Google Scholar 

  • Daley, D., Vere-Jones, D.: An Introduction to the Theory of Point Processes, 2nd edn. Springer, New York (2003)

    MATH  Google Scholar 

  • Del Moral, P.: Feynman–Kac Formulae. Springer, New York (2004)

    Book  MATH  Google Scholar 

  • Del Moral, P., Doucet, A., Jasra, A.: Sequential Monte Carlo samplers. J. R. Stat. Soc. B 68, 411–436 (2006)

  • Diaconis, P., Stroock, D.: Geometric bounds for eigenvalues of Markov chains. Ann. Appl. Probab. 1, 36–61 (1991)

    Article  MATH  MathSciNet  Google Scholar 

  • Doucet, A., Johansen, A.: A tutorial on particle filtering and smoothing: Fifteen years later. In: Crisan, D., Rozovsky, B. (eds.) Handbook of Nonlinear Filtering. Oxford University Press, Oxford (2011)

    Google Scholar 

  • Fearnhead, P.: Sequential Monte Carlo Methods in Filter Theory. Ph.D. thesis, University of Oxford, Oxford, 1998

  • Harrison, M., Miller, J.: Importance Sampling for Weighted Binary Random Matrices, (2013), arXiv preprint, arXiv:1301.3928

  • Jasra, A., Stephens, D.A., Holmes, C.C.: On population-based simulation. Stat. Comput. 17, 263–279 (2007)

    Article  MathSciNet  Google Scholar 

  • Jerrum, M., Sinclair, A., Vigoda, E.: A polynomial-time approximation for the permanent of a matrix with non-negative enteries. J. Assoc. Comput. Mach. 51, 671–697 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  • Kasteleyn, P.W.: The statistics of dimers on a lattice I: the number of dimer arrangements on a quadratic lattice. Physica 27, 1664–1672 (1961)

    Article  Google Scholar 

  • Kou, S.C., McCullagh, P.: Approximating the \(\alpha \)-permanent. Biometrika 96, 635–644 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  • McCullagh, P., Moller, J.: The permanental process. Adv. Appl. Probab. 38, 873–888 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  • Schweizer, N.: Non-asymptotic Error Bounds for Sequential MCMC and Stability of Feynman–Kac Propagators, 2012, arXiv preprint, arXiv:1204.2382

  • Wang, J.: Sequential Monte Carlo Methods for Problems on Finite State-Spaces. Ph.D. Thesis, National University of Singapore (in progress), 2014

  • Wang, J., Jasra, A., De Iorio, M.: Computational methods for a class of network models. J. Comput. Biol. 21, 141–161 (2014)

    Article  MathSciNet  Google Scholar 

  • Whiteley, N. P., Andrieu, C., Doucet, A.: Efficient Bayesian inference for switching state-space models using discrete particle Markov chain Monte Carlo methods, 2010, arXiv preprint, arXiv:1011.2437

Download references

Acknowledgments

The second author was supported by an MOE Singapore Grant R-155-000-119-133 and is affiliated with the Risk Management Institute and Centre for Quantitative Finance at NUS. We thank two referees for very useful comments that have vastly improved the paper. We also thank the editor, Prof. Mark Girolami for his assistance on the article.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ajay Jasra.

Technical results for Section 2

Technical results for Section 2

We will use the Feynman–Kac notations established in Sect. 2.5 and the reader should be familar with that Section to proceed. Recall, from Sect. 2.3, for \(0\le p \le r-1\)

$$\begin{aligned} G_{p,N}(M) = \frac{\Phi _{p+1}^N(M)}{\Phi _{p}^N(M)} \end{aligned}$$

and recall that \(\Phi _0^N(M)\) is deterministic and known. In addition, for \((u,v)\in U\times V\)

$$\begin{aligned} w_p^N(u,v) = \frac{\delta + \eta _{p-1}^N\left( \mathbb {I}_{\mathcal {M}}\frac{\phi _{p+1}}{\phi _{p}}\right) }{\delta + \left[ \eta _{p-1}^N\left( \mathbb {I}_{\mathcal {N}(u,v)}\frac{\phi _{p+1}}{\phi _{p}}\right) \right] \frac{1}{w_p^N(u,v)}} \end{aligned}$$

where for \(\varphi \in \mathcal {B}_b(\mathsf {M})\), \(0\le p\le r\)

$$\begin{aligned} \eta _{p}^N(\varphi ) = \frac{1}{N}\sum _{i=1}^N\varphi (M_p^i) \end{aligned}$$

is the SMC approximation of \(\eta _p\) (recall that one will resample at every time-point, in this analysis). By a simple inductive argument, it follows that one can find a \(0<c(n)<\infty \) such that for any \(0\le p\le r\), \(N\ge 1\), \((u,v)\in U\times V\)

$$\begin{aligned} c(n) \le w_p^N(u,v) \le \frac{\delta +1}{\delta }. \end{aligned}$$

Using the above formulation, for any \(N\ge 1\)

$$\begin{aligned} \sup _{M\in \mathsf {M}}|G_{p,N}(M)| \le 1\vee \Big \{\frac{\delta + 1}{\delta c(n)}\Big \} \end{aligned}$$
(16)

which will be used later. Note that

$$\begin{aligned} \gamma _r^N(1) = \prod _{p=0}^{r-1} \eta _p^N(G_{p,N}) = \prod _{p=0}^{r-1} \bigg [\frac{1}{N} \sum _{i=1}^N G_{p,N}(M_p^i)\bigg ]. \end{aligned}$$

With \(G_{p-1,N}\), given \(Q_{p,N}(M,M') = G_{p-1,N}(M) K_{p,N}(M,M')\) (\(K_{p,N}\) is the MCMC kernel in Jerrum et al. (2004) with invariant measure proportional to \(\Phi _p^N\)) and and \(G_{p-1}\), \(Q_p\) denote the limiting versions (that is, on replacing \(\eta _p^N\) with \(\eta _p\) and so-fourth). Recall the definition of \(\gamma _t(1)\) in (5), which uses the limiting versions of \(G_{p-1}\) and \(K_p\).

Proof of Theorem 2.1

We start with the following decomposition

$$\begin{aligned} \gamma _r^N(1) - \gamma _r(1)&= \prod _{p=0}^{r-1} \eta _p^N(G_{p,N})-\prod _{p=0}^{r-1} \eta _p^N(G_{p})\\&+ \prod _{p=0}^{r-1} \eta _p^N(G_{p})- \prod _{p=0}^{r-1} \eta _p(G_{p}) \end{aligned}$$

where one can show that \(\gamma _r(1)=\prod _{p=0}^{r-1} \eta _p(G_{p})\); see Del Moral (2004). By Theorem 6.1, the second term on the R.H.S. goes to zero. Hence we will focus on \(\prod _{p=0}^{r-1} \eta _p^N(G_{p,N}) - \prod _{p=0}^{r-1} \eta _p^N(G_{p})\).

We have the following collapsing sum representation

$$\begin{aligned}&\prod _{p=0}^{r-1} \eta _p^N(G_{p,N}) - \prod _{p=0}^{r-1}\eta _p^N(G_{p})\\&= \sum _{q=0}^{r-1}\left( \left[ \prod _{s=0}^{q-1}\eta _s^N(G_s)\right] \left[ \eta _q^N(G_{q,N}) - \eta _q^N(G_{q})\right] \right. \\&\quad \left. \left[ \prod _{s=q+1}^{r-1} \eta _s^N(G_{s,N})\right] \right) \end{aligned}$$

where we are using the convention \(\prod _{\emptyset } = 1\). We can consider each summand separately. By Theorem 6.1, \(\prod _{s=0}^{q-1}\eta _s^N(G_s)\) will converge in probability to constant. By the proof of Theorem 6.1 (see (20)) \(\eta _q^N(G_{q,N}) - \eta _q^N(G_{q})\) converges to zero in probability and \(\prod _{s=q+1}^{r-1} \eta _s^N(G_{s,N})\) converges in probability to a constant; this completes the proof of the theorem.\(\square \)

\(\mathbb {E}\) will be used to denote expectation w.r.t. the probability associated to the SMC algorithm.

Theorem 6.1

For any \(0\le p \le r-1\), \((\varphi _0,\ldots ,\varphi _{p})\in \mathcal {B}_b(\mathsf {M})^{p+1}\) and \(((u_{1},v_1),\ldots ,(u_{p+1},v_{p+1}))\in (U\times V)^{p+1}\), we have

$$\begin{aligned}&(\eta _0^N(\varphi _0),w_1^N(u_1,v_1),\ldots ,\eta _p^N(\varphi _p),w_{p+1}^N(u_{p+1},v_{p+1})) \rightarrow _{\mathbb {P}}\\&(\eta _0(\varphi _0),w_1^*(u_1,v_1),\ldots ,\eta _p(\varphi _p),w_{p+1}^*(u_{p+1},v_{p+1})). \end{aligned}$$

Proof

Our proof proceeds via strong induction. For \(p=0\), by the WLLN for i.i.d. random variables \(\eta _0^N(\varphi _0)\rightarrow _{\mathbb {P}}\eta _0(\varphi _0)\). Then by the continuous mapping theorem, it clearly follows that for any fixed \((u_1,v_1)\) that \(w_1^N(u_1,v_1)\rightarrow _{\mathbb {P}}w_1^*(u_1,v_1)\) and indeed that \(M_0\in \mathsf {M}\), \(G_{0,N}(M_0)\rightarrow _{\mathbb {P}}G_0(M_0)\) which will be used later on. Thus, the proof of the initialization follows easily.

Now assume the result for \(p-1\) and consider the proof at rank \(p\). We have that

$$\begin{aligned}&\eta _p^N(\varphi _p) - \eta _p(\varphi _p) = \eta _p^N(\varphi _p)\nonumber \\&\quad -\mathbb {E}[\eta _p^N(\varphi _p)|\mathcal {F}_{p-1}] +\mathbb {E}[\eta _p^N(\varphi _p)|\mathcal {F}_{p-1}] - \eta _p(\varphi _p) \end{aligned}$$
(17)

where \(\mathcal {F}_{p-1}\) is the filtration generated by the particle system up-to time \(p-1\). We focus on the second term on the R.H.S., which can be written as:

$$\begin{aligned}&\mathbb {E}[\eta _p^N(\varphi _p)|\mathcal {F}_{p-1}] - \eta _p(\varphi _p)\nonumber \\&\quad = \frac{\eta _{p-1}^N(Q_{p}(\varphi _p))}{\eta _{p-1}(G_{p-1})}-\frac{\eta _{p-1}(Q_{p}(\varphi _p))}{\eta _{p-1}(G_{p-1})}\nonumber \\&\qquad +\eta _{p-1}^N(Q_{p}(\varphi _p))\bigg [\frac{1}{\eta _{p-1}^N(G_{p-1,N})} -\frac{1}{\eta _{p-1}(G_{p-1})}\bigg ]\nonumber \\&\qquad +\frac{\eta _{p-1}^N[\{Q_{p,N}-Q_p\}(\varphi _p)]}{\eta _{p-1}(G_{p-1,N})}. \end{aligned}$$
(18)

By the induction hypothesis, as \(Q_p(\varphi _p)\in \mathcal {B}_b(\mathsf {M})\), the first term on the R.H.S. of (18) converges in probability to zero. To proceed, we will consider the two terms on the R.H.S. of (18) in turn, starting with the second.

Second Term on R.H.S. of (18). Consider

$$\begin{aligned}&\mathbb {E}[|\eta _{p-1}^N(G_{p-1,N}) - \eta _{p-1}(G_{p-1})|]\\&\quad = \mathbb {E}[|\eta _{p-1}^N(G_{p-1,N}\!-\!G_{p-1})+ \eta _{p-1}^N(G_{p-1})\\&\qquad -\eta _{p-1}(G_{p-1})|]\\&\quad \le \mathbb {E}[|\eta _{p-1}^N(G_{p-1,N}-G_{p-1})|]+\mathbb {E}[|\eta _{p-1}^N(G_{p-1})\\&\qquad - \eta _{p-1}(G_{p-1})|]. \end{aligned}$$

For the second term of the R.H.S. of the inequality, by the induction hypothesis \(|\eta _{p-1}^N(G_{p-1})- \eta _{p-1}(G_{p-1})|\rightarrow _{\mathbb {P}} 0\) and as \(G_{p-1}\) is a bounded function, so \(\mathbb {E}[|\eta _{p-1}^N(G_{p-1})- \eta _{p-1}(G_{p-1})|]\) will converge to zero. For the first term, we have

$$\begin{aligned} \mathbb {E}[|\eta _{p-1}^N(G_{p-1,N}-G_{p-1})|]&\le \mathbb {E}[|G_{p-1,N}(M_{p-1}^1)\\&\quad - G_{p-1}(M_{p-1}^1)|] \end{aligned}$$

where we have used the exchangeability of the particle system (the marginal law of any sample \(M_{p-1}^i\) is the same for each \(i\in [N]\)). Then, noting that the inductive hypothesis implies that for any fixed \(M_{p-1}\in \mathsf {M}\)

$$\begin{aligned} G_{p-1,N}(M_{p-1}) \rightarrow _{\mathbb {P}} G_{p-1}(M_{p-1}) \end{aligned}$$
(19)

by essentially the above the arguments (note (16)), we have that \(\mathbb {E}[|\eta _{p-1}^N(G_{p-1,N}-G_{p-1})|]\rightarrow 0\). This establishes

$$\begin{aligned} \eta _{p-1}^N(G_{p-1,N})\rightarrow _{\mathbb {P}} \eta _{p-1}(G_{p-1}). \end{aligned}$$
(20)

Thus, using the induction hypothesis, as \(Q_p(\varphi _p)\in \mathcal {B}_b(\mathsf {M})\), \(\eta _{p-1}^N(Q_{p}(\varphi _p))\) converges in probability to a constant. This fact combined with above argument and the continuous mapping Theorem, shows that the the second term on the R.H.S. of (18) will converge to zero in probability.

Third Term on R.H.S. of (18). We would like to show that

$$\begin{aligned} \mathbb {E}||\eta _{p-1}^N[\{Q_{p,N}-Q_p\}(\varphi _p)]|]&\le \mathbb {E}[|Q_{p,N}(\varphi _p)(M_{p-1}^1)\\&\quad -Q_{p}(\varphi _p)(M_{p-1}^1)|]. \end{aligned}$$

goes to zero. As the term in the expectation on the R.H.S. of the inequality is bounded (note (16)), it suffices to prove that this term will converge to zero in probability. We have, for any fixed \(M\in \mathsf {M}\)

$$\begin{aligned}&Q_{p,N}(\varphi _p)(M) - Q_{p}(\varphi _p)(M) \\&\quad = [G_{p-1,N}(M) - G_{p-1}(M)] K_{p,N}(\varphi _p)(M)\\&\qquad + G_{p-1}(M)[K_{p,N}(\varphi _p)(M)-K_{p}(\varphi _p)(M)]. \end{aligned}$$

As \(K_{p,N}(\varphi _p)(M)\) is bounded, it clearly follows via the induction hypothesis (note (19)) that \([G_{p-1,N}(M) - G_{p-1}(M)] K_{p,N})(\varphi _p)(M)\) will converge to zero in probability. To deal with the second part, we consider only ‘acceptance’ part of the M–H kernel; dealing with the ‘rejection’ part is very similar and omitted for brevity:

$$\begin{aligned} \sum _{M'\in \mathsf {M}} q_p(M,M')\varphi _p(M')\Bigg [1\wedge \Bigg (\frac{\Phi _p^N(M')}{\Phi _p^N(M)}\Bigg ) - 1\wedge \Bigg (\frac{\Phi _p(M')}{\Phi _p(M)}\Bigg )\Bigg ] \end{aligned}$$
(21)

where \(q_p(M,M')\) is the symmetric proposal probability. For any fixed \(M,M'\) \(1\wedge \Big (\frac{\Phi _p^N(M')}{\Phi _p^N(M)}\Big )\) is a continuous function of \(\eta _{p-1}^N(\cdot )\), \(w_p^N\) (when they appear), so by the induction hypothesis, it follows that for any \(M,M'\in \mathsf {M}\),

$$\begin{aligned} \Bigg [1\wedge \Bigg (\frac{\Phi _p^N(M')}{\Phi _p^N(M)}\Bigg ) - 1\wedge \Bigg (\frac{\Phi _p(M')}{\Phi _p(M)}\Bigg )\Bigg ] \rightarrow _{\mathbb {P}} 0 \end{aligned}$$

and hence so does (21) (recall \(\mathsf {M}\) is finite). By (20) \(\eta _{p-1}(G_{p-1,N})\) converges in probability to \(\eta _{p-1}(G_{p-1})\) and hence third term on the R.H.S. of (18) will converge to zero in probability.

Now, following the proof of Beskos et al. (2014 [Theorem 4.1]) and the above arguments, the first term on the R.H.S.of (17) will converge to zero in probability. Thus, we have shown that \(\eta _{p}^N(\varphi _p)-\eta _{p}(\varphi _p)\) will converge to zero in probability. Then, by this latter result and the induction hypothesis, along with the continuous mapping theorem, it follows that for \((u_{p+1},v_{p+1})\in (U\times V)\) arbitrary, \(w_{p+1}^N(u_{p+1},v_{p+1})\rightarrow _{\mathbb {P}} w_{p+1}^*(u_{p+1},v_{p+1})\) and indeed that \(G_{p,N}(M_p)\) converges in probability to \(G_{p}(M_p)\) for any fixed \(M_p\in \mathsf {M}\). From here one can conclude the proof with standard results in probability.\(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, J., Jasra, A. Monte Carlo algorithms for computing \(\alpha \)-permanents. Stat Comput 26, 231–248 (2016). https://doi.org/10.1007/s11222-014-9491-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-014-9491-z

Keywords

Navigation