Monte Carlo algorithms for computing $$\alpha $$ -permanents

Wang, Junshan; Jasra, Ajay

doi:10.1007/s11222-014-9491-z

Monte Carlo algorithms for computing $\alpha $-permanents

Published: 29 July 2014

Volume 26, pages 231–248, (2016)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

Junshan Wang¹ &
Ajay Jasra¹

310 Accesses
4 Citations
Explore all metrics

Abstract

We consider the computation of the $\alpha $-permanent of a non-negative $n \times n$ matrix. This appears in a wide variety of real applications in statistics, physics and computer-science. It is well-known that the exact computation is a #P complete problem. This has resulted in a large collection of simulation-based methods, to produce randomized solutions whose complexity is only polynomial in $n$. This paper will review and develop algorithms for both the computation of the permanent $\alpha =1$ and $\alpha >0$ permanent. In the context of binary $n \times n$ matrices a variety of Markov chain Monte Carlo (MCMC) computational algorithms have been introduced in the literature whose cost, in order to achieve a given level of accuracy, is $\mathcal {O}(n^7\log ^4(n))$; see Bezakova (Faster Markov chain Monte Carlo algorithms for the permanent and binary contingency tables. University of Chicago, Chicago, 2008), Jerrum et al. (J Assoc Comput Mach 51:671–697, 2004). These algorithms use a particular collection of probability distributions, the ‘ideal’ of which, (in some sense) are not known and need to be approximated. In this paper we propose an adaptive sequential Monte Carlo (SMC) algorithm that can both estimate the permanent and the ideal sequence of probabilities on the fly, with little user input. We provide theoretical results associated to the SMC estimate of the permanent, establishing its convergence. We also analyze the relative variance of the estimate, associated to an ‘ideal’ algorithm (related to our algorithm) and not the one we develop, in particular, computating explicit bounds on the relative variance which depend upon $n$. As this analysis is for an ideal algorithm, it gives a lower-bound on the computational cost, in order to achieve an arbitrarily small relative variance; we find that this cost is $\mathcal {O}(n^4\log ^4(n))$. For the $\alpha $-permanent, perhaps the gold standard algorithm is the importance sampling algorithm of Kou and McCullagh (Biometrika 96:635–644, 2009); in this paper we develop and compare new algorithms to this method; apriori one expects, due to the weight degeneracy problem, that the method of Kou and McCullagh (Biometrika 96:635–644, 2009) might perform very badly in comparison to the more advanced SMC methods we consider. We also present a statistical application of the $\alpha $-permanent for statistical estimation of boson point process and MCMC methods to fit the associated model to data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A computational procedure for estimation of the mixing time of the random-scan Metropolis algorithm

Article 08 July 2015

David A. Spade

The Monte–Carlo Method

Randomized Quasi-Monte Carlo: An Introduction for Practitioners

References

Andrieu, C., Roberts, G.O.: The pseudo-marginal approach for efficient Monte Carlo computations. Ann. Stat. 37, 697–725 (2009)
Beskos, A., Jasra, A., Kantas, N., Thiery, A.: On The Convergence of Adaptive Sequential Monte Carlo Methods. (2014), arXiv preprint, arXiv:1306.6462
Bezakova, I.: Faster Markov chain Monte Carlo Algorithms for the Permanent and Binary Contingency Tables. Ph.D. thesis, University of Chicago, Chicago, 2006
Bezakova, I., Stefankovic, D., Vazirani, V., Vigoda, E.: Accelerating simulated annealing for the permanent and combinatorial counting problems. SIAM J. Comput. 37, 1429–1454 (2008)
Article MATH MathSciNet Google Scholar
Bezakova, I., Sinclair, A., Stefankovic, D., Vigoda, E.: Negative examples for sequential importance sampling of binary contingency tables. In: Azar, Y., Erlebach, T. (eds.) Algorithms ESA 2006, vol. 4168, pp. 136–147. Springer, Berlin (2006)
Chapter Google Scholar
Bezakova, I., Bhatnagar, N., Vigoda, E.: Sampling binary contingency tables with a greedy start. Random Struct. Algorithms 30, 168–205 (2007)
Article MATH MathSciNet Google Scholar
Chen, Y., Diaconis, P., Holmes, S., Liu, J.S.: Sequential Monte Carlo methods for statistical analysis of tables. J. Am. Stat. Assoc. 105, 109–120 (2005)
Article MathSciNet Google Scholar
Daley, D., Vere-Jones, D.: An Introduction to the Theory of Point Processes, 2nd edn. Springer, New York (2003)
MATH Google Scholar
Del Moral, P.: Feynman–Kac Formulae. Springer, New York (2004)
Book MATH Google Scholar
Del Moral, P., Doucet, A., Jasra, A.: Sequential Monte Carlo samplers. J. R. Stat. Soc. B 68, 411–436 (2006)
Diaconis, P., Stroock, D.: Geometric bounds for eigenvalues of Markov chains. Ann. Appl. Probab. 1, 36–61 (1991)
Article MATH MathSciNet Google Scholar
Doucet, A., Johansen, A.: A tutorial on particle filtering and smoothing: Fifteen years later. In: Crisan, D., Rozovsky, B. (eds.) Handbook of Nonlinear Filtering. Oxford University Press, Oxford (2011)
Google Scholar
Fearnhead, P.: Sequential Monte Carlo Methods in Filter Theory. Ph.D. thesis, University of Oxford, Oxford, 1998
Harrison, M., Miller, J.: Importance Sampling for Weighted Binary Random Matrices, (2013), arXiv preprint, arXiv:1301.3928
Jasra, A., Stephens, D.A., Holmes, C.C.: On population-based simulation. Stat. Comput. 17, 263–279 (2007)
Article MathSciNet Google Scholar
Jerrum, M., Sinclair, A., Vigoda, E.: A polynomial-time approximation for the permanent of a matrix with non-negative enteries. J. Assoc. Comput. Mach. 51, 671–697 (2004)
Article MATH MathSciNet Google Scholar
Kasteleyn, P.W.: The statistics of dimers on a lattice I: the number of dimer arrangements on a quadratic lattice. Physica 27, 1664–1672 (1961)
Article Google Scholar
Kou, S.C., McCullagh, P.: Approximating the $\alpha $-permanent. Biometrika 96, 635–644 (2009)
Article MATH MathSciNet Google Scholar
McCullagh, P., Moller, J.: The permanental process. Adv. Appl. Probab. 38, 873–888 (2006)
Article MATH MathSciNet Google Scholar
Schweizer, N.: Non-asymptotic Error Bounds for Sequential MCMC and Stability of Feynman–Kac Propagators, 2012, arXiv preprint, arXiv:1204.2382
Wang, J.: Sequential Monte Carlo Methods for Problems on Finite State-Spaces. Ph.D. Thesis, National University of Singapore (in progress), 2014
Wang, J., Jasra, A., De Iorio, M.: Computational methods for a class of network models. J. Comput. Biol. 21, 141–161 (2014)
Article MathSciNet Google Scholar
Whiteley, N. P., Andrieu, C., Doucet, A.: Efficient Bayesian inference for switching state-space models using discrete particle Markov chain Monte Carlo methods, 2010, arXiv preprint, arXiv:1011.2437

Download references

Acknowledgments

The second author was supported by an MOE Singapore Grant R-155-000-119-133 and is affiliated with the Risk Management Institute and Centre for Quantitative Finance at NUS. We thank two referees for very useful comments that have vastly improved the paper. We also thank the editor, Prof. Mark Girolami for his assistance on the article.

Author information

Authors and Affiliations

Department of Statistics and Applied Probability, National University of Singapore, Singapore , 117546, Singapore
Junshan Wang & Ajay Jasra

Authors

Junshan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Ajay Jasra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ajay Jasra.

Technical results for Section 2

We will use the Feynman–Kac notations established in Sect. 2.5 and the reader should be familar with that Section to proceed. Recall, from Sect. 2.3, for $0\le p \le r-1$

$$\begin{aligned} G_{p,N}(M) = \frac{\Phi _{p+1}^N(M)}{\Phi _{p}^N(M)} \end{aligned}$$

and recall that $\Phi _0^N(M)$ is deterministic and known. In addition, for $(u,v)\in U\times V$

$$\begin{aligned} w_p^N(u,v) = \frac{\delta + \eta _{p-1}^N\left( \mathbb {I}_{\mathcal {M}}\frac{\phi _{p+1}}{\phi _{p}}\right) }{\delta + \left[ \eta _{p-1}^N\left( \mathbb {I}_{\mathcal {N}(u,v)}\frac{\phi _{p+1}}{\phi _{p}}\right) \right] \frac{1}{w_p^N(u,v)}} \end{aligned}$$

where for $\varphi \in \mathcal {B}_b(\mathsf {M})$, $0\le p\le r$

$$\begin{aligned} \eta _{p}^N(\varphi ) = \frac{1}{N}\sum _{i=1}^N\varphi (M_p^i) \end{aligned}$$

is the SMC approximation of $\eta _p$ (recall that one will resample at every time-point, in this analysis). By a simple inductive argument, it follows that one can find a $0<c(n)<\infty $ such that for any $0\le p\le r$, $N\ge 1$, $(u,v)\in U\times V$

$$\begin{aligned} c(n) \le w_p^N(u,v) \le \frac{\delta +1}{\delta }. \end{aligned}$$

Using the above formulation, for any $N\ge 1$

$$\begin{aligned} \sup _{M\in \mathsf {M}}|G_{p,N}(M)| \le 1\vee \Big \{\frac{\delta + 1}{\delta c(n)}\Big \} \end{aligned}$$

(16)

which will be used later. Note that

$$\begin{aligned} \gamma _r^N(1) = \prod _{p=0}^{r-1} \eta _p^N(G_{p,N}) = \prod _{p=0}^{r-1} \bigg [\frac{1}{N} \sum _{i=1}^N G_{p,N}(M_p^i)\bigg ]. \end{aligned}$$

With $G_{p-1,N}$, given $Q_{p,N}(M,M') = G_{p-1,N}(M) K_{p,N}(M,M')$ ($K_{p,N}$ is the MCMC kernel in Jerrum et al. (2004) with invariant measure proportional to $\Phi _p^N$) and and $G_{p-1}$, $Q_p$ denote the limiting versions (that is, on replacing $\eta _p^N$ with $\eta _p$ and so-fourth). Recall the definition of $\gamma _t(1)$ in (5), which uses the limiting versions of $G_{p-1}$ and $K_p$.

Proof of Theorem 2.1

We start with the following decomposition

$$\begin{aligned} \gamma _r^N(1) - \gamma _r(1)&= \prod _{p=0}^{r-1} \eta _p^N(G_{p,N})-\prod _{p=0}^{r-1} \eta _p^N(G_{p})\\&+ \prod _{p=0}^{r-1} \eta _p^N(G_{p})- \prod _{p=0}^{r-1} \eta _p(G_{p}) \end{aligned}$$

where one can show that $\gamma _r(1)=\prod _{p=0}^{r-1} \eta _p(G_{p})$; see Del Moral (2004). By Theorem 6.1, the second term on the R.H.S. goes to zero. Hence we will focus on $\prod _{p=0}^{r-1} \eta _p^N(G_{p,N}) - \prod _{p=0}^{r-1} \eta _p^N(G_{p})$.

We have the following collapsing sum representation

$$\begin{aligned}&\prod _{p=0}^{r-1} \eta _p^N(G_{p,N}) - \prod _{p=0}^{r-1}\eta _p^N(G_{p})\\&= \sum _{q=0}^{r-1}\left( \left[ \prod _{s=0}^{q-1}\eta _s^N(G_s)\right] \left[ \eta _q^N(G_{q,N}) - \eta _q^N(G_{q})\right] \right. \\&\quad \left. \left[ \prod _{s=q+1}^{r-1} \eta _s^N(G_{s,N})\right] \right) \end{aligned}$$

where we are using the convention $\prod _{\emptyset } = 1$. We can consider each summand separately. By Theorem 6.1, $\prod _{s=0}^{q-1}\eta _s^N(G_s)$ will converge in probability to constant. By the proof of Theorem 6.1 (see (20)) $\eta _q^N(G_{q,N}) - \eta _q^N(G_{q})$ converges to zero in probability and $\prod _{s=q+1}^{r-1} \eta _s^N(G_{s,N})$ converges in probability to a constant; this completes the proof of the theorem.$\square $

$\mathbb {E}$ will be used to denote expectation w.r.t. the probability associated to the SMC algorithm.

Theorem 6.1

For any $0\le p \le r-1$, $(\varphi _0,\ldots ,\varphi _{p})\in \mathcal {B}_b(\mathsf {M})^{p+1}$ and $((u_{1},v_1),\ldots ,(u_{p+1},v_{p+1}))\in (U\times V)^{p+1}$, we have

$$\begin{aligned}&(\eta _0^N(\varphi _0),w_1^N(u_1,v_1),\ldots ,\eta _p^N(\varphi _p),w_{p+1}^N(u_{p+1},v_{p+1})) \rightarrow _{\mathbb {P}}\\&(\eta _0(\varphi _0),w_1^*(u_1,v_1),\ldots ,\eta _p(\varphi _p),w_{p+1}^*(u_{p+1},v_{p+1})). \end{aligned}$$

Proof

Our proof proceeds via strong induction. For $p=0$, by the WLLN for i.i.d. random variables $\eta _0^N(\varphi _0)\rightarrow _{\mathbb {P}}\eta _0(\varphi _0)$. Then by the continuous mapping theorem, it clearly follows that for any fixed $(u_1,v_1)$ that $w_1^N(u_1,v_1)\rightarrow _{\mathbb {P}}w_1^*(u_1,v_1)$ and indeed that $M_0\in \mathsf {M}$, $G_{0,N}(M_0)\rightarrow _{\mathbb {P}}G_0(M_0)$ which will be used later on. Thus, the proof of the initialization follows easily.

Now assume the result for $p-1$ and consider the proof at rank $p$. We have that

$$\begin{aligned}&\eta _p^N(\varphi _p) - \eta _p(\varphi _p) = \eta _p^N(\varphi _p)\nonumber \\&\quad -\mathbb {E}[\eta _p^N(\varphi _p)|\mathcal {F}_{p-1}] +\mathbb {E}[\eta _p^N(\varphi _p)|\mathcal {F}_{p-1}] - \eta _p(\varphi _p) \end{aligned}$$

(17)

where $\mathcal {F}_{p-1}$ is the filtration generated by the particle system up-to time $p-1$. We focus on the second term on the R.H.S., which can be written as:

$$\begin{aligned}&\mathbb {E}[\eta _p^N(\varphi _p)|\mathcal {F}_{p-1}] - \eta _p(\varphi _p)\nonumber \\&\quad = \frac{\eta _{p-1}^N(Q_{p}(\varphi _p))}{\eta _{p-1}(G_{p-1})}-\frac{\eta _{p-1}(Q_{p}(\varphi _p))}{\eta _{p-1}(G_{p-1})}\nonumber \\&\qquad +\eta _{p-1}^N(Q_{p}(\varphi _p))\bigg [\frac{1}{\eta _{p-1}^N(G_{p-1,N})} -\frac{1}{\eta _{p-1}(G_{p-1})}\bigg ]\nonumber \\&\qquad +\frac{\eta _{p-1}^N[\{Q_{p,N}-Q_p\}(\varphi _p)]}{\eta _{p-1}(G_{p-1,N})}. \end{aligned}$$

(18)

By the induction hypothesis, as $Q_p(\varphi _p)\in \mathcal {B}_b(\mathsf {M})$, the first term on the R.H.S. of (18) converges in probability to zero. To proceed, we will consider the two terms on the R.H.S. of (18) in turn, starting with the second.

Second Term on R.H.S. of (18). Consider

$$\begin{aligned}&\mathbb {E}[|\eta _{p-1}^N(G_{p-1,N}) - \eta _{p-1}(G_{p-1})|]\\&\quad = \mathbb {E}[|\eta _{p-1}^N(G_{p-1,N}\!-\!G_{p-1})+ \eta _{p-1}^N(G_{p-1})\\&\qquad -\eta _{p-1}(G_{p-1})|]\\&\quad \le \mathbb {E}[|\eta _{p-1}^N(G_{p-1,N}-G_{p-1})|]+\mathbb {E}[|\eta _{p-1}^N(G_{p-1})\\&\qquad - \eta _{p-1}(G_{p-1})|]. \end{aligned}$$

For the second term of the R.H.S. of the inequality, by the induction hypothesis $|\eta _{p-1}^N(G_{p-1})- \eta _{p-1}(G_{p-1})|\rightarrow _{\mathbb {P}} 0$ and as $G_{p-1}$ is a bounded function, so $\mathbb {E}[|\eta _{p-1}^N(G_{p-1})- \eta _{p-1}(G_{p-1})|]$ will converge to zero. For the first term, we have

$$\begin{aligned} \mathbb {E}[|\eta _{p-1}^N(G_{p-1,N}-G_{p-1})|]&\le \mathbb {E}[|G_{p-1,N}(M_{p-1}^1)\\&\quad - G_{p-1}(M_{p-1}^1)|] \end{aligned}$$

where we have used the exchangeability of the particle system (the marginal law of any sample $M_{p-1}^i$ is the same for each $i\in [N]$). Then, noting that the inductive hypothesis implies that for any fixed $M_{p-1}\in \mathsf {M}$

$$\begin{aligned} G_{p-1,N}(M_{p-1}) \rightarrow _{\mathbb {P}} G_{p-1}(M_{p-1}) \end{aligned}$$

(19)

by essentially the above the arguments (note (16)), we have that $\mathbb {E}[|\eta _{p-1}^N(G_{p-1,N}-G_{p-1})|]\rightarrow 0$. This establishes

$$\begin{aligned} \eta _{p-1}^N(G_{p-1,N})\rightarrow _{\mathbb {P}} \eta _{p-1}(G_{p-1}). \end{aligned}$$

(20)

Thus, using the induction hypothesis, as $Q_p(\varphi _p)\in \mathcal {B}_b(\mathsf {M})$, $\eta _{p-1}^N(Q_{p}(\varphi _p))$ converges in probability to a constant. This fact combined with above argument and the continuous mapping Theorem, shows that the the second term on the R.H.S. of (18) will converge to zero in probability.

Third Term on R.H.S. of (18). We would like to show that

$$\begin{aligned} \mathbb {E}||\eta _{p-1}^N[\{Q_{p,N}-Q_p\}(\varphi _p)]|]&\le \mathbb {E}[|Q_{p,N}(\varphi _p)(M_{p-1}^1)\\&\quad -Q_{p}(\varphi _p)(M_{p-1}^1)|]. \end{aligned}$$

goes to zero. As the term in the expectation on the R.H.S. of the inequality is bounded (note (16)), it suffices to prove that this term will converge to zero in probability. We have, for any fixed $M\in \mathsf {M}$

$$\begin{aligned}&Q_{p,N}(\varphi _p)(M) - Q_{p}(\varphi _p)(M) \\&\quad = [G_{p-1,N}(M) - G_{p-1}(M)] K_{p,N}(\varphi _p)(M)\\&\qquad + G_{p-1}(M)[K_{p,N}(\varphi _p)(M)-K_{p}(\varphi _p)(M)]. \end{aligned}$$

As $K_{p,N}(\varphi _p)(M)$ is bounded, it clearly follows via the induction hypothesis (note (19)) that $[G_{p-1,N}(M) - G_{p-1}(M)] K_{p,N})(\varphi _p)(M)$ will converge to zero in probability. To deal with the second part, we consider only ‘acceptance’ part of the M–H kernel; dealing with the ‘rejection’ part is very similar and omitted for brevity:

$$\begin{aligned} \sum _{M'\in \mathsf {M}} q_p(M,M')\varphi _p(M')\Bigg [1\wedge \Bigg (\frac{\Phi _p^N(M')}{\Phi _p^N(M)}\Bigg ) - 1\wedge \Bigg (\frac{\Phi _p(M')}{\Phi _p(M)}\Bigg )\Bigg ] \end{aligned}$$

(21)

where $q_p(M,M')$ is the symmetric proposal probability. For any fixed $M,M'$ $1\wedge \Big (\frac{\Phi _p^N(M')}{\Phi _p^N(M)}\Big )$ is a continuous function of $\eta _{p-1}^N(\cdot )$, $w_p^N$ (when they appear), so by the induction hypothesis, it follows that for any $M,M'\in \mathsf {M}$,

$$\begin{aligned} \Bigg [1\wedge \Bigg (\frac{\Phi _p^N(M')}{\Phi _p^N(M)}\Bigg ) - 1\wedge \Bigg (\frac{\Phi _p(M')}{\Phi _p(M)}\Bigg )\Bigg ] \rightarrow _{\mathbb {P}} 0 \end{aligned}$$

and hence so does (21) (recall $\mathsf {M}$ is finite). By (20) $\eta _{p-1}(G_{p-1,N})$ converges in probability to $\eta _{p-1}(G_{p-1})$ and hence third term on the R.H.S. of (18) will converge to zero in probability.

Now, following the proof of Beskos et al. (2014 [Theorem 4.1]) and the above arguments, the first term on the R.H.S.of (17) will converge to zero in probability. Thus, we have shown that $\eta _{p}^N(\varphi _p)-\eta _{p}(\varphi _p)$ will converge to zero in probability. Then, by this latter result and the induction hypothesis, along with the continuous mapping theorem, it follows that for $(u_{p+1},v_{p+1})\in (U\times V)$ arbitrary, $w_{p+1}^N(u_{p+1},v_{p+1})\rightarrow _{\mathbb {P}} w_{p+1}^*(u_{p+1},v_{p+1})$ and indeed that $G_{p,N}(M_p)$ converges in probability to $G_{p}(M_p)$ for any fixed $M_p\in \mathsf {M}$. From here one can conclude the proof with standard results in probability.$\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, J., Jasra, A. Monte Carlo algorithms for computing $\alpha $-permanents. Stat Comput 26, 231–248 (2016). https://doi.org/10.1007/s11222-014-9491-z

Download citation

Received: 02 December 2013
Accepted: 29 June 2014
Published: 29 July 2014
Issue Date: January 2016
DOI: https://doi.org/10.1007/s11222-014-9491-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Monte Carlo algorithms for computing \(\alpha \)-permanents

Abstract

Access this article

Similar content being viewed by others

A computational procedure for estimation of the mixing time of the random-scan Metropolis algorithm

The Monte–Carlo Method

Randomized Quasi-Monte Carlo: An Introduction for Practitioners

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Technical results for Section 2

Proof of Theorem 2.1

Theorem 6.1

Proof

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Monte Carlo algorithms for computing \(\alpha \)-permanents

Abstract

Access this article

Similar content being viewed by others

A computational procedure for estimation of the mixing time of the random-scan Metropolis algorithm

The Monte–Carlo Method

Randomized Quasi-Monte Carlo: An Introduction for Practitioners

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Technical results for Section 2

Technical results for Section 2

Proof of Theorem 2.1

Theorem 6.1

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation