Truncated Multivariate Student Computations via Exponential Tilting

Botev, Zdravko I.; Chen, Yi-Lung

doi:10.1007/978-3-031-10193-9_4

Zdravko I. Botev⁵ &
Yi-Lung Chen⁵

444 Accesses
3 Altmetric

Abstract

In this paper we consider computations with the multivariate student density, truncated on a set described by a linear system of inequalities. Our goal is to both simulate from this truncated density, as well as to estimate its normalizing constant. To this end we consider an exponentially tilted sequential importance sampling (IS) density. We prove that the corresponding IS estimator of the normalizing constant, a rare-event probability, has bounded relative error under certain conditions. Along the way, we establish the multivariate extension of the Mill’s ratio for the student distribution. We present applications of the proposed sampling and estimation algorithms in Bayesian inference. In particular, we construct efficient rejection samplers for the posterior densities of the Bayesian Constrained Linear Regression model, the Bayesian Tobit model, and the Bayesian smoothing spline for non-negative functions. Typically, sampling from such posterior densities is only viable via approximate Markov chain Monte Carlo (MCMC). Finally, we propose a novel Reject-Regenerate sampler, which is a hybrid between rejection sampling and MCMC. The Reject-Regenerate sampler creates a Markov chain, whose states are, with a certain probability, flagged as commencing a new regenerative or renewal cycle. Whenever a state initiates a new regenerative cycle, we can further flip a biased coin to decide whether the state is an exact draw from the target, or not. We show that the proposed MCMC algorithm is strongly efficient in a rare-event regime and provide a numerical example.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Botev, Z.I.: The normal law under linear restrictions: simulation and estimation via minimax tilting. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 79(1), 125–148 (2017)
Article MathSciNet MATH Google Scholar
Botev, Z.I., L’Ecuyer, P.: Efficient probability estimation and simulation of the truncated multivariate student-t distribution. In: 2015 Winter Simulation Conference (WSC), pp. 380–391. IEEE (2015)
Google Scholar
Botev, Z., L’Ecuyer, P.: Simulation from the normal distribution truncated to an interval in the tail. In: proceedings of the 10th EAI International Conference on Performance Evaluation Methodologies and Tools on 10th EAI International Conference on Performance Evaluation Methodologies and Tools, pp. 23–29 (2017)
Google Scholar
Botev, Z.I., Mackinlay, D., Chen, Y.L.: Logarithmically efficient estimation of the tail of the multivariate normal distribution. In: 2017 Winter Simulation Conference (WSC), pp. 1903–1913. IEEE (2017)
Google Scholar
Botev, Z.I., Chen, Y.L., L’Ecuyer, P., MacNamara, S., Kroese, D.P.: Exact posterior simulation from the linear lasso regression. In: 2018 Winter Simulation Conference (WSC), pp. 1706–1717. IEEE (2018)
Google Scholar
Chen, M.H., Deely, J.J.: Bayesian analysis for a constrained linear multiple regression problem for predicting the new crop of apples. J. Agric. Biol. Environ. Stat. 1(4), 467–489 (1996)
Article MathSciNet Google Scholar
Chen, M.H., Ibrahim, J.G., Shao, Q.M.: Monte Carlo Methods in Bayesian Computation. Springer (2000)
Google Scholar
Chib, S.: Bayes inference in the Tobit censored regression model. J. Econom. 51(1–2), 79–99 (1992)
Article MathSciNet MATH Google Scholar
Gelfand, A.E., Smith, A.F., Lee, T.M.: Bayesian analysis of constrained parameter and truncated data problems using Gibbs sampling. J. Am. Stat. Assoc. 87(418), 523–532 (1992)
Article MathSciNet Google Scholar
Genz, A., Bretz, F.: Numerical computation of multivariate t-probabilities with application to power calculation of multiple contrasts. J. Stat. Comput. Simul. 63(4), 103–117 (1999)
Article MathSciNet MATH Google Scholar
Hashorva, E., Hüsler, J.: On multivariate Gaussian tails. Ann. Inst. Stat. Math. 55(3), 507–522 (2003)
Article MathSciNet MATH Google Scholar
Kroese, D.P., Botev, Z.I., Taimre, T., Vaisman, R.: Data Science and Machine Learning: Mathematical and Statistical Methods. Chapman and Hall/CRC (2019)
Google Scholar
Kroese, D.P., Taimre, T., Botev, Z.I.: Handbook of Monte Carlo Methods. Wiley (2011)
Google Scholar
L’Ecuyer, P., Blanchet, J.H., Tuffin, B., Glynn, P.W.: Asymptotic robustness of estimators in rare-event simulation. ACM Trans. Model. Comput. Simul. (TOMACS) 20(1), 1–41 (2010)
Article MATH Google Scholar
Mengersen, K.L., Tweedie, R.L.: Rates of convergence of the Hastings and Metropolis algorithms. Ann. Stat. 24(1), 101–121 (1996)
Article MathSciNet MATH Google Scholar
Mills, J.P.: Table of the ratio: area to bounding ordinate, for any portion of normal curve. Biometrika, pp. 395–400 (1926)
Google Scholar
Mroz, T.A.: The sensitivity of an empirical model of married women’s hours of work to economic and statistical assumptions. Econom. J. Econom. Soc. 55(4), 765–799 (1987)
Google Scholar
Nummelin, E.: A splitting technique for Harris recurrent Markov chains. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete 43(4), 309–318 (1978)
Article MathSciNet MATH Google Scholar
Pakman, A., Paninski, L.: Exact Hamiltonian Monte Carlo for truncated multivariate Gaussians. J. Comput. Graph. Stat. 23(2), 518–542 (2014)
Article MathSciNet Google Scholar
Soms, A.P.: An asymptotic expansion for the tail area of the t-distribution. J. Am. Stat. Assoc. 71(355), 728–730 (1976)
MathSciNet MATH Google Scholar
Soms, A.P.: Rational bounds for the t-tail area. J. Am. Stat. Assoc. 75(370), 438–440 (1980)
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

School of Mathematics and Statistics, The University of New South Wales (UNSW Sydney), Sydney, NSW, Australia
Zdravko I. Botev & Yi-Lung Chen

Authors

Zdravko I. Botev
View author publications
You can also search for this author in PubMed Google Scholar
Yi-Lung Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zdravko I. Botev .

Editor information

Editors and Affiliations

Randwick, NSW, Australia
Zdravko Botev
NVIDIA, Berlin, Germany
Alexander Keller
University of Waterloo, Waterloo, ON, Canada
Christiane Lemieux
Rennes Bretagne-Atlantique, INRIA, Rennes Cedex, France
Bruno Tuffin

Appendix

1.1 Proof of Theorem 2

Proof

First, we use the normal scale-mixture representation of $\boldsymbol{Y}\sim \textsf{t}_\nu (\textbf{0},\Sigma )$ as $\boldsymbol{Y}=\sqrt{\nu }\boldsymbol{Z}/R$, where $ \boldsymbol{Z}\sim \mathcal {N}\left( \textbf{0},\Sigma \right) $ is independent of $ R\sim c_\nu (r)=\frac{\exp \left( -\frac{r^2}{2}+(\nu -1)\ln r\right) }{2^{\nu /2-1}\Gamma (\nu /2)}, \quad r>0. $ We can thus write $\ell $ as a conditional expectation: $ {\ell }(\gamma )={\mathbb {P}}\left[ \frac{\sqrt{\nu }\boldsymbol{Z}}{R}\ge \boldsymbol{l}(\gamma )\right] = \mathbb {E}\left[ \mathbb {P}\left[ \frac{\sqrt{\nu }\boldsymbol{Z}}{R}\ge \boldsymbol{l}(\gamma )\,\Big |\,R\right] \right] .$ Next, condition on $R=r$, and let $\boldsymbol{\mu }=r\boldsymbol{x}^*/\sqrt{\nu }$, where $\boldsymbol{x}^*$ is the solution of the QPP. Denoting $\boldsymbol{t}=[\boldsymbol{t}_1^\top ,\boldsymbol{t}_2^\top ]^\top =: r\boldsymbol{l}/\sqrt{\nu }$, and making a change of variable $\boldsymbol{z}\leftarrow \boldsymbol{z}-\boldsymbol{\mu }$, we obtain $\mathbb {P}\left[ \frac{\sqrt{\nu }\boldsymbol{Z}}{R}\ge \boldsymbol{l}(\gamma )\,\Big |\,R=r\right] =\mathbb {P}[\boldsymbol{Z}\ge \boldsymbol{t}]=$

$$\begin{aligned}&=\mathbb {E} \exp (-\frac{\boldsymbol{\mu }^\top \Sigma ^{-1}\boldsymbol{\mu }}{2}-\boldsymbol{Z}^\top \Sigma ^{-1}\boldsymbol{\mu }){\mathbbm {1}}\{\boldsymbol{Z}\ge \boldsymbol{t}-\boldsymbol{\mu }\}\\&=\exp (-\frac{\boldsymbol{\mu }^\top \Sigma ^{-1}\boldsymbol{\mu }}{2}) \mathbb {E} \exp (-\boldsymbol{Z}_1^\top \Sigma _{11}^{-1}\boldsymbol{t}_1){\mathbbm {1}}\{\boldsymbol{Z}_1\ge \boldsymbol{t}_1-\boldsymbol{\mu }_1,\boldsymbol{Z}_2\ge \boldsymbol{t}_2-\boldsymbol{\mu }_2\}\\&=\exp (-\boldsymbol{t}_1^\top \Sigma _{11}^{-1}\boldsymbol{t}_1/2)\mathbb {E} \exp (-\boldsymbol{Z}_1^\top \Sigma _{11}^{-1}\boldsymbol{t}_1){\mathbbm {1}}\{\boldsymbol{Z}_1\ge {\textbf{0}}, \boldsymbol{Z}_2\ge \boldsymbol{t}_2-\boldsymbol{\mu }_2\}. \end{aligned}$$

In other words, we have:

$$\begin{aligned} { \mathbb {P}[\boldsymbol{Z}\ge \boldsymbol{t}]=\exp (-\frac{r^{2}\boldsymbol{l}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{l}_{1}}{2\nu }) \mathbb {E} \exp (-\frac{r\boldsymbol{Z}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{l}_{1}}{\sqrt{\nu }}){\mathbbm {1}}\{\boldsymbol{Z}_{1}\ge \textbf{0},\boldsymbol{Z}_{2}\ge \frac{r(\boldsymbol{l}_{2}-\Sigma _{21}\Sigma _{11}^{-1}\boldsymbol{l}_{1})}{\sqrt{\nu }}\}}\end{aligned}$$

(10)

Let $\mathfrak {D} \equiv \{\boldsymbol{z} :\boldsymbol{z}_1\ge {\textbf{0}},\boldsymbol{z}_2\ge \frac{r(\boldsymbol{l}_2-\Sigma _{21}\Sigma _{11}^{-1}\boldsymbol{l}_1)}{\sqrt{\nu }}\}$. We can now rewrite (10) as an integral and integrate over r. This gives ${\ell }(\gamma )=$:

$$\begin{aligned}&{}=\textstyle {\int }_{0}^{\infty }{\int }_{\mathfrak {D}} c_{\nu }(r){\phi }_{\Sigma }\left( \boldsymbol{z}\right) \exp \left( -r^{2}\boldsymbol{l}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{l}_{1}/(2\nu )-r\boldsymbol{z}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{l}_{1}/\sqrt{\nu }\right) \text {d}\boldsymbol{z}\text {d} r\\&{}=\textstyle \frac{2^{1-(\nu +d)/2} \pi ^{-d/2}}{\Gamma (\frac{\nu }{2})|\Sigma |^{1/2}}{\int }_{0}^{\infty }\!\!\!{\int }_{\mathfrak {D}} \exp \left( -\frac{r^{2}}{2}\left( 1+\frac{\boldsymbol{l}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{l}_1}{2\nu }\right) -\frac{\boldsymbol{z}^{\top }\Sigma ^{-1}\boldsymbol{z}}{2}-\frac{r\boldsymbol{z}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{l}_{1}}{\sqrt{\nu }}+(\nu -1)\ln r\right) \textrm{d}\boldsymbol{z}\text {d} r\\&{}=\textstyle \frac{2^{1-(\nu +d)/2} \pi ^{-d/2}}{\Gamma (\frac{\nu }{2})|\Sigma |^{1/2}\left( 1+\frac{\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_{1}}{\nu }\right) ^{\nu /2}}{\int }_{0}^{\infty }\!\!\!{\int }_{\mathfrak {D}} \textstyle \exp \left( -\frac{u^{2}}{2}-\frac{\boldsymbol{z}^{\top }\Sigma ^{-1}\boldsymbol{z}}{2}-\frac{u\;\boldsymbol{z}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{l}_{1}}{\sqrt{\nu +\boldsymbol{l}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{t}_{1}}}+(\nu -1)\ln u\right) \text {d}\boldsymbol{z}\text {d} u\\&{}=\textstyle \frac{1}{\left( 1+\frac{\boldsymbol{l}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{l}_{1}}{\nu }\right) ^{\nu /2}}{\int }_{0}^{\infty }\!\!\!{\int }_{\mathbb {R}^d} c_{\nu }(u)\phi _{\Sigma }(\boldsymbol{z})\exp \left( \textstyle -\frac{u\;\boldsymbol{z}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{l}_{1}}{\sqrt{\nu +\boldsymbol{l}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{l}_{1}}}\right) {\mathbbm {1}}\left\{ \boldsymbol{z}_{1}\ge \textbf{0}, \boldsymbol{z}_{2}\ge \frac{u(\boldsymbol{l}_{2}-\Sigma _{21}\Sigma _{11}^{-1}\boldsymbol{l}_{1})}{\sqrt{\nu +\boldsymbol{l}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{l}_{1}}}\right\} \text {d}\boldsymbol{z}\text {d} u\\&{}=\textstyle \left( 1+\frac{\boldsymbol{l}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{l}_{1}}{\nu }\right) ^{-\nu /2}\mathbb {E} \exp \left( -\frac{R\;\boldsymbol{Z}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{l}_{1}}{\sqrt{\nu +\boldsymbol{l}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{l}_{1}}}\right) {\mathbbm {1}}\left\{ \boldsymbol{Z}_{1}\ge \textbf{0}, \boldsymbol{Z}_{2}\ge \frac{R(\boldsymbol{l}_{2}-\Sigma _{21}\Sigma _{11}^{-1}\boldsymbol{l}_{1})}{\sqrt{\nu +\boldsymbol{l}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{l}_{1}}}\right\} ,\end{aligned}$$

where the third line follows from the change of variable $ u=r\sqrt{1+\frac{\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1}{\nu }}\;.$ Next, using formula (10) we rewrite the last expression as:

$$ \left( 1+\frac{\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1}{\nu }\right) ^{-\nu /2}\mathbb {E}\exp \left( \frac{R^2\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1}{2(\nu +\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1)}\right) \mathbb {P}\left[ \boldsymbol{Z}\ge \frac{R\boldsymbol{l}}{\sqrt{\nu +\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1}}\, \Big |\,R\right] . $$

We now seek to apply the dominated convergence theorem to the expectation in the last displayed equation. For this we need the upper bound (recall that $\Sigma _{11}^{-1}\boldsymbol{l}_1\ge {\textbf{0}}$)

$$\begin{aligned} \exp \left( \frac{r^2\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1}{2(\nu +\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1)}\right) \mathbb {P}\left[ \boldsymbol{Z}\ge \frac{r\boldsymbol{l}}{\sqrt{\nu +\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1}}\right]&\le \exp (r^2/2)\mathbb {P}\left[ \boldsymbol{Z}_1\ge \frac{r\boldsymbol{l}_1}{\sqrt{\nu +\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1}}\right] \\&\le \exp (r^2/2)\mathbb {P}\left[ \boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{Z}_1\ge \frac{r\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1}{\sqrt{\nu +\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1}} \right] \\&=\exp (r^2/2)\overline{\Phi }\left[ r\sqrt{\frac{\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1}{\nu +\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1}}\right] \le \exp (r^2/2)\overline{\Phi }\left( r\right) . \end{aligned}$$

The last expression is integrable in the sense that $\int _0^\infty c_\nu (r)\exp (r^2/2)\overline{\Phi }\left( r\right) \textrm{d} r=$

$$ \frac{2^{1-\nu /2}}{\Gamma (\nu /2)}\int _0^\infty r^{\nu -1}\overline{\Phi }\left( r\right) \textrm{d} r= \frac{2^{1-\nu /2}}{\Gamma (\nu /2)2\nu }\int _{-\infty }^\infty |u|^{\nu } \phi (u)\textrm{d} u=\frac{2^{1-\nu /2}\Gamma ((\nu +1)/2)2^{\nu /2}}{\sqrt{\pi }\Gamma (\nu /2)2\nu }=\frac{\Gamma ((\nu +1)/2)}{\sqrt{\pi }\Gamma (\nu /2)\nu }<\infty . $$

In addition, as $\gamma \uparrow \infty $, by Lemma 1 we have the pointwise limits:

$$ \exp \left[ \frac{r^2\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1}{2(\nu +\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1)}\right] \mathbb {P}\left( \boldsymbol{Z}\ge \frac{r\boldsymbol{l}}{\sqrt{\nu +\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1}}\right) \rightarrow \exp (r^2/2)\mathbb {P}[\boldsymbol{Z}\ge r\boldsymbol{l}_\infty ]. $$

Therefore, by the dominated convergence theorem

$$ \lim _{\gamma \uparrow \infty }\mathbb {E}\exp \left( \frac{R^2\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1}{2(\nu +\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1)}\right) \mathbb {P}\left[ \boldsymbol{Z}\ge \frac{R\boldsymbol{l}}{\sqrt{\nu +\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1}}\, \Big |\,R\right] = \frac{2^{1-\nu /2}}{\Gamma (\nu /2)}\int _0^\infty r^{\nu -1}\mathbb {P}[\boldsymbol{Z}\ge r\boldsymbol{l}_\infty ]\textrm{d} r. $$

This concludes the proof. $\square $

Lemma 1

(Continuity of Gaussian tail) Suppose that $\boldsymbol{Z}\sim \mathcal {N}({\textbf{0}},\Sigma )$ for some positive definite matrix $\Sigma $, and $\boldsymbol{a}_n\rightarrow \boldsymbol{a}$ as $n\uparrow \infty $. Then, the tail of the multivariate Gaussian is continuous: $ \lim _{n\uparrow \infty }\mathbb {P}[\boldsymbol{Z}\ge \boldsymbol{a}_n]=\mathbb {P}[\boldsymbol{Z}\ge \boldsymbol{a}]. $

Proof

The proof is yet another application of the dominated convergence theorem to show that: $ \int _{[{\textbf{0}},\boldsymbol{\infty })}\phi _{\Sigma }(\boldsymbol{z}+\boldsymbol{a}_n)\textrm{d} \boldsymbol{z}\rightarrow \int _{[{\textbf{0}},\boldsymbol{\infty })}\phi _{\Sigma }(\boldsymbol{z}+\boldsymbol{a})\textrm{d} \boldsymbol{z}=\mathbb {P}[\boldsymbol{Z}\ge \boldsymbol{a}]. $ Since $\Sigma $ is a positive definite matrix, the $\Vert \boldsymbol{x}\Vert ^2_{\Sigma }:=\boldsymbol{x}^\top \Sigma ^{-1}\boldsymbol{x}$ is a norm satisfying $\Vert \boldsymbol{z}+\boldsymbol{a}_n\Vert _{\Sigma }^2\le 2(\Vert \boldsymbol{z}\Vert _{\Sigma }^2+\Vert \boldsymbol{a}_n\Vert _{\Sigma }^2)$. Therefore, $ \int _{[{\textbf{0}},\boldsymbol{\infty })}\phi _{\Sigma }(\boldsymbol{z}+\boldsymbol{a}_n)\textrm{d} \boldsymbol{z}\le \frac{\exp (-\Vert \boldsymbol{a}_n\Vert ^2_{\Sigma })}{2^{n/2}}\int _{[\textbf{0},\boldsymbol{\infty })}\phi _{\Sigma /2}(\boldsymbol{z})\textrm{d} \boldsymbol{z}<\infty , $ and the conditions for the dominated convergence theorem are met. $\square $

1.2 Proof of Theorem 3

Proof

First, note that the second moment is $\int g(\boldsymbol{z},r;\boldsymbol{\mu }^*,\eta ^*) \exp (2\psi (\boldsymbol{z},r;\boldsymbol{\mu }^*,\eta ^*))$ $\textrm{d}\boldsymbol{z}\textrm{d} r =$

$$ =\int _{\mathfrak {R}}c_\nu (r)\phi _{\Sigma }(\boldsymbol{z})\exp (\psi (\boldsymbol{z},r;\boldsymbol{\mu }^*,\eta ^*))\textrm{d}\boldsymbol{z}\textrm{d} r\le \ell (\gamma ) \exp (\psi (\boldsymbol{z}^*,r^*;\boldsymbol{\mu }^*,\eta ^*)). $$

Since the properties of $\psi $ imply that

$$ \psi (\boldsymbol{z}^*,r^*;\boldsymbol{\mu }^*,\eta ^*)\le \psi (\boldsymbol{z}^*,r^*;{\textbf{0}},\eta ^*)\le \frac{(\eta ^*)^2}{2}-r^*\eta ^*+(\nu -1)\ln r^*+\ln \Phi (\eta ^*), $$

bounded relative error will follow if we can show that $ \frac{(r^*)^{\nu -1}\Phi (\eta ^*)\exp (\frac{(\eta ^*)^2}{2}-r^*\eta ^*)}{\ell (\gamma )} $ remains bounded in $\gamma $. The pair $(r^*,\eta ^*)$ is determined from the solution to (3), namely from finding the saddle-point solution of: $\max _{r,\boldsymbol{z}}\min _{\eta ,\boldsymbol{\mu }}\psi (\boldsymbol{z},r;\boldsymbol{\mu },\eta )$. This can be obtained by setting the gradient of $\psi $ with respect to the vector $(\boldsymbol{z},r,\boldsymbol{\mu },\eta )$ to zero: $\nabla \psi ={\textbf{0}}$. We now introduce the following notation that will allow us to express $\nabla \psi ={\textbf{0}}$ explicitly. Let L be the lower triangular Cholesky factor of $\Sigma = L L^\top $. Define $D=\textrm{diag}( L)\;,\breve{ L}= D^{-1} L$, $ \tilde{\boldsymbol{l}}=\frac{r}{\sqrt{\nu }} D^{-1} \boldsymbol{l}(\gamma )-(\breve{ L}- I)\boldsymbol{z}, $ and vector $\boldsymbol{\Psi }$ with elements $ \Psi _k=\phi (\tilde{l}_k-\mu _k)/\overline{\Phi }(\tilde{l}_k-\mu _k)$. Then, $\nabla \psi ={\textbf{0}}$ can be written as

$$\begin{aligned} (\breve{ L}^\top -I)\boldsymbol{\Psi }-\boldsymbol{\mu }&{}={\textbf{0}}\\ \frac{\nu -1}{r}-\eta -\frac{1}{\sqrt{\nu }} \boldsymbol{\Psi }^\top D^{-1} \boldsymbol{l}(\gamma ) &{}=0\\ \boldsymbol{\mu }+\boldsymbol{\Psi }-\boldsymbol{z}&{}={\textbf{0}} \\ \eta +\frac{\phi (\eta )}{\Phi (\eta )}-r&{}=0. \end{aligned}$$

(11)

Next, we verify via substitution that the solution of (11) as $\gamma \uparrow \infty $ satisfies $ r^*=\mathcal {O}(\gamma ^{-1})$, $\boldsymbol{z}^*=\mathcal {O}(\textbf{1})$, $\eta ^*=\mathcal {O}(-\gamma )$, $\boldsymbol{\mu }^*=\mathcal {O}(\textbf{1}).$ First, equations one and three in (11) are trivially satisfied and we can deduce that $\boldsymbol{\Psi }=\mathcal {O}(\textbf{1})$. Second, since $\tilde{\boldsymbol{l}}=\mathcal {O}(r\boldsymbol{l}(\gamma ))=\mathcal {O}(\textbf{1})$, it follows that equation two in (11) is equivalent to

$$ r^*\eta ^*=\nu -1-\frac{ r^*}{\sqrt{\nu }} \boldsymbol{\Psi }^\top D^{-1}\boldsymbol{l}(\gamma )=\mathcal {O} (1). $$

Finally, note that Mill’s ratio $ \frac{\Phi (\eta )}{\phi (\eta )}\simeq -\frac{1}{\eta }+\frac{1}{\eta ^3}, \eta \downarrow -\infty , $ implies that equation four is asymptotically equivalent to $r\eta ^2+\eta -r\simeq 0 $. The solution of this quadratic equation in turn implies that $ \eta \simeq (-1-\sqrt{1+4r^2})/(2r) \simeq -1/r$. In other words, $\eta ^* r^*=\mathcal {O}(1)$, as desired. Therefore, if $\tilde{\psi }$ denotes the value of $\psi $ at the solution (11), we have

$$\begin{aligned} \tilde{\psi }&=\frac{\Vert \boldsymbol{\mu }^*\Vert ^2}{2}-(\boldsymbol{z}^*)^\top \boldsymbol{\mu }^* + \frac{(\eta ^*)^2}{2}-r^*\eta ^*+ (\nu -1)\ln r^*+\ln \Phi (\eta ^*) +\sum _{k=1}^d\ln \overline{\Phi }( \tilde{l}_k-\mu _k^*)\\&= \mathcal {O}(1)+\frac{(\eta ^*)^2}{2}+(\nu -1)\ln r^*+\ln \overline{\Phi }(-\eta ^*). \end{aligned}$$

By Mill’s ratio inequality: $ \ln \overline{\Phi }(-\eta )\le -\eta ^2/2-\frac{1}{2}\ln (2\pi )-\ln (-\eta ), $ we obtain: $ \tilde{\psi }\lesssim \mathcal {O}(1)-\ln (-\eta ^*)-\frac{1}{2}\ln (2\pi )+(\nu -1)\ln r^*=-\nu \log (\gamma )+\mathcal {O}(1). $ In other words, there exist constants $c_1,c_2>0$ such that $ \exp (\tilde{\psi })\le c_1\gamma ^{-\nu } $ for every $\gamma >c_2$. Therefore,

$$ \text {Var}(\hat{\ell })=\mathbb {E} \exp (\psi (\boldsymbol{Z},R;\boldsymbol{\mu }^{*},\eta ^{*}))-\ell ^2\lesssim \ell (\gamma )\exp (\tilde{\psi })-\ell ^{2}\le c_{1}\gamma ^{-2\nu }-\ell ^{2}(\gamma ) $$

and since by Theorem 2

$$ \ell (\gamma )\simeq c \times \bigg (1+\gamma ^2 \times \underbrace{\frac{\boldsymbol{l}^\top _1\Sigma _{11}^{-1}\boldsymbol{l}_1}{\nu \times \gamma ^2}}_{\Theta (1)}\bigg )^{-\nu /2}=\Theta (\gamma ^{-\nu }),\quad \gamma \uparrow \infty , $$

we have $\text {lim sup}_{\gamma \uparrow \infty }\text {Var}(\hat{\ell })/\ell ^{2}< {\infty }.$ $\square $

1.3 Proof of Theorem 4

Proof

Ignoring the $B_i$ variable in Algorithm 2 gives a state $\boldsymbol{X}_n$ with marginal distribution that follows an independence Metropolis Hastings sampler. From [15, Theorem 2.1] we know that for an independence Metropolis sampler with proposal $g(\boldsymbol{x})$ and target $f(\boldsymbol{x})$ such that $\sup _{\boldsymbol{x}}f(\boldsymbol{x})/g(\boldsymbol{x})<c$ for some constant $c>0$, the Markov chain is uniformly ergodic with convergence rate

$$ \sup _A \left| \kappa _t(A|\boldsymbol{x})-f(A)\right| \le (1-c^{-1})^t. $$

Thus, to ensure the total variation bound remains below $\epsilon $, we need to run the independence sampler for $t^*$ steps such that

$$ (1-c^{-1})^{t^*}\le \exp (-t^*/c) \le \epsilon . $$

In other words, we have $ t^*\ge \left\lceil -c\ln (\epsilon )\right\rceil $ and the length of the chain will remain bounded in the rarity parameter $\gamma $ provided that $c(\gamma )$ remains bounded in $\gamma $. In Algorithm 2 we have

$$ c(\gamma )=f(\boldsymbol{x})/g(\boldsymbol{x})=\frac{p(\boldsymbol{x})}{g(\boldsymbol{x})\ell (\gamma )}\le \frac{\exp (\psi ^*)}{\ell (\gamma )}\le \frac{(r^*)^{\nu -1}\Phi (\eta ^*)\exp (\frac{(\eta ^*)^2}{2}-r^*\eta ^*)}{\ell (\gamma )}, $$

where $\psi ^*=\exp (\psi (\boldsymbol{z}^*,r^*;\boldsymbol{\mu }^*,\eta ^*))$. However, from the proof of Theorem 3 we know that $\frac{\exp (\psi ^*)}{\ell (\gamma )}$ remains bounded as $\gamma \uparrow \infty $. Hence, the Markov chain in Algorithm 2 is strongly efficient. $\square $

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Botev, Z.I., Chen, YL. (2022). Truncated Multivariate Student Computations via Exponential Tilting. In: Botev, Z., Keller, A., Lemieux, C., Tuffin, B. (eds) Advances in Modeling and Simulation. Springer, Cham. https://doi.org/10.1007/978-3-031-10193-9_4

Download citation

DOI: https://doi.org/10.1007/978-3-031-10193-9_4
Published: 01 December 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-10192-2
Online ISBN: 978-3-031-10193-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Truncated Multivariate Student Computations via Exponential Tilting

Abstract

Access this chapter

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

1.1 Proof of Theorem 2

Proof

Lemma 1

Proof

1.2 Proof of Theorem 3

Proof

1.3 Proof of Theorem 4

Proof

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation