Skip to main content

Truncated Multivariate Student Computations via Exponential Tilting

  • Chapter
  • First Online:
Advances in Modeling and Simulation

Abstract

In this paper we consider computations with the multivariate student density, truncated on a set described by a linear system of inequalities. Our goal is to both simulate from this truncated density, as well as to estimate its normalizing constant. To this end we consider an exponentially tilted sequential importance sampling (IS) density. We prove that the corresponding IS estimator of the normalizing constant, a rare-event probability, has bounded relative error under certain conditions. Along the way, we establish the multivariate extension of the Mill’s ratio for the student distribution. We present applications of the proposed sampling and estimation algorithms in Bayesian inference. In particular, we construct efficient rejection samplers for the posterior densities of the Bayesian Constrained Linear Regression model, the Bayesian Tobit model, and the Bayesian smoothing spline for non-negative functions. Typically, sampling from such posterior densities is only viable via approximate Markov chain Monte Carlo (MCMC). Finally, we propose a novel Reject-Regenerate sampler, which is a hybrid between rejection sampling and MCMC. The Reject-Regenerate sampler creates a Markov chain, whose states are, with a certain probability, flagged as commencing a new regenerative or renewal cycle. Whenever a state initiates a new regenerative cycle, we can further flip a biased coin to decide whether the state is an exact draw from the target, or not. We show that the proposed MCMC algorithm is strongly efficient in a rare-event regime and provide a numerical example.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://cran.r-project.org/web/packages/TruncatedNormal/index.html.

  2. 2.

    https://www.mathworks.com/matlabcentral/fileexchange/53796-truncated-normal-and-student-s-t-distribution-toolbox.

References

  1. Botev, Z.I.: The normal law under linear restrictions: simulation and estimation via minimax tilting. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 79(1), 125–148 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  2. Botev, Z.I., L’Ecuyer, P.: Efficient probability estimation and simulation of the truncated multivariate student-t distribution. In: 2015 Winter Simulation Conference (WSC), pp. 380–391. IEEE (2015)

    Google Scholar 

  3. Botev, Z., L’Ecuyer, P.: Simulation from the normal distribution truncated to an interval in the tail. In: proceedings of the 10th EAI International Conference on Performance Evaluation Methodologies and Tools on 10th EAI International Conference on Performance Evaluation Methodologies and Tools, pp. 23–29 (2017)

    Google Scholar 

  4. Botev, Z.I., Mackinlay, D., Chen, Y.L.: Logarithmically efficient estimation of the tail of the multivariate normal distribution. In: 2017 Winter Simulation Conference (WSC), pp. 1903–1913. IEEE (2017)

    Google Scholar 

  5. Botev, Z.I., Chen, Y.L., L’Ecuyer, P., MacNamara, S., Kroese, D.P.: Exact posterior simulation from the linear lasso regression. In: 2018 Winter Simulation Conference (WSC), pp. 1706–1717. IEEE (2018)

    Google Scholar 

  6. Chen, M.H., Deely, J.J.: Bayesian analysis for a constrained linear multiple regression problem for predicting the new crop of apples. J. Agric. Biol. Environ. Stat. 1(4), 467–489 (1996)

    Article  MathSciNet  Google Scholar 

  7. Chen, M.H., Ibrahim, J.G., Shao, Q.M.: Monte Carlo Methods in Bayesian Computation. Springer (2000)

    Google Scholar 

  8. Chib, S.: Bayes inference in the Tobit censored regression model. J. Econom. 51(1–2), 79–99 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  9. Gelfand, A.E., Smith, A.F., Lee, T.M.: Bayesian analysis of constrained parameter and truncated data problems using Gibbs sampling. J. Am. Stat. Assoc. 87(418), 523–532 (1992)

    Article  MathSciNet  Google Scholar 

  10. Genz, A., Bretz, F.: Numerical computation of multivariate t-probabilities with application to power calculation of multiple contrasts. J. Stat. Comput. Simul. 63(4), 103–117 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  11. Hashorva, E., Hüsler, J.: On multivariate Gaussian tails. Ann. Inst. Stat. Math. 55(3), 507–522 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  12. Kroese, D.P., Botev, Z.I., Taimre, T., Vaisman, R.: Data Science and Machine Learning: Mathematical and Statistical Methods. Chapman and Hall/CRC (2019)

    Google Scholar 

  13. Kroese, D.P., Taimre, T., Botev, Z.I.: Handbook of Monte Carlo Methods. Wiley (2011)

    Google Scholar 

  14. L’Ecuyer, P., Blanchet, J.H., Tuffin, B., Glynn, P.W.: Asymptotic robustness of estimators in rare-event simulation. ACM Trans. Model. Comput. Simul. (TOMACS) 20(1), 1–41 (2010)

    Article  MATH  Google Scholar 

  15. Mengersen, K.L., Tweedie, R.L.: Rates of convergence of the Hastings and Metropolis algorithms. Ann. Stat. 24(1), 101–121 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  16. Mills, J.P.: Table of the ratio: area to bounding ordinate, for any portion of normal curve. Biometrika, pp. 395–400 (1926)

    Google Scholar 

  17. Mroz, T.A.: The sensitivity of an empirical model of married women’s hours of work to economic and statistical assumptions. Econom. J. Econom. Soc. 55(4), 765–799 (1987)

    Google Scholar 

  18. Nummelin, E.: A splitting technique for Harris recurrent Markov chains. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete 43(4), 309–318 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  19. Pakman, A., Paninski, L.: Exact Hamiltonian Monte Carlo for truncated multivariate Gaussians. J. Comput. Graph. Stat. 23(2), 518–542 (2014)

    Article  MathSciNet  Google Scholar 

  20. Soms, A.P.: An asymptotic expansion for the tail area of the t-distribution. J. Am. Stat. Assoc. 71(355), 728–730 (1976)

    MathSciNet  MATH  Google Scholar 

  21. Soms, A.P.: Rational bounds for the t-tail area. J. Am. Stat. Assoc. 75(370), 438–440 (1980)

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zdravko I. Botev .

Editor information

Editors and Affiliations

Appendix

Appendix

1.1 Proof of Theorem 2

Proof

First, we use the normal scale-mixture representation of \(\boldsymbol{Y}\sim \textsf{t}_\nu (\textbf{0},\Sigma )\) as \(\boldsymbol{Y}=\sqrt{\nu }\boldsymbol{Z}/R\), where \( \boldsymbol{Z}\sim \mathcal {N}\left( \textbf{0},\Sigma \right) \) is independent of \( R\sim c_\nu (r)=\frac{\exp \left( -\frac{r^2}{2}+(\nu -1)\ln r\right) }{2^{\nu /2-1}\Gamma (\nu /2)}, \quad r>0. \) We can thus write \(\ell \) as a conditional expectation: \( {\ell }(\gamma )={\mathbb {P}}\left[ \frac{\sqrt{\nu }\boldsymbol{Z}}{R}\ge \boldsymbol{l}(\gamma )\right] = \mathbb {E}\left[ \mathbb {P}\left[ \frac{\sqrt{\nu }\boldsymbol{Z}}{R}\ge \boldsymbol{l}(\gamma )\,\Big |\,R\right] \right] .\) Next, condition on \(R=r\), and let \(\boldsymbol{\mu }=r\boldsymbol{x}^*/\sqrt{\nu }\), where \(\boldsymbol{x}^*\) is the solution of the QPP. Denoting \(\boldsymbol{t}=[\boldsymbol{t}_1^\top ,\boldsymbol{t}_2^\top ]^\top =: r\boldsymbol{l}/\sqrt{\nu }\), and making a change of variable \(\boldsymbol{z}\leftarrow \boldsymbol{z}-\boldsymbol{\mu }\), we obtain \(\mathbb {P}\left[ \frac{\sqrt{\nu }\boldsymbol{Z}}{R}\ge \boldsymbol{l}(\gamma )\,\Big |\,R=r\right] =\mathbb {P}[\boldsymbol{Z}\ge \boldsymbol{t}]=\)

$$\begin{aligned}&=\mathbb {E} \exp (-\frac{\boldsymbol{\mu }^\top \Sigma ^{-1}\boldsymbol{\mu }}{2}-\boldsymbol{Z}^\top \Sigma ^{-1}\boldsymbol{\mu }){\mathbbm {1}}\{\boldsymbol{Z}\ge \boldsymbol{t}-\boldsymbol{\mu }\}\\&=\exp (-\frac{\boldsymbol{\mu }^\top \Sigma ^{-1}\boldsymbol{\mu }}{2}) \mathbb {E} \exp (-\boldsymbol{Z}_1^\top \Sigma _{11}^{-1}\boldsymbol{t}_1){\mathbbm {1}}\{\boldsymbol{Z}_1\ge \boldsymbol{t}_1-\boldsymbol{\mu }_1,\boldsymbol{Z}_2\ge \boldsymbol{t}_2-\boldsymbol{\mu }_2\}\\&=\exp (-\boldsymbol{t}_1^\top \Sigma _{11}^{-1}\boldsymbol{t}_1/2)\mathbb {E} \exp (-\boldsymbol{Z}_1^\top \Sigma _{11}^{-1}\boldsymbol{t}_1){\mathbbm {1}}\{\boldsymbol{Z}_1\ge {\textbf{0}}, \boldsymbol{Z}_2\ge \boldsymbol{t}_2-\boldsymbol{\mu }_2\}. \end{aligned}$$

In other words, we have:

$$\begin{aligned} { \mathbb {P}[\boldsymbol{Z}\ge \boldsymbol{t}]=\exp (-\frac{r^{2}\boldsymbol{l}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{l}_{1}}{2\nu }) \mathbb {E} \exp (-\frac{r\boldsymbol{Z}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{l}_{1}}{\sqrt{\nu }}){\mathbbm {1}}\{\boldsymbol{Z}_{1}\ge \textbf{0},\boldsymbol{Z}_{2}\ge \frac{r(\boldsymbol{l}_{2}-\Sigma _{21}\Sigma _{11}^{-1}\boldsymbol{l}_{1})}{\sqrt{\nu }}\}}\end{aligned}$$
(10)

Let \(\mathfrak {D} \equiv \{\boldsymbol{z} :\boldsymbol{z}_1\ge {\textbf{0}},\boldsymbol{z}_2\ge \frac{r(\boldsymbol{l}_2-\Sigma _{21}\Sigma _{11}^{-1}\boldsymbol{l}_1)}{\sqrt{\nu }}\}\). We can now rewrite (10) as an integral and integrate over r. This gives \({\ell }(\gamma )=\):

$$\begin{aligned}&{}=\textstyle {\int }_{0}^{\infty }{\int }_{\mathfrak {D}} c_{\nu }(r){\phi }_{\Sigma }\left( \boldsymbol{z}\right) \exp \left( -r^{2}\boldsymbol{l}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{l}_{1}/(2\nu )-r\boldsymbol{z}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{l}_{1}/\sqrt{\nu }\right) \text {d}\boldsymbol{z}\text {d} r\\&{}=\textstyle \frac{2^{1-(\nu +d)/2} \pi ^{-d/2}}{\Gamma (\frac{\nu }{2})|\Sigma |^{1/2}}{\int }_{0}^{\infty }\!\!\!{\int }_{\mathfrak {D}} \exp \left( -\frac{r^{2}}{2}\left( 1+\frac{\boldsymbol{l}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{l}_1}{2\nu }\right) -\frac{\boldsymbol{z}^{\top }\Sigma ^{-1}\boldsymbol{z}}{2}-\frac{r\boldsymbol{z}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{l}_{1}}{\sqrt{\nu }}+(\nu -1)\ln r\right) \textrm{d}\boldsymbol{z}\text {d} r\\&{}=\textstyle \frac{2^{1-(\nu +d)/2} \pi ^{-d/2}}{\Gamma (\frac{\nu }{2})|\Sigma |^{1/2}\left( 1+\frac{\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_{1}}{\nu }\right) ^{\nu /2}}{\int }_{0}^{\infty }\!\!\!{\int }_{\mathfrak {D}} \textstyle \exp \left( -\frac{u^{2}}{2}-\frac{\boldsymbol{z}^{\top }\Sigma ^{-1}\boldsymbol{z}}{2}-\frac{u\;\boldsymbol{z}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{l}_{1}}{\sqrt{\nu +\boldsymbol{l}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{t}_{1}}}+(\nu -1)\ln u\right) \text {d}\boldsymbol{z}\text {d} u\\&{}=\textstyle \frac{1}{\left( 1+\frac{\boldsymbol{l}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{l}_{1}}{\nu }\right) ^{\nu /2}}{\int }_{0}^{\infty }\!\!\!{\int }_{\mathbb {R}^d} c_{\nu }(u)\phi _{\Sigma }(\boldsymbol{z})\exp \left( \textstyle -\frac{u\;\boldsymbol{z}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{l}_{1}}{\sqrt{\nu +\boldsymbol{l}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{l}_{1}}}\right) {\mathbbm {1}}\left\{ \boldsymbol{z}_{1}\ge \textbf{0}, \boldsymbol{z}_{2}\ge \frac{u(\boldsymbol{l}_{2}-\Sigma _{21}\Sigma _{11}^{-1}\boldsymbol{l}_{1})}{\sqrt{\nu +\boldsymbol{l}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{l}_{1}}}\right\} \text {d}\boldsymbol{z}\text {d} u\\&{}=\textstyle \left( 1+\frac{\boldsymbol{l}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{l}_{1}}{\nu }\right) ^{-\nu /2}\mathbb {E} \exp \left( -\frac{R\;\boldsymbol{Z}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{l}_{1}}{\sqrt{\nu +\boldsymbol{l}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{l}_{1}}}\right) {\mathbbm {1}}\left\{ \boldsymbol{Z}_{1}\ge \textbf{0}, \boldsymbol{Z}_{2}\ge \frac{R(\boldsymbol{l}_{2}-\Sigma _{21}\Sigma _{11}^{-1}\boldsymbol{l}_{1})}{\sqrt{\nu +\boldsymbol{l}_{1}^{\top }\Sigma _{11}^{-1}\boldsymbol{l}_{1}}}\right\} ,\end{aligned}$$

where the third line follows from the change of variable \( u=r\sqrt{1+\frac{\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1}{\nu }}\;.\) Next, using formula (10) we rewrite the last expression as:

$$ \left( 1+\frac{\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1}{\nu }\right) ^{-\nu /2}\mathbb {E}\exp \left( \frac{R^2\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1}{2(\nu +\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1)}\right) \mathbb {P}\left[ \boldsymbol{Z}\ge \frac{R\boldsymbol{l}}{\sqrt{\nu +\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1}}\, \Big |\,R\right] . $$

We now seek to apply the dominated convergence theorem to the expectation in the last displayed equation. For this we need the upper bound (recall that \(\Sigma _{11}^{-1}\boldsymbol{l}_1\ge {\textbf{0}}\))

$$\begin{aligned} \exp \left( \frac{r^2\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1}{2(\nu +\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1)}\right) \mathbb {P}\left[ \boldsymbol{Z}\ge \frac{r\boldsymbol{l}}{\sqrt{\nu +\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1}}\right]&\le \exp (r^2/2)\mathbb {P}\left[ \boldsymbol{Z}_1\ge \frac{r\boldsymbol{l}_1}{\sqrt{\nu +\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1}}\right] \\&\le \exp (r^2/2)\mathbb {P}\left[ \boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{Z}_1\ge \frac{r\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1}{\sqrt{\nu +\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1}} \right] \\&=\exp (r^2/2)\overline{\Phi }\left[ r\sqrt{\frac{\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1}{\nu +\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1}}\right] \le \exp (r^2/2)\overline{\Phi }\left( r\right) . \end{aligned}$$

The last expression is integrable in the sense that \(\int _0^\infty c_\nu (r)\exp (r^2/2)\overline{\Phi }\left( r\right) \textrm{d} r=\)

$$ \frac{2^{1-\nu /2}}{\Gamma (\nu /2)}\int _0^\infty r^{\nu -1}\overline{\Phi }\left( r\right) \textrm{d} r= \frac{2^{1-\nu /2}}{\Gamma (\nu /2)2\nu }\int _{-\infty }^\infty |u|^{\nu } \phi (u)\textrm{d} u=\frac{2^{1-\nu /2}\Gamma ((\nu +1)/2)2^{\nu /2}}{\sqrt{\pi }\Gamma (\nu /2)2\nu }=\frac{\Gamma ((\nu +1)/2)}{\sqrt{\pi }\Gamma (\nu /2)\nu }<\infty . $$

In addition, as \(\gamma \uparrow \infty \), by Lemma 1 we have the pointwise limits:

$$ \exp \left[ \frac{r^2\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1}{2(\nu +\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1)}\right] \mathbb {P}\left( \boldsymbol{Z}\ge \frac{r\boldsymbol{l}}{\sqrt{\nu +\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1}}\right) \rightarrow \exp (r^2/2)\mathbb {P}[\boldsymbol{Z}\ge r\boldsymbol{l}_\infty ]. $$

Therefore, by the dominated convergence theorem

$$ \lim _{\gamma \uparrow \infty }\mathbb {E}\exp \left( \frac{R^2\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1}{2(\nu +\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1)}\right) \mathbb {P}\left[ \boldsymbol{Z}\ge \frac{R\boldsymbol{l}}{\sqrt{\nu +\boldsymbol{l}_1^\top \Sigma _{11}^{-1}\boldsymbol{l}_1}}\, \Big |\,R\right] = \frac{2^{1-\nu /2}}{\Gamma (\nu /2)}\int _0^\infty r^{\nu -1}\mathbb {P}[\boldsymbol{Z}\ge r\boldsymbol{l}_\infty ]\textrm{d} r. $$

This concludes the proof. \(\square \)

Lemma 1

(Continuity of Gaussian tail) Suppose that \(\boldsymbol{Z}\sim \mathcal {N}({\textbf{0}},\Sigma )\) for some positive definite matrix \(\Sigma \), and \(\boldsymbol{a}_n\rightarrow \boldsymbol{a}\) as \(n\uparrow \infty \). Then, the tail of the multivariate Gaussian is continuous: \( \lim _{n\uparrow \infty }\mathbb {P}[\boldsymbol{Z}\ge \boldsymbol{a}_n]=\mathbb {P}[\boldsymbol{Z}\ge \boldsymbol{a}]. \)

Proof

The proof is yet another application of the dominated convergence theorem to show that: \( \int _{[{\textbf{0}},\boldsymbol{\infty })}\phi _{\Sigma }(\boldsymbol{z}+\boldsymbol{a}_n)\textrm{d} \boldsymbol{z}\rightarrow \int _{[{\textbf{0}},\boldsymbol{\infty })}\phi _{\Sigma }(\boldsymbol{z}+\boldsymbol{a})\textrm{d} \boldsymbol{z}=\mathbb {P}[\boldsymbol{Z}\ge \boldsymbol{a}]. \) Since \(\Sigma \) is a positive definite matrix, the \(\Vert \boldsymbol{x}\Vert ^2_{\Sigma }:=\boldsymbol{x}^\top \Sigma ^{-1}\boldsymbol{x}\) is a norm satisfying \(\Vert \boldsymbol{z}+\boldsymbol{a}_n\Vert _{\Sigma }^2\le 2(\Vert \boldsymbol{z}\Vert _{\Sigma }^2+\Vert \boldsymbol{a}_n\Vert _{\Sigma }^2)\). Therefore, \( \int _{[{\textbf{0}},\boldsymbol{\infty })}\phi _{\Sigma }(\boldsymbol{z}+\boldsymbol{a}_n)\textrm{d} \boldsymbol{z}\le \frac{\exp (-\Vert \boldsymbol{a}_n\Vert ^2_{\Sigma })}{2^{n/2}}\int _{[\textbf{0},\boldsymbol{\infty })}\phi _{\Sigma /2}(\boldsymbol{z})\textrm{d} \boldsymbol{z}<\infty , \) and the conditions for the dominated convergence theorem are met. \(\square \)

1.2 Proof of Theorem 3

Proof

First, note that the second moment is \(\int g(\boldsymbol{z},r;\boldsymbol{\mu }^*,\eta ^*) \exp (2\psi (\boldsymbol{z},r;\boldsymbol{\mu }^*,\eta ^*))\) \(\textrm{d}\boldsymbol{z}\textrm{d} r =\)

$$ =\int _{\mathfrak {R}}c_\nu (r)\phi _{\Sigma }(\boldsymbol{z})\exp (\psi (\boldsymbol{z},r;\boldsymbol{\mu }^*,\eta ^*))\textrm{d}\boldsymbol{z}\textrm{d} r\le \ell (\gamma ) \exp (\psi (\boldsymbol{z}^*,r^*;\boldsymbol{\mu }^*,\eta ^*)). $$

Since the properties of \(\psi \) imply that

$$ \psi (\boldsymbol{z}^*,r^*;\boldsymbol{\mu }^*,\eta ^*)\le \psi (\boldsymbol{z}^*,r^*;{\textbf{0}},\eta ^*)\le \frac{(\eta ^*)^2}{2}-r^*\eta ^*+(\nu -1)\ln r^*+\ln \Phi (\eta ^*), $$

bounded relative error will follow if we can show that \( \frac{(r^*)^{\nu -1}\Phi (\eta ^*)\exp (\frac{(\eta ^*)^2}{2}-r^*\eta ^*)}{\ell (\gamma )} \) remains bounded in \(\gamma \). The pair \((r^*,\eta ^*)\) is determined from the solution to (3), namely from finding the saddle-point solution of: \(\max _{r,\boldsymbol{z}}\min _{\eta ,\boldsymbol{\mu }}\psi (\boldsymbol{z},r;\boldsymbol{\mu },\eta )\). This can be obtained by setting the gradient of \(\psi \) with respect to the vector \((\boldsymbol{z},r,\boldsymbol{\mu },\eta )\) to zero: \(\nabla \psi ={\textbf{0}}\). We now introduce the following notation that will allow us to express \(\nabla \psi ={\textbf{0}}\) explicitly. Let L be the lower triangular Cholesky factor of \(\Sigma = L L^\top \). Define \(D=\textrm{diag}( L)\;,\breve{ L}= D^{-1} L\), \( \tilde{\boldsymbol{l}}=\frac{r}{\sqrt{\nu }} D^{-1} \boldsymbol{l}(\gamma )-(\breve{ L}- I)\boldsymbol{z}, \) and vector \(\boldsymbol{\Psi }\) with elements \( \Psi _k=\phi (\tilde{l}_k-\mu _k)/\overline{\Phi }(\tilde{l}_k-\mu _k)\). Then, \(\nabla \psi ={\textbf{0}}\) can be written as

$$\begin{aligned} (\breve{ L}^\top -I)\boldsymbol{\Psi }-\boldsymbol{\mu }&{}={\textbf{0}}\\ \frac{\nu -1}{r}-\eta -\frac{1}{\sqrt{\nu }} \boldsymbol{\Psi }^\top D^{-1} \boldsymbol{l}(\gamma ) &{}=0\\ \boldsymbol{\mu }+\boldsymbol{\Psi }-\boldsymbol{z}&{}={\textbf{0}} \\ \eta +\frac{\phi (\eta )}{\Phi (\eta )}-r&{}=0. \end{aligned}$$
(11)

Next, we verify via substitution that the solution of (11) as \(\gamma \uparrow \infty \) satisfies \( r^*=\mathcal {O}(\gamma ^{-1})\), \(\boldsymbol{z}^*=\mathcal {O}(\textbf{1})\), \(\eta ^*=\mathcal {O}(-\gamma )\), \(\boldsymbol{\mu }^*=\mathcal {O}(\textbf{1}).\) First, equations one and three in (11) are trivially satisfied and we can deduce that \(\boldsymbol{\Psi }=\mathcal {O}(\textbf{1})\). Second, since \(\tilde{\boldsymbol{l}}=\mathcal {O}(r\boldsymbol{l}(\gamma ))=\mathcal {O}(\textbf{1})\), it follows that equation two in (11) is equivalent to

$$ r^*\eta ^*=\nu -1-\frac{ r^*}{\sqrt{\nu }} \boldsymbol{\Psi }^\top D^{-1}\boldsymbol{l}(\gamma )=\mathcal {O} (1). $$

Finally, note that Mill’s ratio \( \frac{\Phi (\eta )}{\phi (\eta )}\simeq -\frac{1}{\eta }+\frac{1}{\eta ^3}, \eta \downarrow -\infty , \) implies that equation four is asymptotically equivalent to \(r\eta ^2+\eta -r\simeq 0 \). The solution of this quadratic equation in turn implies that \( \eta \simeq (-1-\sqrt{1+4r^2})/(2r) \simeq -1/r\). In other words, \(\eta ^* r^*=\mathcal {O}(1)\), as desired. Therefore, if \(\tilde{\psi }\) denotes the value of \(\psi \) at the solution (11), we have

$$\begin{aligned} \tilde{\psi }&=\frac{\Vert \boldsymbol{\mu }^*\Vert ^2}{2}-(\boldsymbol{z}^*)^\top \boldsymbol{\mu }^* + \frac{(\eta ^*)^2}{2}-r^*\eta ^*+ (\nu -1)\ln r^*+\ln \Phi (\eta ^*) +\sum _{k=1}^d\ln \overline{\Phi }( \tilde{l}_k-\mu _k^*)\\&= \mathcal {O}(1)+\frac{(\eta ^*)^2}{2}+(\nu -1)\ln r^*+\ln \overline{\Phi }(-\eta ^*). \end{aligned}$$

By Mill’s ratio inequality: \( \ln \overline{\Phi }(-\eta )\le -\eta ^2/2-\frac{1}{2}\ln (2\pi )-\ln (-\eta ), \) we obtain: \( \tilde{\psi }\lesssim \mathcal {O}(1)-\ln (-\eta ^*)-\frac{1}{2}\ln (2\pi )+(\nu -1)\ln r^*=-\nu \log (\gamma )+\mathcal {O}(1). \) In other words, there exist constants \(c_1,c_2>0\) such that \( \exp (\tilde{\psi })\le c_1\gamma ^{-\nu } \) for every \(\gamma >c_2\). Therefore,

$$ \text {Var}(\hat{\ell })=\mathbb {E} \exp (\psi (\boldsymbol{Z},R;\boldsymbol{\mu }^{*},\eta ^{*}))-\ell ^2\lesssim \ell (\gamma )\exp (\tilde{\psi })-\ell ^{2}\le c_{1}\gamma ^{-2\nu }-\ell ^{2}(\gamma ) $$

and since by Theorem 2

$$ \ell (\gamma )\simeq c \times \bigg (1+\gamma ^2 \times \underbrace{\frac{\boldsymbol{l}^\top _1\Sigma _{11}^{-1}\boldsymbol{l}_1}{\nu \times \gamma ^2}}_{\Theta (1)}\bigg )^{-\nu /2}=\Theta (\gamma ^{-\nu }),\quad \gamma \uparrow \infty , $$

we have \(\text {lim sup}_{\gamma \uparrow \infty }\text {Var}(\hat{\ell })/\ell ^{2}< {\infty }.\) \(\square \)

1.3 Proof of Theorem 4

Proof

Ignoring the \(B_i\) variable in Algorithm 2 gives a state \(\boldsymbol{X}_n\) with marginal distribution that follows an independence Metropolis Hastings sampler. From [15, Theorem 2.1] we know that for an independence Metropolis sampler with proposal \(g(\boldsymbol{x})\) and target \(f(\boldsymbol{x})\) such that \(\sup _{\boldsymbol{x}}f(\boldsymbol{x})/g(\boldsymbol{x})<c\) for some constant \(c>0\), the Markov chain is uniformly ergodic with convergence rate

$$ \sup _A \left| \kappa _t(A|\boldsymbol{x})-f(A)\right| \le (1-c^{-1})^t. $$

Thus, to ensure the total variation bound remains below \(\epsilon \), we need to run the independence sampler for \(t^*\) steps such that

$$ (1-c^{-1})^{t^*}\le \exp (-t^*/c) \le \epsilon . $$

In other words, we have \( t^*\ge \left\lceil -c\ln (\epsilon )\right\rceil \) and the length of the chain will remain bounded in the rarity parameter \(\gamma \) provided that \(c(\gamma )\) remains bounded in \(\gamma \). In Algorithm 2 we have

$$ c(\gamma )=f(\boldsymbol{x})/g(\boldsymbol{x})=\frac{p(\boldsymbol{x})}{g(\boldsymbol{x})\ell (\gamma )}\le \frac{\exp (\psi ^*)}{\ell (\gamma )}\le \frac{(r^*)^{\nu -1}\Phi (\eta ^*)\exp (\frac{(\eta ^*)^2}{2}-r^*\eta ^*)}{\ell (\gamma )}, $$

where \(\psi ^*=\exp (\psi (\boldsymbol{z}^*,r^*;\boldsymbol{\mu }^*,\eta ^*))\). However, from the proof of Theorem 3 we know that \(\frac{\exp (\psi ^*)}{\ell (\gamma )}\) remains bounded as \(\gamma \uparrow \infty \). Hence, the Markov chain in Algorithm 2 is strongly efficient. \(\square \)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Botev, Z.I., Chen, YL. (2022). Truncated Multivariate Student Computations via Exponential Tilting. In: Botev, Z., Keller, A., Lemieux, C., Tuffin, B. (eds) Advances in Modeling and Simulation. Springer, Cham. https://doi.org/10.1007/978-3-031-10193-9_4

Download citation

Publish with us

Policies and ethics