Sampling large hyperplane-truncated multivariate normal distributions

Maatouk, Hassan; Rullière, Didier; Bay, Xavier

doi:10.1007/s00180-023-01416-7

Sampling large hyperplane-truncated multivariate normal distributions

Original Paper
Published: 19 September 2023

Volume 39, pages 1779–1806, (2024)
Cite this article

Computational Statistics Aims and scope Submit manuscript

Hassan Maatouk¹,
Didier Rullière² &
Xavier Bay²

72 Accesses
1 Citation
Explore all metrics

Abstract

Generating multivariate normal distributions is widely used in various fields, including engineering, statistics, finance and machine learning. In this paper, simulating large multivariate normal distributions truncated on the intersection of a set of hyperplanes is investigated. Specifically, the proposed methodology focuses on cases where the prior multivariate normal is extracted from a stationary Gaussian process (GP). It is based on combining both Karhunen-Loève expansions (KLE) and Matheron’s update rules (MUR). The KLE requires the computation of the decomposition of the covariance matrix of the random variables, which can become expensive when the random vector is too large. To address this issue, the input domain is split in smallest subdomains where the eigendecomposition can be computed. Due to the stationary property, only the eigendecomposition of the first subdomain is required. Through this strategy, the computational complexity is drastically reduced. The mean-square truncation and block errors have been calculated. The efficiency of the proposed approach has been demonstrated through both synthetic and real data studies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Algorithm 1

A Survey on Supervised and Unsupervised Learning Techniques

Random Gradient-Free Minimization of Convex Functions

Article 30 November 2015

Principal Component Analysis for Distributions Observed by Samples in Bayes Spaces

Article Open access 03 May 2024

References

Aune E, Eidsvik J, Pokern Y (2013) Iterative numerical methods for sampling from high dimensional Gaussian distributions. Stat Comput 23(4):501–521
Article MathSciNet Google Scholar
Chevalier C, Emery X, Ginsbourger D (2015) Fast update of conditional simulation ensembles. Math Geosci 47(7):771–789
Article Google Scholar
Cho H, Venturi D, Karniadakis G (2013) Karhunen-Loève expansion for multi-correlated stochastic processes. Probab Eng Mech 34:157–167
Article Google Scholar
Cong Y, Chen B, Zhou M (2017) Fast simulation of hyperplane-truncated multivariate normal distributions. Bayesian Anal 12(4):1017–1037. https://doi.org/10.1214/17-BA1052
Article MathSciNet Google Scholar
Cousin A, Maatouk H, Rullière D (2016) Kriging of financial term-structures. Eur J Oper Res 255(2):631–648
Article MathSciNet Google Scholar
Davis M (1987) Generating large stochastic simulations - The matrix polynomial approximation method. Math Geol 19(2):99–107
Article Google Scholar
Golub G, Van Loan C (1996) Matrix Comput. The Johns Hopkins University Press, Baltimore
Google Scholar
Hoffman Y, Ribak E (1991) Constrained realizations of Gaussian fields: A simple algorithm. APJ 380:L5. https://doi.org/10.1086/186160
Article Google Scholar
Journel A, Huijbregts C (1976) Min Geostat. Academic Press, Cambridge
Google Scholar
Kirby M, Sirovich L (1990) Application of the Karhunen-Loève procedure for the characterization of human faces. IEEE PAMI 12(1):103–108
Article Google Scholar
Lenk PJ, Choi T (2017) Bayesian analysis of shape-restricted functions using Gaussian process priors. Stat Sin 23:43–69
MathSciNet Google Scholar
Loève M (1977) Elementary probability theory. Springer, Berlin
Google Scholar
Maatouk H, Bay X (2017) Gaussian process emulators for computer experiments with inequality constraints. Math Geosci 49(5):557–582
Article MathSciNet Google Scholar
Maatouk H, Bay X, Rullière D (2022) A note on simulating hyperplane-truncated multivariate normal distributions. Stat Probab Lett 191(109):650. https://doi.org/10.1016/j.spl.2022.109650
Article MathSciNet Google Scholar
Matérn B (1986) Spatial variation, vol 36. Springer Science & Business Media, Berlin
Google Scholar
Panunzio A, Cottereau R, Puel G (2018) Large scale random fields generation using localized Karhunen-Loève expansion. AMSES 5(1):1–29
Google Scholar
Ray P, Pati D, Bhattacharya A (2020) Efficient Bayesian shape-restricted function estimation with constrained Gaussian process priors. Stat Comput 30(4):839–853
Article MathSciNet Google Scholar
Rue H (2001) Fast sampling of Gaussian Markov random fields. J R Stat Soc B Stat 63(2):325–338
Article MathSciNet Google Scholar
Wang L (2008) Karhunen-Loève expansions and their applications. School of Economics and Political Science, London
Google Scholar
Williams C, Rasmussen C (2006) Gaussian processes for machine learning, vol 2. MIT press, Cambridge
Google Scholar
Wilson J, Borovitskiy V, Terenin A et al (2021) Pathwise conditioning of Gaussian processes. JMLR 22(105):1–47
MathSciNet Google Scholar
Wood AT, Chan G (1994) Simulation of stationary Gaussian processes in $[0, 1]^d$. J Comput Graph Stat 3(4):409–432
MathSciNet Google Scholar
Zhang J, Ellingwood B (1994) Orthogonal series expansions of random fields in reliability analysis. J Eng Mech 120(12):2660–2677
Article Google Scholar
Zhou S, Giulani P, Piekarewicz J et al (2019) Reexamining the proton-radius problem using constrained Gaussian processes. Phys Rev C 99(055):202. https://doi.org/10.1103/PhysRevC.99.055202
Article Google Scholar

Download references

Acknowledgements

The authors would like to express their gratitude to the Editor, the Associate Editor, and the two reviewers for their valuable and constructive comments, which have significantly enhanced the presentation and accuracy of the manuscript. Part of this research was conducted with the support of the consortium in Applied Mathematics CIROQUO, gathering partners in technological and academia in the development of advanced methods for Computer Experiments. https://doi.org/10.5281/zenodo.6581217

Author information

Authors and Affiliations

Département de Mathématiques, CY Cergy Paris Université, Site du Parc, Cergy-Pontoise, 95011, France
Hassan Maatouk
Mines Saint-Étienne, Univ Clermont Auvergne, CNRS, UMR 6158 LIMOS, Institut Henri Fayol, Saint-Étienne, F-42023, France
Didier Rullière & Xavier Bay

Authors

Hassan Maatouk
View author publications
You can also search for this author in PubMed Google Scholar
Didier Rullière
View author publications
You can also search for this author in PubMed Google Scholar
Xavier Bay
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hassan Maatouk.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A Subdomains with different lengths

The methodology presented in Sect. 2.2 can also be extended to cases where the domain is split into subdomains with different lengths. In such cases, the eigendecomposition and the coupling matrix must be computed at each subdomain. For example, suppose that the domain ${\mathcal {D}}=[0,1]$ is divided into two subdomains $[0,S]\cup [S,1]$ as illustrated in Fig. 12 below, where $S=0.7$. The only modification in the methodology presented before is in computing both the coupling matrix ${\varvec{K}}$ and the lower triangle ${\varvec{L}}$. Let us denote these two matrices respectively by ${\varvec{K}}^{1,2}$ and ${\varvec{L}}^{1,2}$ since they depend on the two connected subdomains. The elements of ${\varvec{K}}^{1,2}\in {\mathbb {R}}^{p\times p}$ are defined as

$$\begin{aligned} {\varvec{K}}^{1,2}_{i,j}=\frac{1}{\sqrt{\lambda ^{(1)}_i\lambda _j^{(2)}}} \int _{x=0}^S\int _{t=S}^1C(|x-t|)\phi _i^{(1)}(x)\phi _j^{(2)}(t)\textrm{d}x\textrm{d}t, \end{aligned}$$

where $\{\lambda _i^{(1)},\phi _i^{(1)}\}_{i=1}^{p}$ and $\{\lambda _j^{(2)},\phi _j^{(2)}\}_{j=1}^{p}$ are the eigendecomposition of C on [0, S] and (S, 1], respectively. The lower triangle matrix ${\varvec{L}}^{1,2}$ is computed as follows

$$\begin{aligned} {{\textbf {I}}}_{p}-\left( {\varvec{K}}^{1,2}\right) ^\top {\varvec{K}}^{1,2}={\varvec{L}}^{1,2} \left( {\varvec{L}}^{1,2}\right) ^\top . \end{aligned}$$

As before, the second KLE conditional coefficients set is generated as follows:

$$\begin{aligned} \varvec{\xi }^{(2)}=\left( {\varvec{K}}^{1,2}\right) ^\top \varvec{\xi }^{(1)}+{\varvec{L}}^{1,2}\varvec{\zeta }^{(2)}, \end{aligned}$$

where $\varvec{\xi }^{(1)}=\varvec{\zeta }^{(1)}\in {\mathbb {R}}^{p}$ and $\varvec{\zeta }^{(2)}\in {\mathbb {R}}^{p}$ are two replicates following a standard MVN distribution. Finally, the random process defined on the entire domain ${\mathcal {D}}=[0,1]$ is given by

$$\begin{aligned} \left\{ \begin{array}{ll} Y^{\perp }_{1}(x)=\sum _{i=1}^{p}\sqrt{\lambda _i^{(1)}}\zeta _i^{(1)} \phi _i^{(1)}(x), &{} \quad \text {if }x\in [0,S] \\ Y^p_{2}(x)=\sum _{j=1}^{p}\sqrt{\lambda _j^{(2)}}\xi ^{(2)}_j \phi _j^{(2)}(x), &{} \quad \text {if }x\in (S,1]. \end{array}\right. \end{aligned}$$

The correlation between any two coefficients sets $\varvec{\xi }^{(m)}$ and $\varvec{\xi }^{(m')}$ for any $m'>m$ is equal to

$$\begin{aligned} {\varvec{K}}^{m,m'}= & {} \textrm{Cov}\left( \varvec{\xi }^{(m)},\varvec{\xi }^{(m')}\right) ={\varvec{K}}^{m,m+1}\times {\varvec{K}}^{m+1,m+2}\times \ldots \times {\varvec{K}}^{m'-1,m'} \\= & {} \prod _{\ell =m}^{m'-1}{\varvec{K}}^{\ell ,\ell +1}, \end{aligned}$$

where ${\varvec{K}}^{\ell ,\ell +1}=\textrm{Cov}\left( \varvec{\xi }^{(\ell )},\varvec{\xi }^{(\ell +1)}\right)$, for any $\ell \in \{m,\ldots ,m'-1\}$.

In Fig. 12, the domain ${\mathcal {D}}$ is split in two (resp. three) subdomains in the left (resp. right) panel with different lengths. The solid curves (resp. dashed curves) represent the paths before (resp. after) conditioning. The Matérn covariance function with a smoothness parameter $\nu =5/2$ has been utilized, where the length-scale parameter $\theta$ is fixed at 0.2.

Appendix B Proofs of Propositions

This appendix contains the proofs of the propositions presented in the main body of the manuscript.

Proof of Proposition 2.1

We have for any $x,x'\in {\mathcal {D}}$

$$\begin{aligned} C(x,x')=\, & {} \textrm{E}[Y(x)Y(x')]=\sum _{i=1}^{+\infty }\sum _{j=1}^{+\infty } \sqrt{\gamma _i\gamma _j}\varphi _i(x)\varphi _j(x')\textrm{E}[\xi _i\xi _j] \\= & {} \sum _{i=1}^{+\infty }\sum _{j=1}^{+\infty }\sqrt{\gamma _i\gamma _j} \varphi _i(x)\varphi _j(x')\delta _{ij} = \sum _{i=1}^{+\infty }\gamma _i\varphi _i(x)\varphi _i(x'), \end{aligned}$$

which concludes the proof of the proposition. $\square$

Proof of Proposition 2.2

(i)
From right to left: suppose that $\{\xi _i^{(m)}\}$ are independent and identically distributed ${\mathcal {N}}(0,1)$. The two processes $Y$ and $Y_m$ are zero-mean Gaussian. Firstly, from Proposition 2.1 and the stationary property of the covariance function, we have for any $t,t'\in {\mathcal {D}}^{(m)}$
$$\begin{aligned} C(t,t')=\, & {} \textrm{Cov}(Y(t),Y(t'))=\textrm{Cov}\left( Y(t-(m-1)S),Y(t'-(m-1)S)\right) \\= & {} \textrm{Cov}(Y(x),Y(x'))=\sum _{i=1}^{+\infty }\lambda _i\phi _i(x)\phi _i(x'), \end{aligned}$$
where $x=t-(m-1)S$ and $x'=t'-(m-1)S$ are in ${\mathcal {D}}^{(1)}$. Secondly, we have for any $t,t'\in {\mathcal {D}}^{(m)}$
$$\begin{aligned} \textrm{Cov}(Y_m(t),Y_m(t'))= & {} \sum _{i,j=1}^{+\infty }\sqrt{\lambda _i\lambda _j}\phi _i(t-(m-1)S)\phi _j(t'-(m-1)S)\delta _{ij} \\= & {} \sum _{i=1}^{+\infty }\lambda _i\phi _i(t-(m-1)S)\phi _i(t'-(m-1)S) \\= & {} \sum _{i=1}^{+\infty }\lambda _i\phi _i(x)\phi _i(x'), \end{aligned}$$
where $x=t-(m-1)S$ and $x'=t'-(m-1)S$ are in ${\mathcal {D}}^{(1)}$. Now, from left to right: suppose that $Y_m$ and $Y$ have the same distribution on ${\mathcal {D}}^{(m)}$. This means that for any $x,x'\in {\mathcal {D}}^{(m)}$
$$\begin{aligned} \textrm{Cov}\left( Y_m(x),Y_{m}(x')\right) =\textrm{Cov}(Y(x),Y(x'))=C(|x-x'|). \end{aligned}$$
Therefore, for any i, j
$$\begin{aligned} \textrm{Cov}(\xi _i^{(m)},\xi _j^{(m)})= & {} \frac{1}{\sqrt{\lambda _i\lambda _j}}\iint _{{\mathcal {D}}^{(1)}}\phi _i(t)\phi _j(t')C(|t-t'|)\textrm{d}t\textrm{d}t' \\= & {} \frac{\lambda _j}{\sqrt{\lambda _i\lambda _j}}\int _{{\mathcal {D}}^{(1)}}\phi _i(t)\phi _j(t)\textrm{d}t=\delta _{ij}, \end{aligned}$$
which concludes the proof of the first item of the proposition.
(ii)
Let us consider the simple case when $m'=m+1$. For example, let $m=1$ then $m'=2$. From right to left: on the one hand, we have for any $(s,t)\in {\mathcal {D}}^{(1)}\times {\mathcal {D}}^{(2)}$,
$$\begin{aligned} \textrm{Cov}(Y(s),Y(t))=\textrm{E}[Y(s)Y(t)]=C(|s-t|). \end{aligned}$$
On the other hand, we have
$$\begin{aligned} \textrm{E}[Y_1(s)Y_2(t)]= & {} \sum _{i,j=1}^{+\infty }\sqrt{\lambda _i\lambda _j}\phi _i(s)\phi _j(t-S)\textrm{E}[\xi _i^{(1)}\xi _j^{(2)}]\\= & {} \sum _{i,j=1}^{+\infty }\phi _i(s)\phi _j(t-S)\int _{x=0}^S\int _{x'=S}^{2S}\phi _i(x)\phi _j(x'-S)C(|x-x'|)\textrm{d}x\textrm{d}x'\\= & {} \int _{0}^S\int _S^{2S}\sum _{i=1}^{+\infty }\phi _i(s)\phi _i(x)\sum _{j=1}^{+\infty }\phi _j(t-S)\phi _j(x'-S)C(|x-x'|)\textrm{d}x\textrm{d}x' \\= & {} \int _0^S\int _S^{2S}\delta (s-x)\delta (t-x')C(|x-x'|)\textrm{d}x\textrm{d}x' = C(|s-t|). \end{aligned}$$
From left to right: we have
$$\begin{aligned} \textrm{Cov}\left( \xi _{i}^{(1)}, \xi _{j}^{(2)}\right)=\, & {} \textrm{Cov}\left( \frac{1}{\sqrt{\lambda _{i}}} \int _{0}^{S} \phi _{i}(x) Y_{1}(x) \textrm{d}x, \frac{1}{\sqrt{\lambda _{j}}} \int _{t=S}^{2 S} \phi _{j}(t-S) Y_{2}(t) \textrm{d}t\right) \\= & {} \frac{1}{\sqrt{\lambda _{i} \lambda _{j}}} \int _{x=0}^{S} \int _{t=S}^{2 S} C(|x-t|) \phi _{i}(x) \phi _{j}(t-S) \textrm{d}x \textrm{d}t \\= & {} \frac{1}{\sqrt{\lambda _{i} \lambda _{j}}} \int _{x=0}^{S} \int _{x'=0}^{S} C\left( |x-x'-S|\right) \phi _{i}(x) \phi _{j}\left( x'\right) \textrm{d}x \textrm{d}x'. \end{aligned}$$
The general case can be proved in a similar way.

$\square$

Proof of Proposition 2.3

From Equation (9), the correlation between $\varvec{\xi }^{(m-1)}$ and $\varvec{\xi }^{(m)}$ is imposed in order to satisfy the correlation structure between the connected subdomains ${\mathcal {D}}^{(m-1)}$ and ${\mathcal {D}}^{(m)}$
$$\begin{aligned} {\varvec{K}}^{m-1,m}=\textrm{Cov}\left( \varvec{\xi }^{(m-1)},\varvec{\xi }^{(m)}\right) ={\varvec{K}}, \quad \forall m\in \{2,\ldots ,M\}. \end{aligned}$$
(B1)
For example, when $m=3$, we get
$$\begin{aligned} {\varvec{K}}^{2,3}= \,& {} \textrm{Cov}\left( \varvec{\xi }^{(2)},\varvec{\xi }^{(3)}\right) =\textrm{Cov}\left( {\varvec{K}}^\top \varvec{\zeta }^{(1)}+{\varvec{L}}\varvec{\zeta }^{(2)},{\varvec{K}}^\top \varvec{\xi }^{(2)}+{\varvec{L}}\varvec{\zeta }^{(3)}\right) \\=\, & {} {\varvec{K}}^\top \textrm{Cov}\left( \varvec{\zeta }^{(1)},\varvec{\xi }^{(2)}\right) {\varvec{K}}+{\varvec{L}}\textrm{Cov}\left( \varvec{\zeta }^{(2)},\varvec{\xi }^{(2)}\right) {\varvec{K}}\\=\, & {} {\varvec{K}}^\top {\varvec{K}}{\varvec{K}}+{\varvec{L}}{\varvec{L}}^\top {\varvec{K}}=({\varvec{K}}^\top {\varvec{K}}+{\varvec{L}}{\varvec{L}}^\top ){\varvec{K}}={\varvec{K}}. \end{aligned}$$
The general case can be proved in a similar way. The second item of Proposition 2.2 concludes the proof.
The second item is a simple consequence of Equations (9) and (B1). For example, when $m=1$ and $m'=3$, we get
$$\begin{aligned} {\varvec{K}}^{1,3}=\, & {} \textrm{Cov}\left( \varvec{\xi }^{(1)},\varvec{\xi }^{(3)}\right) =\,\textrm{Cov}\left( \varvec{\xi }^{(1)},{\varvec{K}}^\top \varvec{\xi }^{(2)}+{\varvec{L}}\varvec{\zeta }^{(3)}\right) \\=\, & {} \textrm{Cov}\left( \varvec{\xi }^{(1)},\varvec{\xi }^{(2)}\right) {\varvec{K}}={\varvec{K}}^2. \end{aligned}$$
In particular, for any $m\in \{2,\ldots ,M\}$
$$\begin{aligned} {\varvec{K}}^{m,m}=\,\textrm{Cov}\left( \varvec{\xi }^{(m)}\right)=\, & {} \textrm{Cov}\left( {\varvec{K}}^\top \varvec{\xi }^{(m-1)}+{\varvec{L}}\varvec{\zeta }^{(m)}\right) \\=\, & {} {\varvec{K}}^\top \textrm{Cov}(\varvec{\xi }^{(m-1)}){\varvec{K}}+{\varvec{L}}\textrm{Cov}(\varvec{\zeta }^{(m)}){\varvec{L}}^\top \\=\, & {} {\varvec{K}}^\top {\varvec{K}}+{\varvec{L}}{\varvec{L}}^\top ={{\textbf {I}}}_{p}, \end{aligned}$$
and it is evident for $m=1$. This concludes the proof of the second item and thus of the proposition.

$\square$

Proof of Proposition 2.4

For the mean-square truncation error (12), on the one hand,

$$\begin{aligned}{} & {} \textrm{E}\left[ \int _{{\mathcal {D}}^{(m)}} Y_m(x)^2\textrm{d}x\right] \\{} & {} \quad =\,\textrm{E}\left[ \int _{{\mathcal {D}}^{(m)}}\sum _{i,j=1}^{+\infty }\sqrt{\lambda _i\lambda _j}\phi _i(x-(m-1)S)\phi _j(x-(m-1)S)\xi ^{(m)}_i\xi ^{(m)}_j\right] \\{} & {} \quad = \sum _{i,j=1}^{+\infty }\sqrt{\lambda _i\lambda _j}\textrm{E}[\xi ^{(m)}_i\xi ^{(m)}_j]\int _{{\mathcal {D}}^{(m)}}\phi _i(x-(m-1)S)\phi _j(x-(m-1)S)\textrm{d}x \\{} & {} \quad = \sum _{i,j=1}^{+\infty }\sqrt{\lambda _i\lambda _j}\textrm{E}[\xi ^{(m)}_i\xi ^{(m)}_j]\delta _{i,j}=\sum _{i=1}^{+\infty }\lambda _i. \end{aligned}$$

On the other hand,

$$\begin{aligned} \textrm{E}\left[ \displaystyle \int _{{\mathcal {D}}^{(m)}}(Y_m(x)-Y^p_m(x))^2\textrm{d}x\right]= & {} \textrm{E}\left[ \displaystyle \int _{{\mathcal {D}}^{(m)}}\left( \sum _{i=p+1}^{+\infty } \sqrt{\lambda _i}\phi _i(x-(m-1)S)\xi ^{(m)}_i\right) ^2\right] \\= & {} \sum _{i=p+1}^{+\infty }\sqrt{\lambda _i\lambda _j}\textrm{E}[\xi _i^{(m)} \xi _j^{(m)}]\delta _{i,j}=\sum _{i=p+1}^{+\infty }\lambda _i. \end{aligned}$$

Thus,

$$\begin{aligned} \epsilon _T^2=\frac{\sum _{i=p+1}^{+\infty }\lambda _i}{\sum _{i=1}^{ +\infty }\lambda _i}=\frac{\sum _{i=1}^{+\infty }\lambda _i -\sum _{i=1}^{p}\lambda _i}{\sum _{i=1}^{+\infty }\lambda _i}= 1-\frac{\sum _{i=1}^p\lambda _i}{\sum _{i=1}^{+\infty }\lambda _i}. \end{aligned}$$

For the mean-square global block error, the result is a simple consequence of the fact that any Gaussian vector can be written as follows (Maatouk et al. 2022)

$$\begin{aligned} \begin{pmatrix} Y(x_1) \\ \vdots \\ Y(x_{N}) \end{pmatrix}=S_{Y}\times \epsilon \quad \text {and} \quad \begin{pmatrix} Y_{1:M}(x_1) \\ \vdots \\ Y_{1:M}(x_{N}) \end{pmatrix}=S_{Y_{1:M}}\times \epsilon , \end{aligned}$$

where $\epsilon$ is a N-dimensional standard Gaussian vector chosen the same for Y and $Y_{1:M}$ to get a specific dependence structure. The two matrices $S_{Y}$ and $S_{Y_{1:M}}$ are the Cholesky factorization of the covariance of $Y$ and $Y_{1:M}$ on the grid ${\mathcal {G}}=\{x_1,\ldots ,x_{N}\}$ respectively. $\square$

The following Lemma is presented before proving Proposition 3.2.

Lemma 4.1

Let $W_1, W_2$ and $W_3$ be three Gaussian random vectors such that $W_2$ is independent of $W_3$ verifying:

$$\begin{aligned} W_1\overset{d}{=}f(W_2)+W_3, \end{aligned}$$

where f is a measurable function of $W_2$. Then

$$\begin{aligned} \{W_1\ \text {s.t.} \ W_2=\beta \}\overset{d}{=}f(\beta )+W_3. \end{aligned}$$

Proof of Lemma 4.1

The proof is given in Wilson et al. (2021). $\square$

Proof of Proposition 3.2

We have $\textrm{E}[\varvec{\eta }\ \text {s.t.} \ {\varvec{A}}\varvec{\eta }]=\Sigma _{\varvec{\eta },{\varvec{A}}\varvec{\eta }}\Sigma _{{\varvec{A}}\varvec{\eta },{\varvec{A}}\varvec{\eta }}^{-1}{\varvec{A}}\varvec{\eta }$. Let $W_3=\varvec{\eta }-\Sigma _{\varvec{\eta },{\varvec{A}}\varvec{\eta }}\Sigma _{{\varvec{A}}\varvec{\eta },{\varvec{A}}\varvec{\eta }}^{-1}{\varvec{A}}\varvec{\eta }$. The two vectors ${\varvec{A}}\varvec{\eta }$ and $W_3$ are uncorrelated:

$$\begin{aligned} \textrm{Cov}({\varvec{A}}\varvec{\eta },W_3)=\, & {} \textrm{Cov}({\varvec{A}}\varvec{\eta },W_3)=\,\textrm{Cov}({\varvec{A}}\varvec{\eta },\varvec{\eta }-\Sigma _{\varvec{\eta },{\varvec{A}}\varvec{\eta }}\Sigma _{{\varvec{A}}\varvec{\eta },{\varvec{A}}\varvec{\eta }}^{-1}{\varvec{A}}\varvec{\eta }) \\=\, & {} \textrm{Cov}({\varvec{A}}\varvec{\eta },\varvec{\eta })-\textrm{Cov}({\varvec{A}}\varvec{\eta },\varvec{\Gamma }{\varvec{A}}^\top ({\varvec{A}}\varvec{\Gamma }{\varvec{A}}^\top )^{-1}{\varvec{A}}\varvec{\eta }) \\= \,& {} {\varvec{A}}\varvec{\Gamma }-{\varvec{A}}\varvec{\Gamma }{\varvec{A}}^\top ({\varvec{A}}\varvec{\Gamma }{\varvec{A}}^\top )^{-1}{\varvec{A}}\varvec{\Gamma }=\, {\varvec{A}}\varvec{\Gamma }-{\varvec{A}}\varvec{\Gamma }={\varvec{0}}_{n,N}. \end{aligned}$$

Thus, ${\varvec{A}}\varvec{\eta }$ and $W_3$ are independent (since Gaussian). Applying Lemma 4.1, we get

$$\begin{aligned} \{\varvec{\eta }\ \text {s.t.} \ {\varvec{A}}\varvec{\eta }=\,{\varvec{y}}\}&\overset{d}{=} f({\varvec{y}})+W_3=\Sigma _{\varvec{\eta },{\varvec{A}}\varvec{\eta }}\Sigma _{{\varvec{A}}\varvec{\eta },{\varvec{A}}\varvec{\eta }}^{-1}{\varvec{y}}+\varvec{\eta }-\Sigma _{\varvec{\eta },{\varvec{A}}\varvec{\eta }}\Sigma _{{\varvec{A}}\varvec{\eta },{\varvec{A}}\varvec{\eta }}^{-1}{\varvec{A}}\varvec{\eta }\\&\overset{d}{=} \varvec{\eta }+\Sigma _{\varvec{\eta },{\varvec{A}}\varvec{\eta }}\Sigma _{{\varvec{A}}\varvec{\eta },{\varvec{A}}\varvec{\eta }}^{-1}({\varvec{y}}-{\varvec{A}}\varvec{\eta }) \overset{d}{=}\varvec{\eta }+\varvec{\Gamma }{\varvec{A}}^\top ({\varvec{A}}\varvec{\Gamma }{\varvec{A}}^\top )^{-1}({\varvec{y}}-{\varvec{A}}\varvec{\eta })\\&\overset{d}{=}\varvec{\eta }+({\varvec{A}}\varvec{\Gamma })^\top ({\varvec{A}}\varvec{\Gamma }{\varvec{A}}^\top )^{-1}({\varvec{y}}-{\varvec{A}}\varvec{\eta }) \end{aligned}$$

where $\Sigma _{\varvec{\eta },{\varvec{A}}\varvec{\eta }}=\textrm{Cov}(\varvec{\eta },{\varvec{A}}\varvec{\eta })=\textrm{Cov}(\varvec{\eta },\varvec{\eta }){\varvec{A}}^\top =\Gamma {\varvec{A}}^\top =({\varvec{A}}\varvec{\Gamma })^\top$ and $\Sigma _{{\varvec{A}}\varvec{\eta },{\varvec{A}}\varvec{\eta }}=\textrm{Cov}({\varvec{A}}\varvec{\eta },{\varvec{A}}\varvec{\eta })={\varvec{A}}\textrm{Cov}(\varvec{\eta },\varvec{\eta }){\varvec{A}}^\top ={\varvec{A}}\varvec{\Gamma }{\varvec{A}}^\top$. $\square$

Appendix C Computational complexity of Algorithms 1 and 2

Table 5 The computational complexity of Algorithm 1. The parameters n and N are respectively the dimension of the set of constraints and of the prior MVN $\varvec{\eta }$

Full size table

Table 6 The computational complexity of Algorithm 2 when the domain ${\mathcal {D}}$ is split into two subdomains $M=2$. The parameters n and N represent the dimension of the set of constraints and the prior MVN $\varvec{\eta }$, respectively, while $p$ denotes the number of terms retained in the expansion

Full size table

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Maatouk, H., Rullière, D. & Bay, X. Sampling large hyperplane-truncated multivariate normal distributions. Comput Stat 39, 1779–1806 (2024). https://doi.org/10.1007/s00180-023-01416-7

Download citation

Received: 15 January 2023
Accepted: 05 September 2023
Published: 19 September 2023
Issue Date: June 2024
DOI: https://doi.org/10.1007/s00180-023-01416-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sampling large hyperplane-truncated multivariate normal distributions

Abstract

Access this article

Similar content being viewed by others

A Survey on Supervised and Unsupervised Learning Techniques

Random Gradient-Free Minimization of Convex Functions

Principal Component Analysis for Distributions Observed by Samples in Bayes Spaces

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix A Subdomains with different lengths

Appendix B Proofs of Propositions

Proof of Proposition 2.1

Proof of Proposition 2.2

Proof of Proposition 2.3

Proof of Proposition 2.4

Lemma 4.1

Proof of Lemma 4.1

Proof of Proposition 3.2

Appendix C Computational complexity of Algorithms 1 and 2

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Sampling large hyperplane-truncated multivariate normal distributions

Abstract

Access this article

Similar content being viewed by others

A Survey on Supervised and Unsupervised Learning Techniques

Random Gradient-Free Minimization of Convex Functions

Principal Component Analysis for Distributions Observed by Samples in Bayes Spaces

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix A Subdomains with different lengths

Appendix B Proofs of Propositions

Proof of Proposition 2.1

Proof of Proposition 2.2

Proof of Proposition 2.3

Proof of Proposition 2.4

Lemma 4.1

Proof of Lemma 4.1

Proof of Proposition 3.2

Appendix C Computational complexity of Algorithms 1 and 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation