An O(N) algorithm for computing expectation of N-dimensional truncated multi-variate normal distribution I: fundamentals

Huang, Jingfang; Cao, Jian; Fang, Fuhui; Genton, Marc G.; Keyes, David E.; Turkiyyah, George

doi:10.1007/s10444-021-09888-1

An O(N) algorithm for computing expectation of N-dimensional truncated multi-variate normal distribution I: fundamentals

Published: 01 September 2021

Volume 47, article number 65, (2021)
Cite this article

Advances in Computational Mathematics Aims and scope Submit manuscript

Jingfang Huang ORCID: orcid.org/0000-0002-6720-8389¹,
Jian Cao²,
Fuhui Fang¹,
Marc G. Genton²,
David E. Keyes² &
…
George Turkiyyah³

171 Accesses
1 Citation
Explore all metrics

Abstract

In this paper, we present the fundamentals of a hierarchical algorithm for computing the N-dimensional integral $\phi (\mathbf {a}, \mathbf {b}; A) = {\int \limits }_{\mathbf {a}}^{\mathbf {b}} H(\mathbf {x}) f(\mathbf {x} | A) \text {d} \mathbf {x}$ representing the expectation of a function H(X) where f(x|A) is the truncated multi-variate normal (TMVN) distribution with zero mean, x is the vector of integration variables for the N-dimensional random vector X, A is the inverse of the covariance matrix Σ, and a and b are constant vectors. The algorithm assumes that H(x) is “low-rank” and is designed for properly clustered X so that the matrix A has “low-rank” blocks and “low-dimensional” features. We demonstrate the divide-and-conquer idea when A is a symmetric positive definite tridiagonal matrix and present the necessary building blocks and rigorous potential theory–based algorithm analysis when A is given by the exponential covariance model. The algorithm overall complexity is O(N) for N-dimensional problems, with a prefactor determined by the rank of the off-diagonal matrix blocks and number of effective variables. Very high accuracy results for N as large as 2048 are obtained on a desktop computer with 16G memory using the fast Fourier transform (FFT) and non-uniform FFT to validate the analysis. The current paper focuses on the ideas using the simple yet representative examples where the off-diagonal matrix blocks are rank 1 and the number of effective variables is bounded by 2, to allow concise notations and easier explanation. In a subsequent paper, we discuss the generalization of current scheme using the sparse grid technique for higher rank problems and demonstrate how all the moments of k_th order or less (a total of O(N^k) integrals) can be computed using O(N^k) operations for k ≥ 2 and $O(N \log N)$ operations for k = 1.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An O(N) algorithm for computing expectation of N-dimensional truncated multi-variate normal distribution II: computing moments and sparse grid acceleration

Article 02 November 2022

Hierarchical-block conditioning approximations for high-dimensional multivariate normal probabilities

Article 30 July 2018

Improved Bounds on Sample Size for Implicit Matrix Trace Estimators

Article 09 September 2014

References

Arellano-Valle, R.B., Azzalini, A.: On the unification of families of skew-normal distributions. Scand. J. Stat. 33(3), 561–574 (2006)
Article MathSciNet MATH Google Scholar
Arellano-Valle, R.B., Branco, M.D., Genton, M.G.: A unified view on skewed distributions arising from selections. Canadian Journal of Statistics 34 (4), 581–601 (2006)
Article MathSciNet MATH Google Scholar
Arellano-Valle, R.B., Genton, M.G.: On the exact distribution of the maximum of absolutely continuous dependent random variables. Statistics & Probability Letters 78(1), 27–35 (2008)
Article MathSciNet MATH Google Scholar
Azzalini, A., Capitanio, A.: The skew-normal and related families, vol. 3. Cambridge University Press, Cambridge (2014)
MATH Google Scholar
Castruccio, S., Huser, R., Genton, M.G.: High-order composite likelihood inference for max-stable distributions and processes. J. Comput. Graph. Stat. 25(4), 1212–1229 (2016)
Article MathSciNet Google Scholar
Huser, R., Dombry, C., Ribatet, M., Genton, M.G.: Full likelihood inference for max-stable data. Stat 8(1), e218 (2019)
Article MathSciNet Google Scholar
Genton, M.G.: Skew-elliptical distributions and their applications: a journey beyond normality. CRC Press (2004)
Genton, M.G., Keyes, D.E., Turkiyyah, G.: Hierarchical decompositions for the computation of high-dimensional multivariate normal probabilities. J. Comput. Graph. Stat. 27(2), 268–277 (2018)
Article MathSciNet Google Scholar
Stephenson, A., Tawn, J.: Exploiting occurrence times in likelihood inference for componentwise maxima. Biometrika 92(1), 213–227 (2005)
Article MathSciNet MATH Google Scholar
Barthelmann, V., Novak, E., Ritter, K.: High dimensional polynomial interpolation on sparse grids. Adv. Comput. Math. 12(4), 273–288 (2000)
Article MathSciNet MATH Google Scholar
Gerstner, T., Griebel, M.: Numerical integration using sparse grids. Numerical algorithms 18(3), 209–232 (1998)
Article MathSciNet MATH Google Scholar
Shen, J., Yu, H.: Efficient spectral sparse grid methods and applications to high-dimensional elliptic problems. SIAM J. Sci. Comput. 32(6), 3228–3250 (2010)
Article MathSciNet MATH Google Scholar
Genz, A., Bretz, F.: Computation of multivariate normal and t probabilities, vol. 195. Springer Science & Business Media, New York (2009)
Book MATH Google Scholar
Azzimonti, D., Ginsbourger, D.: Estimating orthant probabilities of high-dimensional Gaussian vectors with an application to set estimation. J. Comput. Graph. Stat. 27(2), 255–267 (2018)
Article MathSciNet Google Scholar
Botev, Z.I.: The normal law under linear restrictions: simulation and estimation via minimax tilting. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 79(1), 125–148 (2017)
Article MathSciNet MATH Google Scholar
Botev, Z.I., L’Ecuyer, P.: Efficient probability estimation and simulation of the truncated multivariate student-t distribution. In: Proceedings of the 2015 Winter Simulation Conference, pp 380–391. IEEE Press (2015)
Craig, P.: A new reconstruction of multivariate normal orthant probabilities. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 70(1), 227–243 (2008)
MathSciNet MATH Google Scholar
Fayed, H., Atiya, A.: A novel series expansion for the multivariate normal probability integrals based on Fourier series. Math. Comput. 83(289), 2385–2402 (2014)
Article MathSciNet MATH Google Scholar
Genz, A.: Numerical computation of multivariate normal probabilities. Journal of computational and graphical statistics 1(2), 141–149 (1992)
Google Scholar
Genz, A., Bretz, F., Miwa, T., Mi, X., Leisch, F., Scheipl, F., Bornkamp, B., Maechler, M., Hothorn, T.: Multivariate normal and t distributions (2014)
Geweke, J.: Efficient simulation from the multivariate normal and student-t distributions subject to linear constraints and the evaluation of constraint probabilities. Seattle, USA (1991)
Hajivassiliou, V., McFadden, D., Ruud, P.: Simulation of multivariate normal rectangle probabilities and their derivatives theoretical and computational results. Journal of econometrics 72(1-2), 85–134 (1996)
Article MathSciNet MATH Google Scholar
Keane, M.P.: 20 simulation estimation for panel data models with limited dependent variables. Handbook of Statistics 11, 545–571 (1993)
Article Google Scholar
Kuo, F., Sloan, I., Woźniakowski, H: Multivariate integration for analytic functions with Gaussian kernels. Math. Comput. 86(304), 829–853 (2017)
Article MathSciNet MATH Google Scholar
Meyer, C.: Recursive numerical evaluation of the cumulative bivariate normal distribution. arXiv preprint arXiv:1004.3616 (2010)
Miwa, T., Hayter, A.J., Kuriki, S.: The evaluation of general non-centred orthant probabilities. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 65(1), 223–234 (2003)
Article MathSciNet MATH Google Scholar
Pakman, A., Paninski, L.: Exact Hamiltonian Monte Carlo for truncated multivariate Gaussians. J. Comput. Graph. Stat. 23(2), 518–542 (2014)
Article MathSciNet Google Scholar
Phinikettos, I., Gandy, A.: Fast computation of high-dimensional multivariate normal probabilities. Computational Statistics & Data Analysis 55(4), 1521–1529 (2011)
Article MathSciNet MATH Google Scholar
Ridgway, J.: Computation of Gaussian orthant probabilities in high dimension. Statistics and computing 26(4), 899–916 (2016)
Article MathSciNet MATH Google Scholar
Wang, X.: Strong tractability of multivariate integration using quasi–Monte Carlo algorithms. Math. Comput. 72(242), 823–838 (2003)
Article MathSciNet MATH Google Scholar
Floros, D., Liu, T., Pitsianis, N., Sun, X.: Sparse dual of the density peaks algorithm for cluster analysis of high-dimensional data. In: 2018 IEEE High Performance extreme Computing Conference (HPEC), pp 1–14. IEEE (2018)
Yu, C.D., Reiz, S., Biros, G.: Distributed-memory hierarchical compression of dense SPD matrices. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, p 15. IEEE Press (2018)
Greengard, L., Gueyffier, D., Martinsson, P.-G., Rokhlin, V.: Fast direct solvers for integral equations in complex three-dimensional domains. Acta Numerica 18, 243–275 (2009)
Article MathSciNet MATH Google Scholar
Hackbusch, W.: A sparse matrix arithmetic based on ${\mathscr{H}}$-matrices. Part I: Introduction to ${\mathscr{H}}$-matrices. Computing 62(2), 89–108 (1999)
Article MathSciNet MATH Google Scholar
Hackbusch, W., Khoromskij, B.N.: A sparse ${\mathscr{H}}$-matrix arithmetic. Computing 64(1), 21–47 (2000)
Article MathSciNet MATH Google Scholar
Ho, K.L., Greengard, L.: A fast direct solver for structured linear systems by recursive skeletonization. SIAM J. Sci. Comput. 34(5), A2507–A2532 (2012)
Article MathSciNet MATH Google Scholar
Cooley, J.W., Tukey, J.W.: An algorithm for the machine calculation of complex Fourier series. Mathematics of computation 19(90), 297–301 (1965)
Article MathSciNet MATH Google Scholar
Brandt, A.: Multi-level adaptive solutions to boundary-value problems. Mathematics of computation 31(138), 333–390 (1977)
Article MathSciNet MATH Google Scholar
Hackbusch, W.: Multi-grid methods and applications, vol. 4. Springer Science & Business Media, New York (2013)
Google Scholar
Greengard, L., Rokhlin, V.: A fast algorithm for particle simulations. Journal of computational physics 73(2), 325–348 (1987)
Article MathSciNet MATH Google Scholar
Greengard, L., Rokhlin, V.: A new version of the fast multipole method for the Laplace equation in three dimensions. Acta numerica 6, 229–269 (1997)
Article MathSciNet MATH Google Scholar
Frigo, M., Johnson, S.G.: FFTW: An adaptive software architecture for the FFT. In: Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on, 3, pp. 1381–1384. IEEE (1998)
Barnett, A.H., Magland, J., af Klinteberg, L.: A parallel nonuniform fast fourier transform library based on an “exponential of semicircle” kernel. SIAM J. Sci. Comput. 41(5), C479–C504 (2019)
Article MathSciNet MATH Google Scholar
Greengard, L., Lee, J-Y: Accelerating the nonuniform fast Fourier transform. SIAM review 46(3), 443–454 (2004)
Article MathSciNet MATH Google Scholar
Lee, J-Y, Greengard, L.: The type 3 nonuniform FFT and its applications. J. Comput. Phys. 206(1), 1–5 (2005)
Article MathSciNet MATH Google Scholar
Bebendorf, M.: Hierarchical Matrices: A Means to Efficiently Solve Elliptic Boundary Value Problems. Lecture Notes in Computational Science and Engineering, vol. 63. Springer, New York (2008)
Li, K.-C.: Sliced inverse regression for dimension reduction. J. Am. Stat. Assoc. 86(414), 316–327 (1991)
Article MathSciNet MATH Google Scholar
Ho, K.L., Ying, L.: Hierarchical interpolative factorization for elliptic operators: differential equations. Commun. Pure Appl. Math. 69(8), 1415–1451 (2016)
Article MathSciNet MATH Google Scholar
Ho, K.L., Ying, L.: Hierarchical interpolative factorization for elliptic operators: integral equations. Commun. Pure Appl. Math. 69(7), 1314–1353 (2016)
Article MathSciNet MATH Google Scholar
Halko, N., Martinsson, P.-G., Tropp, J.A.: Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM review 53(2), 217–288 (2011)
Article MathSciNet MATH Google Scholar
Boyd, J.P.: Asymptotic Fourier coefficients for a $c^{\infty }$ bell (smoothed-“top-hat”) and the Fourier extension problem. J. Sci. Comput. 29(1), 1–24 (2006)
Article MathSciNet MATH Google Scholar
Abrarov, S.M., Quine, B.M.: Efficient algorithmic implementation of the Voigt/complex error function based on exponential series approximation. Appl. Math. Comput. 218(5), 1894–1902 (2011)
MathSciNet MATH Google Scholar
Abrarov, S.M., Quine, B.M.: On the Fourier expansion method for highly accurate computation of the Voigt/complex error function in a rapid algorithm. arXiv preprint arXiv:1205.1768 (2012)
Gautschi, W.: Efficient computation of the complex error function. SIAM J. Numer. Anal. 7(1), 187–198 (1970)
Article MathSciNet MATH Google Scholar
Karbach, T.M., Raven, G., Schiller, M.: Decay time integrals in neutral meson mixing and their efficient evaluation. arXiv preprint arXiv:1407.0748 (2014)
Gilbert, A.C., Indyk, P., Iwen, M., Schmidt, L.: Recent developments in the sparse Fourier transform: A compressed Fourier transform for big data. IEEE Signal Process. Mag. 31(5), 91–100 (2014)
Article Google Scholar
Nobile, F., Tempone, R., Webster, C.G.: A sparse grid stochastic collocation method for partial differential equations with random input data. SIAM J. Numer. Anal. 46(5), 2309–2345 (2008)
Article MathSciNet MATH Google Scholar
Gander, M.J.: An eigendecomposition updating algorithm for large Hermitian matrices under low rank perturbations (1998)

Download references

Acknowledgements

J. Huang was supported by the NSF grant DMS1821093, and the work was finished while he was a visiting professor at the King Abdullah University of Science and Technology, National Center for Theoretical Sciences (NCTS) in Taiwan, Mathematical Center for Interdisciplinary Research of Soochow University, and Institute for Mathematical Sciences of the National University of Singapore.

Author information

Authors and Affiliations

University of North Carolina, Chapel Hill, North Carolina, USA
Jingfang Huang & Fuhui Fang
King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
Jian Cao, Marc G. Genton & David E. Keyes
American University of Beirut, Beirut, Lebanon
George Turkiyyah

Authors

Jingfang Huang
View author publications
You can also search for this author in PubMed Google Scholar
Jian Cao
View author publications
You can also search for this author in PubMed Google Scholar
Fuhui Fang
View author publications
You can also search for this author in PubMed Google Scholar
Marc G. Genton
View author publications
You can also search for this author in PubMed Google Scholar
David E. Keyes
View author publications
You can also search for this author in PubMed Google Scholar
George Turkiyyah
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jingfang Huang.

Additional information

Communicated by: Leslie Greengard

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Potential theory–based analysis

The covariance matrix and covariance functions are often related to the solutions of elliptic partial differential equations. In the following, we apply the potential theory from the analysis of ordinary and partial differential equations and show how the divide-and-conquer strategy can be successfully performed on the hierarchical tree structure when A is the exponential matrix. A statistical analysis–based approach for a general ${\mathscr{H}}$-matrix with rank 1 off-diagonals is presented in Theorem 4. Purely numerical linear algebra–based approaches for more general cases will be addressed in subsequent papers.

Green’s functions: We present the results for b_z = 1 to simplify the notations and assume z_j ∈ [0, 1]. We start from the observation that

$$G(z,y)=\frac12 e^{-|z-y|} = \left\{ \begin{array}{l} coef \cdot g_{r}(z) \cdot g_{l}(y), \quad z\geq y, \\ coef \cdot g_{r}(y) \cdot g_{l}(z), \quad z<y \end{array} \right. $$

is the domain Green’s function of the ordinary differential equation (ODE) two-point boundary value problem

$$ \left\{ \begin{array}{l} u(z)-u^{\prime\prime}(z)=f(z), \quad z \in [0,1], \\ u(0)=u^{\prime}(0), \quad u(1)=-u^{\prime}(1), \end{array} \right. $$

(18)

where $coef=\frac 12$, g_l(z) = e^z− 1 and g_r(z) = e^1−z. The proof is a straightforward validation that $u(z)= {{\int \limits }_{0}^{1}} G(z,y) f(y) dy$ satisfies both the ODE and boundary conditions.

In the following discussions, we consider the continuous version of the original matrix problem, where the matrix A is the discretized Green’s function G(z, y), the two off-diagonal submatrices A_1,2 and A_1,3 are the discretized g_r(z) ⋅ g_l(y) and g_r(y) ⋅ g_l(z), respectively. Some simple algebra manipulations show that the submatrices $A_{1,1}- \frac {\lambda }{\gamma ^{2}} \mathbf {v} \mathbf {v}^{T}$ and $A_{1,4}-\gamma ^{2} \lambda \mathbf {u} \mathbf {u}^{T}$ can be considered as the discretized $G(z,y)-\tilde {\gamma }^{2} \cdot g_{l}(z) \cdot g_{l}(y)$ and $G(z,y)-\frac {1}{\tilde {\gamma }^{2}} \cdot g_{r}(z) \cdot g_{r}(y)$, and the coefficients $i t \frac {\sqrt {2\lambda }}{\gamma } \mathbf {v}^{T}$ and $i t \sqrt {2\lambda } \gamma \mathbf {u}^{T}$ for the linear terms of the x-variables x_1,1 and x_1,2 in Eq. (??) are the discretized $i t \tilde {\gamma } g_{l}(z)$ and $ i t \frac {1}{\tilde {\gamma }} g_{r}(z)$, respectively.

Remark: The observation also allows easy proof of the positive definiteness of the matrix A, which is the discretized Green’s function G(z, y). In order to show that for any vector f ≠ 0, the quadratic form satisfies $\frac 12 \mathbf {f}^{T} A \mathbf {f} >0$, we consider its continuous version defined as

$${{\int}_{0}^{1}} f(z) \left( {{\int}_{0}^{1}} G(z,y) f(y) dy \right) dz = {{\int}_{0}^{1}} f(z) u(z) dz$$

where $u(z)= {{\int \limits }_{0}^{1}} G(z,y) f(y) dy$ and f(y) is the continuous version of the (discretized) vector f. As $f(z)=u(z)-u^{\prime \prime }(z)$, applying the integration by parts, we have

$$ {{\int}_{0}^{1}} f(z) u(z) dz = {{\int}_{0}^{1}} \left( u^{2}(z) + (u^{\prime}(z))^{2}\right) dz -u^{\prime}(1) \cdot u(1) + u^{\prime}(0) \cdot u(0). $$

As f(z) ≠ 0, therefore u(z) ≠ 0, and applying the boundary conditions of the ODE, we have ${{\int \limits }_{0}^{1}} f(z) u(z) dz>0$. We refer to the two-variable function G(z, y) as a positive definite function. The positive definiteness of the matrix A can be proved in a similar way using the discretized integration by parts.

A particular choice of γ can be determined by considering the corresponding child ODE problems as follows. We first study the root problem and define its two children as the left child and right child, and the locations z_i of the left child and z_j of the right child satisfy the condition z_i < z_j as the z-locations of the x-variables in the two child problems are separated and ordered. We pick a location ζ between the two clusters of z-locations. Note that the choice of ζ is not unique, and a particular choice is the midpoint of the two sets. We have the following results for the root node.

Theorem 2

If we choose $\tilde {\gamma } = \frac {e^{-\zeta }}{\sqrt {2}}$, then

1.
For the left child, the new function $G_{l}(z,y)=G(z,y)-\tilde {\gamma }^{2} \cdot g_{l}(z) \cdot g_{l}(y)$ is the Green function of the ODE problem
$$ \left\{ \begin{array}{l} \mathbf{u}_1(z)-\mathbf{u}_1^{\prime\prime}(z)=f(z), \quad z \in [0,\zeta], \\ \mathbf{u}_1(0)=\mathbf{u}_1^{\prime}(0), \quad \mathbf{u}_1(\zeta)=0. \end{array} \right. $$
(19)
The function G_l(z, y) is positive definite.
2.
For the right child, the new function $G_{r}(z,y)=G(z,y)-\frac {1}{\tilde {\gamma }^{2}} \cdot g_{r}(z) \cdot g_{r}(y)$ is the Green function of the ODE problem
$$ \left\{ \begin{array}{l} \mathbf{u}_2(z)-\mathbf{u}_2^{\prime\prime}(z)=f(z), \quad z \in [\zeta, 1], \\ \mathbf{u}_2(\zeta)=0, \quad \mathbf{u}_2(1)=-\mathbf{u}_2^{\prime}(1). \end{array} \right. $$
(20)
The function G_r(z, y) is positive definite.
3.
The two child ODE problem solutions u₁(z) and u₂(z) can be derived by subtracting a single-layer potential defined at z = ζ from the parent’s solution u(z) of Eq. (18), so that solutions u₁(z) and u₂(z) satisfy the zero interface condition at z = ζ. The other boundary condition for each child ODE problem is the same as its parent’s boundary condition.

These results can be easily validated by plugging in the functions to the ODE problems. The positive definiteness of the child Green’s function can be proved using the same integration by part technique as we did for the parent’s Green’s function.

For a general parent node on the tree structure, we have the following generalized results.

Theorem 3

Consider a parent node with the corresponding function G_p(z, y) defined on the interval [a, b], and ζ is a point separating the two children’s z-locations. Then there exists a number $\tilde {\gamma }$ which depends on ζ, such that

1.
For the left child, the new function $G_{l}(z,y)=G(z,y)-\tilde {\gamma }^{2} \cdot g_{l}(z) \cdot g_{l}(y)$ is the Green function of the ODE problem
$$ \left\{ \begin{array}{l} \mathbf{u}_1(z)-\mathbf{u}_1^{\prime\prime}(z)=f(z), \quad z \in [a,\zeta], \\ \text{same boundary condition as parent at } x=a, \text{ and } \mathbf{u}_1(\zeta)=0. \end{array} \right. $$
(21)
The function G_l(z, y) is positive definite.
2.
For the right child, the new function $G_{r}(z,y)=G(z,y)-\frac {1}{\tilde {\gamma }^{2}} \cdot g_{r}(z) \cdot g_{r}(y)$ is the Green function of the ODE problem
$$ \left\{ \begin{array}{l} \mathbf{u}_2(z)-\mathbf{u}_2^{\prime\prime}(z)=f(z), \quad z \in [\zeta, b], \\ \mathbf{u}_2(\zeta)=0, \text{ and same boundary condition as parent at } x=b. \end{array} \right. $$
(22)
The function G_r(z, y) is positive definite.
3.
The two child ODE problem solutions u₁(z) and u₂(z) can be derived by subtracting a single-layer potential defined at z = ζ from the parent’s solution u(z) of Eq. (18), so that solutions u₁(z) and u₂(z) satisfy the zero interface condition at z = ζ. The other boundary condition for each child ODE problem is the same as its parent’s boundary condition.

The detailed formulas for the number $\tilde {\gamma }$ and Green’s functions are presented next. The proof of the theorem is simply validations of the formulas.

The Green function G_p(z, y) of a parent node p and the functions G₁(z, y) and G₂(z, y) of p’s left child 1 and right child 2 are defined as

$$G_{p}(z,y) = \left\{ \begin{array}{l} coef^{p} \cdot {g_{r}^{p}}(z) \cdot {g_{l}^{p}}(y), \quad y<z, \\ coef^{p} \cdot {g_{l}^{p}}(z) \cdot {g_{r}^{p}}(y), \quad y>z, \end{array} \right. $$

$$G_{1}(z,y) = \left\{ \begin{array}{l} coef^{1} \cdot {g_{r}^{1}}(z) \cdot {g_{l}^{1}}(y), \quad y<z, \\ coef^{1} \cdot {g_{l}^{1}}(z) \cdot {g_{r}^{1}}(y), \quad y>z, \end{array} \right. $$

$$G_{2}(z,y) = \left\{ \begin{array}{l} coef^{2} \cdot {g_{r}^{2}}(z) \cdot {g_{2}^{p}}(y), \quad y<z, \\ coef^{2} \cdot {g_{l}^{2}}(z) \cdot {g_{2}^{p}}(y), \quad y>z. \end{array} \right. $$

We assume parent’s z-locations satisfy z ∈ [a, b]. We choose ζ = c to separate the parent’s locations, and the child intervals are therefore [a, c] and [c, b], respectively.

Case 1: p is the root node (a = 0, b = 1): The functions are

$$ \begin{array}{lll} g_l^p(z)=e^{z-1}, & g_r^p(z)=e^{1-z}, & \quad coef^p=\frac12; \\ g_l^1(z)=e^{z-c}, & g_r^1(z)=\sinh(c-z), & \quad coef^1=1; \\ g_l^2(z)=\sinh(z-c), & g_r^2(z)=e^{c-z}, & \quad coef^2=1; \end{array} $$

(23)

Case 2: p is a left boundary node(a = 0): The functions are

$$ \begin{array}{lll} g_l^p(z)=e^{z-b}, & g_r^p(z)=\sinh(b-z), & \quad coef^p=1; \\ g_l^1(z)=e^{z-c}, & g_r^1(z)=\sinh(c-z), & \quad coef^1=1; \\ g_l^2(z)=\sinh(z-c), & g_r^2(z)=\sinh(b-z), & \quad coef^2=\frac{2 e^{b+c}}{e^{2 b}-e^{2 c}}; \end{array} $$

(24)

Case 3: p is a right boundary node(b = 1): The functions are

$$ \begin{array}{lll} g_l^p(z)=\sinh(z-a), & g_r^p(z)=e^{a-z}, & \quad coef^p=1; \\ g_l^1(z)=\sinh(z-a), & g_r^1(z)=\sinh(c-z), & \quad coef^1=\frac{2 e^{a+c}}{e^{2 c}-e^{2 a}}; \\ g_l^2(z)=\sinh(z-c), & g_r^2(z)=e^{c-z}, & \quad coef^2=1; \end{array} $$

(25)

Case 4: p is an interior node: The functions are

$$ \begin{array}{lll} g_l^p(z)=\sinh(z-a), & g_r^p(z)=\sinh(b-z), & \quad coef^p=\frac{2 e^{a+b}}{e^{2 b}-e^{2 a}}; \\ g_l^1(z)=\sinh(z-a), & g_r^1(z)=\sinh(c-z), & \quad coef^1=\frac{2 e^{a+c}}{e^{2 c}-e^{2 a}}; \\ g_l^2(z)=\sinh(z-c), & g_r^2(z)=\sinh(b-z), & \quad coef^2=\frac{2 e^{b+c}}{e^{2 b}-e^{2 c}}; \end{array} $$

(26)

Next, we present the relations of the parent p’s two w-variables ${w_{1}^{p}}$ and ${w_{2}^{p}}$ with the left child 1’s two w-variables {${w_{1}^{1}}$, ${w_{2}^{1}}$} and right child 2’s two w-variables {${w_{1}^{2}}$, ${w_{2}^{2}}$}. We use t_new to represent the new t-variable introduced to divide the parent problem into two sub-problems of child 1 and child 2. We use a unified set of basis functions for each node on the hierarchical tree structure. For the parent node, the basis functions are $\{ {{{\varPhi }}_{1}^{p}}=\cosh (z-c), {{{\varPhi }}_{2}^{p}}=\frac {\sinh (z-c)}{b-a} \}$. The basis functions for the left and right children are $\{ {{{\varPhi }}_{1}^{1}} = \cosh (z-p), {{{\varPhi }}_{2}^{1}}=\frac {\sinh (z-p)}{c-a} \}$ and $\{ {{{\varPhi }}_{1}^{2}}=\cosh (z-q), {{{\varPhi }}_{2}^{2}}=\frac {\sinh (z-q)}{b-c} \}$, respectively, where p and q are either the interface ζ points when further subdividing the two child problems, or the mid-point of the child intervals when they become leaf nodes.

Case 1: p is the root node (a = 0, b = 1): Parent has no effective w-variables.

$$ \begin{array}{l} w_1^1=\frac{t_{new} e^{p-c}}{\sqrt{2}}, \quad w_2^1=-\frac{t_{new} (a-c) e^{p-c}}{\sqrt{2}}; \\ w_1^2=\frac{t_{new} e^{c-q}}{\sqrt{2}}, \quad w_2^2=-\frac{t_{new} (b-c) e^{c-q}}{\sqrt{2}}. \end{array} $$

(27)

Case 2: p is a left boundary node(a = 0):

$$ \begin{array}{l} w_1^1=\frac{w_2^p \sinh (c-p)}{a-b}+\frac{e^p t_{new} \sqrt{e^{-2 c}-e^{-2 b}}}{\sqrt{2}}+w_1^p \cosh (c-p), \\ w_2^1=(a-c) \left( \frac{w_2^p \cosh (c-p)}{a-b}+ w_1^p \sinh (c-p)\right)-\frac{e^p t_{new} (a-c) \sqrt{e^{-2 c}-e^{-2 b}}}{\sqrt{2}}; \\ w_1^2= \frac{w_2^p \sinh (c-q)}{a-b}+t_{new} \sqrt{\coth (b-c)-1} \sinh (b-q)+w_1^p \cosh (c-q), \\ w_2^2 = \frac{(b-c) (w_1^p (b-a) \sinh (c-q)-w_2^p \cosh (c-q))}{a-b}+t_{new} (c - b) \sqrt{\coth (b - c) - 1} \cosh (b - q). \end{array} $$

(28)

Case 3: p is a right boundary node(b = 1):

$$ \begin{array}{rcl} w_1^1 &=& \frac{w_2^p \sinh (c-p)}{a-b}+t_{new} \sqrt{-\coth (a-c)-1} \sinh (p-a)+w_1^p \cosh (c-p), \\ w_2^1 &=& (a-c) \left( \frac{w_2^p \cosh (c-p)}{a-b}+w_1^p \sinh (c-p)\right) \\ & & +t_{new} (c-a) \sqrt{-\coth (a-c)-1} \cosh (a-p); \\ w_1^2 &=& \frac{w_2^p \sinh (c-q)}{a-b}+\frac{e^{-q} t_{new} \sqrt{e^{2 c}-e^{2 a}}}{\sqrt{2}}+w_1^p \cosh (c-q), \\ w_2^2 &=& \frac{e^{-q} t_{new} \sqrt{e^{2 c}-e^{2 a}} (c-b)}{\sqrt{2}}+\frac{(b-c) (w_1^p (b-a) \sinh (c-q)-w_2^p \cosh (c-q))}{a-b}. \end{array} $$

(29)

Case 4: p is an interior node:

$$ \begin{array}{rcl} w_1^1 &=& t_{new} \sinh (p-a) \sqrt{\text{csch}(a-b) \text{csch}(a-c) \sinh (b-c)} \\ & & +\frac{w_2^p \sinh (c-p)}{a-b}+w_1^p \cosh (c-p) , \\ w_2^1 &=& t_{new} (c-a) \cosh (a-p) \sqrt{\text{csch}(a-b) \text{csch}(a-c) \sinh (b-c)} \\ & & +(a-c) \left( \frac{w_2^p \cosh (c-p)}{a-b}+w_1^p \sinh (c-p)\right); \\ w_1^2 &=& t_{new} \sinh (b-q) \sqrt{\text{csch}(a-b) \sinh (a-c) \text{csch}(b-c)} \\ & & +\frac{w_2^p \sinh (c-q)}{a-b}+w_1^p \cosh (c-q), \\ w_2^2 &=& t_{new} (c-b) \cosh (b-q) \sqrt{\text{csch}(a-b) \sinh (a-c) \text{csch}(b-c)} \\ & & +\frac{(b-c) (w_1^p (b-a) \sinh (c-q)-w_2^p \cosh (c-q))}{a-b}. \end{array} $$

(30)

Mathematica files for computing these formulas are available.

Finally, we present a theorem which guarantees the positive definiteness of the child problems when a proper parameter is chosen in the divide-and-conquer scheme for a ${\mathscr{H}}$-matrix with rank 1 off-diagonal blocks. The theorem applies to both the tridiagonal and exponential cases.

Theorem 4

Consider a positive definite covariance matrix of two random vectors X and Y with rank one off-diagonal blocks in the form

(31)

where c is a constant and u = [u₁, u₂, ⋯ , u_n] and v = [v₁, v₂, ⋯ , v_n] are row vectors. Then, there exists an interval of γ values such that the matrices

$$ \left( \begin{matrix} \lambda_1^2 & & 0 \\ &\ddots&\\ 0 & & \lambda_n^2 \end{matrix} \right) - c \gamma {\mathbf{v}^t \mathbf{v}} \qquad \text{ and } \qquad \left( \begin{matrix} \sigma_1^2 & & 0 \\ &\ddots&\\ 0 & & \sigma_n^2 \end{matrix} \right) - {c} \frac{1}{\gamma} {\mathbf{u}^t \mathbf{u}} $$

(32)

are both symmetric positive definite.

Proof

Without loss of generality, we assume c > 0. Note that any symmetric positive definite matrix $\left (\begin {array}{cc} A_{1,1} & A_{1,2} \\ A_{2,1} & A_{2,2} \end {array} \right ) $ can be reduced to the standard form in Eq. (31) using an orthogonal transformation $\left (\begin {array}{cc} U & \huge 0 \\ \huge 0 & V \end {array} \right )$ where U and V are the normalized eigenvectors for A_1,1 and A_2,2, respectively. Apply Lemma 2.2 from [58], we need to find γ such that

$$ 0 \leq {c} \gamma \leq \frac{1}{{\sum}_{k=1}^{n} \frac{{v_{i}^{2}}}{{\lambda_{i}^{2}}}} \quad \text{ and } \quad 0 \leq {c} \frac{1}{\gamma} \leq \frac{1}{{\sum}_{k=1}^{n} \frac{{u_{i}^{2}}}{{\sigma_{i}^{2}}}}. $$

(33)

We consider the conditional variance

$$ \begin{array}{rcl}{{\varSigma}}_{Y|X} & = & {{\varSigma}}_{YY} - {{\varSigma}}_{YX} {{\varSigma}}_{XX}^{-1} {{\varSigma}}_{XY} \\ &=& \left( \begin{array}{rcl} {\sigma_{1}^{2}} & & 0 \\&\ddots&\\ 0 & & {\sigma_{n}^{2}} \end{array} \right) - {c \mathbf{u}^{t} }\left( \mathbf{v} \left( \begin{array}{rcl} \frac{1}{{\lambda_{1}^{2}}} & & 0 \\ &\ddots&\\ 0 & & \frac{1}{{\lambda_{n}^{2}}} \end{array} \right) {c \mathbf{v}^{t}} \right) \mathbf{u} \\ & = & \left( \begin{array}{rcl} {\sigma_{1}^{2}} & & 0 \\ &\ddots&\\ 0 & & {\sigma_{n}^{2}} \end{array} \right) - {c}^{2} \left( {\sum}_{k=1}^{n} \frac{{v_{i}^{2}}}{{\lambda_{i}^{2}}} \right) \mathbf{u}^{t} \mathbf{u} \end{array} $$

which is symmetric positive definite from statistics theory (or linear algebra). Therefore using Lemma 2.2 from [58], we have

$$0 \leq {c}^{2} \left( \sum\limits_{k=1}^{n} \frac{{v_{i}^{2}}}{{\lambda_{i}^{2}}} \right) \leq \frac{1}{ \left( {\sum}_{k=1}^{n} \frac{{u_{i}^{2}}}{{\sigma_{i}^{2}}} \right) } $$

which is equivalent to $ c {\sum }_{k=1}^{n} \frac {{u_{i}^{2}}}{{\sigma _{i}^{2}}} \leq \frac {1}{c {\sum }_{k=1}^{n} \frac {{v_{i}^{2}}}{{\lambda _{i}^{2}}}} $. Clearly, any γ value in the interval $[ c {\sum }_{k=1}^{n} \frac {{u_{i}^{2}}}{{\sigma _{i}^{2}}}, \frac {1}{c {\sum }_{k=1}^{n} \frac {{v_{i}^{2}}}{{\lambda _{i}^{2}}}} ]$ will satisfy the conditions in Eqs. (16), e.g., one can choose the mid-point of this interval to “balance” the positive definiteness of the two child problems. This completes the proof. □

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, J., Cao, J., Fang, F. et al. An O(N) algorithm for computing expectation of N-dimensional truncated multi-variate normal distribution I: fundamentals. Adv Comput Math 47, 65 (2021). https://doi.org/10.1007/s10444-021-09888-1

Download citation

Received: 29 December 2020
Accepted: 22 July 2021
Published: 01 September 2021
DOI: https://doi.org/10.1007/s10444-021-09888-1

Keywords

Mathematics Subject Classification (2010)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An O(N) algorithm for computing expectation of N-dimensional truncated multi-variate normal distribution I: fundamentals

Abstract

Access this article

Similar content being viewed by others

An O(N) algorithm for computing expectation of N-dimensional truncated multi-variate normal distribution II: computing moments and sparse grid acceleration

Hierarchical-block conditioning approximations for high-dimensional multivariate normal probabilities

Improved Bounds on Sample Size for Implicit Matrix Trace Estimators

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Appendix: Potential theory–based analysis

Theorem 2

Theorem 3

Theorem 4

Proof

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2010)

Navigation

An O(N) algorithm for computing expectation of N-dimensional truncated multi-variate normal distribution I: fundamentals

Abstract

Access this article

Similar content being viewed by others

An O(N) algorithm for computing expectation of N-dimensional truncated multi-variate normal distribution II: computing moments and sparse grid acceleration

Hierarchical-block conditioning approximations for high-dimensional multivariate normal probabilities

Improved Bounds on Sample Size for Implicit Matrix Trace Estimators

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Appendix: Potential theory–based analysis

Appendix: Potential theory–based analysis

Theorem 2

Theorem 3

Theorem 4

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2010)

Search

Navigation