Skip to main content

An O(N) algorithm for computing expectation of N-dimensional truncated multi-variate normal distribution I: fundamentals

Abstract

In this paper, we present the fundamentals of a hierarchical algorithm for computing the N-dimensional integral \(\phi (\mathbf {a}, \mathbf {b}; A) = {\int \limits }_{\mathbf {a}}^{\mathbf {b}} H(\mathbf {x}) f(\mathbf {x} | A) \text {d} \mathbf {x}\) representing the expectation of a function H(X) where f(x|A) is the truncated multi-variate normal (TMVN) distribution with zero mean, x is the vector of integration variables for the N-dimensional random vector X, A is the inverse of the covariance matrix Σ, and a and b are constant vectors. The algorithm assumes that H(x) is “low-rank” and is designed for properly clustered X so that the matrix A has “low-rank” blocks and “low-dimensional” features. We demonstrate the divide-and-conquer idea when A is a symmetric positive definite tridiagonal matrix and present the necessary building blocks and rigorous potential theory–based algorithm analysis when A is given by the exponential covariance model. The algorithm overall complexity is O(N) for N-dimensional problems, with a prefactor determined by the rank of the off-diagonal matrix blocks and number of effective variables. Very high accuracy results for N as large as 2048 are obtained on a desktop computer with 16G memory using the fast Fourier transform (FFT) and non-uniform FFT to validate the analysis. The current paper focuses on the ideas using the simple yet representative examples where the off-diagonal matrix blocks are rank 1 and the number of effective variables is bounded by 2, to allow concise notations and easier explanation. In a subsequent paper, we discuss the generalization of current scheme using the sparse grid technique for higher rank problems and demonstrate how all the moments of kth order or less (a total of O(Nk) integrals) can be computed using O(Nk) operations for k ≥ 2 and \(O(N \log N)\) operations for k = 1.

This is a preview of subscription content, access via your institution.

References

  1. 1.

    Arellano-Valle, R.B., Azzalini, A.: On the unification of families of skew-normal distributions. Scand. J. Stat. 33(3), 561–574 (2006)

    MathSciNet  MATH  Article  Google Scholar 

  2. 2.

    Arellano-Valle, R.B., Branco, M.D., Genton, M.G.: A unified view on skewed distributions arising from selections. Canadian Journal of Statistics 34 (4), 581–601 (2006)

    MathSciNet  MATH  Article  Google Scholar 

  3. 3.

    Arellano-Valle, R.B., Genton, M.G.: On the exact distribution of the maximum of absolutely continuous dependent random variables. Statistics & Probability Letters 78(1), 27–35 (2008)

    MathSciNet  MATH  Article  Google Scholar 

  4. 4.

    Azzalini, A., Capitanio, A.: The skew-normal and related families, vol. 3. Cambridge University Press, Cambridge (2014)

    MATH  Google Scholar 

  5. 5.

    Castruccio, S., Huser, R., Genton, M.G.: High-order composite likelihood inference for max-stable distributions and processes. J. Comput. Graph. Stat. 25(4), 1212–1229 (2016)

    MathSciNet  Article  Google Scholar 

  6. 6.

    Huser, R., Dombry, C., Ribatet, M., Genton, M.G.: Full likelihood inference for max-stable data. Stat 8(1), e218 (2019)

    MathSciNet  Article  Google Scholar 

  7. 7.

    Genton, M.G.: Skew-elliptical distributions and their applications: a journey beyond normality. CRC Press (2004)

  8. 8.

    Genton, M.G., Keyes, D.E., Turkiyyah, G.: Hierarchical decompositions for the computation of high-dimensional multivariate normal probabilities. J. Comput. Graph. Stat. 27(2), 268–277 (2018)

    MathSciNet  Article  Google Scholar 

  9. 9.

    Stephenson, A., Tawn, J.: Exploiting occurrence times in likelihood inference for componentwise maxima. Biometrika 92(1), 213–227 (2005)

    MathSciNet  MATH  Article  Google Scholar 

  10. 10.

    Barthelmann, V., Novak, E., Ritter, K.: High dimensional polynomial interpolation on sparse grids. Adv. Comput. Math. 12(4), 273–288 (2000)

    MathSciNet  MATH  Article  Google Scholar 

  11. 11.

    Gerstner, T., Griebel, M.: Numerical integration using sparse grids. Numerical algorithms 18(3), 209–232 (1998)

    MathSciNet  MATH  Article  Google Scholar 

  12. 12.

    Shen, J., Yu, H.: Efficient spectral sparse grid methods and applications to high-dimensional elliptic problems. SIAM J. Sci. Comput. 32(6), 3228–3250 (2010)

    MathSciNet  MATH  Article  Google Scholar 

  13. 13.

    Genz, A., Bretz, F.: Computation of multivariate normal and t probabilities, vol. 195. Springer Science & Business Media, New York (2009)

    MATH  Book  Google Scholar 

  14. 14.

    Azzimonti, D., Ginsbourger, D.: Estimating orthant probabilities of high-dimensional Gaussian vectors with an application to set estimation. J. Comput. Graph. Stat. 27(2), 255–267 (2018)

    MathSciNet  Article  Google Scholar 

  15. 15.

    Botev, Z.I.: The normal law under linear restrictions: simulation and estimation via minimax tilting. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 79(1), 125–148 (2017)

    MathSciNet  MATH  Article  Google Scholar 

  16. 16.

    Botev, Z.I., L’Ecuyer, P.: Efficient probability estimation and simulation of the truncated multivariate student-t distribution. In: Proceedings of the 2015 Winter Simulation Conference, pp 380–391. IEEE Press (2015)

  17. 17.

    Craig, P.: A new reconstruction of multivariate normal orthant probabilities. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 70(1), 227–243 (2008)

    MathSciNet  MATH  Google Scholar 

  18. 18.

    Fayed, H., Atiya, A.: A novel series expansion for the multivariate normal probability integrals based on Fourier series. Math. Comput. 83(289), 2385–2402 (2014)

    MathSciNet  MATH  Article  Google Scholar 

  19. 19.

    Genz, A.: Numerical computation of multivariate normal probabilities. Journal of computational and graphical statistics 1(2), 141–149 (1992)

    Google Scholar 

  20. 20.

    Genz, A., Bretz, F., Miwa, T., Mi, X., Leisch, F., Scheipl, F., Bornkamp, B., Maechler, M., Hothorn, T.: Multivariate normal and t distributions (2014)

  21. 21.

    Geweke, J.: Efficient simulation from the multivariate normal and student-t distributions subject to linear constraints and the evaluation of constraint probabilities. Seattle, USA (1991)

  22. 22.

    Hajivassiliou, V., McFadden, D., Ruud, P.: Simulation of multivariate normal rectangle probabilities and their derivatives theoretical and computational results. Journal of econometrics 72(1-2), 85–134 (1996)

    MathSciNet  MATH  Article  Google Scholar 

  23. 23.

    Keane, M.P.: 20 simulation estimation for panel data models with limited dependent variables. Handbook of Statistics 11, 545–571 (1993)

    Article  Google Scholar 

  24. 24.

    Kuo, F., Sloan, I., Woźniakowski, H: Multivariate integration for analytic functions with Gaussian kernels. Math. Comput. 86(304), 829–853 (2017)

    MathSciNet  MATH  Article  Google Scholar 

  25. 25.

    Meyer, C.: Recursive numerical evaluation of the cumulative bivariate normal distribution. arXiv preprint arXiv:1004.3616 (2010)

  26. 26.

    Miwa, T., Hayter, A.J., Kuriki, S.: The evaluation of general non-centred orthant probabilities. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 65(1), 223–234 (2003)

    MathSciNet  MATH  Article  Google Scholar 

  27. 27.

    Pakman, A., Paninski, L.: Exact Hamiltonian Monte Carlo for truncated multivariate Gaussians. J. Comput. Graph. Stat. 23(2), 518–542 (2014)

    MathSciNet  Article  Google Scholar 

  28. 28.

    Phinikettos, I., Gandy, A.: Fast computation of high-dimensional multivariate normal probabilities. Computational Statistics & Data Analysis 55(4), 1521–1529 (2011)

    MathSciNet  MATH  Article  Google Scholar 

  29. 29.

    Ridgway, J.: Computation of Gaussian orthant probabilities in high dimension. Statistics and computing 26(4), 899–916 (2016)

    MathSciNet  MATH  Article  Google Scholar 

  30. 30.

    Wang, X.: Strong tractability of multivariate integration using quasi–Monte Carlo algorithms. Math. Comput. 72(242), 823–838 (2003)

    MathSciNet  MATH  Article  Google Scholar 

  31. 31.

    Floros, D., Liu, T., Pitsianis, N., Sun, X.: Sparse dual of the density peaks algorithm for cluster analysis of high-dimensional data. In: 2018 IEEE High Performance extreme Computing Conference (HPEC), pp 1–14. IEEE (2018)

  32. 32.

    Yu, C.D., Reiz, S., Biros, G.: Distributed-memory hierarchical compression of dense SPD matrices. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, p 15. IEEE Press (2018)

  33. 33.

    Greengard, L., Gueyffier, D., Martinsson, P.-G., Rokhlin, V.: Fast direct solvers for integral equations in complex three-dimensional domains. Acta Numerica 18, 243–275 (2009)

    MathSciNet  MATH  Article  Google Scholar 

  34. 34.

    Hackbusch, W.: A sparse matrix arithmetic based on \({\mathscr{H}}\)-matrices. Part I: Introduction to \({\mathscr{H}}\)-matrices. Computing 62(2), 89–108 (1999)

    MathSciNet  MATH  Article  Google Scholar 

  35. 35.

    Hackbusch, W., Khoromskij, B.N.: A sparse \({\mathscr{H}}\)-matrix arithmetic. Computing 64(1), 21–47 (2000)

    MathSciNet  MATH  Google Scholar 

  36. 36.

    Ho, K.L., Greengard, L.: A fast direct solver for structured linear systems by recursive skeletonization. SIAM J. Sci. Comput. 34(5), A2507–A2532 (2012)

    MathSciNet  MATH  Article  Google Scholar 

  37. 37.

    Cooley, J.W., Tukey, J.W.: An algorithm for the machine calculation of complex Fourier series. Mathematics of computation 19(90), 297–301 (1965)

    MathSciNet  MATH  Article  Google Scholar 

  38. 38.

    Brandt, A.: Multi-level adaptive solutions to boundary-value problems. Mathematics of computation 31(138), 333–390 (1977)

    MathSciNet  MATH  Article  Google Scholar 

  39. 39.

    Hackbusch, W.: Multi-grid methods and applications, vol. 4. Springer Science & Business Media, New York (2013)

    Google Scholar 

  40. 40.

    Greengard, L., Rokhlin, V.: A fast algorithm for particle simulations. Journal of computational physics 73(2), 325–348 (1987)

    MathSciNet  MATH  Article  Google Scholar 

  41. 41.

    Greengard, L., Rokhlin, V.: A new version of the fast multipole method for the Laplace equation in three dimensions. Acta numerica 6, 229–269 (1997)

    MathSciNet  MATH  Article  Google Scholar 

  42. 42.

    Frigo, M., Johnson, S.G.: FFTW: An adaptive software architecture for the FFT. In: Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on, 3, pp. 1381–1384. IEEE (1998)

  43. 43.

    Barnett, A.H., Magland, J., af Klinteberg, L.: A parallel nonuniform fast fourier transform library based on an “exponential of semicircle” kernel. SIAM J. Sci. Comput. 41(5), C479–C504 (2019)

    MathSciNet  MATH  Article  Google Scholar 

  44. 44.

    Greengard, L., Lee, J-Y: Accelerating the nonuniform fast Fourier transform. SIAM review 46(3), 443–454 (2004)

    MathSciNet  MATH  Article  Google Scholar 

  45. 45.

    Lee, J-Y, Greengard, L.: The type 3 nonuniform FFT and its applications. J. Comput. Phys. 206(1), 1–5 (2005)

    MathSciNet  MATH  Article  Google Scholar 

  46. 46.

    Bebendorf, M.: Hierarchical Matrices: A Means to Efficiently Solve Elliptic Boundary Value Problems. Lecture Notes in Computational Science and Engineering, vol. 63. Springer, New York (2008)

  47. 47.

    Li, K.-C.: Sliced inverse regression for dimension reduction. J. Am. Stat. Assoc. 86(414), 316–327 (1991)

    MathSciNet  MATH  Article  Google Scholar 

  48. 48.

    Ho, K.L., Ying, L.: Hierarchical interpolative factorization for elliptic operators: differential equations. Commun. Pure Appl. Math. 69(8), 1415–1451 (2016)

    MathSciNet  MATH  Article  Google Scholar 

  49. 49.

    Ho, K.L., Ying, L.: Hierarchical interpolative factorization for elliptic operators: integral equations. Commun. Pure Appl. Math. 69(7), 1314–1353 (2016)

    MathSciNet  MATH  Article  Google Scholar 

  50. 50.

    Halko, N., Martinsson, P.-G., Tropp, J.A.: Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM review 53(2), 217–288 (2011)

    MathSciNet  MATH  Article  Google Scholar 

  51. 51.

    Boyd, J.P.: Asymptotic Fourier coefficients for a \(c^{\infty }\) bell (smoothed-“top-hat”) and the Fourier extension problem. J. Sci. Comput. 29(1), 1–24 (2006)

    MathSciNet  MATH  Article  Google Scholar 

  52. 52.

    Abrarov, S.M., Quine, B.M.: Efficient algorithmic implementation of the Voigt/complex error function based on exponential series approximation. Appl. Math. Comput. 218(5), 1894–1902 (2011)

    MathSciNet  MATH  Google Scholar 

  53. 53.

    Abrarov, S.M., Quine, B.M.: On the Fourier expansion method for highly accurate computation of the Voigt/complex error function in a rapid algorithm. arXiv preprint arXiv:1205.1768 (2012)

  54. 54.

    Gautschi, W.: Efficient computation of the complex error function. SIAM J. Numer. Anal. 7(1), 187–198 (1970)

    MathSciNet  MATH  Article  Google Scholar 

  55. 55.

    Karbach, T.M., Raven, G., Schiller, M.: Decay time integrals in neutral meson mixing and their efficient evaluation. arXiv preprint arXiv:1407.0748 (2014)

  56. 56.

    Gilbert, A.C., Indyk, P., Iwen, M., Schmidt, L.: Recent developments in the sparse Fourier transform: A compressed Fourier transform for big data. IEEE Signal Process. Mag. 31(5), 91–100 (2014)

    Article  Google Scholar 

  57. 57.

    Nobile, F., Tempone, R., Webster, C.G.: A sparse grid stochastic collocation method for partial differential equations with random input data. SIAM J. Numer. Anal. 46(5), 2309–2345 (2008)

    MathSciNet  MATH  Article  Google Scholar 

  58. 58.

    Gander, M.J.: An eigendecomposition updating algorithm for large Hermitian matrices under low rank perturbations (1998)

Download references

Acknowledgements

J. Huang was supported by the NSF grant DMS1821093, and the work was finished while he was a visiting professor at the King Abdullah University of Science and Technology, National Center for Theoretical Sciences (NCTS) in Taiwan, Mathematical Center for Interdisciplinary Research of Soochow University, and Institute for Mathematical Sciences of the National University of Singapore.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Jingfang Huang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Communicated by: Leslie Greengard

Appendix: Potential theory–based analysis

Appendix: Potential theory–based analysis

The covariance matrix and covariance functions are often related to the solutions of elliptic partial differential equations. In the following, we apply the potential theory from the analysis of ordinary and partial differential equations and show how the divide-and-conquer strategy can be successfully performed on the hierarchical tree structure when A is the exponential matrix. A statistical analysis–based approach for a general \({\mathscr{H}}\)-matrix with rank 1 off-diagonals is presented in Theorem 4. Purely numerical linear algebra–based approaches for more general cases will be addressed in subsequent papers.

Green’s functions: We present the results for bz = 1 to simplify the notations and assume zj ∈ [0, 1]. We start from the observation that

$$G(z,y)=\frac12 e^{-|z-y|} = \left\{ \begin{array}{l} coef \cdot g_{r}(z) \cdot g_{l}(y), \quad z\geq y, \\ coef \cdot g_{r}(y) \cdot g_{l}(z), \quad z<y \end{array} \right. $$

is the domain Green’s function of the ordinary differential equation (ODE) two-point boundary value problem

$$ \left\{ \begin{array}{l} u(z)-u^{\prime\prime}(z)=f(z), \quad z \in [0,1], \\ u(0)=u^{\prime}(0), \quad u(1)=-u^{\prime}(1), \end{array} \right. $$
(18)

where \(coef=\frac 12\), gl(z) = ez− 1 and gr(z) = e1−z. The proof is a straightforward validation that \(u(z)= {{\int \limits }_{0}^{1}} G(z,y) f(y) dy\) satisfies both the ODE and boundary conditions.

In the following discussions, we consider the continuous version of the original matrix problem, where the matrix A is the discretized Green’s function G(z, y), the two off-diagonal submatrices A1,2 and A1,3 are the discretized gr(z) ⋅ gl(y) and gr(y) ⋅ gl(z), respectively. Some simple algebra manipulations show that the submatrices \(A_{1,1}- \frac {\lambda }{\gamma ^{2}} \mathbf {v} \mathbf {v}^{T}\) and \(A_{1,4}-\gamma ^{2} \lambda \mathbf {u} \mathbf {u}^{T}\) can be considered as the discretized \(G(z,y)-\tilde {\gamma }^{2} \cdot g_{l}(z) \cdot g_{l}(y)\) and \(G(z,y)-\frac {1}{\tilde {\gamma }^{2}} \cdot g_{r}(z) \cdot g_{r}(y)\), and the coefficients \(i t \frac {\sqrt {2\lambda }}{\gamma } \mathbf {v}^{T}\) and \(i t \sqrt {2\lambda } \gamma \mathbf {u}^{T}\) for the linear terms of the x-variables x1,1 and x1,2 in Eq. (??) are the discretized \(i t \tilde {\gamma } g_{l}(z)\) and \( i t \frac {1}{\tilde {\gamma }} g_{r}(z)\), respectively.

Remark: The observation also allows easy proof of the positive definiteness of the matrix A, which is the discretized Green’s function G(z, y). In order to show that for any vector f0, the quadratic form satisfies \(\frac 12 \mathbf {f}^{T} A \mathbf {f} >0\), we consider its continuous version defined as

$${{\int}_{0}^{1}} f(z) \left( {{\int}_{0}^{1}} G(z,y) f(y) dy \right) dz = {{\int}_{0}^{1}} f(z) u(z) dz$$

where \(u(z)= {{\int \limits }_{0}^{1}} G(z,y) f(y) dy\) and f(y) is the continuous version of the (discretized) vector f. As \(f(z)=u(z)-u^{\prime \prime }(z)\), applying the integration by parts, we have

$$ {{\int}_{0}^{1}} f(z) u(z) dz = {{\int}_{0}^{1}} \left( u^{2}(z) + (u^{\prime}(z))^{2}\right) dz -u^{\prime}(1) \cdot u(1) + u^{\prime}(0) \cdot u(0). $$

As f(z) ≠ 0, therefore u(z) ≠ 0, and applying the boundary conditions of the ODE, we have \({{\int \limits }_{0}^{1}} f(z) u(z) dz>0\). We refer to the two-variable function G(z, y) as a positive definite function. The positive definiteness of the matrix A can be proved in a similar way using the discretized integration by parts.

A particular choice of γ can be determined by considering the corresponding child ODE problems as follows. We first study the root problem and define its two children as the left child and right child, and the locations zi of the left child and zj of the right child satisfy the condition zi < zj as the z-locations of the x-variables in the two child problems are separated and ordered. We pick a location ζ between the two clusters of z-locations. Note that the choice of ζ is not unique, and a particular choice is the midpoint of the two sets. We have the following results for the root node.

Theorem 2

If we choose \(\tilde {\gamma } = \frac {e^{-\zeta }}{\sqrt {2}}\), then

  1. 1.

    For the left child, the new function \(G_{l}(z,y)=G(z,y)-\tilde {\gamma }^{2} \cdot g_{l}(z) \cdot g_{l}(y)\) is the Green function of the ODE problem

    $$ \left\{ \begin{array}{l} \mathbf{u}_1(z)-\mathbf{u}_1^{\prime\prime}(z)=f(z), \quad z \in [0,\zeta], \\ \mathbf{u}_1(0)=\mathbf{u}_1^{\prime}(0), \quad \mathbf{u}_1(\zeta)=0. \end{array} \right. $$
    (19)

    The function Gl(z, y) is positive definite.

  2. 2.

    For the right child, the new function \(G_{r}(z,y)=G(z,y)-\frac {1}{\tilde {\gamma }^{2}} \cdot g_{r}(z) \cdot g_{r}(y)\) is the Green function of the ODE problem

    $$ \left\{ \begin{array}{l} \mathbf{u}_2(z)-\mathbf{u}_2^{\prime\prime}(z)=f(z), \quad z \in [\zeta, 1], \\ \mathbf{u}_2(\zeta)=0, \quad \mathbf{u}_2(1)=-\mathbf{u}_2^{\prime}(1). \end{array} \right. $$
    (20)

    The function Gr(z, y) is positive definite.

  3. 3.

    The two child ODE problem solutions u1(z) and u2(z) can be derived by subtracting a single-layer potential defined at z = ζ from the parent’s solution u(z) of Eq. (18), so that solutions u1(z) and u2(z) satisfy the zero interface condition at z = ζ. The other boundary condition for each child ODE problem is the same as its parent’s boundary condition.

These results can be easily validated by plugging in the functions to the ODE problems. The positive definiteness of the child Green’s function can be proved using the same integration by part technique as we did for the parent’s Green’s function.

For a general parent node on the tree structure, we have the following generalized results.

Theorem 3

Consider a parent node with the corresponding function Gp(z, y) defined on the interval [a, b], and ζ is a point separating the two children’s z-locations. Then there exists a number \(\tilde {\gamma }\) which depends on ζ, such that

  1. 1.

    For the left child, the new function \(G_{l}(z,y)=G(z,y)-\tilde {\gamma }^{2} \cdot g_{l}(z) \cdot g_{l}(y)\) is the Green function of the ODE problem

    $$ \left\{ \begin{array}{l} \mathbf{u}_1(z)-\mathbf{u}_1^{\prime\prime}(z)=f(z), \quad z \in [a,\zeta], \\ \text{same boundary condition as parent at } x=a, \text{ and } \mathbf{u}_1(\zeta)=0. \end{array} \right. $$
    (21)

    The function Gl(z, y) is positive definite.

  2. 2.

    For the right child, the new function \(G_{r}(z,y)=G(z,y)-\frac {1}{\tilde {\gamma }^{2}} \cdot g_{r}(z) \cdot g_{r}(y)\) is the Green function of the ODE problem

    $$ \left\{ \begin{array}{l} \mathbf{u}_2(z)-\mathbf{u}_2^{\prime\prime}(z)=f(z), \quad z \in [\zeta, b], \\ \mathbf{u}_2(\zeta)=0, \text{ and same boundary condition as parent at } x=b. \end{array} \right. $$
    (22)

    The function Gr(z, y) is positive definite.

  3. 3.

    The two child ODE problem solutions u1(z) and u2(z) can be derived by subtracting a single-layer potential defined at z = ζ from the parent’s solution u(z) of Eq. (18), so that solutions u1(z) and u2(z) satisfy the zero interface condition at z = ζ. The other boundary condition for each child ODE problem is the same as its parent’s boundary condition.

The detailed formulas for the number \(\tilde {\gamma }\) and Green’s functions are presented next. The proof of the theorem is simply validations of the formulas.

The Green function Gp(z, y) of a parent node p and the functions G1(z, y) and G2(z, y) of p’s left child 1 and right child 2 are defined as

$$G_{p}(z,y) = \left\{ \begin{array}{l} coef^{p} \cdot {g_{r}^{p}}(z) \cdot {g_{l}^{p}}(y), \quad y<z, \\ coef^{p} \cdot {g_{l}^{p}}(z) \cdot {g_{r}^{p}}(y), \quad y>z, \end{array} \right. $$
$$G_{1}(z,y) = \left\{ \begin{array}{l} coef^{1} \cdot {g_{r}^{1}}(z) \cdot {g_{l}^{1}}(y), \quad y<z, \\ coef^{1} \cdot {g_{l}^{1}}(z) \cdot {g_{r}^{1}}(y), \quad y>z, \end{array} \right. $$
$$G_{2}(z,y) = \left\{ \begin{array}{l} coef^{2} \cdot {g_{r}^{2}}(z) \cdot {g_{2}^{p}}(y), \quad y<z, \\ coef^{2} \cdot {g_{l}^{2}}(z) \cdot {g_{2}^{p}}(y), \quad y>z. \end{array} \right. $$

We assume parent’s z-locations satisfy z ∈ [a, b]. We choose ζ = c to separate the parent’s locations, and the child intervals are therefore [a, c] and [c, b], respectively.

Case 1: p is the root node (a = 0, b = 1): The functions are

$$ \begin{array}{lll} g_l^p(z)=e^{z-1}, & g_r^p(z)=e^{1-z}, & \quad coef^p=\frac12; \\ g_l^1(z)=e^{z-c}, & g_r^1(z)=\sinh(c-z), & \quad coef^1=1; \\ g_l^2(z)=\sinh(z-c), & g_r^2(z)=e^{c-z}, & \quad coef^2=1; \end{array} $$
(23)

Case 2: p is a left boundary node(a = 0): The functions are

$$ \begin{array}{lll} g_l^p(z)=e^{z-b}, & g_r^p(z)=\sinh(b-z), & \quad coef^p=1; \\ g_l^1(z)=e^{z-c}, & g_r^1(z)=\sinh(c-z), & \quad coef^1=1; \\ g_l^2(z)=\sinh(z-c), & g_r^2(z)=\sinh(b-z), & \quad coef^2=\frac{2 e^{b+c}}{e^{2 b}-e^{2 c}}; \end{array} $$
(24)

Case 3: p is a right boundary node(b = 1): The functions are

$$ \begin{array}{lll} g_l^p(z)=\sinh(z-a), & g_r^p(z)=e^{a-z}, & \quad coef^p=1; \\ g_l^1(z)=\sinh(z-a), & g_r^1(z)=\sinh(c-z), & \quad coef^1=\frac{2 e^{a+c}}{e^{2 c}-e^{2 a}}; \\ g_l^2(z)=\sinh(z-c), & g_r^2(z)=e^{c-z}, & \quad coef^2=1; \end{array} $$
(25)

Case 4: p is an interior node: The functions are

$$ \begin{array}{lll} g_l^p(z)=\sinh(z-a), & g_r^p(z)=\sinh(b-z), & \quad coef^p=\frac{2 e^{a+b}}{e^{2 b}-e^{2 a}}; \\ g_l^1(z)=\sinh(z-a), & g_r^1(z)=\sinh(c-z), & \quad coef^1=\frac{2 e^{a+c}}{e^{2 c}-e^{2 a}}; \\ g_l^2(z)=\sinh(z-c), & g_r^2(z)=\sinh(b-z), & \quad coef^2=\frac{2 e^{b+c}}{e^{2 b}-e^{2 c}}; \end{array} $$
(26)

Next, we present the relations of the parent p’s two w-variables \({w_{1}^{p}}\) and \({w_{2}^{p}}\) with the left child 1’s two w-variables {\({w_{1}^{1}}\), \({w_{2}^{1}}\)} and right child 2’s two w-variables {\({w_{1}^{2}}\), \({w_{2}^{2}}\)}. We use tnew to represent the new t-variable introduced to divide the parent problem into two sub-problems of child 1 and child 2. We use a unified set of basis functions for each node on the hierarchical tree structure. For the parent node, the basis functions are \(\{ {{{\varPhi }}_{1}^{p}}=\cosh (z-c), {{{\varPhi }}_{2}^{p}}=\frac {\sinh (z-c)}{b-a} \}\). The basis functions for the left and right children are \(\{ {{{\varPhi }}_{1}^{1}} = \cosh (z-p), {{{\varPhi }}_{2}^{1}}=\frac {\sinh (z-p)}{c-a} \}\) and \(\{ {{{\varPhi }}_{1}^{2}}=\cosh (z-q), {{{\varPhi }}_{2}^{2}}=\frac {\sinh (z-q)}{b-c} \}\), respectively, where p and q are either the interface ζ points when further subdividing the two child problems, or the mid-point of the child intervals when they become leaf nodes.

Case 1: p is the root node (a = 0, b = 1): Parent has no effective w-variables.

$$ \begin{array}{l} w_1^1=\frac{t_{new} e^{p-c}}{\sqrt{2}}, \quad w_2^1=-\frac{t_{new} (a-c) e^{p-c}}{\sqrt{2}}; \\ w_1^2=\frac{t_{new} e^{c-q}}{\sqrt{2}}, \quad w_2^2=-\frac{t_{new} (b-c) e^{c-q}}{\sqrt{2}}. \end{array} $$
(27)

Case 2: p is a left boundary node(a = 0):

$$ \begin{array}{l} w_1^1=\frac{w_2^p \sinh (c-p)}{a-b}+\frac{e^p t_{new} \sqrt{e^{-2 c}-e^{-2 b}}}{\sqrt{2}}+w_1^p \cosh (c-p), \\ w_2^1=(a-c) \left( \frac{w_2^p \cosh (c-p)}{a-b}+ w_1^p \sinh (c-p)\right)-\frac{e^p t_{new} (a-c) \sqrt{e^{-2 c}-e^{-2 b}}}{\sqrt{2}}; \\ w_1^2= \frac{w_2^p \sinh (c-q)}{a-b}+t_{new} \sqrt{\coth (b-c)-1} \sinh (b-q)+w_1^p \cosh (c-q), \\ w_2^2 = \frac{(b-c) (w_1^p (b-a) \sinh (c-q)-w_2^p \cosh (c-q))}{a-b}+t_{new} (c - b) \sqrt{\coth (b - c) - 1} \cosh (b - q). \end{array} $$
(28)

Case 3: p is a right boundary node(b = 1):

$$ \begin{array}{rcl} w_1^1 &=& \frac{w_2^p \sinh (c-p)}{a-b}+t_{new} \sqrt{-\coth (a-c)-1} \sinh (p-a)+w_1^p \cosh (c-p), \\ w_2^1 &=& (a-c) \left( \frac{w_2^p \cosh (c-p)}{a-b}+w_1^p \sinh (c-p)\right) \\ & & +t_{new} (c-a) \sqrt{-\coth (a-c)-1} \cosh (a-p); \\ w_1^2 &=& \frac{w_2^p \sinh (c-q)}{a-b}+\frac{e^{-q} t_{new} \sqrt{e^{2 c}-e^{2 a}}}{\sqrt{2}}+w_1^p \cosh (c-q), \\ w_2^2 &=& \frac{e^{-q} t_{new} \sqrt{e^{2 c}-e^{2 a}} (c-b)}{\sqrt{2}}+\frac{(b-c) (w_1^p (b-a) \sinh (c-q)-w_2^p \cosh (c-q))}{a-b}. \end{array} $$
(29)

Case 4: p is an interior node:

$$ \begin{array}{rcl} w_1^1 &=& t_{new} \sinh (p-a) \sqrt{\text{csch}(a-b) \text{csch}(a-c) \sinh (b-c)} \\ & & +\frac{w_2^p \sinh (c-p)}{a-b}+w_1^p \cosh (c-p) , \\ w_2^1 &=& t_{new} (c-a) \cosh (a-p) \sqrt{\text{csch}(a-b) \text{csch}(a-c) \sinh (b-c)} \\ & & +(a-c) \left( \frac{w_2^p \cosh (c-p)}{a-b}+w_1^p \sinh (c-p)\right); \\ w_1^2 &=& t_{new} \sinh (b-q) \sqrt{\text{csch}(a-b) \sinh (a-c) \text{csch}(b-c)} \\ & & +\frac{w_2^p \sinh (c-q)}{a-b}+w_1^p \cosh (c-q), \\ w_2^2 &=& t_{new} (c-b) \cosh (b-q) \sqrt{\text{csch}(a-b) \sinh (a-c) \text{csch}(b-c)} \\ & & +\frac{(b-c) (w_1^p (b-a) \sinh (c-q)-w_2^p \cosh (c-q))}{a-b}. \end{array} $$
(30)

Mathematica files for computing these formulas are available.

Finally, we present a theorem which guarantees the positive definiteness of the child problems when a proper parameter is chosen in the divide-and-conquer scheme for a \({\mathscr{H}}\)-matrix with rank 1 off-diagonal blocks. The theorem applies to both the tridiagonal and exponential cases.

Theorem 4

Consider a positive definite covariance matrix of two random vectors X and Y with rank one off-diagonal blocks in the form

(31)

where c is a constant and u = [u1, u2, ⋯ , un] and v = [v1, v2, ⋯ , vn] are row vectors. Then, there exists an interval of γ values such that the matrices

$$ \left( \begin{matrix} \lambda_1^2 & & 0 \\ &\ddots&\\ 0 & & \lambda_n^2 \end{matrix} \right) - c \gamma {\mathbf{v}^t \mathbf{v}} \qquad \text{ and } \qquad \left( \begin{matrix} \sigma_1^2 & & 0 \\ &\ddots&\\ 0 & & \sigma_n^2 \end{matrix} \right) - {c} \frac{1}{\gamma} {\mathbf{u}^t \mathbf{u}} $$
(32)

are both symmetric positive definite.

Proof

Without loss of generality, we assume c > 0. Note that any symmetric positive definite matrix \(\left (\begin {array}{cc} A_{1,1} & A_{1,2} \\ A_{2,1} & A_{2,2} \end {array} \right ) \) can be reduced to the standard form in Eq. (31) using an orthogonal transformation \(\left (\begin {array}{cc} U & \huge 0 \\ \huge 0 & V \end {array} \right )\) where U and V are the normalized eigenvectors for A1,1 and A2,2, respectively. Apply Lemma 2.2 from [58], we need to find γ such that

$$ 0 \leq {c} \gamma \leq \frac{1}{{\sum}_{k=1}^{n} \frac{{v_{i}^{2}}}{{\lambda_{i}^{2}}}} \quad \text{ and } \quad 0 \leq {c} \frac{1}{\gamma} \leq \frac{1}{{\sum}_{k=1}^{n} \frac{{u_{i}^{2}}}{{\sigma_{i}^{2}}}}. $$
(33)

We consider the conditional variance

$$ \begin{array}{rcl}{{\varSigma}}_{Y|X} & = & {{\varSigma}}_{YY} - {{\varSigma}}_{YX} {{\varSigma}}_{XX}^{-1} {{\varSigma}}_{XY} \\ &=& \left( \begin{array}{rcl} {\sigma_{1}^{2}} & & 0 \\&\ddots&\\ 0 & & {\sigma_{n}^{2}} \end{array} \right) - {c \mathbf{u}^{t} }\left( \mathbf{v} \left( \begin{array}{rcl} \frac{1}{{\lambda_{1}^{2}}} & & 0 \\ &\ddots&\\ 0 & & \frac{1}{{\lambda_{n}^{2}}} \end{array} \right) {c \mathbf{v}^{t}} \right) \mathbf{u} \\ & = & \left( \begin{array}{rcl} {\sigma_{1}^{2}} & & 0 \\ &\ddots&\\ 0 & & {\sigma_{n}^{2}} \end{array} \right) - {c}^{2} \left( {\sum}_{k=1}^{n} \frac{{v_{i}^{2}}}{{\lambda_{i}^{2}}} \right) \mathbf{u}^{t} \mathbf{u} \end{array} $$

which is symmetric positive definite from statistics theory (or linear algebra). Therefore using Lemma 2.2 from [58], we have

$$0 \leq {c}^{2} \left( \sum\limits_{k=1}^{n} \frac{{v_{i}^{2}}}{{\lambda_{i}^{2}}} \right) \leq \frac{1}{ \left( {\sum}_{k=1}^{n} \frac{{u_{i}^{2}}}{{\sigma_{i}^{2}}} \right) } $$

which is equivalent to \( c {\sum }_{k=1}^{n} \frac {{u_{i}^{2}}}{{\sigma _{i}^{2}}} \leq \frac {1}{c {\sum }_{k=1}^{n} \frac {{v_{i}^{2}}}{{\lambda _{i}^{2}}}} \). Clearly, any γ value in the interval \([ c {\sum }_{k=1}^{n} \frac {{u_{i}^{2}}}{{\sigma _{i}^{2}}}, \frac {1}{c {\sum }_{k=1}^{n} \frac {{v_{i}^{2}}}{{\lambda _{i}^{2}}}} ]\) will satisfy the conditions in Eqs. (16), e.g., one can choose the mid-point of this interval to “balance” the positive definiteness of the two child problems. This completes the proof. □

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Huang, J., Cao, J., Fang, F. et al. An O(N) algorithm for computing expectation of N-dimensional truncated multi-variate normal distribution I: fundamentals. Adv Comput Math 47, 65 (2021). https://doi.org/10.1007/s10444-021-09888-1

Download citation

Keywords

  • Exponential covariance model
  • Fourier transform
  • Hierarchical algorithm
  • Low-dimensional structure
  • Low-rank structure
  • Truncated multi-variate normal distribution

Mathematics Subject Classification (2010)

  • 03D20
  • 34B27
  • 62H10
  • 65C60
  • 65D30
  • 65T40