Advertisement

Statistics and Computing

, Volume 23, Issue 4, pp 501–521 | Cite as

Iterative numerical methods for sampling from high dimensional Gaussian distributions

  • Erlend AuneEmail author
  • Jo Eidsvik
  • Yvo Pokern
Article

Abstract

Many applications require efficient sampling from Gaussian distributions. The method of choice depends on the dimension of the problem as well as the structure of the covariance- (Σ) or precision matrix (Q). The most common black-box routine for computing a sample is based on Cholesky factorization. In high dimensions, computing the Cholesky factor of Σ or Q may be prohibitive due to accumulation of more non-zero entries in the factor than is possible to store in memory. We compare different methods for computing the samples iteratively adapting ideas from numerical linear algebra. These methods assume that matrix vector products, Qv, are fast to compute. We show that some of the methods are competitive and faster than Cholesky sampling and that a parallel version of one method on a Graphical Processing Unit (GPU) using CUDA can introduce a speed-up of up to 30x. Moreover, one method is used to sample from the posterior distribution of petroleum reservoir parameters in a North Sea field, given seismic reflection data on a large 3D grid.

Keywords

Gaussian distribution Krylov methods Numerical linear algebra Sampling 

Notes

Acknowledgements

We thank Statoil for permission to use the Norne dataset, François Alouges for helpful discussion of the deformation method and Daniel P. Simpson for insightful discussions on the use of Krylov methods for computing matrix functions.

References

  1. Akhiezer, N.I.: Elements of the Theory of Elliptic Functions. American Mathematical Society, Providence (1990) zbMATHGoogle Scholar
  2. Allen, E.J., Baglama, J., Boyd, S.K.: Numerical approximation of the product of the square root of a matrix with a vector. Linear Algebra Appl. 310, 167–181 (2000) MathSciNetzbMATHCrossRefGoogle Scholar
  3. Aune, E., Simpson, D.P.: Parameter estimation in high dimensional Gaussian distributions. arXiv:1105.5256v1 [stat.CO] (2011)
  4. Banerjee, S., Gelfand, A.E., Finley, A., Sang, H.: Gaussian predictive process models for large spatial data sets. J. R. Stat. Soc. B 70, 209–226 (2008) MathSciNetCrossRefGoogle Scholar
  5. Belabbas, M., Wolfe, P.: Spectral methods in machine learning and new strategies for very large datasets. Proc. Natl. Acad. Sci. 106(2), 369 (2009) CrossRefGoogle Scholar
  6. Bell, N., Garland, M.: Implementing sparse matrix-vector multiplication on throughput-oriented processors. In: SC’09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, pp. 1–11. ACM, New York (2009) CrossRefGoogle Scholar
  7. Benzi, M., Bertaccini, D.: Approximate inverse preconditioning for shifted linear systems. BIT Numer. Math. 43(2), 231–244 (2003) MathSciNetzbMATHCrossRefGoogle Scholar
  8. Besag, J., York, J., Mollie, A.: Bayesian image restoration, with two applications in spatial statistics. Ann. Inst. Stat. Math. 43, 1–59 (1991) MathSciNetzbMATHCrossRefGoogle Scholar
  9. Björck, Å.: Numerical Methods for Least Squares Problems, vol. 51. Society for Industrial Mathematics, Philadelphia (1996) zbMATHCrossRefGoogle Scholar
  10. Buland, A., Omre, H.: Bayesian linearized AVO inversion. Geophysics 68, 185–198 (2003) CrossRefGoogle Scholar
  11. Buland, A., Kolbjørnsen, O., Omre, H.: Rapid spatially coupled AVO inversion in the Fourier domain. Geophysics 68, 824–836 (2003) CrossRefGoogle Scholar
  12. Cressie, N., Johannesson, G.: Fixed rank kriging for large spatial datasets. J. R. Stat. Soc. B 70, 209–226 (2008) MathSciNetzbMATHCrossRefGoogle Scholar
  13. Davies, P.I., Higham, N.J.: Computing f(A)b for matrix functions f. In: QCD and Numerical Analysis III, pp. 15–24. Springer, Berlin (2005) CrossRefGoogle Scholar
  14. Frommer, A., Simoncini, V.: Matrix functions. In: Model Order Reduction: Theory, Research Aspects and Applications, pp. 275–304. Springer, Berlin (2008) CrossRefGoogle Scholar
  15. Golub, G.H., van Loan, C.F.: Matrix Computations, 3rd edn. John Hopkins University Press, Baltimore (1996) zbMATHGoogle Scholar
  16. Gray, R.: Toeplitz and Circulant Matrices: A Review. E-book (2006) Google Scholar
  17. Hale, N., Higham, N.J., Trefethen, L.N.: Computing A α, log(A) and related matrix functions by contour integrals. SIAM J. Numer. Anal. 46, 2505–2523 (2008) MathSciNetzbMATHCrossRefGoogle Scholar
  18. Hestenes, M., Stiefel, E.: Methods of conjugate gradients for solving linear systems. J. Res. Natl. Bur. Stand. 49, 409–436 (1952) MathSciNetzbMATHCrossRefGoogle Scholar
  19. Higham, N.J.: Functions of Matrices: Theory and Computation. Society for Industrial and Applied Mathematics, Philadelphia (2008) CrossRefGoogle Scholar
  20. Ilic, M., Turner, I.W., Anh, V.: A numerical solution using an adaptively preconditioned lanczos method for a class of linear systems related with the fractional Poisson equation. J. Appl. Math. Stoch. Anal. 2008, 104525 (2008) MathSciNetCrossRefGoogle Scholar
  21. Ilic, M., Turner, I.W., Simpson, D.P.: A restarted Lanczos approximation to functions of a symmetric matrix. IMA J. Numer. Anal. 30, 1044–1061 (2010) MathSciNetzbMATHCrossRefGoogle Scholar
  22. Jegerlehner, B.: Krylov space solvers for shifted linear systems. arXiv:hep-lat/9612014v1 (1996)
  23. Lee, A., Yau, C., Giles, M.B., Doucet, A., Holmes, C.C.: On the utility of graphics cards to perform massively parallel simulation of advanced Monte Carlo methods. J. Comput. Graph. Stat. 19, 769–789 (2010) CrossRefGoogle Scholar
  24. Lindgren, F., Lindstrøm, J., Rue, H.: An explicit link between Gaussian fields and Gaussian Markov random fields: the SPDE approach. J. R. Stat. Soc. B 73, 423–498 (2011) zbMATHCrossRefGoogle Scholar
  25. MATLAB: Version 7.11.0 (R2010b). The MathWorks Inc., Natick (2010) Google Scholar
  26. Meurant, G., Strakos, Z.: The Lanczos and conjugate gradient algorithms in finite precision arithmetic. Acta Numer. 15, 471–542 (2006) MathSciNetzbMATHCrossRefGoogle Scholar
  27. Parker, A., Fox, C.: Sampling Gaussian distributions in Krylov spaces with conjugate gradients. SIAM J. Sci. Comput. (2011, submitted) Google Scholar
  28. Paul, M., Held, L., Toschke, A.M.: Multivariate modelling of infectious disease surveillance data. Stat. Med. 27, 6250–6267 (2008) MathSciNetCrossRefGoogle Scholar
  29. Rabben, T.E., Ursin, B., Tjelmeland, H.: Non-linear Bayesian joint inversion of seismic reflection coefficients. Geophys. J. Int. 173, 265–280 (2008) CrossRefGoogle Scholar
  30. Rasmussen, C., Wiliams, C.: Gaussian Processes for Machine Learning. MIT Press, Cambridge (2006) zbMATHGoogle Scholar
  31. Riley, J.: Solving systems of linear equations with a positive definite, symmetric, but possibly ill-conditioned matrix. Math. Tables Other Aids Comput. 9(51), 96–101 (1955) MathSciNetzbMATHCrossRefGoogle Scholar
  32. Roberts, G., Sahu, S.: Updating schemes, correlation structure, blocking and parameterization for the Gibbs sampler. J. R. Stat. Soc., Ser. B, Stat. Methodol. 59(2), 291–317 (1997) MathSciNetzbMATHCrossRefGoogle Scholar
  33. Rue, H.: Fast sampling of Gaussian Markov random fields. J. R. Stat. Soc. B 63, 325–338 (2001) MathSciNetzbMATHCrossRefGoogle Scholar
  34. Rue, H., Held, L.: Gaussian Markov Random Fields. Chapman & Hall, London (2005) zbMATHCrossRefGoogle Scholar
  35. Rue, H., Tjelmeland, H.: Fitting Gaussian Markov random fields to Gaussian fields. Scand. J. Stat. 29, 31–49 (2002) MathSciNetzbMATHCrossRefGoogle Scholar
  36. Saad, Y.: Iterative Methods for Sparse Linear Systems, 2nd edn. SIAM, Philadelphia (2003) zbMATHCrossRefGoogle Scholar
  37. Saad, Y., Yeung, M., Erhel, J., Guyomarc’h, F.: A deflated version of the conjugate gradient algorithm. SIAM J. Sci. Comput. 21, 1909–1926 (1999) MathSciNetCrossRefGoogle Scholar
  38. Schneider, M., Willsky, A.: A Krylov subspace method for covariance approximation and simulation of random processes and fields. Multidimens. Syst. Signal Process. 14, 295–318 (2003) MathSciNetzbMATHCrossRefGoogle Scholar
  39. Shewchuk, J.: An introduction to the conjugate gradient method without the agonizing pain. http://www.cs.colorado.edu/~jessup/csci5646/READINGS/painless-conjugate-gradient.pdf (1994)
  40. Simon, H.: Analysis of the symmetric Lanczos algorithm with reorthogonalization methods. Linear Algebra Appl. 61, 101–131 (1984) MathSciNetzbMATHCrossRefGoogle Scholar
  41. Simpson, D.: Krylov subspace methods for approximating functions of symmetric positive definite matrices with applications to applied statistics and anomalous diffusion. PhD thesis, School of Mathematical Sciences, Queensland Univ of Tech (2008) Google Scholar
  42. Simpson, D., Turner, I., Pettitt, A.: Fast sampling from a Gaussian Markov random field using Krylov subspace approaches. Technical report, School of Mathematical Sciences, Queensland Univ. of Tech. (2008) Google Scholar
  43. Stein, E.M., Shakarchi, R.: Complex Analysis. Princeton University Press, Princeton (2003) zbMATHGoogle Scholar
  44. Stovas, A., Ursin, B.: Reflection and transmission responses of layered transversely isotropic viscoelastic media. Geophys. Prospect. 51, 447–477 (2003) CrossRefGoogle Scholar
  45. Trefethen, L., Bau, D.: Numerical Linear Algebra. SIAM, Philadelphia (1997) zbMATHCrossRefGoogle Scholar
  46. Tyrtyshnikov, E.E.: Optimal and superoptimal circulant preconditioners. SIAM J. Matrix Anal. Appl. 13, 459–473 (1990) MathSciNetCrossRefGoogle Scholar
  47. van den Eshof, J., Sleijpen, G.: Accurate conjugate gradients methods for families of shifted systems. Appl. Numer. Math. 49, 17–37 (2003) CrossRefGoogle Scholar
  48. Zolotarev, E.I.: Applications of elliptic functions to questions of functions deviating least and most from zero. Zap. Imp. Nauk St. Petersb. 30 (1877) Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  1. 1.Norwegian University of Science and TechnologyTrondheimNorway
  2. 2.University College, LondonLondonUK

Personalised recommendations