Advertisement

Statistics and Computing

, Volume 24, Issue 2, pp 247–263 | Cite as

Parameter estimation in high dimensional Gaussian distributions

  • Erlend AuneEmail author
  • Daniel P. Simpson
  • Jo Eidsvik
Article

Abstract

In order to compute the log-likelihood for high dimensional Gaussian models, it is necessary to compute the determinant of the large, sparse, symmetric positive definite precision matrix. Traditional methods for evaluating the log-likelihood, which are typically based on Cholesky factorisations, are not feasible for very large models due to the massive memory requirements. We present a novel approach for evaluating such likelihoods that only requires the computation of matrix-vector products. In this approach we utilise matrix functions, Krylov subspaces, and probing vectors to construct an iterative numerical method for computing the log-likelihood.

Keywords

Gaussian distribution Krylov methods Matrix functions Numerical linear algebra Estimation 

References

  1. Al-Mohy, A., Higham, N.: Computing the action of the matrix exponential, with an application to exponential integrators. Technical report, University of Manchester (2011) Google Scholar
  2. Anitescu, M., Chen, J., Wang, L.: A matrix-free approach for solving the Gaussian process maximum likelihood problem. SIAM J. Sci. Comput. 34, A240–A262 (2012) CrossRefzbMATHMathSciNetGoogle Scholar
  3. Aune, E., Eidsvik, J., Pokern, Y.: Iterative numerical methods for sampling from high dimensional Gaussian distributions. Stat. Comput. (2012, forthcoming). doi: 10.1007/s11222-012-9326-8
  4. Bai, Z., Fahey, G., Golub, G.: Some large-scale matrix computation problems. J. Comput. Appl. Math. 74(1), 71–89 (1996) CrossRefzbMATHMathSciNetGoogle Scholar
  5. Bai, Z., Golub, G.: Bounds for the trace of the inverse and the determinant of symmetric positive definite matrices. Ann. Numer. Math. 4, 29–38 (1997) zbMATHMathSciNetGoogle Scholar
  6. Bekas, C., Kokiopoulou, E., Saad, Y.: An estimator for the diagonal of a matrix. Appl. Numer. Math. 57(11–12), 1214–1229 (2007) CrossRefzbMATHMathSciNetGoogle Scholar
  7. Benzi, M., Golub, G.: Bounds for the entries of matrix functions with applications to preconditioning. BIT Numer. Math. 39(3), 417–438 (1999) CrossRefzbMATHMathSciNetGoogle Scholar
  8. Benzi, M., Razouk, N.: Decay bounds and \(\mathcal{O}(n)\) algorithms for approximating functions of sparse matrices. Electron. Trans. Numer. Anal. 28, 16–39 (2007) zbMATHMathSciNetGoogle Scholar
  9. Besag, J.: Spatial interaction and the statistical analysis of lattice systems. J. R. Stat. Soc. B 36, 192–236 (1974) zbMATHMathSciNetGoogle Scholar
  10. Bolin, D., Lindgren, F.: Spatial models generated by nested stochastic partial differential equations, with an application to global ozone mapping. Ann. Appl. Stat. 5(1), 523–550 (2011) CrossRefzbMATHMathSciNetGoogle Scholar
  11. Buland, A., Omre, H.: Bayesian linearized AVO inversion. Geophysics 68, 185–198 (2003) CrossRefGoogle Scholar
  12. Candès, E., Demanet, L., Donoho, D., Ying, L.: Fast discrete curvelet transforms. Multiscale Model. Simul. 5(3), 861–899 (2006) CrossRefzbMATHMathSciNetGoogle Scholar
  13. Chan, T., Tang, W., Wan, W.: Wavelet sparse approximate inverse preconditioners. BIT Numer. Math. 37(3), 644–660 (1997) CrossRefzbMATHMathSciNetGoogle Scholar
  14. Chen, Y., Davis, T., Hager, W., Rajamanickam, S.: Algorithm 887: CHOLMOD, supernodal sparse Cholesky factorization and update/downdate. ACM Trans. Math. Softw. 35(3), 22 (2008) CrossRefMathSciNetGoogle Scholar
  15. Cressie, N.: Statistics for Spatial Data. Wiley, New York (1993) Google Scholar
  16. Cressie, N., Johannesson, G.: Fixed rank kriging for large spatial datasets. J. R. Stat. Soc. B 70, 209–226 (2008) CrossRefzbMATHMathSciNetGoogle Scholar
  17. Culberson, J.: Iterated greedy graph coloring and the difficulty landscape. Technical report, University of Alberta (1992) Google Scholar
  18. Davies, P.I., Higham, N.J.: Computing f(A)b for matrix functions f In: QCD and Numerical Analysis III, pp. 15–24. Springer, Berlin (2005) CrossRefGoogle Scholar
  19. Davis, T., Hager, W.: Modifying a sparse Cholesky factorization. SIAM J. Matrix Anal. Appl. 20(3), 606–627 (1999) CrossRefzbMATHMathSciNetGoogle Scholar
  20. Driscoll, A.: The Schwarz Christoffel Toolbox (2009). Available at http://www.math.udel.edu/~driscoll/software/SC. Electronic
  21. Eidsvik, J., Shaby, B.A., Reich, B.J., Wheeler, M., Niemi, J.: Estimation and prediction in spatial models with block composite likelihoods using parallel computing. Technical report, NTNU (2011) Google Scholar
  22. Estrada, E., Hatano, N., Benzi, M.: The physics of communicability in complex networks (2011). Arxiv preprint arXiv:1109.2950
  23. Frommer, A.: BiCGStab (l) for families of shifted linear systems. Computing 70(2), 87–109 (2003) CrossRefzbMATHMathSciNetGoogle Scholar
  24. Fuentes, M.: Approximate likelihood for large irregularly spaced spatial data. J. Am. Stat. Assoc. 102(477), 321–331 (2007) CrossRefzbMATHMathSciNetGoogle Scholar
  25. Gorjanc, G.: Graphical model representation of pedigree based mixed model. In: 32nd International Conference on Information Technology Interfaces (ITI), 2010, pp. 545–550. IEEE Press, New York (2010) Google Scholar
  26. Gröchenig, K.: Foundations of Time-Frequency Analysis. Birkhäuser, Basel (2001) CrossRefzbMATHGoogle Scholar
  27. Hale, N., Higham, N.J., Trefethen, L.N.: Computing A α, log(A) and related matrix functions by contour integrals. SIAM J. Numer. Anal. 46, 2505–2523 (2008) CrossRefzbMATHMathSciNetGoogle Scholar
  28. Higham, N.J.: Functions of Matrices: Theory and Computation. SIAM, Philadelphia (2008) CrossRefGoogle Scholar
  29. Huckle, M., Grote, M.: Parallel preconditioning with sparse approximate inverses. SIAM J. Sci. Comput. 18(3), 838–853 (1997) CrossRefzbMATHMathSciNetGoogle Scholar
  30. Hutchinson, M.: A stochastic estimator of the trace of the influence matrix for Laplacian smoothing splines. Commun. Stat., Simul. Comput. 18, 1059–1076 (1989) CrossRefzbMATHMathSciNetGoogle Scholar
  31. Jegerlehner, B.: Krylov space solvers for shifted linear systems (1996). arXiv:hep-lat/9612014v1
  32. Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20(1), 359 (1999) CrossRefzbMATHMathSciNetGoogle Scholar
  33. Lindgren, F., Lindstrøm, J., Rue, H.: An explicit link between Gaussian fields and Gaussian Markov random fields: the SPDE approach. J. R. Stat. Soc. B 73, 423–498 (2011) CrossRefzbMATHGoogle Scholar
  34. Mallat, S.: A Wavelet Tour of Signal Processing. Academic Press, San Diego (1998) zbMATHGoogle Scholar
  35. MATLAB (2010). Version 7.11.0 (R2010b). The MathWorks Inc., Natick, Massachusetts Google Scholar
  36. McPeters, R., Aeronautics, U.S.N., Scientific, S.A., Branch, T.I.: Nimbus-7 Total Ozone Mapping Spectrometer (TOMS) data products user’s guide. NASA, Scientific and Technical Information Branch (1996) Google Scholar
  37. Rue, H., Held, L.: Gaussian Markov Random Fields. Chapman & Hall, London (2005) CrossRefzbMATHGoogle Scholar
  38. Rue, H., Martino, S., Chopin, N.: Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. J. R. Stat. Soc., Ser. B, Stat. Methodol. 71(2), 319–392 (2009) CrossRefzbMATHMathSciNetGoogle Scholar
  39. Rue, H., Tjelmeland, H.: Fitting Gaussian Markov random fields to Gaussian fields. Scand. J. Stat. 29, 31–49 (2002) CrossRefzbMATHMathSciNetGoogle Scholar
  40. Saad, Y.: Iterative Methods for Sparse Linear Systems, 2nd edn. SIAM, Philadelphia (2003) CrossRefzbMATHGoogle Scholar
  41. Simpson, D.: Krylov subspace methods for approximating functions of symmetric positive definite matrices with applications to applied statistics and anomalous diffusion. Ph.D. thesis, School of Mathematical Sciences, Queensland Univ. of Tech. (2008) Google Scholar
  42. Simpson, D., Turner, I., Pettitt, A.: Fast sampling from a Gaussian Markov random field using Krylov subspace approaches. Technical report, School of Mathematical Sciences, Queensland Univ. of Tech. (2008) Google Scholar
  43. Stein, M., Chen, J., Anitescu, M.: Stochastic approximation of score functions for Gaussian processes. Technical report (2012). Preprint ANL/MCSP-2091-0512 Google Scholar
  44. Tang, J., Saad, Y.: A probing method for computing the diagonal of the matrix inverse. Technical report, Minnesota Supercomputing Institute for Advanced Computational Research (2010) Google Scholar
  45. van der Vorst, H., Melissen, J.: A Petrov-Galerkin type method for solving A×k=b, where A is symmetric complex. IEEE Trans. Magn. 26(2), 706–708 (1990) CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2012

Authors and Affiliations

  1. 1.Norwegian University of Science and TechnologyTrondheimNorway

Personalised recommendations