Statistics and Computing

, Volume 24, Issue 2, pp 247–263 | Cite as

Parameter estimation in high dimensional Gaussian distributions

Article

Abstract

In order to compute the log-likelihood for high dimensional Gaussian models, it is necessary to compute the determinant of the large, sparse, symmetric positive definite precision matrix. Traditional methods for evaluating the log-likelihood, which are typically based on Cholesky factorisations, are not feasible for very large models due to the massive memory requirements. We present a novel approach for evaluating such likelihoods that only requires the computation of matrix-vector products. In this approach we utilise matrix functions, Krylov subspaces, and probing vectors to construct an iterative numerical method for computing the log-likelihood.

Keywords

Gaussian distribution Krylov methods Matrix functions Numerical linear algebra Estimation 

References

  1. Al-Mohy, A., Higham, N.: Computing the action of the matrix exponential, with an application to exponential integrators. Technical report, University of Manchester (2011) Google Scholar
  2. Anitescu, M., Chen, J., Wang, L.: A matrix-free approach for solving the Gaussian process maximum likelihood problem. SIAM J. Sci. Comput. 34, A240–A262 (2012) CrossRefMATHMathSciNetGoogle Scholar
  3. Aune, E., Eidsvik, J., Pokern, Y.: Iterative numerical methods for sampling from high dimensional Gaussian distributions. Stat. Comput. (2012, forthcoming). doi:10.1007/s11222-012-9326-8
  4. Bai, Z., Fahey, G., Golub, G.: Some large-scale matrix computation problems. J. Comput. Appl. Math. 74(1), 71–89 (1996) CrossRefMATHMathSciNetGoogle Scholar
  5. Bai, Z., Golub, G.: Bounds for the trace of the inverse and the determinant of symmetric positive definite matrices. Ann. Numer. Math. 4, 29–38 (1997) MATHMathSciNetGoogle Scholar
  6. Bekas, C., Kokiopoulou, E., Saad, Y.: An estimator for the diagonal of a matrix. Appl. Numer. Math. 57(11–12), 1214–1229 (2007) CrossRefMATHMathSciNetGoogle Scholar
  7. Benzi, M., Golub, G.: Bounds for the entries of matrix functions with applications to preconditioning. BIT Numer. Math. 39(3), 417–438 (1999) CrossRefMATHMathSciNetGoogle Scholar
  8. Benzi, M., Razouk, N.: Decay bounds and \(\mathcal{O}(n)\) algorithms for approximating functions of sparse matrices. Electron. Trans. Numer. Anal. 28, 16–39 (2007) MATHMathSciNetGoogle Scholar
  9. Besag, J.: Spatial interaction and the statistical analysis of lattice systems. J. R. Stat. Soc. B 36, 192–236 (1974) MATHMathSciNetGoogle Scholar
  10. Bolin, D., Lindgren, F.: Spatial models generated by nested stochastic partial differential equations, with an application to global ozone mapping. Ann. Appl. Stat. 5(1), 523–550 (2011) CrossRefMATHMathSciNetGoogle Scholar
  11. Buland, A., Omre, H.: Bayesian linearized AVO inversion. Geophysics 68, 185–198 (2003) CrossRefGoogle Scholar
  12. Candès, E., Demanet, L., Donoho, D., Ying, L.: Fast discrete curvelet transforms. Multiscale Model. Simul. 5(3), 861–899 (2006) CrossRefMATHMathSciNetGoogle Scholar
  13. Chan, T., Tang, W., Wan, W.: Wavelet sparse approximate inverse preconditioners. BIT Numer. Math. 37(3), 644–660 (1997) CrossRefMATHMathSciNetGoogle Scholar
  14. Chen, Y., Davis, T., Hager, W., Rajamanickam, S.: Algorithm 887: CHOLMOD, supernodal sparse Cholesky factorization and update/downdate. ACM Trans. Math. Softw. 35(3), 22 (2008) CrossRefMathSciNetGoogle Scholar
  15. Cressie, N.: Statistics for Spatial Data. Wiley, New York (1993) Google Scholar
  16. Cressie, N., Johannesson, G.: Fixed rank kriging for large spatial datasets. J. R. Stat. Soc. B 70, 209–226 (2008) CrossRefMATHMathSciNetGoogle Scholar
  17. Culberson, J.: Iterated greedy graph coloring and the difficulty landscape. Technical report, University of Alberta (1992) Google Scholar
  18. Davies, P.I., Higham, N.J.: Computing f(A)b for matrix functions f In: QCD and Numerical Analysis III, pp. 15–24. Springer, Berlin (2005) CrossRefGoogle Scholar
  19. Davis, T., Hager, W.: Modifying a sparse Cholesky factorization. SIAM J. Matrix Anal. Appl. 20(3), 606–627 (1999) CrossRefMATHMathSciNetGoogle Scholar
  20. Driscoll, A.: The Schwarz Christoffel Toolbox (2009). Available at http://www.math.udel.edu/~driscoll/software/SC. Electronic
  21. Eidsvik, J., Shaby, B.A., Reich, B.J., Wheeler, M., Niemi, J.: Estimation and prediction in spatial models with block composite likelihoods using parallel computing. Technical report, NTNU (2011) Google Scholar
  22. Estrada, E., Hatano, N., Benzi, M.: The physics of communicability in complex networks (2011). Arxiv preprint arXiv:1109.2950
  23. Frommer, A.: BiCGStab (l) for families of shifted linear systems. Computing 70(2), 87–109 (2003) CrossRefMATHMathSciNetGoogle Scholar
  24. Fuentes, M.: Approximate likelihood for large irregularly spaced spatial data. J. Am. Stat. Assoc. 102(477), 321–331 (2007) CrossRefMATHMathSciNetGoogle Scholar
  25. Gorjanc, G.: Graphical model representation of pedigree based mixed model. In: 32nd International Conference on Information Technology Interfaces (ITI), 2010, pp. 545–550. IEEE Press, New York (2010) Google Scholar
  26. Gröchenig, K.: Foundations of Time-Frequency Analysis. Birkhäuser, Basel (2001) CrossRefMATHGoogle Scholar
  27. Hale, N., Higham, N.J., Trefethen, L.N.: Computing A α, log(A) and related matrix functions by contour integrals. SIAM J. Numer. Anal. 46, 2505–2523 (2008) CrossRefMATHMathSciNetGoogle Scholar
  28. Higham, N.J.: Functions of Matrices: Theory and Computation. SIAM, Philadelphia (2008) CrossRefGoogle Scholar
  29. Huckle, M., Grote, M.: Parallel preconditioning with sparse approximate inverses. SIAM J. Sci. Comput. 18(3), 838–853 (1997) CrossRefMATHMathSciNetGoogle Scholar
  30. Hutchinson, M.: A stochastic estimator of the trace of the influence matrix for Laplacian smoothing splines. Commun. Stat., Simul. Comput. 18, 1059–1076 (1989) CrossRefMATHMathSciNetGoogle Scholar
  31. Jegerlehner, B.: Krylov space solvers for shifted linear systems (1996). arXiv:hep-lat/9612014v1
  32. Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20(1), 359 (1999) CrossRefMATHMathSciNetGoogle Scholar
  33. Lindgren, F., Lindstrøm, J., Rue, H.: An explicit link between Gaussian fields and Gaussian Markov random fields: the SPDE approach. J. R. Stat. Soc. B 73, 423–498 (2011) CrossRefMATHGoogle Scholar
  34. Mallat, S.: A Wavelet Tour of Signal Processing. Academic Press, San Diego (1998) MATHGoogle Scholar
  35. MATLAB (2010). Version 7.11.0 (R2010b). The MathWorks Inc., Natick, Massachusetts Google Scholar
  36. McPeters, R., Aeronautics, U.S.N., Scientific, S.A., Branch, T.I.: Nimbus-7 Total Ozone Mapping Spectrometer (TOMS) data products user’s guide. NASA, Scientific and Technical Information Branch (1996) Google Scholar
  37. Rue, H., Held, L.: Gaussian Markov Random Fields. Chapman & Hall, London (2005) CrossRefMATHGoogle Scholar
  38. Rue, H., Martino, S., Chopin, N.: Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. J. R. Stat. Soc., Ser. B, Stat. Methodol. 71(2), 319–392 (2009) CrossRefMATHMathSciNetGoogle Scholar
  39. Rue, H., Tjelmeland, H.: Fitting Gaussian Markov random fields to Gaussian fields. Scand. J. Stat. 29, 31–49 (2002) CrossRefMATHMathSciNetGoogle Scholar
  40. Saad, Y.: Iterative Methods for Sparse Linear Systems, 2nd edn. SIAM, Philadelphia (2003) CrossRefMATHGoogle Scholar
  41. Simpson, D.: Krylov subspace methods for approximating functions of symmetric positive definite matrices with applications to applied statistics and anomalous diffusion. Ph.D. thesis, School of Mathematical Sciences, Queensland Univ. of Tech. (2008) Google Scholar
  42. Simpson, D., Turner, I., Pettitt, A.: Fast sampling from a Gaussian Markov random field using Krylov subspace approaches. Technical report, School of Mathematical Sciences, Queensland Univ. of Tech. (2008) Google Scholar
  43. Stein, M., Chen, J., Anitescu, M.: Stochastic approximation of score functions for Gaussian processes. Technical report (2012). Preprint ANL/MCSP-2091-0512 Google Scholar
  44. Tang, J., Saad, Y.: A probing method for computing the diagonal of the matrix inverse. Technical report, Minnesota Supercomputing Institute for Advanced Computational Research (2010) Google Scholar
  45. van der Vorst, H., Melissen, J.: A Petrov-Galerkin type method for solving A×k=b, where A is symmetric complex. IEEE Trans. Magn. 26(2), 706–708 (1990) CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2012

Authors and Affiliations

  1. 1.Norwegian University of Science and TechnologyTrondheimNorway

Personalised recommendations