Skip to main content
Log in

Efficient reduced-rank methods for Gaussian processes with eigenfunction expansions

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

In this work, we introduce a reduced-rank algorithm for Gaussian process regression. Our numerical scheme converts a Gaussian process on a user-specified interval to its Karhunen–Loève expansion, the \(L^2\)-optimal reduced-rank representation. Numerical evaluation of the Karhunen–Loève expansion is performed once during precomputation and involves computing a numerical eigendecomposition of an integral operator whose kernel is the covariance function of the Gaussian process. The Karhunen–Loève expansion is independent of observed data and depends only on the covariance kernel and the size of the interval on which the Gaussian process is defined. The scheme of this paper does not require translation invariance of the covariance kernel. We also introduce a class of fast algorithms for Bayesian fitting of hyperparameters and demonstrate the performance of our algorithms with numerical experiments in one and two dimensions. Extensions to higher dimensions are mathematically straightforward but suffer from the standard curses of high dimensions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Abramowitz, M., Stegun, I.A. (eds.): Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. National Bureau of Standards, Washington, D.C (1964)

    MATH  Google Scholar 

  • Ambikasaran, S., Foreman-Mackey, D., Greengard, L., Hogg, D.W., O’Neil, M.: Fast direct methods for Gaussian processes. IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 252–265 (2016)

    Article  Google Scholar 

  • Baugh, S., Stein, M.L.: Computationally efficient spatial modeling using recursive skeletonization factorizations. Spat. Stat. 27, 18–30 (2018)

    Article  MathSciNet  Google Scholar 

  • Carpenter, B., Gelman, A., Hoffman, M.D., Lee, D., Goodrich, B., Betancourt, M., Brubaker, M., Guo, J., Li, P., Riddell, A.: Stan: a probabilistic programming language. J. Stat. Softw. 76(1), 1–32 (2017)

    Article  Google Scholar 

  • Cressie, N.: Statistics for Spatial Data, Revised Wiley-Interscience, Hoboken (2015)

    MATH  Google Scholar 

  • Datta, A., Banerjee, S., Finley, A.O., Gelfand, A.E.: Hierarchical nearest-neighbor Gaussian process models for large geostatistical datasets. J. Am. Stat. Assoc. 111(514), 800–812 (2016)

    Article  MathSciNet  Google Scholar 

  • Driscoll, T.A., Hale, N., Trefethen, L.N.: Chebfun Guide. Pafnuty Publications, Oxford (2014)

    Google Scholar 

  • Filip, S., Javeed, A., Trefethen, L.N.: Smooth random functions, random ODEs, and Gaussian processes. SIAM Rev. 61(1), 185–205 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  • Foreman-Mackey, D., Agol, E., Ambikasaran, S., Angus, R.: Fast and scalable Gaussian process modeling with applications to astronomical time series. Astron. J. 154(6), 220 (2017)

    Article  Google Scholar 

  • Gelman, A., Carlin, J.B., Stern, H.S., Dunson, D.B., Vehtari, A., Rubin, D.B.: Bayesian Data Analysis, 3rd edn. Chapman and Hall/CRC, New York (2013)

    Book  MATH  Google Scholar 

  • Gonzalvez, J., Lezmi, E., Roncalli, T., Xu, J.: Financial applications of Gaussian processes and Bayesian optimization. arXiv:1903.04841 (2019)

  • Greengard, P., Gelman, A., Vehtari, A.: A fast regression via SVD and marginalization. Comput. Stat. 37, 701–720 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  • Hastie, T., Tibshirani, R., Friedman, J.: Elements of Statistical Learning. Springer Series in Statistics, 2nd edn. Springer, New York (2009)

    MATH  Google Scholar 

  • Kress, R.: Linear Integral Equations. Springer, New York (1999)

    Book  MATH  Google Scholar 

  • Lalchand, V. Rasmussen, C.E.: Approximate inference for fully Bayesian Gaussian process regression. In: Symposium on Advances in Approximate Bayesian Inference, pp. 1–12. PMLR (2020)

  • Lázaro-Gredilla, M., Quiñnero-Candela, J., Rasmussen, C.E., Figueiras-Vidal, A.R.: Sparse spectrum Gaussian process regression. J. Mach. Learn. Res. 11(63), 1865–1881 (2010)

    MathSciNet  MATH  Google Scholar 

  • Loève, M.: Probability Theory I. Springer, New York (1977)

    Book  MATH  Google Scholar 

  • Minden, V., Damle, A., Ho, K.L., Ying, L.: Fast spatial Gaussian process maximum likelihood estimation via skeletonization factorizations. Multiscale Model. Simul. 15(4), 1584–1611 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  • Nolan, J.P.: Univariate Stable Distributions. Springer, New York (2020)

    Book  MATH  Google Scholar 

  • Quinonero-Candela, J., Rasmussen, C.E.: Analysis of some methods for reduced rank Gaussian process regression. In: Murray-Smith, R., Shorten, R. (eds.) Switching and Learning in Feedback Systems, pp. 98–127. Springer, Berlin (2005)

    Chapter  Google Scholar 

  • Rahimi, A., Recht, B.: Random features for large-scale kernel machines. In: Platt, J., Koller, D., Singer, Y., Roweis, S. (eds.) Advances in Neural Information Processing Systems, vol. 20. Curran Associates Inc., Red Hook (2008)

    Google Scholar 

  • Rasmussen, C.E., Williams, C.L.I.: Gaussian Processes for Machine Learning. MIT Press, Cambridge (2006)

    MATH  Google Scholar 

  • Riesz, F., Sz-Nagy, B.: Functional Analysis. Frederick Ungar Publishing Co., New York (1955)

    MATH  Google Scholar 

  • Riutort-Mayol, G., Bürkner, P.-C., Andersen, M.R., Solin, A., Vehtari, A.: Practical Hilbert space approximate Bayesian Gaussian processes for probabilistic programming (2020)

  • Schwab, C., Todor, R.A.: Karhunen–Loéve approximation of random fields by generalized fast multipole methods. J. Comput. Phys. 217, 100–122 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  • Solin, A., Särkkä, S.: Hilbert space methods for reduced-rank Gaussian process regression. Stat. Comput. 30, 419–446 (2020a)

    Article  MathSciNet  MATH  Google Scholar 

  • Solin, A., Särkkä, S.: Hilbert space methods for reduced-rank Gaussian process regression. https://github.com/AaltoML/hilbert-gp (2020b)

  • Stoer, J., Bulirsch, R.: Introduction to Numerical Analysis, 2nd edn. Springer, New York (1992)

    MATH  Google Scholar 

  • Trefethen, L.N.: Approximation Theory and Approximation Practice, Extended SIAM, Philadelphia (2020)

    MATH  Google Scholar 

  • Xiu, D.: Numerical Methods for Stochastic Computations. Princeton University Press, Princeton (2010)

    Book  MATH  Google Scholar 

  • Yarvin, N., Rokhlin, V.: Generalized Gaussian quadratures and singular value decompositions of integral operators. SIAM J. Sci. Comput. 20(2), 699–720 (1998)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors are grateful to Paul Beckman, Dan Foreman-Mackey, Jeremy Hoskins, Manas Rachh, and Vladimir Rokhlin for helpful discussions. The first author is supported by Alfred P. Sloan Foundation. The second author is supported in part by the Office of Naval Research under award numbers #N00014-21-1-2383 and the Simons Foundation/SFARI (560651, AB).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Philip Greengard.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A Legendre polynomials

A Legendre polynomials

We now provide a brief overview of Legendre polynomials and Gaussian quadrature (Abramowitz and Stegun 1964). For a more in-depth analysis of these tools and their role in (numerical) approximation theory see, for example, (Trefethen 2020).

In accordance with standard practice, we denote by \(P_i: [-1, 1] \rightarrow \mathbb {R}\) the Legendre polynomial of degree i defined by the three-term recursion

$$\begin{aligned} P_{i+1}(x) = \frac{2i + 1}{i + 1} \, x \, P_{i}(x) - \frac{i}{i + 1} P_{i-1}(x) \end{aligned}$$
(A.1)

with initial conditions

$$\begin{aligned} P_0(x) = 1 \qquad \text {and} \qquad P_1(x) = x. \end{aligned}$$
(A.2)

Legendre polynomials are orthogonal on \([-1, 1]\) and satisfy

$$\begin{aligned} \int _{-1}^{1} P_i(x) \, P_j(x) \, \mathrm{d}x = {\left\{ \begin{array}{ll} 0 &{} i \ne j, \\ \frac{2}{2i+1} &{} i = j. \end{array}\right. } \end{aligned}$$

We denote the \(L^2\) normalized Legendre polynomials, \(\overline{P}_i\), which are defined by

$$\begin{aligned} \overline{P}_i(x) = \sqrt{\frac{2i + 1}{2}} P_i(x). \end{aligned}$$
(A.3)

For each n, the Legendre polynomial \(P_n\) has n distinct roots which we denote in what follows by \(x_1,...,x_n\). Furthermore, for all n, there exist n positive real numbers \(w_1,...,w_n\) such that for any polynomial p of degree \( \le 2n - 1\),

$$\begin{aligned} \int _{-1}^1 p(x) \, \mathrm{d}x = \sum _{i=1}^n w_i \, p(x_i). \end{aligned}$$
(A.4)

The roots \(x_1,\ldots ,x_n\) are usually referred to as order-n Gaussian nodes and \(w_1,...,w_n\) the associated Gaussian quadrature weights. Classical Gaussian quadratures such as this are associated with many families of orthogonal polynomials: Chebyshev, Hermite, Laguerre, etc. The quadratures we mention above, associated with Legendre polynomials, provide a high-order method for discretizing (i.e., interpolating) and integrating square-integrable functions on a finite interval. Legendre polynomials are the natural orthogonal polynomial basis for square-integrable functions on the interval \([-1,1]\), and the associated interpolation and quadrature formulae provide nearly optimal approximation tools for these functions, even if they are not, in fact, polynomials.

The following well-known lemma regarding interpolation using Legendre polynomials will be used in the numerical schemes discussed in this paper. A proof can be found in Stoer and Bulirsch (1992), for example.

Theorem 4

Let \(x_1,...,x_n\) be the order-n Gaussian nodes and \(w_1,...,w_n\) the associated order-n Gaussian weights. Then, there exists an \(n \times n\) matrix \(\varvec{\mathsf {M}}\) that maps a function tabulated at these Gaussian nodes to the corresponding Legendre expansion, i.e., the interpolating polynomial expressed in terms of Legendre polynomials. That is to say, defining \(\varvec{\mathsf {f}}\) by

$$\begin{aligned} \varvec{\mathsf {f}} = \left( f(x_1) \cdots f(x_n) \right) ^{\mathsf {T}}, \end{aligned}$$
(A.5)

the vector

$$\begin{aligned} {{\varvec{\alpha }}} = \varvec{\mathsf {M}}\varvec{\mathsf {f}} \end{aligned}$$
(A.6)

are the coefficients of the order-n Legendre expansion p such that

$$\begin{aligned} \begin{aligned} p(x_j)&= \sum _{i=1}^n \alpha _i \, P_{i-1}(x_j)\\&= f(x_j), \end{aligned} \end{aligned}$$
(A.7)

where \(\alpha _i\) denotes the ith entry of the vector \({{\varvec{\alpha }}}\).

From a computational standpoint, algorithms for efficient evaluation of Legendre polynomials and Gaussian nodes and weights are available in standard software packages (e.g., Driscoll et al. 2014). Furthermore, the entries of the matrix \(\varvec{\mathsf {M}}\) can be computed directly via \(M_{i, j} = w_j P_{i-1}(x_j)\).

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Greengard, P., O’Neil, M. Efficient reduced-rank methods for Gaussian processes with eigenfunction expansions. Stat Comput 32, 94 (2022). https://doi.org/10.1007/s11222-022-10124-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11222-022-10124-z

Keywords

Navigation