Abstract
In this work, we introduce a reduced-rank algorithm for Gaussian process regression. Our numerical scheme converts a Gaussian process on a user-specified interval to its Karhunen–Loève expansion, the \(L^2\)-optimal reduced-rank representation. Numerical evaluation of the Karhunen–Loève expansion is performed once during precomputation and involves computing a numerical eigendecomposition of an integral operator whose kernel is the covariance function of the Gaussian process. The Karhunen–Loève expansion is independent of observed data and depends only on the covariance kernel and the size of the interval on which the Gaussian process is defined. The scheme of this paper does not require translation invariance of the covariance kernel. We also introduce a class of fast algorithms for Bayesian fitting of hyperparameters and demonstrate the performance of our algorithms with numerical experiments in one and two dimensions. Extensions to higher dimensions are mathematically straightforward but suffer from the standard curses of high dimensions.
Similar content being viewed by others
References
Abramowitz, M., Stegun, I.A. (eds.): Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. National Bureau of Standards, Washington, D.C (1964)
Ambikasaran, S., Foreman-Mackey, D., Greengard, L., Hogg, D.W., O’Neil, M.: Fast direct methods for Gaussian processes. IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 252–265 (2016)
Baugh, S., Stein, M.L.: Computationally efficient spatial modeling using recursive skeletonization factorizations. Spat. Stat. 27, 18–30 (2018)
Carpenter, B., Gelman, A., Hoffman, M.D., Lee, D., Goodrich, B., Betancourt, M., Brubaker, M., Guo, J., Li, P., Riddell, A.: Stan: a probabilistic programming language. J. Stat. Softw. 76(1), 1–32 (2017)
Cressie, N.: Statistics for Spatial Data, Revised Wiley-Interscience, Hoboken (2015)
Datta, A., Banerjee, S., Finley, A.O., Gelfand, A.E.: Hierarchical nearest-neighbor Gaussian process models for large geostatistical datasets. J. Am. Stat. Assoc. 111(514), 800–812 (2016)
Driscoll, T.A., Hale, N., Trefethen, L.N.: Chebfun Guide. Pafnuty Publications, Oxford (2014)
Filip, S., Javeed, A., Trefethen, L.N.: Smooth random functions, random ODEs, and Gaussian processes. SIAM Rev. 61(1), 185–205 (2019)
Foreman-Mackey, D., Agol, E., Ambikasaran, S., Angus, R.: Fast and scalable Gaussian process modeling with applications to astronomical time series. Astron. J. 154(6), 220 (2017)
Gelman, A., Carlin, J.B., Stern, H.S., Dunson, D.B., Vehtari, A., Rubin, D.B.: Bayesian Data Analysis, 3rd edn. Chapman and Hall/CRC, New York (2013)
Gonzalvez, J., Lezmi, E., Roncalli, T., Xu, J.: Financial applications of Gaussian processes and Bayesian optimization. arXiv:1903.04841 (2019)
Greengard, P., Gelman, A., Vehtari, A.: A fast regression via SVD and marginalization. Comput. Stat. 37, 701–720 (2021)
Hastie, T., Tibshirani, R., Friedman, J.: Elements of Statistical Learning. Springer Series in Statistics, 2nd edn. Springer, New York (2009)
Kress, R.: Linear Integral Equations. Springer, New York (1999)
Lalchand, V. Rasmussen, C.E.: Approximate inference for fully Bayesian Gaussian process regression. In: Symposium on Advances in Approximate Bayesian Inference, pp. 1–12. PMLR (2020)
Lázaro-Gredilla, M., Quiñnero-Candela, J., Rasmussen, C.E., Figueiras-Vidal, A.R.: Sparse spectrum Gaussian process regression. J. Mach. Learn. Res. 11(63), 1865–1881 (2010)
Loève, M.: Probability Theory I. Springer, New York (1977)
Minden, V., Damle, A., Ho, K.L., Ying, L.: Fast spatial Gaussian process maximum likelihood estimation via skeletonization factorizations. Multiscale Model. Simul. 15(4), 1584–1611 (2017)
Nolan, J.P.: Univariate Stable Distributions. Springer, New York (2020)
Quinonero-Candela, J., Rasmussen, C.E.: Analysis of some methods for reduced rank Gaussian process regression. In: Murray-Smith, R., Shorten, R. (eds.) Switching and Learning in Feedback Systems, pp. 98–127. Springer, Berlin (2005)
Rahimi, A., Recht, B.: Random features for large-scale kernel machines. In: Platt, J., Koller, D., Singer, Y., Roweis, S. (eds.) Advances in Neural Information Processing Systems, vol. 20. Curran Associates Inc., Red Hook (2008)
Rasmussen, C.E., Williams, C.L.I.: Gaussian Processes for Machine Learning. MIT Press, Cambridge (2006)
Riesz, F., Sz-Nagy, B.: Functional Analysis. Frederick Ungar Publishing Co., New York (1955)
Riutort-Mayol, G., Bürkner, P.-C., Andersen, M.R., Solin, A., Vehtari, A.: Practical Hilbert space approximate Bayesian Gaussian processes for probabilistic programming (2020)
Schwab, C., Todor, R.A.: Karhunen–Loéve approximation of random fields by generalized fast multipole methods. J. Comput. Phys. 217, 100–122 (2006)
Solin, A., Särkkä, S.: Hilbert space methods for reduced-rank Gaussian process regression. Stat. Comput. 30, 419–446 (2020a)
Solin, A., Särkkä, S.: Hilbert space methods for reduced-rank Gaussian process regression. https://github.com/AaltoML/hilbert-gp (2020b)
Stoer, J., Bulirsch, R.: Introduction to Numerical Analysis, 2nd edn. Springer, New York (1992)
Trefethen, L.N.: Approximation Theory and Approximation Practice, Extended SIAM, Philadelphia (2020)
Xiu, D.: Numerical Methods for Stochastic Computations. Princeton University Press, Princeton (2010)
Yarvin, N., Rokhlin, V.: Generalized Gaussian quadratures and singular value decompositions of integral operators. SIAM J. Sci. Comput. 20(2), 699–720 (1998)
Acknowledgements
The authors are grateful to Paul Beckman, Dan Foreman-Mackey, Jeremy Hoskins, Manas Rachh, and Vladimir Rokhlin for helpful discussions. The first author is supported by Alfred P. Sloan Foundation. The second author is supported in part by the Office of Naval Research under award numbers #N00014-21-1-2383 and the Simons Foundation/SFARI (560651, AB).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
A Legendre polynomials
A Legendre polynomials
We now provide a brief overview of Legendre polynomials and Gaussian quadrature (Abramowitz and Stegun 1964). For a more in-depth analysis of these tools and their role in (numerical) approximation theory see, for example, (Trefethen 2020).
In accordance with standard practice, we denote by \(P_i: [-1, 1] \rightarrow \mathbb {R}\) the Legendre polynomial of degree i defined by the three-term recursion
with initial conditions
Legendre polynomials are orthogonal on \([-1, 1]\) and satisfy
We denote the \(L^2\) normalized Legendre polynomials, \(\overline{P}_i\), which are defined by
For each n, the Legendre polynomial \(P_n\) has n distinct roots which we denote in what follows by \(x_1,...,x_n\). Furthermore, for all n, there exist n positive real numbers \(w_1,...,w_n\) such that for any polynomial p of degree \( \le 2n - 1\),
The roots \(x_1,\ldots ,x_n\) are usually referred to as order-n Gaussian nodes and \(w_1,...,w_n\) the associated Gaussian quadrature weights. Classical Gaussian quadratures such as this are associated with many families of orthogonal polynomials: Chebyshev, Hermite, Laguerre, etc. The quadratures we mention above, associated with Legendre polynomials, provide a high-order method for discretizing (i.e., interpolating) and integrating square-integrable functions on a finite interval. Legendre polynomials are the natural orthogonal polynomial basis for square-integrable functions on the interval \([-1,1]\), and the associated interpolation and quadrature formulae provide nearly optimal approximation tools for these functions, even if they are not, in fact, polynomials.
The following well-known lemma regarding interpolation using Legendre polynomials will be used in the numerical schemes discussed in this paper. A proof can be found in Stoer and Bulirsch (1992), for example.
Theorem 4
Let \(x_1,...,x_n\) be the order-n Gaussian nodes and \(w_1,...,w_n\) the associated order-n Gaussian weights. Then, there exists an \(n \times n\) matrix \(\varvec{\mathsf {M}}\) that maps a function tabulated at these Gaussian nodes to the corresponding Legendre expansion, i.e., the interpolating polynomial expressed in terms of Legendre polynomials. That is to say, defining \(\varvec{\mathsf {f}}\) by
the vector
are the coefficients of the order-n Legendre expansion p such that
where \(\alpha _i\) denotes the ith entry of the vector \({{\varvec{\alpha }}}\).
From a computational standpoint, algorithms for efficient evaluation of Legendre polynomials and Gaussian nodes and weights are available in standard software packages (e.g., Driscoll et al. 2014). Furthermore, the entries of the matrix \(\varvec{\mathsf {M}}\) can be computed directly via \(M_{i, j} = w_j P_{i-1}(x_j)\).
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Greengard, P., O’Neil, M. Efficient reduced-rank methods for Gaussian processes with eigenfunction expansions. Stat Comput 32, 94 (2022). https://doi.org/10.1007/s11222-022-10124-z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11222-022-10124-z