Abstract
In this paper, we propose a data-driven model selection approach for the nonparametric estimation of covariance functions under very general moments assumptions on the stochastic process. Observing i.i.d replications of the process at fixed observation points, we select the best estimator among a set of candidates using a penalized least squares estimation procedure with a fully data-driven penalty function, extending the work in Bigot et al. (Electron J Stat 4:822–855, 2010). We then provide a practical application of this estimate for a Kriging interpolation procedure to forecast rainfall data.
Similar content being viewed by others
References
Baraud Y (2000) Model selection for regression on a fixed design. Probab Theory Relat Fields 117(4):467–493
Bigot J, Biscay R, Loubes J-M, Muñiz-Alvarez L (2010) Nonparametric estimation of covariance functions by model selection. Electron J Stat 4:822–855
Bigot J, Biscay R, Loubes J-M, Alvarez LM (2011) Group lasso estimation of high-dimensional covariance matrices. J Mach Learn Res 12:3187–3225
Biscay R, Lescornel H, Loubes J-M (2012) Adaptive covariance estimation with model selection. Math Methods Stat 21:283–297
Cressie NAC (1993) Statistics for spatial data. Wiley Series in probability and mathematical statistics: applied probability and statistics. Wiley, New York. Revised reprint of the 1991 edition, A Wiley-Interscience Publication
Gendre X (2008) Simultaneous estimation of the mean and the variance in heteroscedastic Gaussian regression. Electron J Stat 2:1345–1372
Guillot G, Senoussi R, Monestiez P (2000) A positive definite estimator of the non stationary covariance of random fields. In: Monestiez P, Allard D, Froidevaux R (eds) GeoENV2000. Third European conference on geostatistics for environmental applications. Kluwer, Dordrecht
Hall P, Fisher N, Hoffmann B (1994) On the nonparametric estimation of covariance functions. Ann Statist 22(4):2115–2134
Krige DG (1951) A statistical approach to some basic mine valuation problems on the witwatersrand. J Chem Metall Min Soc S Afr 52(6):119–139
Matsuo T, Nychka D, Paul D (2011) Nonstationary covariance modeling for incomplete data: Monte Carlo EM approach. Comput Stat Data Anal 55:2059–2073
Ripley BD (2004) Spatial statistics. Wiley, Hoboken, NJ
Sampson PD, Guttorp P (1992) Nonparametric representation of nonstationary spatial covariance structure. J Am Stat Assoc 87:108–119
Seber GAF (2008) A matrix handbook for statisticians. Wiley Series in probability and statistics. Wiley-Interscience (Wiley), Hoboken, NJ
Shapiro A, Botha JD (1991) Variogram fitting with a general class of conditionally nonnegative definite functions. Comput Stat Data Anal 11:87–96
Stein ML (1999) Interpolation of spatial data. Some theory for kringing. Springer series in statistics, vol xvii, p 247. Springer, New York, NY
Petrov VV (1995) Limit theorems of probability theory. Sequences of independent random variables. Oxford studies in probability 4. Oxford Science Publications. The Clarendon Press, Oxford University Press, New York
von Bahr B, Esseen CG (1965) Inequalities for the \(r\)th absolute moment of a sum of random variables \(1\le r\le 2\). Ann Math Stat 36:299–303
Acknowledgments
The authors would like to thank the referees for their valuable comments.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
In this section we recall some of the inequalities used in the proof of our results. The next theorem is the Proposition 4.3 given in Bigot et al. (2010), which is a \(k\)-variate extension of the Corollary 5.1 in Baraud (2000) (which is recovered for the particular case \(k=1\)).
Theorem 5
Given \(N,k\in N\) , let \(\widetilde{\mathbf {A}}\in \mathbb {R} ^{Nk\times Nk}\backslash \{0\}\) be a non-negative definite and symmetric matrix, and let \({\varepsilon }_{1},\ldots ,{\varepsilon }_{N}\) be i.i.d random vectors in \(\mathbb {R}^{k}\), with \(\mathbb {E}(\mathbf { \varepsilon }_{1})=\mathbf {0}\) and \(\mathbb {V}({\varepsilon }_{1})= \varvec{\Phi }\). Denote \({\varepsilon }=({\varepsilon } _{1}^{\top },\ldots ,{\varepsilon }_{N}^{\top })^{\top }\), \(\zeta ( {\varepsilon })=\sqrt{{\varepsilon }^{\top }\widetilde{\mathbf { A}}{\varepsilon }}\), and \(\delta _{*}^{2}=\frac{\mathrm {Tr}\left( \widetilde{\mathbf {A}}(\mathbf {I}_{N}\otimes \varvec{\Phi })\right) }{ \mathrm {Tr}\left( \widetilde{\mathbf {A}}\right) }\). Then, for all \(p\ge 2\), such that \(\mathbb {E}\Vert {\varepsilon }_{1}\Vert _{l_{2}}^{p}<\infty \) it holds that for all \(x>0\),
where \(\rho \left( \widetilde{\mathbf {A}}\right) \) is the spectral norm of \( \widetilde{\mathbf {A}}\).
The following result is the Corollary 4.2 that appears in Bigot et al. (2010), which constitutes also a natural extension of Corollary 3.1 in Baraud (2000), providing a similar bound as in Gendre (2008).
Theorem 6
Let \(q>0\) be given such that there exists \(p>2(1+q)\) satisfying \( \mathbb {E}\Vert \varepsilon _{i}\Vert _{l_{2}}^{p}<\infty \). Then, for some constants \(K(\theta )>1\) we have that
where
Proposition 7
(Hermite Hadamard’s Inequality) For all convex functions \( f\!:\![a,b]\!\rightarrow \! \mathbb {R}\) is known that:
Now we recall two moment inequalities for sum of independent centered random variables, which are repeatedly used throughout this paper.
Theorem 8
(Rosenthal’s Inequality) Let \(U_{1},U_{2},\ldots U_{n}\) be independent centered random variables with values in \(\mathbb {R}\). Then for any \(p\ge 2\) we have:
For the proof of this inequality, we refer to Petrov (1995). The next result explores the case where \(p\in [1,2]\). To our knowledge the result is due to Bahr and Esseen (1965).
Theorem 9
Let \(U_{1},U_{2},\ldots ,U_{n}\) be independent centered random variables with values \(\mathbb {R}\). For any \(p\) with \(p\in [1,2]\) it holds that:
Rights and permissions
About this article
Cite this article
Biscay Lirio, R., Camejo, D.G., Loubes, JM. et al. Estimation of covariance functions by a fully data-driven model selection procedure and its application to Kriging spatial interpolation of real rainfall data. Stat Methods Appl 23, 149–174 (2014). https://doi.org/10.1007/s10260-013-0250-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10260-013-0250-7