, Volume 24, Issue 1, pp 1–28 | Cite as

Comparing and selecting spatial predictors using local criteria

Invited Paper


Remote sensing technology for the study of Earth and its environment has led to “Big Data” that, paradoxically, have global extent but may be spatially sparse. Furthermore, the variability in the measurement error and the latent process error may not fit conveniently into the Gaussian linear paradigm. In this paper, we consider the problem of selecting a predictor from a finite collection of spatial predictors of a spatial random process defined on \(D\), a subset of \(d\)-dimensional Euclidean space. Critically, we make no statistical distributional assumptions other than additive measurement error. In this nonparametric setting, one could use a criterion based on a validation dataset to select a spatial predictor for all of \(D\). Instead, we propose local criteria based on validation data to select a predictor at each spatial location in \(D\); the result is a hybrid combination of the spatial predictors, which we call a locally selected predictor (LSP). We consider selection from a collection of some of the classical and more recently proposed spatial predictors currently available. In a simulation study, the relative performances of various LSPs, as well as the performance of each of the individual spatial predictors in the collection, are assessed. “Big Data” are always challenging, and here we apply LSP to a very large global spatial dataset of atmospheric \(\mathrm {CO}_{2}\) measurements.


Information criteria Model averaging Model combination  Best linear unbiased predictor 

Mathematics Subject Classification

62M30 62H11 62B10 



We would like to thank the referees and editor for their helpful comments. Jonathan Bradley’s and Noel Cressie’s research was supported by NASA’s Earth Science Technology Office through its Advanced Information Systems Technology program Grant NNH08ZDA001N. Tao Shi’s research is partially supported by NSF grants DMS-1007060 and DMS-1308458.


  1. Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19:716–723CrossRefMATHMathSciNetGoogle Scholar
  2. Amari S, Murata N, Finke NMKR, Yang HH (1997) Asymptotic statistical theory of overtraining and cross-validation. IEEE Trans Neural Netw 8:985–996CrossRefGoogle Scholar
  3. Banerjee S, Carlin BP, Gelfand AE (2004) Hierarchical modeling and analysis for spatial data. Chapman and Hall, LondonMATHGoogle Scholar
  4. Banerjee S, Gelfand AE, Finley AO, Sang H (2008) Gaussian predictive process models for large spatial data sets. J Royal Stat Soc Ser B 70:825–848CrossRefMATHMathSciNetGoogle Scholar
  5. Bradley JR, Cressie N, Shi T (2011) Selection of rank and basis functions in the spatial random effects model. In: Proceedings of the 2011 joint statistical meetings, American Statistical Association, Alexandria, pp 3393–3406Google Scholar
  6. Bradley JR, Cressie N, Shi T (2012) Local spatial-predictor selection. In: Proceedings of the 2012 joint statistical meetings, American Statistical Association, Alexandria, pp 3098–3110Google Scholar
  7. Burnham KP, Anderson DR (1998) Model selection and multimodel inference, 2nd edn. Springer, New YorkCrossRefGoogle Scholar
  8. Chahine M, Pagano T, Aumann H, Atlas R, Barnet C, Blaisdell J, Chen L, Divakarla M, Fetzer E, Goldberg M, Gautier C, Granger S, Hannon S, Irion FW, Kakar R, Kalnay E, Lambrigtsen B, Lee S, Marshall JL, McMillian WW, McMillin L, Olsen ET, Revercomb H, Rosenkranz P, Smith WL, Staelin D, Strow LL, Susskind J, Tobin D, Wolf W, Zhou L (2006) AIRS: improving weather forecasting and providing new data on greenhouse gases. Bull Am Meteorol Soc 87:911–926CrossRefGoogle Scholar
  9. Chen CS, Huang HC (2011a) Geostatistical model averaging based on conditional information criteria. Environ Ecol Stat 19:23–35CrossRefGoogle Scholar
  10. Chen C-S, Huang H-C (2011b) An improved Cp criterion for spline smoothing. J Stat Plan Inference 141:445–452CrossRefMATHGoogle Scholar
  11. Chen Y-P, Huang H-C, Tu I-P (2010) A new approach for selecting the number of factors. Comput Stat Data Anal 54:2990–2998CrossRefMATHMathSciNetGoogle Scholar
  12. Cressie N (1990) The origins of kriging. Math Geol 22:239–252CrossRefMATHMathSciNetGoogle Scholar
  13. Cressie N (1993) Statistics for spatial data, rev edn. Wiley, New YorkGoogle Scholar
  14. Cressie N, Johannesson G (2006) Spatial prediction for massive data sets. In: Proceedings of Australian academy of science Elizabeth and Frederick White conference. Australian Academy of Science, Canberra, pp 1–11Google Scholar
  15. Cressie N, Johannesson G (2008) Fixed rank kriging for very large spatial data sets. J Royal Stat Soc Ser B 70:209–226CrossRefMATHMathSciNetGoogle Scholar
  16. Cressie N, Shi T, Kang EL (2010) Using temporal variability to improve spatial mapping with application to satellite data. Can J Stat 38:271–289CrossRefMATHMathSciNetGoogle Scholar
  17. Cressie N, Wikle CK (2011) Statistics for spatio-temporal data. Wiley, HobokenMATHGoogle Scholar
  18. Donoho D, Johnstone I (1994) Ideal spatial adaptation by wavelet shrinkage. Biometrika 81:425–455CrossRefMATHMathSciNetGoogle Scholar
  19. Efron B (1983) Estimating the error rate of a prediction rule: improvement on cross-validation. J Am Stat Assoc 78:316–331CrossRefMATHMathSciNetGoogle Scholar
  20. Efron B (1986) How biased is the apparent error rate of a prediction rule? J Am Stat Assoc 81:461–470CrossRefMATHMathSciNetGoogle Scholar
  21. Efron B (2004) The estimation of prediction error: covariance penalties and cross-validation. J Am Stat Assoc 99:619–642CrossRefMATHMathSciNetGoogle Scholar
  22. Finley AO, Banerjee S, Carlin B (2012) Package ‘spBayes’. http://cran.r-project.org/web/packages/spBayes/spBayes.pdf, retrieved Jan 2013
  23. Finley AO, Sang H, Banerjee S, Gelfand AE (2009) Improving the performance of predictive process modeling for large datasets. Comput Stat Data Anal 53:2873–2884CrossRefMATHMathSciNetGoogle Scholar
  24. Greven S, Kneib T (2010) On the behaviour of marginal and conditional AIC in linear mixed models. Biometrika 97:773–789CrossRefMATHMathSciNetGoogle Scholar
  25. Guyon I (1997) A scaling law for the validation-set training-set size ratio, technical report. Bell Laboratories, BerkeleyGoogle Scholar
  26. Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction. Springer, New YorkCrossRefGoogle Scholar
  27. Huang HC, Chen CS (2007) Optimal geostatistical model selection. J Am Stat Assoc 102:1009–1024CrossRefMATHGoogle Scholar
  28. Katzfuss M, Cressie N (2011a) Spatio-temporal smoothing and EM estimation for massive remote-sensing data sets. J Time Ser Anal 32:430–446CrossRefMATHMathSciNetGoogle Scholar
  29. Katzfuss M, Cressie N (2011b) Tutorial on fixed rank kriging (FRK) of \({\rm CO}_2\) data. In: Proceedings of technical report, report no 858, Department of Statistics, The Ohio State University, Columbus. http://www.stat.osu.edu/sses/papers.html
  30. Konishi S, Kitagawa G (1996) Generalised information criteria in model selection. Biometrika 83:875–890CrossRefMATHMathSciNetGoogle Scholar
  31. Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22:79–86CrossRefMATHMathSciNetGoogle Scholar
  32. Lai R, Huang H-C, Lee T (2012) Fixed and random effects selection in nonparametric additive mixed models. Electron J Stat 6:810–842CrossRefMATHMathSciNetGoogle Scholar
  33. Landgrebe DA (2003) Signal theory methods in multispectral remote sensing. Wiley, HobokenCrossRefGoogle Scholar
  34. Larsen J, Goutte C (1999) On optimal data split for generalization estimation and model selection. In: Proceedings IEEE workshop on neural networks for signal processing. IEEE Press, New York, pp 225–234Google Scholar
  35. Liang H, Wu H, Zou G (2008) A note on conditional AIC for linear mixed-effects models. Biometrika 95:773–778CrossRefMATHMathSciNetGoogle Scholar
  36. Lindgren F, Rue H, Lindström J (2011) An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. J Royal Stat Soc Ser B 73:423–498CrossRefMATHGoogle Scholar
  37. Mallows CL (1973) Some comments on Cp. Technometrics 15:661–675MATHGoogle Scholar
  38. Matheron G (1963) Principles of geostatistics. Econ Geol 58:1246–1266Google Scholar
  39. Müller S, Scealy JL, Welsh AH (2013) Model selection in linear mixed models. Stat Sci 28:135–167Google Scholar
  40. Nguyen H, Cressie N, Braverman A (2012) Spatial statistical data fusion for remote sensing applications. J Am Stat Assoc 107:1004–1018CrossRefMATHMathSciNetGoogle Scholar
  41. Nychka D, Bandyopadhyay S, Hammerling D, Lindgren F, Sain S (2014) A multi-resolution Gaussian process model for the analysis of large spatial data sets. J Comput Gr Stat. doi: 10.1080/10618600.2014.914946
  42. Nychka DW (2014) Spatial process estimates as smoothers. In: Schmiek MG (ed) Smoothing and regression: approaches, computation and applications, rev edn. Wiley, New York, pp 393–424Google Scholar
  43. Raftery A, Madigan D, Hoeting J (1997) Bayesian model averaging for linear regression models. J Am Stat Assoc 92:179–191CrossRefMATHMathSciNetGoogle Scholar
  44. Ribeiro PJ Jr, Diggle PJ (2012) Package ‘geoR’. http://cran.r-project.org/web/packages/geoR/geoR.pdf, retrieved Nov 2012
  45. Ripley BD (1996) Pattern recognition and neural networks. Press Syndicate of the University of Cambridge, New YorkCrossRefMATHGoogle Scholar
  46. Ronchetti E (1997) Robustness aspects of model choice. Statistica Sinica 7:327–338MATHMathSciNetGoogle Scholar
  47. Ronchetti E, Staudte R (1994) A robust version of Mallow’s Cp. J Am Stat Assoc 89:550–559MATHMathSciNetGoogle Scholar
  48. Royle JA, Wikle CK (2005) Efficient statistical mapping of avian count data. Environ Ecol Stat 12:225–243CrossRefMathSciNetGoogle Scholar
  49. Rue H (2012) The R-INLA project. http://www.r-inla.org/, retrieved Nov 2012
  50. Rue H, Martino S, Chopin N (2009) Approximate Bayesian inference for latent Gaussian models using integrated nested Laplace approximations. J Royal Stat Soc Ser B 71:319–392CrossRefMATHMathSciNetGoogle Scholar
  51. Schabenberger O, Gotway C (2005) Statistical methods for spatial data analysis. CRC Press, Boca RatonGoogle Scholar
  52. Schwarz GE (1978) Estimating the dimension of a model. Ann Stat 6:461–464CrossRefMATHGoogle Scholar
  53. Shao J (1997) An asymptotic theory for linear model selection. Statistica Sinica 7:221–264MATHMathSciNetGoogle Scholar
  54. Shi T, Cressie N (2007) Global statistical analysis of MISR aerosol data: a massive data product from NASA’s Terra satellite. Environmetrics 18:665–680CrossRefMathSciNetGoogle Scholar
  55. Stein C (1981) Estimation of the mean of the multivariate normal distribution. Ann Stat 9:1135–1151CrossRefMATHGoogle Scholar
  56. Vaida F, Blanchard S (2005) Conditional Akaike information for mixed-effects models. Biometrika 92:351–370CrossRefMATHMathSciNetGoogle Scholar
  57. Wahba G (1990) Spline models for observational data. Society for Industrial and Applied Mathematics, PhiladelphiaCrossRefMATHGoogle Scholar
  58. Wikle CK (2010) Low-rank representations for spatial processes. In: Gelfand AE, Diggle PJ, Fuentes M, Guttorp P (eds) Handbook of spatial statistics. CRC Press, Boca Raton, pp 107–118Google Scholar
  59. Zhu J, Huang H-C, Reyes P (2010) On selection of spatial linear models for lattice data. J Royal Stat Soc Ser B 72:389–402CrossRefMathSciNetGoogle Scholar

Copyright information

© Sociedad de Estadística e Investigación Operativa 2014

Authors and Affiliations

  1. 1.Department of StatisticsUniversity of MissouriColumbiaUSA
  2. 2.National Institute for Applied Statistics Research Australia (NIASRA), School of Mathematics and Applied StatisticsUniversity of WollongongWollongongAustralia
  3. 3.Department of StatisticsThe Ohio State UniversityColumbusUSA

Personalised recommendations