Modeling Complex Spatial Dependencies: Low-Rank Spatially Varying Cross-Covariances With Application to Soil Nutrient Data

  • Rajarshi Guhaniyogi
  • Andrew O. FinleyEmail author
  • Sudipto Banerjee
  • Richard K. Kobe


Advances in geo-spatial technologies have created data-rich environments which provide extraordinary opportunities to understand the complexity of large and spatially indexed data in ecology and the natural sciences. Our current application concerns analysis of soil nutrients data collected at La Selva Biological Station, Costa Rica, where inferential interest lies in capturing the spatially varying relationships among the nutrients. The objective here is to interpolate not just the nutrients across space, but also associations among the nutrients that are posited to vary spatially. This requires spatially varying cross-covariance models. Fully process-based specifications using matrix-variate processes are theoretically attractive but computationally prohibitive. Here we develop fully process-based low-rank but non-degenerate spatially varying cross-covariance processes that can effectively yield interpolate cross-covariances at arbitrary locations. We show how a particular low-rank process, the predictive process, which has been widely used to model large geostatistical datasets, can be effectively deployed to model non-degenerate cross-covariance processes. We produce substantive inferential tools such as maps of nonstationary cross-covariances that constitute the premise of further mechanistic modeling and have hitherto not been easily available for environmental scientists and ecologists.

Key Words

Gaussian spatial process MCMC Nonstationarity Predictive process Tropical soil nutrients 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Apanasovich, T. V., and Genton, M. G. (2010), “Cross-Covariance Functions for Multivariate Random Fields Based on Latent Dimensions,” Biometrika, 97, 15–30. MathSciNetCrossRefzbMATHGoogle Scholar
  2. Banerjee, S., Carlin, B. P., and Gelfand, A. E. (2004), Hierarchical Modeling and Analysis for Spatial Data, Boca Raton: Chapman and Hall/CRC Press. zbMATHGoogle Scholar
  3. Banerjee, S., and Johnson, G. A. (2006), “Coregionalized Single- and Multi-Resolution Spatially-Varying Growth Curve Modelling With Application to Weed Growth,” Biometrics, 61, 617–625 MathSciNetCrossRefGoogle Scholar
  4. Banerjee, S., Gelfand, A. E., Finley, A. O., and Sang, H. (2008), “Gaussian Predictive Process Models for Large Spatial Datasets,” Journal of the Royal Statistical Society, Series B, 70, 825–848. MathSciNetCrossRefzbMATHGoogle Scholar
  5. Banerjee, S., Finley, A. O., Waldmann, P., and Ericcson, T. (2010), “Hierarchical Spatial Process Models for Multiple Traits in Large Genetic Trials,” Journal of the American Statistical Association, 105, 506–521. MathSciNetCrossRefGoogle Scholar
  6. Cressie, N. (1993), Statistics for Spatial Data (2nd ed.), New York: Wiley. Google Scholar
  7. Cressie, N., and Johannesson, G. (2008), “Fixed Rank Kriging for Very Large Spatial Data Sets,” Journal of the Royal Statistical Society, Series B, 70, 209–226. MathSciNetCrossRefzbMATHGoogle Scholar
  8. Cressie, N. A. C., and Wikle, C. K. (2011), Statistics for Spatio-Temporal Data, New York: Wiley. zbMATHGoogle Scholar
  9. Daniels, M. J., and Kass, R. E. (1999), “Nonconjugate Bayesian Estimation of Covariance Matrices and Its Use in Hierarchical Models,” Journal of the American Statistical Association, 94, 1254–1263. MathSciNetCrossRefzbMATHGoogle Scholar
  10. Diez, J. M., and Pulliam, H. R. (2007), “Hierarchical Analysis of Species Distributions and Abundance Across Environmental Gradients,” Ecology, 88, 3144–3152. CrossRefGoogle Scholar
  11. Finley, A. O., Banerjee, S., and McRoberts, R. E. (2009), “Hierarchical Spatial Models for Predicting Tree Species Assemblages Across Large Domains,” Annals of Applied Statistics, 3, 1052–1079. MathSciNetCrossRefzbMATHGoogle Scholar
  12. Finley, A. O., Banerjee, S., Ek, A. R., and McRoberts, R. E. (2008), “Bayesian Multivariate Process Modeling for Prediction of Forest Attributes,” Journal of Agricultural, Biological, and Environmental Statistics, 13, 60–83. MathSciNetCrossRefGoogle Scholar
  13. Finley, A. O., Sang, H., Banerjee, S., and Gelfand, A. E. (2009), “Improving the Performance of Predictive Process Modeling for Large Datasets,” Computational Statistics & Data Analysis, 53, 2873–2884. MathSciNetCrossRefzbMATHGoogle Scholar
  14. Finzi, A. C., van Breemen, N., and Canham, C. D. (1998), “Canopy Tree-Soil Interactions Within Temperate Forests: Species Effects on pH and Base Cations,” Ecological Applications, 8, 447–454. Google Scholar
  15. Gelfand, A. E., and Banerjee, S. (2010), “Multivariate Spatial Process Models,” in Handbook of Spatial Statistics, eds. A. E. Gelfand, P. Diggle, P. Guttorp, and M. Fuentes, Boca Raton: Taylor and Francis/CRC, pp. 495–516. CrossRefGoogle Scholar
  16. Gelfand, A. E., and Ghosh, S. K. (1998), “Model Choice: A Minimum Posterior Predictive Loss Approach,” Biometrika, 85, 1–11. MathSciNetCrossRefzbMATHGoogle Scholar
  17. Gelfand, A. E., Schmidt, A. M., Banerjee, S., and Sirmans, C. F. (2004), “Nonstationary Multivariate Process Modeling Through Spatially Varying Coregionalization” (with discussion), Test, 13, 263–312. MathSciNetCrossRefzbMATHGoogle Scholar
  18. Gelman, A., and Rubin, D. (1992), “Inference From Iterative Simulation Using Multiple Sequences,” Statistical Science, 7, 457–511. CrossRefGoogle Scholar
  19. Gelman, A., Carlin, J. B., Stern, H. S., and Rubin, D. B. (2004), Bayesian Data Analysis (2nd ed.), Boca Raton: Chapman and Hall/CRC Press. zbMATHGoogle Scholar
  20. Gneiting, T., and Guttorp, P. (2010), “Continuous-Parameter Stochastic Process Theory,” in Handbook of Spatial Statistics, eds. A. E. Gelfand, P. Diggle, P. Guttorp, and M. Fuentes, Boca Raton: Taylor and Francis/CRC, pp. 17–28. CrossRefGoogle Scholar
  21. Gneiting, T., Kleiber, W., and Schlather, M. (2010), “Matérn Cross-Covariance Functions for Multivariate Random Fields,” Journal of the American Statistical Association, 105, 1167–1177. MathSciNetCrossRefGoogle Scholar
  22. Guhaniyogi, R., Finley, A. O., Banerjee, S., and Gelfand, A. E. (2011), “Adaptive Gaussian Predictive Process Models for Large Spatial Datasets,” Environmetrics, 22, 997–1007. MathSciNetCrossRefGoogle Scholar
  23. Harville, D. A. (1997), Matrix Algebra From a Statistician’s Perspective, New York: Springer. CrossRefzbMATHGoogle Scholar
  24. Henderson, H. V., and Searle, S. R. (1981), “On Deriving the Inverse of a Sum of Matrices,” SIAM Review, 23, 53–60. MathSciNetCrossRefzbMATHGoogle Scholar
  25. Hodges, J. S., and Reich, B. J. (2010), “Adding Spatially-Correlated Errors Can Mess up the Fixed Effect You Love,” American Statistician, 64, 335–344. MathSciNetCrossRefGoogle Scholar
  26. Holste, E. K., Kobe, R. K., and Vriesendorp, C. F. (2011), “Seedling Growth Responses to Soil Nutrients in a Wet Tropical Forest Understory,” Ecology, 92, 1828–1838. CrossRefGoogle Scholar
  27. Houlton, B. Z., Wang, Y. P., Vitousek, P. M., and Field, C. B. (2008), “A Unifying Framework for Dinitrogen Fixation in the Terrestrial Biosphere,” Nature, 454, 327–331. CrossRefGoogle Scholar
  28. Kang, E. L., and Cressie, N. (2011), “Bayesian Inference for the Spatial Random Effects Model,” Journal of the American Statistical Association, 106, 972–983. MathSciNetCrossRefzbMATHGoogle Scholar
  29. Kobe, R. K., and Vriesendorp, C. F. (2009), “Size of Sampling Unit Strongly Influences Detection of Seedling Limitation in a Wet Tropical Forest,” Ecology Letters, 12, 220–228. CrossRefGoogle Scholar
  30. Majumdar, A., Paul, D., and Bautista, D. (2010), “A Generalized Convolution Model for Multivariate Nonstationary Spatial Processes,” Statistica Sinica, 20, 675–695. MathSciNetzbMATHGoogle Scholar
  31. McCarthy-Neumann, S., and Kobe, R. K. (2010), “Conspecific Plant-Soil Feedbacks Reduce Survivorship and Growth of Tropical Tree Seedlings,” Journal of Ecology, 98, 396–407. CrossRefGoogle Scholar
  32. Ovaskainen, O., Hottola, J., and Siitonen, J. (2010), “Modeling Species Co-occurrence by Multivariate Logistic Regression Generates New Hypotheses on Fungal Interactions,” Ecology, 9, 2414–2521. Google Scholar
  33. Paciorek, C. J. (2010), “The Importance of Scale for Spatial-Confounding Bias and Precision of Spatial Regression Estimators,” Statistical Science, 107–125. Google Scholar
  34. Pourahmadi, M. (1999), “Joint Mean-Covariance Model With Applications to Longitudinal Data: Unconstrained Parameterisation,” Biometrika, 86, 677–690. MathSciNetCrossRefzbMATHGoogle Scholar
  35. Rao, C. R. (1973), Linear Statistical Inference and Its Applications (2nd ed.), New York: Wiley. CrossRefzbMATHGoogle Scholar
  36. Robert, C. P., and Casella, G. (2010), An Introduction to Monte Carlo Methods With R, New York: Springer. CrossRefGoogle Scholar
  37. Roberts, G. O., and Rosenthal, J. S. (2009), “Examples of Adaptive MCMC,” Journal of Computational and Graphical Statistics, 18, 349–367. MathSciNetCrossRefGoogle Scholar
  38. Royle, J. A., and Berliner, L. M. (1999), “A Hierarchical Approach to Multivariate Spatial Modeling and Prediction,” Journal of Agricultural, Biological, and Environmental Statistics, 4, 29–56. MathSciNetCrossRefGoogle Scholar
  39. Sang, H., Jun, M., and Huang, J. Z. (2011), “Covariance Approximation for Large Multivariate Spatial Data Sets With an Application to Multiple Climate Model Errors,” Annals of Applied Statistics, 4, 2519–2548. MathSciNetCrossRefGoogle Scholar
  40. Stein, M. L. (1999), Interpolation of Spatial Data: Some Theory of Kriging, New York: Springer. CrossRefzbMATHGoogle Scholar
  41. — (2008), “A Modeling Approach for Large Spatial Datasets,” Journal of the Korean Statistical Society, 37, 3–10. MathSciNetCrossRefzbMATHGoogle Scholar
  42. Townsend, A. R., Asner, G. P., and Cleveland, C. C. (2008), “The Biogeochemical Heterogeneity of Tropical Soils,” Trends in Ecology & Evolution, 23, 424–431. CrossRefGoogle Scholar
  43. Wackernagel, H. (2006), Multivariate Geostatistics: An Introduction With Applications (3rd ed.), New York: Springer. Google Scholar
  44. Waddle, J. H., Dorazio, R. M., Walls, S. C., Rice, K. G., Beauchamp, J., Schuman, M. J., and Mazzotti, F. J. (2010), “A New Parameterization for Estimating Co-occurrence of Interacting Species,” Ecological Applications, 20, 1467–1475. CrossRefGoogle Scholar
  45. Walker, T. W., and Syers, J. K. (1976), “The Fate of Phosphorus During Pedogenesis,” Geoderma, 15, 1–19. CrossRefGoogle Scholar
  46. Wardle, D. A., Walker, L. R., and Bardgett, R. D. (2004), “Ecosystem Properties and Forest Decline in Contrasting Long-Term Chronosequences,” Science, 305, 509–512. CrossRefGoogle Scholar
  47. Yaglom, A. M. (1987), Correlation Theory of Stationary and Related Random Functions, Vol. I, New York: Springer. Google Scholar
  48. Zhang, H. (2007), “Maximum-Likelihood Estimation for Multivariate Spatial Linear Coregionalization Models,” Environmetrics, 18, 125–139. MathSciNetCrossRefGoogle Scholar

Copyright information

© International Biometric Society 2013

Authors and Affiliations

  • Rajarshi Guhaniyogi
    • 1
  • Andrew O. Finley
    • 2
    Email author
  • Sudipto Banerjee
    • 3
  • Richard K. Kobe
    • 4
  1. 1.Department of Statistical ScienceDuke UniversityDurhamUSA
  2. 2.Departments of Forestry and GeographyMichigan State UniversityEast LansingUSA
  3. 3.Division of Biostatistics, School of Public HealthUniversity of MinnesotaMinneapolisUSA
  4. 4.Department of ForestryMichigan State UniversityEast LansingUSA

Personalised recommendations