Statistics and Computing

, Volume 12, Issue 4, pp 353–367 | Cite as

A Conditional Autoregressive Gaussian Process for Irregularly Spaced Multivariate Data with Application to Modelling Large Sets of Binary Data

  • A. N. Pettitt
  • I. S. Weir
  • A. G. Hart
Article

Abstract

A Gaussian conditional autoregressive (CAR) formulation is presented that permits the modelling of the spatial dependence and the dependence between multivariate random variables at irregularly spaced sites so capturing some of the modelling advantages of the geostatistical approach. The model benefits not only from the explicit availability of the full conditionals but also from the computational simplicity of the precision matrix determinant calculation using a closed form expression involving the eigenvalues of a precision matrix submatrix. The introduction of covariates into the model adds little computational complexity to the analysis and thus the method can be straightforwardly extended to regression models. The model, because of its computational simplicity, is well suited to application involving the fully Bayesian analysis of large data sets involving multivariate measurements with a spatial ordering. An extension to spatio-temporal data is also considered. Here, we demonstrate use of the model in the analysis of bivariate binary data where the observed data is modelled as the sign of the hidden CAR process. A case study involving over 450 irregularly spaced sites and the presence or absence of each of two species of rain forest trees at each site is presented; Markov chain Monte Carlo (MCMC) methods are implemented to obtain posterior distributions of all unknowns. The MCMC method works well with simulated data and the tree biodiversity data set.

Bayesian analysis binary data conditional autoregression Markov chain Monte Carlo multivariate data spatial statistics spatio-temporal 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Adler R.J. 1981. The Geometry of Random Fields. Wiley, Chichester.Google Scholar
  2. Belyaev Yu.K. 1961. Continuity and Hölder continous conditions for sample functions of stationary Gaussian processes. In: Proc. Fourth Berkeley Symp. Math. Statist. Prob. University of California Press, Berkely, Vol. 2, pp. 23-33.Google Scholar
  3. Besag J.E. 1974. Spatial interaction and the statistical analysis of lattice systems (with discussion). J. Roy. Statist. Soc. B 36: 192-236.Google Scholar
  4. Besag J. and Higdon D. 1999. Bayesian analysis of agricultural field experiments (with discussion). J. Roy. Statist. Soc. B 61: 691-746.Google Scholar
  5. Besag J.E. and Kooperberg C. 1995. On conditional and intrinsic autoregressions. Biometrika 82: 733-746.Google Scholar
  6. Best N.G., Cowles M.K., and Vines S.K. 1995. CODA Manual version 0.30. MRC Biostatistics Unit, Cambridge, UK.Google Scholar
  7. Chib S. and Greenberg E. 1998. Analysis of multivariate probit models. Biometrika 85: 347-361.Google Scholar
  8. Cressie N.A.C. 1993. Statistics for Spatial Data, Rev. Edition. Wiley, New York.Google Scholar
  9. Cressie N.A.C. and Huang H.-C. 1999. Classes of nonseparable, spatiotemporal stationary covarince functions. J. Am. Statist. Assoc. 94: 1330-1340.Google Scholar
  10. Dempster A.P. 1972. Covariance selection. Biometrics 28: 157-175.Google Scholar
  11. Diggle P.J., Tawn J.A., and Moyeed R.A. 1998. Model-based geostatistics (with discussion). Appl. Statist. 47: 299-350.Google Scholar
  12. Ecker M.D. and Gelfand A.E. 1997. Bayesian variogram modelling for an isotropic spatial process. J. Agric. Biol. Environ. Statist. 2: 347-369.Google Scholar
  13. Gilks W.R., Richardson S., and Spiegelhalter D.J. 1996. Markov Chain Monte Carlo in Practice. Chapman & Hall, London.Google Scholar
  14. Graybill F.A. 1983. Matrices with Applications in Statistics, 2nd Edition. Wadsworth, California.Google Scholar
  15. He Z. and Sun D. 2000. Hierarchical bayes estimation of hunting success rates with spatial correlations. Biometrics 56: 360-367.Google Scholar
  16. Heidelberger P. and Welch P. 1983. Simulation run length control in the presence of an initial transient. Operations Research 7: 493-497.Google Scholar
  17. Mardia K.V. 1988. Multi-dimensional multivariate Gaussian Markov random fields with application to image processing. J. Multivariate Analysis 24(2): 265-284.Google Scholar
  18. McCormack B. 1995. Timber inventory manual for the native forests of Queensland. Technical Report, Queensland Dept. Primary Industries-Forest Service, Brisbane, Queensland.Google Scholar
  19. Thompson J., Bean A., Dillewaard H., Sparshott K., Grimshaw P., Dowling R., Stephens K., Price R., and Stanley T. 1996. Methodolgy for vegetation survey and mapping for eastern Queensland. Technical Report, Queensland Dept. Environment and Heritage, Brisbane, Queensland.Google Scholar
  20. Weir I.S. and Pettitt A.N. 1999. Spatial modelling for binary data using a hidden conditional autoregressive Gaussian process: A multivariate extension of the probit model. Statistics and Computing 9: 77-86.Google Scholar
  21. Weir I.S. and Pettitt A.N. 2000. Binary probability maps using a hidden conditional autoregressive Gaussian process with an application to Finnish common toad data. Appl. Statist. 49: 473-484.Google Scholar
  22. Wikle C.R. and Cressie N. 1999. A dimension-reduced approach to space-time Kalman filtering. Biometrika 86: 815-829.Google Scholar

Copyright information

© Kluwer Academic Publishers 2002

Authors and Affiliations

  • A. N. Pettitt
  • I. S. Weir
  • A. G. Hart

There are no affiliations available

Personalised recommendations