Environmental and Ecological Statistics

, Volume 14, Issue 3, pp 323–340 | Cite as

Bayesian entropy for spatial sampling design of environmental data

  • Montserrat Fuentes
  • Arin Chaudhuri
  • David M. Holland
Article

Abstract

We develop a spatial statistical methodology to design national air pollution monitoring networks with good predictive capabilities while minimizing the cost of monitoring. The underlying complexity of atmospheric processes and the urgent need to give credible assessments of environmental risk create problems requiring new statistical methodologies to meet these challenges. In this work, we present a new method of ranking various subnetworks taking both the environmental cost and the statistical information into account. A Bayesian algorithm is introduced to obtain an optimal subnetwork using an entropy framework. The final network and accuracy of the spatial predictions is heavily dependent on the underlying model of spatial correlation. Usually the simplifying assumption of stationarity, in the sense that the spatial dependency structure does not change location, is made for spatial prediction. However, it is not uncommon to find spatial data that show strong signs of nonstationary behavior. We build upon an existing approach that creates a nonstationary covariance by a mixture of a family of stationary processes, and we propose a Bayesian method of estimating the associated parameters using the technique of Reversible Jump Markov Chain Monte Carlo. We apply these methods for spatial prediction and network design to ambient ozone data from a monitoring network in the eastern US.

Keywords

Bayesian inference Matérn covariance National air quality standards Nonstationarity Simulated annealing Spatial statistics 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abramowitz M and Stegun IA (1964). Handbook of mathematical functions. Dover, New York Google Scholar
  2. Bras RL and Rodriquez-Iturbe I (1976). Network design for the estimation of areal mean of rainfall events. Water Resour Res 12: 1185–1195 Google Scholar
  3. Bernardo JM (1979). Expected information as expected utility. Ann Stat 7: 686–690 Google Scholar
  4. Brooks SP and Guidici P (2000). MCMC convergence assessment via two-way ANOVA. J Comput Graph Stat 9: 266–285 CrossRefGoogle Scholar
  5. Caselton WF and Zidek JV (1984). Optimal monitoring network designs. Stat Probab Lett 2: 223–227 CrossRefGoogle Scholar
  6. Cressie N, Gotway CA and Grondona MO (1990). Spatial prediction from networks. Chemometrics Intell Lab Sys 7: 251–272 CrossRefGoogle Scholar
  7. Fuentes M (2001). A new high frequency kriging approach for nonstationary environmental processes. Envirometrics 12: 469–483 CrossRefGoogle Scholar
  8. Fuentes M (2002). Modeling and prediction of nonstationary spatial processes. Stat Model 2: 281–298 CrossRefGoogle Scholar
  9. Fuentes M, Smith R (2001) A new class of nonstationary models. Tech. report at North Carolina State University, Institute of Statistics Mimeo Series #2534Google Scholar
  10. Gelfand AE and Smith AFM (1990). Sampling-based approaches to calculating margnal densities. J Am Stat Assoc 85: 398–409 CrossRefGoogle Scholar
  11. Green PJ (1995). Reversible jump Markov Chain Monte Carlo computation and bayesian model determination. Biometrika 82: 711–732 CrossRefGoogle Scholar
  12. Guttorp P, Le ND, Sampson PD and Zidek JV (1993). Using entropy in the redesign of an environmental monitoring network. In: Patil, GP and Rao, CR (eds) Multivariate environmental statistics, pp 173–202. North-Holland, Amsterdam Google Scholar
  13. Ko CW, Lee J and Queyranane M (1995). An exact algrotihm for maximum entropy sampling. Operat Res 43: 684–691 CrossRefGoogle Scholar
  14. Lindley DV (1956). On a measure of the information provided by an experiment. Ann Math Stat 27: 986–1005 Google Scholar
  15. Matérn B (1960) Spatial variation. Meddelanden fran Statens Skogsforskningsinstitut. (Almaenna Foerlaget, Stockholm. Second edition 91986), (vol 49, Issue 5). Springer-Verlag, BerlinGoogle Scholar
  16. Müller P (1999). Simulated-based optimal design. Bayesian Stat 6: 459–474 Google Scholar
  17. Müller WG and Zimmerman DL (1999). Optimal designs for variogram estimation. Environmetrics 10: 23–27 CrossRefGoogle Scholar
  18. Nychka D and Saltzman N (1998). Design of air quality networks. In: Nychka, D, Piegorsch, W and Cox, LH (eds) Case studies in environmental statistics, Lecture Notes in Statistics number 132, pp 51–76. Apringer Verlag, New York Google Scholar
  19. Theil J and Fiebig DG (1984). Exploiting continuity: maximum entropy estimation of continuous distributions. Ballinger Publishing Company, Cambridge Google Scholar
  20. U. S. Environmental Protection Agency (2003) National air quality and emissions trends report, 2003 special studies edition. U. S. Environmental Protection Agency, Office of Air Quality Planning and Standards, Research Triangle Park, NC 27711, EPA 454/R-03-005Google Scholar
  21. Warrick AW and Myers DE (1987). Optimization of sampling locations for variogram calculations. Water Resour Res 23: 496–500 Google Scholar
  22. Wikle CK and Royle JA (1999). Space-time dynamic design of environmental monitoring networks. J Agric Biol Environ Stati 4: 489–507 CrossRefGoogle Scholar
  23. Yfantis EA, Flatman GT and Behar JV (1987). Efficiency of kriging estimation for square, triangular, and hexagonal grids. Math Geol 19: 183–205 CrossRefGoogle Scholar
  24. Zidek J, Sun W and Le N (2000). Designing and integrating composite networks for monitoring multivariate Gaussian pollution fields. Appl Stat 49: 63–79 Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2007

Authors and Affiliations

  • Montserrat Fuentes
    • 1
  • Arin Chaudhuri
    • 2
  • David M. Holland
    • 3
  1. 1.Department of StatisticsNorth Carolina State University (NCSU)RaleighUSA
  2. 2.SASCaryUSA
  3. 3.U.S. Environmental Protection AgencyResearch Triangle ParkUSA

Personalised recommendations