Advertisement

Environmental and Ecological Statistics

, Volume 21, Issue 3, pp 411–433 | Cite as

A flexible spatio-temporal model for air pollution with spatial and spatio-temporal covariates

  • Johan LindströmEmail author
  • Adam A. Szpiro
  • Paul D. Sampson
  • Assaf P. Oron
  • Mark Richards
  • Tim V. Larson
  • Lianne Sheppard
Article

Abstract

The development of models that provide accurate spatio-temporal predictions of ambient air pollution at small spatial scales is of great importance for the assessment of potential health effects of air pollution. Here we present a spatio-temporal framework that predicts ambient air pollution by combining data from several different monitoring networks and deterministic air pollution model(s) with geographic information system covariates. The model presented in this paper has been implemented in an R package, SpatioTemporal, available on CRAN. The model is used by the EPA funded Multi-Ethnic Study of Atherosclerosis and Air Pollution (MESA Air) to produce estimates of ambient air pollution; MESA Air uses the estimates to investigate the relationship between chronic exposure to air pollution and cardiovascular disease. In this paper we use the model to predict long-term average concentrations of \(\text {NO}_{x}\) in the Los Angeles area during a 10 year period. Predictions are based on measurements from the EPA Air Quality System, MESA Air specific monitoring, and output from a source dispersion model for traffic related air pollution (Caline3QHCR). Accuracy in predicting long-term average concentrations is evaluated using an elaborate cross-validation setup that accounts for a sparse spatio-temporal sampling pattern in the data, and adjusts for temporal effects. The predictive ability of the model is good with cross-validated \(R^2\) of approximately \(0.7\) at subject sites. Replacing four geographic covariate indicators of traffic density with the Caline3QHCR dispersion model output resulted in very similar prediction accuracy from a more parsimonious and more interpretable model. Adding traffic-related geographic covariates to the model that included Caline3QHCR did not further improve the prediction accuracy.

Keywords

Air pollution Cross-validation NO\(_{x}\) Spatio-temporal data Unbalanced data 

Notes

Acknowledgments

Although the research described in this article has been funded wholly or in part by the United States Environmental Protection Agency through assistance agreement CR-834077101-0 and grant RD831697 to the University of Washington, it has not been subjected to the Agency’s required peer and policy review and therefore does not necessarily reflect the views of the Agency and no official endorsement should be inferred. Travel for Johan Lindström has been paid by STINT (The Swedish Foundation for International Cooperation in Research and Higher Education) Grant IG2005-2047 and the Royal Physiographic Society in Lund.

References

  1. Appel KW, Bhave PV, Gilliland AB, Sarwar G, Roselle SJ (2008) Evaluation of the community multiscale air quality (CMAQ) model version 4.5: sensitivities impacting model performance; part II—particulate matter. Atmos Environ 42(24):6057–6066Google Scholar
  2. Banerjee S, Carlin BP, Gelfand AE (2004) Hierarchical modeling and analysis for spatial data. Chapman & Hall/CRC, LondonGoogle Scholar
  3. Banerjee S, Gelfand AE, Finley AO, Sang H (2008) Gaussian predictive process models for large spatial data sets. J R Stat Soc Ser B 70:825–848CrossRefGoogle Scholar
  4. Basu R, Woodruff TJ, Parker JD, Saulnier L, Schoendorf KC (2000) Particulate air pollution and mortality: findings from 20 U.S. cities. N Engl J Med 343(24):1742–1749CrossRefGoogle Scholar
  5. Berrocal VJ, Gelfand AE, Holland DM (2010) A spatio-temporal downscaler for output from numerical models. J Agric Bio Environ Stat 15(2):176–197Google Scholar
  6. Bild DE, Bluemke DA, Burke GL, Detrano R, Diez Roux AV, Folsom AR, Greenland P, Jacobs DR Jr, Kronmal R, Liu K, Nelson JC, O’Leary D, Saad MF, Shea S, Szklo M, Tracy RP (2002) Multi-ethnic study of atherosclerosis: objectives and design. Am J Epidemiol 156(9):871–881Google Scholar
  7. Brauer M, Hoek G, van Vliet P, Meliefste K, Fischer P, Gehring U, Heinrich J, Cyrys J, Bellander T, Lewne M, Brunekreef B (2003) Estimating long-term average particulate air pollution concentrations: application of traffic indicators and geographic information systems. Epidemiology 14(2):228–239PubMedGoogle Scholar
  8. Byrd R, Lu P, Nocedal J, Zhu C (1995) A limited memory algorithm for bound constrained optimization. SIAM J Sci Comput 16:1190–1208CrossRefGoogle Scholar
  9. Calder CA (2008) A dynamic process convolution approach to modeling ambient particulate matter concentrations. Environmetrics 19(1):39–48CrossRefGoogle Scholar
  10. Cameletti M, Lindgren F, Simpson D, Rue H (2013) Spatio-temporal modeling of particulate matter concentration through the SPDE approach. AStA Adv Stat Anal 97(2):109–131CrossRefGoogle Scholar
  11. Carroll RJ, Ruppert D, Stefanski LA, Crainiceanu CM (2006) Measurement error in nonlinear models: a modern perspective, 2nd edn. Chapman and Hall, CRC, LondonCrossRefGoogle Scholar
  12. Cohen MA, Adar SD, Allen RW, Avol E, Curl CL, Gould T, Hardie D, Ho A, Kinney P, Larson TV, Sampson PD, Sheppard L, Stukovsky KD, Swan SS, Liu LJS, Kaufman JD (2009) Approach to estimating participant pollutant exposures in the multi-ethnic study of atherosclerosis and air pollution (MESA air). Environ Sci Technol 43(13):4687–4693PubMedCentralPubMedCrossRefGoogle Scholar
  13. Cressie N (1993) Statistics for spatial data, revised edition. Wiley, LondonGoogle Scholar
  14. Cressie N, Johannesson G (2008) Fixed rank kriging for very large spatial data sets. J R Stat Soc Ser B 70(1):209–226CrossRefGoogle Scholar
  15. Cressie N, Wikle CK (2011) Statistics for spatio-temporal data. Wiley, LondonGoogle Scholar
  16. De Iaco S, Posa D (2012) Predicting spatio-temporal random fields: some computational aspects. Comput Geosci 41:12–24CrossRefGoogle Scholar
  17. Dockery DW, Pope CA, Xu X, Spangler JD, Ware JH, Fay ME, Ferris BG, Speizer FE (1993) An association between air pollution and mortality in six cities. N Engl J Med 329(24):1753–1759PubMedCrossRefGoogle Scholar
  18. Eckhoff P, Braverman T (1995) Addendum to the user’s guide to CAL3QHC version 2.0 (CAL3QHCR user’s guide). Technical report, US Environmental Protection Agency, Office of Air Quality Planning and Standards, Research Triangle Park, NC, USAGoogle Scholar
  19. Fanshawe TR, Diggle PJ, Rushton S, Sanderson R, Lurz PWW, Glinianaia SV, Pearce MS, Parker L, Charlton M, Pless-Mulloli T (2008) Modelling spatio-temporal variation in exposure to particulate matter: a two-stage approach. Environmetrics 19(6):549–566CrossRefGoogle Scholar
  20. Finley AO, Banerjee S, Gelfand AE (2012) Bayesian dynamic modeling for large space-time datasets using gaussian predictive processes. J Geogr Syst 14(1):29–47CrossRefGoogle Scholar
  21. Fuentes M, Raftery AE (2005) Model evaluation and spatial interpolation by Bayesian combination of observations with outputs from numerical models. Biometrics 61(1):34–45CrossRefGoogle Scholar
  22. Fuentes M, Guttorp P, Sampson PD (2006) Using transforms to analyze space-time processes. In: Finkenstädt B, Held L, Isham V (eds) Statistical methods for spatio-temporal systems. Chapman & Hall/CRC, London, pp 77–150Google Scholar
  23. Furrer R, Genton MG, Nychka D (2006) Covariance tapering for interpolation of large spatial datasets. J Comput Graph Stat 15(3):502–523CrossRefGoogle Scholar
  24. Gamerman D (2010) Dynamic spatial models including spatial time series. In: Gelfand AE, Diggle P, Guttorp P, Fuentes M (eds) Handbook of spatial statistics. Chapman & Hall/CRC, London, pp 437–448CrossRefGoogle Scholar
  25. Gneiting T, Guttorp P (2010) Continuous parameter spatio-temporal processes. In: Gelfand AE, Diggle P, Guttorp P, Fuentes M (eds) Handbook of Spatial Statistics. Chapman & Hall/CRC, London, pp 427–436CrossRefGoogle Scholar
  26. Gryparis A, Paciorek CJ, Zeka A, Schwartz J, Coull BA (2009) Measurement error caused by spatial misalignment in environmental epidemiology. Biostatistics 10(2):258–274PubMedCentralPubMedCrossRefGoogle Scholar
  27. Harville DA (1997) Matrix algebra from a statistician’s perspective, 1st edn. Springer, BerlinCrossRefGoogle Scholar
  28. Hoek G, Beelen R, de Hoogh K, Vienneau D, Gulliver J, Fischer P, Briggs D (2008) A review of land-use regression models to assess spatial variation of outdoor air pollution. Atmos Environ 42(3):7561–7578CrossRefGoogle Scholar
  29. Hogrefe C, Porter P, Gego E, Gilliland A, Gilliam R, Swall J, Irwin J, Rao S (2006) Temporal features in observed and simulated meteorology and air quality over the Eastern United States. Atmos Environ 40(26):5041–5055CrossRefGoogle Scholar
  30. Jerrett M, Arain A, Kanaroglou P, Beckerman B, Potoglou D, Sahsuvaroglu T, Morrison J, Giovis C (2005) A review and evaluation of intraurban air pollution exposure models. J Expo Anal Environ Epidemiol 15:185–204PubMedCrossRefGoogle Scholar
  31. Kang EL, Cressie N, Shi T (2010) Using temporal variability to improve spatial mapping with application to satellite data. Can J Stat 38(2):271–289CrossRefGoogle Scholar
  32. Kaufman JD, Adar SD, Allen RW, Barr RG, Budoff MJ, Burke GL, Casillas AM, Cohen MA, Curl CL, Daviglus ML, Roux AVD, Jacobs DR, Kronmal RA, Larson TV, Liu SLJ, Lumley T, Navas-Acien A, O’Leary DH, Rotter JI, Sampson PD, Sheppard L, Siscovick DS, Stein JH, Szpiro AA, Tracy RP (2012) Prospective study of particulate air pollution exposures, subclinical atherosclerosis, and clinical cardiovascular disease: the multi-ethnic study of atherosclerosis and air pollution (MESA air). Am J Epidemiol 176(9):825–837PubMedCentralPubMedCrossRefGoogle Scholar
  33. Lindgren F, Rue H, Lindström J (2011) An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. J R Stat Soc Ser B 73(4):423–498CrossRefGoogle Scholar
  34. Lindström J, Szpiro AA, Sampson PD, Sheppard L, Oron A, Richards M, Larson T (2011) Incorporating output from source dispersion models into the spatio-temporal modelling of outdoor pollutant concentrations. Technical report. Working paper 370, UW Biostatistics working paper series. http://www.bepress.com/uwbiostat/paper370
  35. Mercer LD, Szpiro AA, Sheppard L, Lindström J, Adar SD, Allen RW, Avol EL, Oron AP, Larson T, Liu LJS, Kaufman JD (2011) Comparing universal kriging and land-use regression for predicting concentrations of gaseous oxides of nitrogen (\(\text{ NO }_{x}\)) for the multi-ethnic study of atherosclerosis and air pollution (MESA air). Atmos Environ 45(26):4412–4420PubMedCentralCrossRefGoogle Scholar
  36. Miller KA, Sicovick DS, Sheppard L, Shepherd K, Sullivan JH, Anderson GL, Kaufman JD (2007) Long-term exposure to air pollution and incidence of cardiovascular events in women. N Engl J Med 356(5):447–458PubMedCrossRefGoogle Scholar
  37. Paciorek CJ (2010) The importance of scale for spatial-confounding bias and precision of spatial regression estimators. Stat Sci 25(1):107–125PubMedCentralPubMedCrossRefGoogle Scholar
  38. Paciorek CP, Yanosky JD, Puett RC, Laden F, Suh HH (2009) Practical large-scale spatio-temporal modeling of particulate matter concentrations. Ann Stat 3(1):370–397Google Scholar
  39. Pinheiro J, Bates D (2009) Mixed-effects models in S and S-PLUS. Statistics and computing. Springer, BerlinGoogle Scholar
  40. Pope CA, Thun MJ, Namboodiri MM, Dockery DW, Evans JS, Speizer FE, Heath CW Jr (1995) Particulate air pollution as a predictor of mortality in a prospective study of U.S. adults. Am J Respir Crit Care Med 151:669–674PubMedGoogle Scholar
  41. Pope CA, Burnett RT, Thun MJ, Calle EE, Krewski D, Ito K, Thurston GD (2002) Lung cancer, cardiopulmonary mortality, and long-term exposure to fine particulate air pollution. J Am Med Assoc 9(287):1132–1141CrossRefGoogle Scholar
  42. Puett RC, Hart JE, Yanosky JD, Paciorek CJ, Schwartz J, Suh H, Speizer FE, Laden F (2009) Chronic fine and coarse particulate exposure, mortality and coronary heart disease in the nurses’ health study. Environ Health Perspect 117:1697–1701PubMedCentralPubMedCrossRefGoogle Scholar
  43. R Development Core Team (2008) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org. ISBN 3-900051-07-0
  44. Sahu SK, Gelfand AE, Holland D (2006) Spatio-temporal modeling of fine particulate matter. J Agric Bio Environ Stat 11(1):61–86CrossRefGoogle Scholar
  45. Sampson PD, Szpiro AA, Sheppard L, Lindström J, Kaufman JD (2011) Pragmatic estimation of a spatio-temporal air quality model with irregular monitoring data. Atmos Environ 45(36):6593–6606CrossRefGoogle Scholar
  46. Sheppard L, Burnett RT, Szpiro AA, Kim SY, Jerrett M, Pope CA III, Brunekreef B (2012) Confounding and exposure measurement error in air pollution epidemiology. Air Qual Atmos Health 5(2):203–216PubMedCentralPubMedCrossRefGoogle Scholar
  47. Smith RL, Kolenikov S, Cox LH (2003) Spatio-temporal modeling of PM2.5 data with missing values. J Geophys Res 108(D24):9004Google Scholar
  48. Stein ML, Chi Z, Welty LJ (2004) Approximating likelihoods for large spatial data sets. J R Stat Soc Ser B 66(2):275–296CrossRefGoogle Scholar
  49. Szpiro AA, Sampson PD, Sheppard L, Lumley T, Adar S, Kaufman J (2010) Predicting intra-urban variation in air pollution concentrations with complex spatio-temporal dependencies. Environmetrics 21(6):606–631Google Scholar
  50. Szpiro AA, Paciorek C, Sheppard L (2011a) Does more accurate exposure prediction necessarily improve health effect estimates? Epidemiology 22(5):680–685PubMedCentralPubMedCrossRefGoogle Scholar
  51. Szpiro AA, Sheppard L, Lumley T (2011b) Efficient measurement error correction with spatially misaligned data. Biostatistics 12(4):610–623PubMedCentralPubMedCrossRefGoogle Scholar
  52. US Census Bureau (2002) UA census 2000 TIGER/line files technical documentation. Technical report, US Census Bureau, Washington, DC. https://www.census.gov/geo/www/tiger/tigerua/ua2ktgr.pdf
  53. Wilton D, Szpiro AA, Gould T, Larson T (2010) Improving spatial concentration estimates for nitrogen oxides using a hybrid meteorological dispersion/land use regression model in Los Angeles, CA and Seattle. WA. Sci Total Environ 408(5):1120–1130CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Johan Lindström
    • 1
    • 2
    Email author
  • Adam A. Szpiro
    • 1
  • Paul D. Sampson
    • 1
  • Assaf P. Oron
    • 1
  • Mark Richards
    • 1
  • Tim V. Larson
    • 1
  • Lianne Sheppard
    • 1
  1. 1.University of WashingtonSeattleUSA
  2. 2.Lund UniversityLundSweden

Personalised recommendations