Air Quality, Atmosphere & Health

, Volume 11, Issue 1, pp 11–22 | Cite as

Air pollutant exposure field modeling using air quality model-data fusion methods and comparison with satellite AOD-derived fields: application over North Carolina, USA

  • Ran Huang
  • Xinxin Zhai
  • Cesunica E. Ivey
  • Mariel D. Friberg
  • Xuefei Hu
  • Yang Liu
  • Qian Di
  • Joel Schwartz
  • James A. Mulholland
  • Armistead G. RussellEmail author


In order to generate air-pollutant exposure fields for health studies, a data fusion (DF) approach is developed that combines observations from ambient monitors and simulated data from the Community Multiscale Air Quality (CMAQ) model. These resulting fields capture the spatiotemporal information provided by the air quality model, as well as the finer temporal scale variations from the pollutant observations and decrease model biases. Here, the approach is applied to develop daily concentration fields for PM2.5 total mass, five major particulate species (OC, EC, SO4 2−, NO3 , and NH4 +), and three gaseous pollutants (CO, NO x , and NO2) from 2006 to 2008 over North Carolina (USA). Several data withholding methods are then conducted to evaluate the data fusion method, and the results suggest that typical approaches may overestimate the ability of spatiotemporal estimation methods to capture pollutant concentrations in areas with limited or no monitors. The results show improvements in capturing spatial and temporal variability compared with CMAQ results. Evaluation tests for PM2.5 led to an R 2 of 0.95 (no withholding) and 0.82 when using 10% random data withholding. If spatially based data withholding is used, the R 2 is 0.73. Comparisons of DF-developed PM2.5 total mass concentration with the spatiotemporal fields derived from two other methods (both use satellite aerosol optical depth (AOD) data) find that, in this case, the data fusion fields have slightly less overall error, with an RMSE of 1.28 compared with 3.06 μg/m3 (two-stage statistical model) and 2.74 (neural network-based hybrid model). Applying the Integrated Mobile Source Indicator (IMSI) method shows that the data fusion fields can be used to estimate mobile source impacts. Overall, the growing availability of chemically detailed air quality model fields and the accuracy of the DF field, suggest that this approach is better able to provide spatiotemporal pollutant fields for gaseous and speciated particulate pollutants for health and planning studies.


Ambient air pollution Spatiotemporal pollutant fields Data fusion CMAQ 



We gratefully acknowledge the USEPA, especially Valerie Garcia and K. Wyat Appel, for supplying CMAQ modeling results. The work of X. Hu and Y. Liu was supported by NASA Applied Sciences Program (grant numbers NNX11AI53G and NNX14AG01G, principal investigator: Liu). This publication was funded, in part, by USEPA grant number R834799. Its contents are solely the responsibility of the grantee and do not necessarily represent the official views of the US government. Further, the US government does not endorse the purchase of any commercial products or services mentioned in the publication. We also acknowledge the Southern Company and the Electric Power Research Institute (EPRI) for their support.

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

Supplementary material

11869_2017_511_MOESM1_ESM.doc (36 kb)
ESM 1 (DOC 35 kb)
11869_2017_511_MOESM2_ESM.doc (14 kb)
Table S1 (DOC 14 kb)
11869_2017_511_Fig7_ESM.gif (112 kb)
Fig. S1

PM2.5 monitor site (each color represents a spatially removed group) (GIF 111 kb)

11869_2017_511_Fig7_ESM.tif (212 kb)
High-resolution image (TIFF 212 kb)
11869_2017_511_Fig8_ESM.gif (208 kb)
Fig. S2

Probability density distribution of all species from 2006 to 2008 (GIF 208 kb)

11869_2017_511_Fig8_ESM.tif (688 kb)
High-resolution image (TIFF 688 kb)
11869_2017_511_Fig9_ESM.gif (660 kb)
Fig. S3a

Annual average spatial distributions fields from data fusion, 2006 (GIF 659 kb)

11869_2017_511_Fig9_ESM.tif (993 kb)
High-resolution image (TIFF 992 kb)
11869_2017_511_Fig10_ESM.gif (656 kb)
Fig. S3b

Annual average spatial distributions fields from data fusion, 2007 (GIF 656 kb)

11869_2017_511_Fig10_ESM.tif (976 kb)
High-resolution image (TIFF 976 kb)
11869_2017_511_Fig11_ESM.gif (161 kb)
Fig. S4

Normalized monthly average concentration for all species from 2006 to 2008 (GIF 161 kb)

11869_2017_511_Fig11_ESM.tif (466 kb)
High-resolution image (TIFF 466 kb)
11869_2017_511_Fig12_ESM.gif (21 kb)
Fig. S5

Annual trends of IMSIEB, IMSIEB, GV, and IMSIEB, DV from 2006 to 2008 (unitless) (GIF 21 kb)

11869_2017_511_Fig12_ESM.tif (92 kb)
High-resolution image (TIFF 92 kb)
11869_2017_511_Fig13_ESM.gif (219 kb)
Fig. S6

Annual IMSIEB, IMSIEB, GV, and IMSIEB, DV from 2006 to 2008 (GIF 219 kb)

11869_2017_511_Fig13_ESM.tif (928 kb)
High-resolution image (TIFF 927 kb)
11869_2017_511_Fig14_ESM.gif (87 kb)
Fig. S7

Temporal correlations between IMSI and PM2.5 concentrations from 2006 to 2008 (GIF 86 kb)

11869_2017_511_Fig14_ESM.tif (396 kb)
High-resolution image (TIFF 396 kb)
11869_2017_511_Fig15_ESM.gif (207 kb)
Fig. S8

Temporal correlations between PM2.5 and EC, CO, and NO x from 2006 to 2008 (GIF 207 kb)

11869_2017_511_Fig15_ESM.tif (2 mb)
High-resolution image (TIFF 2010 kb)
11869_2017_511_Fig16_ESM.gif (84 kb)
Fig. S9

Comparison of R 2 between observations and simulated datasets (CMAQ, data fusion and 10% data-withheld data fusion) for 2006–2008 (GIF 84 kb)

11869_2017_511_Fig16_ESM.tif (283 kb)
High-resolution image (TIFF 283 kb)
11869_2017_511_Fig17_ESM.gif (488 kb)
Fig. S10

Linear regression between observation (OBS) and simulations (CO, data fusion) (GIF 488 kb)

11869_2017_511_Fig17_ESM.tif (512 kb)
High-resolution image (TIFF 511 kb)
11869_2017_511_Fig18_ESM.gif (466 kb)
Fig. S11

Linear regression between observation (OBS) and simulations (NO2) (GIF 465 kb)

11869_2017_511_Fig18_ESM.tif (483 kb)
High-resolution image (TIFF 483 kb)
11869_2017_511_Fig19_ESM.gif (27 kb)
Fig. S12

Comparison of RMSE between observations and simulated datasets (CMAQ, data fusion, and 10% data-withheld data fusion) for 2006–2008 (μg/m3: PM25, EC, OC, NH4 +, NO3 , SO4 2−; ppb: NO2, NO x , CO) (GIF 27 kb)

11869_2017_511_Fig19_ESM.tif (284 kb)
High-resolution image (TIFF 284 kb)
11869_2017_511_Fig20_ESM.gif (81 kb)
Fig. S13a

Maximum RMSD between leave-out randomly (first time) and data fusion for all randomly leave 10% monitor-out from 2006 (left) to 2008 (right). (GIF 80 kb)

11869_2017_511_Fig20_ESM.tif (2.7 mb)
High-resolution image (TIFF 2784 kb)
11869_2017_511_Fig21_ESM.gif (80 kb)
Fig. S13b

Maximum RMSD between leave-out randomly (second time) and data fusion among all randomly leave 10% monitor-out groups from 2006 (left) to 2008 (right). (GIF 80 kb)

11869_2017_511_Fig21_ESM.tif (2.6 mb)
High-resolution image (TIFF 2641 kb)
11869_2017_511_Fig22_ESM.gif (82 kb)
Fig. S14

Maximum RMSD between leave-out spatially and data fusion among all spatially leave-out groups from 2006 (left) to 2008 (right) (GIF 81 kb)

11869_2017_511_Fig22_ESM.tif (2.7 mb)
High-resolution image (TIFF 2753 kb)
11869_2017_511_Fig23_ESM.gif (101 kb)
Fig. S15

Annual average spatial distributions fields from ordinary kriging (2006, 2007, 2008) (GIF 101 kb)

11869_2017_511_Fig23_ESM.tif (200 kb)
High-resolution image (TIFF 199 kb)
11869_2017_511_Fig24_ESM.gif (143 kb)
Fig. S16a

Linear regression between OBS and ordinary kriging (PM2.5, up: total data; done: leave-monitor-out results) (GIF 143 kb)

11869_2017_511_Fig24_ESM.tif (512 kb)
High-resolution image (TIFF 512 kb)
11869_2017_511_Fig25_ESM.gif (106 kb)
Fig. S16b

Linear regression between OBS and ordinary kriging (CO, left: total data; right: leave-one-out results) (GIF 106 kb)

11869_2017_511_Fig25_ESM.tif (1.1 mb)
High-resolution image (TIFF 1111 kb)
11869_2017_511_Fig26_ESM.gif (64 kb)
Fig. S17

Linear regression between observation (OBS) and neural network-based hybrid model (hybrid) (GIF 63 kb)

11869_2017_511_Fig26_ESM.tif (140 kb)
High-resolution image (TIFF 139 kb)
11869_2017_511_Fig27_ESM.gif (130 kb)
Fig. S18

Annual average spatial distributions fields from neural network-based hybrid model for PM2.5, 2006–2008 (12 km) (GIF 130 kb)

11869_2017_511_Fig27_ESM.tif (79 kb)
High-resolution image (TIFF 78 kb)
11869_2017_511_Fig28_ESM.gif (167 kb)
Fig. S19

Annual average spatial distributions fields from two-stage statistical model for PM2.5, 2006–2008 (12 km) (GIF 167 kb)

11869_2017_511_Fig28_ESM.tif (94 kb)
High-resolution image (TIFF 94 kb)
11869_2017_511_Fig29_ESM.gif (161 kb)
Fig. S20

Annual average spatial distributions fields from data fusion for PM2.5, 2006–2008 (12 km) (GIF 161 kb)

11869_2017_511_Fig29_ESM.tif (90 kb)
High-resolution image (TIFF 90 kb)


  1. Baek J, Hu Y, Odman MT, Russell AG (2011) Modeling secondary organic aerosol in CMAQ using multigenerational oxidation of semi-volatile organic compounds. J Geophys Res Atmos 116:D22204. CrossRefGoogle Scholar
  2. Beelen R, Hoek G, Pebesma E et al (2009) Mapping of background air pollution at a fine spatial scale across the European Union. Sci Total Environ 407:1852–1867. CrossRefGoogle Scholar
  3. Binkowski FS (2003) Models-3 Community Multiscale Air Quality (CMAQ) model aerosol component 1. Model description J Geophys Res 108:4183. CrossRefGoogle Scholar
  4. Byun D, Schere KL (2006) Review of the governing equations, computational algorithms, and other components of the models-3 Community Multiscale Air Quality (CMAQ) modeling system. Appl Mech Rev 59:51. CrossRefGoogle Scholar
  5. Carlton AG, Turpin BJ, Altieri KE et al (2008) CMAQ model performance enhanced when in-cloud secondary organic aerosol is included: comparisons of organic carbon predictions with measurements. Environ Sci Technol 42:8798–8802. CrossRefGoogle Scholar
  6. Chu S-H (2004) PM2.5 episodes as observed in the speciation trends network. Atmos Environ 38:5237–5246. CrossRefGoogle Scholar
  7. Cressie N (1988) Spatial prediction and ordinary kriging. Math Geol 20:405–421. CrossRefGoogle Scholar
  8. Deming WE (1943) Statistical adjustment of dataGoogle Scholar
  9. Di Q, Kloog I, Koutrakis P et al (2016) Assessing PM2.5 exposures with high spatiotemporal resolution across the continental United States. Environ Sci Technol 50:4712–4721. CrossRefGoogle Scholar
  10. Dionisio KL, Baxter LK, Burke J, Özkaynak H (2016) The importance of the exposure metric in air pollution epidemiology studies: when does it matter, and why? Air Qual Atmos Heal 9:495–502. CrossRefGoogle Scholar
  11. Friberg MD, Zhai X, Holmes HA et al (2016) Method for fusing observational data and chemical transport model simulations to estimate spatiotemporally resolved ambient air pollution. Environ Sci Technol 50:3695–3705. CrossRefGoogle Scholar
  12. Gertler AW (2005) Diesel vs. gasoline emissions: does PM from diesel or gasoline vehicles dominate in the US? Atmos Environ 39:2349–2355. CrossRefGoogle Scholar
  13. Gertler AW, Gillies JA, Pierson WR (2000) An assessment of the mobile source contribution to PM10 and PM2.5 in the United States. Water Air Soil Pollut 123:203–214. CrossRefGoogle Scholar
  14. Gilboa SM, Mendola P, Olshan AF et al (2005) Relation between ambient air quality and selected birth defects, seven county study, Texas, 1997–2000. Am J Epidemiol 162:238–252. CrossRefGoogle Scholar
  15. Gilliland AB, Hogrefe C, Pinder RW et al (2008) Dynamic evaluation of regional air quality models: assessing changes in O3 stemming from changes in emissions and meteorology. Atmos Environ 42:5110–5123. CrossRefGoogle Scholar
  16. Godowitch JM, Gilliam RC, Roselle SJ (2015) Investigating the impact on modeled ozone concentrations using meteorological fields from WRF with an updated four-dimensional data assimilation approach. Atmos Pollut Res 6:305–311. CrossRefGoogle Scholar
  17. Hu X, Waller LA, Al-Hamdan MZ et al (2013) Estimating ground-level PM(2.5) concentrations in the southeastern U.S. using geographically weighted regression. Environ Res 121:1–10. CrossRefGoogle Scholar
  18. Hu X, Waller LA, Lyapustin A et al (2014a) Estimating ground-level PM2.5 concentrations in the southeastern United States using MAIAC AOD retrievals and a two-stage model. Remote Sens Environ 140:220–232. CrossRefGoogle Scholar
  19. Hu Y, Balachandran S, Pachon JE et al (2014b) Fine particulate matter source apportionment using a hybrid chemical transport and receptor model approach. Atmos Chem Phys 14:5415–5431. CrossRefGoogle Scholar
  20. Hubbell B (2012) Understanding urban exposure environments: new research directions for informing implementation of U.S. air quality standards. Air Qual Atmos Heal 5:259–267. CrossRefGoogle Scholar
  21. Ivey CE, Holmes HA, Hu Y et al (2016) A method for quantifying bias in modeled concentrations and source impacts for secondary particulate matter. Front Environ Sci Eng 10:14. CrossRefGoogle Scholar
  22. Ivey CE, Holmes HA, Hu YT et al (2015) Development of PM2.5 source impact spatial fields using a hybrid source apportionment air quality model. Geosci Model Dev 8:2153–2165. CrossRefGoogle Scholar
  23. Jathar SH, Cappa CD, Wexler AS et al (2016) Simulating secondary organic aerosol in a regional air quality model using the statistical oxidation model—part 1: assessing the influence of constrained multi-generational ageing. Atmos Chem Phys 16:2309–2322. CrossRefGoogle Scholar
  24. Johnson M, Isakov V, Touma JS et al (2010) Evaluation of land-use regression models used to predict air quality concentrations in an urban area. Atmos Environ 44:3660–3668. CrossRefGoogle Scholar
  25. Kanaroglou PS, Jerrett M, Morrison J et al (2005) Establishing an air pollution monitoring network for intra-urban population exposure assessment: a location-allocation approach. Atmos Environ 39:2399–2409. CrossRefGoogle Scholar
  26. Kim S-Y, Yi S-J, Eum YS et al (2014) Ordinary kriging approach to predicting long-term particulate matter concentrations in seven major Korean cities. Environ Health Toxicol 29:e2014012. CrossRefGoogle Scholar
  27. Kim Y-M, Zhou Y, Gao Y et al (2015) Spatially resolved estimation of ozone-related mortality in the United States under two representative concentration pathways (RCPs) and their uncertainty. Clim Chang 128:71–84. CrossRefGoogle Scholar
  28. Lefohn AS, Knudsen HP, Logan JA et al (1987) An evaluation of the kriging method to predict 7-h seasonal mean ozone concentrations for estimating crop losses. JAPCA 37:595–602. CrossRefGoogle Scholar
  29. Liu Y, Koutrakis P, Kahn R et al (2012) Estimating fine particulate matter component concentrations and size distributions using satellite-retrieved fractional aerosol optical depth: part 2—a case study. J Air Waste Manage Assoc 57:1360–1369Google Scholar
  30. Liu Y, Sarnat JA, Kilaru V et al (2005) Estimating ground-level PM2.5 in the eastern United States using satellite remote sensing. Environ Sci Technol 39:3269–3278. CrossRefGoogle Scholar
  31. Malm WC, Sisler JF, Huffman D et al (1994) Spatial and seasonal trends in particle concentration and optical extinction in the United States. J Geophys Res 99:1347. CrossRefGoogle Scholar
  32. Marmur A, Unal A, Mulholland JA, Russell AG (2005) Optimization-based source apportionment of PM2.5 incorporating gas-to-particle ratios. Environ Sci Technol 39:3245–3254. CrossRefGoogle Scholar
  33. Matte TD, Cohen A, Dimmick F et al (2009) Summary of the workshop on methodologies for environmental public health tracking of air pollution effects. Air Qual Atmos Health 2:177–184. CrossRefGoogle Scholar
  34. McGuinn LA, Ward-Caviness C, Neas LM et al (2017) Fine particulate matter and cardiovascular disease: comparison of assessment methods for long-term exposure. Environ Res 159:16–23. CrossRefGoogle Scholar
  35. Pachon JE, Balachandran S, Hu Y et al (2012) Development of outcome-based, multipollutant mobile source indicators. J Air Waste Manage Assoc 62:431–442. CrossRefGoogle Scholar
  36. Pleim J, Gilliam R, Appel W, Ran L (2016) Recent advances in modeling of the atmospheric boundary layer and land surface in the coupled WRF-CMAQ model. Springer International Publishing, pp 391–396Google Scholar
  37. Pope CA, Ezzati M, Dockery DW (2009) Fine-particulate air pollution and life expectancy in the United States. N Engl J Med 360:376–386. CrossRefGoogle Scholar
  38. Qin M, Wang X, Hu Y et al (2015) Formation of particulate sulfate and nitrate over the Pearl River Delta in the fall: diagnostic analysis using the Community Multiscale Air Quality model. Atmos Environ 112:81–89. CrossRefGoogle Scholar
  39. Sampson PD, Richards M, Szpiro AA et al (2013) A regionalized national universal kriging model using partial least squares regression for estimating annual PM2.5 concentrations in epidemiology. Atmos Environ (1994) 75:383–392. CrossRefGoogle Scholar
  40. Sarnat SE, Coull BA, Schwartz J et al (2005) Factors affecting the association between ambient concentrations and personal exposures to particles and gases. Environ Health Perspect 114:649–654. CrossRefGoogle Scholar
  41. Solomon PA, Costantini M, Grahame TJ et al (2012) Air pollution and health: bridging the gap from sources to health outcomes: conference summary. Air Qual Atmos Heal 5:9–62. CrossRefGoogle Scholar
  42. Tang W, Cohan DS, Morris GA et al (2011) Influence of vertical mixing uncertainties on ozone simulation in CMAQ. Atmos Environ 45:2898–2909. CrossRefGoogle Scholar
  43. Van Donkelaar A, Martin RV, Park RJ et al (2007) Model evidence for a significant source of secondary organic aerosol from isoprene. Atmos Environ 41:1267–1274. CrossRefGoogle Scholar
  44. Wade KS, Mulholland JA, Marmur A et al (2006) Effects of instrument precision and spatial variability on the assessment of the temporal variation of ambient air pollution in Atlanta, Georgia. J Air Waste Manage Assoc 56:876–888. CrossRefGoogle Scholar
  45. Woody MC, Baker KR, Hayes PL et al (2016) Understanding sources of organic aerosol during CalNex-2010 using the CMAQ-VBS. Atmos Chem Phys 16:4081–4100. CrossRefGoogle Scholar
  46. Wyat Appel K, Bhave PV, Gilliland AB et al (2008) Evaluation of the Community Multiscale Air Quality (CMAQ) model version 4.5: sensitivities impacting model performance; part II—particulate matter. Atmos Environ 42:6057–6066. CrossRefGoogle Scholar
  47. Xiao X, Cohan DS, Byun DW, Ngan F (2010) Highly nonlinear ozone formation in the Houston region and implications for emission controls. J Geophys Res 115:D23309. CrossRefGoogle Scholar
  48. Yu S, Mathur R, Pleim J et al (2012) Comparative evaluation of the impact of WRF/NMM and WRF/ARW meteorology on CMAQ simulations for PM2.5 and its related precursors during the 2006 TexAQS/GoMACCS study. Atmos Chem Phys 12:4091–4106. CrossRefGoogle Scholar
  49. Zhang Y, Huang J-P, Henze DK, Seinfeld JH (2007) Role of isoprene in secondary organic aerosol formation on a regional scale. J Geophys Res 112:D20207. CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2017

Authors and Affiliations

  1. 1.Civil and Environmental Engineering DepartmentGeorgia Institute of TechnologyAtlantaUSA
  2. 2.Department of PhysicsUniversity of Nevada RenoRenoUSA
  3. 3.Rollins School of Public HealthEmory UniversityAtlantaUSA
  4. 4.Department of Environmental Health, Harvard T.H. Chan School of Public HeathHarvard UniversityBostonUSA

Personalised recommendations