Background

Numerous studies have documented associations between long-term exposure to fine particulate matter air pollution (PM2.5, particles < 2.5 μm in aerodynamic diameter) and risk of mortality. Notable cohort studies have indicated that elevated PM2.5 exposures are associated with increased risks of all-cause and cardiopulmonary mortality [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25]. Several studies have estimated the association between PM2.5 and mortality while controlling for exposures to one or more co-pollutants, such as ozone (O3), nitrogen dioxide (NO2), and sulfur dioxide (SO2) [4, 5, 13, 20]. There remains a need for further multiple-pollutant analyses that control for other common air pollutants, including coarse fraction particulate matter (PM2.5–10, particles 2.5–10 μm in aerodynamic diameter) and carbon monoxide (CO).

Related to multiple-pollutant analyses are models that examine constituents of PM2.5 rather than aggregated PM2.5 treated as a single pollutant. The composition and toxicity of PM2.5 can vary substantially based on when and where it is sampled and the distance from the pollution source [26, 27]. Exposures that occur near a pollution source may include a larger fraction of primary combustion products (black carbon and primary organic aerosol) and other local sources (industrial and road dust). Alternatively, exposure may occur farther from the source, allowing a larger fraction of aged, agglomerated, and secondary particulate matter (sulfates, nitrates, and secondary organic aerosol). Are there differences in the PM2.5-mortality associations across spatial decompositions of PM2.5 pollution?

The composition of PM2.5 not only varies spatially, but may also vary temporally as sources of pollution change. Furthermore, ambient pollution levels change over time, and observed health effects of PM2.5 likely depend on the window of exposure assigned to individuals in the cohort. Therefore, an important question is, are there differences in observed PM2.5-mortality associations across time or for different windows of PM2.5 exposure?

This study uses a large, well-documented, and representative cohort of the U.S. [25] to pursue three primary objectives. First, investigate pollution-mortality associations with models that include multiple air pollutants. Second, explore differences in PM2.5-mortality associations across spatially-decomposed PM2.5 as an evaluation of whether the impact of PM2.5 depends on distance from pollution source. Third, estimate PM2.5-mortality associations in temporally-decomposed cohorts, allowing effect estimates to vary across time and for different choices of exposure window.

Methods

Study population

The cohort for this study was constructed using publicly-available National Health Interview Survey (NHIS) data from 1987 to 2014, linked with restricted-use geographic information and mortality follow-up through 2015. The sample was limited to NHIS respondents aged 18–84 at the time of survey for whom information was available regarding age, sex, race-ethnicity, income, education, marital status, smoking status, BMI, census tract, ambient air pollution, survey date, mortality status at the end of 2015, and date of death (if deceased at the end of 2015).

The NHIS is a household survey administered annually by the National Center for Health Statistics (NCHS) and designed to be representative of the civilian noninstitutionalized U.S. population [28]. Survey data were linked with the National Death Index for mortality follow-up through 2015 [29]. The construction of this cohort has been described in a previous study [25], where it was referred to as a “subcohort” of a larger NHIS cohort. This cohort, rather than the larger “full cohort” of the prior study, was chosen for the present analysis because it included information for smoking status and BMI. The NHIS design was altered periodically over the sample period, so some variables required harmonization. Data linkage was performed with permission and assistance from the NCHS. Further details on construction, harmonization, and data linkage for the NHIS cohort are documented elsewhere [25].

Air pollution data

Air pollution exposures were assigned to individuals based on their census tract of residence at the time of survey, using year-2000 Census tracts for individuals surveyed from 1987 to 2010 and year-2010 Census tracts for individuals surveyed from 2011 to 2014. Annual-average estimates of ambient air pollution were calculated for criteria pollutants (PM2.5, PM10, SO2, NO2, O3, and CO) using estimates from the v1 empirical models of Kim et al., 2018 [30], available at www.caces.us. These models employed regulatory monitoring and land-use data, and pollution estimates were calculated starting with the first year for which nationwide monitoring data were available for that pollutant (1979 for SO2, NO2, and O3; 1988 for PM10; 1990 for CO; and 1999 for PM2.5). In the case of O3, annual values are the mean for May through September of the daily maximum eight-hour moving average. O3 monitoring is not widely and routinely conducted from October through April since these months typically experience very low O3 concentrations. Estimates for each pollutant-year through 2015 were generated at the census-block level using year-2010 Census block centroids. Tract-level estimates for year-2000 Census tracts and year-2010 Census tracts were estimated by mapping year-2010 Census blocks to census tracts and then calculating a population-weighted average of the census blocks within a census tract. PM2.5 exposures prior to 1999 were estimated by multiplying a census tract’s PM10 value with the census tract’s mean PM2.5:PM10 ratio from 1999 to 2003, as explained elsewhere [25]. Values for PM2.5–10 were calculated by subtracting PM2.5 from PM10.

In addition, spatially-decomposed PM2.5 data were generated following an approach described elsewhere [26]. Briefly, a census block’s total ambient PM2.5 was decomposed into four components, depending on the spatial variance in PM2.5 surrounding the census block. Estimating spatial decompositions involved finding and subtracting the minimum PM2.5 values within circular buffers around each census block. First, the minimum PM2.5 for census block centroids within a 100 km radius of a given census block centroid was found, and this minimum was designated as regional (> 100 km) PM2.5. After subtracting regional PM2.5, the minimum PM2.5 within a 10 km radius of the census block centroid was found, and this value was designated as mid-range (10–100 km) PM2.5. Next, the minimum value within 1 km of the block centroid was similarly used to calculate neighborhood (1–10 km) PM2.5 by subtracting regional and mid-range PM2.5. Finally, the residual PM2.5 that remained after subtracting regional, mid-range, and neighborhood PM2.5 was called local (< 1 km) PM2.5. The process was repeated for each year-2010 Census block and for each year from 2000 through 2015. Values for census tracts were calculated using population-weighted averages of year-2010 Census blocks.

Statistical analyses

Statistical analyses were performed at the NCHS Research Data Center in Hyattsville, MD, using SAS (version 9.3; SAS Institute). Survival analyses were performed for all-cause and cardiopulmonary mortality, with cardiopulmonary mortality defined as mortality due to cardiovascular disease (ICD-10 codes: I00-I09, I11, I13, I20-I51), cerebrovascular disease (I60-I69), chronic lower respiratory disease (J40-J47), and influenza or pneumonia (J09-J18). Mortality hazard ratios (HRs) and 95% confidence intervals (CIs) were estimated using two versions of the Cox proportional hazards (PH) model. The first PH model, referred to as the basic PH model, controlled for age, sex, and race-ethnicity by allowing each combination of age (in one-year increments), sex, and race-ethnicity (Hispanic, non-Hispanic black, non-Hispanic white, other or unknown) its own baseline hazard function using the STRATA statement of the PHREG procedure in SAS. The second PH model, referred to as the complex PH model, controlled for age group (18–24 years and subsequent five-year age groups), sex, and race-ethnicity by including an indicator variable for each interaction of age group, sex, and race-ethnicity. The complex PH model was estimated using the SURVEYPHREG procedure in SAS, adjusting for the NHIS complex survey design, using reported survey stratum, primary sampling unit, and sample weight from mortality follow-up files [28].

Both PH models controlled for covariates by including indicator variables for each value of marital status (never married, married, separated, divorced, widowed), inflation-adjusted household income ($0–35,000; $35,000-50,000; $50,000-75,000; >$75,000), education (<high school graduate, high school graduate, some college, college graduate, >college graduate), smoking status (current, former, never), BMI (< 20, 20–25, 25–30, 30–35, > 35), U.S. Census region, urban versus rural designation, and survey year. Survival time was the number of days between survey and death. For all-cause mortality, censored survival time was the number of days between survey and mortality follow-up (31 Dec 2015). In models that considered cardiopulmonary mortality, censored survival time was the number of days between survey and mortality follow-up, or the number of days between survey and non-cardiopulmonary mortality. Pollution values were included as continuous variables in the regressions.

In models using criteria pollutants (PM2.5, PM2.5–10, SO2, NO2, O3, and CO), regressions included one, two, or six pollutants, and were estimated for both all-cause and cardiopulmonary mortality. One- and two-pollutant regression models used the basic PH model. For six-pollutant regression models, both the basic and complex PH models were employed to examine whether results were sensitive to adjusting for the NHIS complex survey design. Basic PH models were also used to estimate the associations between spatial decompositions of PM2.5 and risk of all-cause and cardiopulmonary mortality. Regressions were performed for each of the four decompositions individually and for models that included all four decompositions.

For temporally-decomposed analyses, the NHIS cohort was decomposed into 24 yearly cohorts (1992–2015), beginning in 1992 to allow up to a five-year lagged pollution-exposure window. An individual in the NHIS cohort was included in a particular year’s cohort if she was alive on 1 Jan and was surveyed by 31 Dec of that year. For example, the 1992 cohort included those surveyed before 1992 and alive on 1 Jan 1992. It also included those who were surveyed in 1992. For those who died in 1992, survival time was the number of days between 1 Jan 1992 and date of death (for individuals surveyed before 1992), or the number of days between survey date and date of death (for individuals surveyed in 1992). For those who did not die in 1992, censored survival time was the number of days between 1 Jan and 31 Dec (for individuals surveyed before 1992), or the number of days between survey date and 31 Dec (for individuals surveyed in 1992). Analogous cohorts were constructed for each year from 1993 to 2015. The construction of these yearly cohorts is illustrated in Additional file 1: Figure S1.

Complex PH regressions were performed for all-cause and cardiopulmonary mortality for each of the 24 temporally-decomposed cohorts. In each cohort, individuals were assigned a two-year (cohort year and previous year) and five-year (cohort year and four previous years) average of ambient PM2.5 using their census tract of residence at time of survey. In addition, age was adjusted to age in cohort year. Other covariates were not updated between cohorts. Meta-analytic fixed-effect estimates of the HR associated with a 10 μg/m3 increase in mean ambient PM2.5 were calculated for all-cause and cardiopulmonary mortality using estimates generated by the 24 yearly cohorts (Comprehensive Meta Analysis Ver. 3 Biostat Englewood, NJ).

Results

Table 1 presents summary statistics for the NHIS cohort. Table 2 provides summary statistics (mean, standard deviation, and interquartile range [IQR]) for the 17-year (1999–2015) averages of the six criteria pollutants (PM2.5, PM2.5–10, SO2, NO2, O3, and CO) and correlation coefficients between pollutants, within the NHIS cohort. Criteria pollutants were generally positively correlated, with the exception of PM2.5–10 and SO2 (see Table 2). Figure 1 presents heat maps for the six criteria pollutants across census tracts in the contiguous U.S.

Table 1 Baseline unweighted characteristics of the NHIS cohort
Table 2 Correlations (Pearson’s r) and summary statistics of criteria pollutants (1999–2015) in the NHIS cohort
Fig. 1
figure 1

Average concentrations of criteria pollutants by 2010 Census tracts in the continental U.S., 1999–2015. PM2.5, fine particulate matter (particles < 2.5 μm in aerodynamic diameter) in μg/m3; PM2.5–10, coarse fraction particulate matter (particles 2.5–10 μm in aerodynamic diameter) in μg/m3; SO2, sulfur dioxide in ppb; NO2, nitrogen dioxide in ppb; O3, ozone in ppb, mean for May–September of daily max of eight-hour moving average; CO, carbon monoxide in ppm

Figure 2 illustrates the HRs (and 95% CIs) estimated in regression models with the six criteria pollutants, using one-, two-, and six-pollutant models. HRs and CIs in Fig. 2 are presented relative to each pollutant’s IQR. Exposure to PM2.5 was consistently associated with increased risk of all-cause and cardiopulmonary mortality, and the PM2.5-mortality associations were statistically significant and insensitive to controlling for other pollutants. Exposures to PM2.5–10 and SO2 were also associated with increased mortality risk, including in six-pollutant models, but the associations were less robust. NO2, O3, and CO were not consistently linked with excess mortality risk. In models that controlled for PM2.5, exposures to NO2 were associated with reduced mortality risk. Furthermore, O3 was not associated with excess risk of all-cause mortality in six-pollutant models, and O3-mortality associations were marginally significant in six-pollutant cardiopulmonary regression models. Estimated HRs were not sensitive to using the complex PH regression model.

Fig. 2
figure 2

Illustration of regression results using 6 criteria pollutants, examining all-cause (left panel) and cardiopulmonary (right panel) mortality. Hazard ratios (and 95% CIs) were estimated using models that adjusted for age, sex, race-ethnicity, marital status, inflation-adjusted household income, education, smoking status, BMI, U.S. Census region, urban versus rural designation, and survey year. Hazard ratios are represented with circles when estimated using basic proportional hazards regressions, and with squares when estimated using complex proportional hazards (PH) regressions. Data used to generate plot are listed in Additional file 1 Table S1.

Because the IQR of PM2.5–10 (5.42 μg/m3) is larger than the IQR of PM2.5 (3.12 μg/m3), the pollution-mortality HRs associated with these two pollutants appear more similar in Fig. 2 than when scaled by 10 μg/m3. In the two-pollutant basic PH model with PM2.5 and PM2.5–10, the all-cause mortality HR associated with a 10 μg/m3 increase in PM2.5 is 1.12 (95% CI: 1.09, 1.15), whereas the HR associated with a 10 μg/m3 increase in PM2.5–10 is 1.02 (1.00, 1.04). Thus, when considered per 10 μg/m3, exposure to PM2.5 is associated with about six times greater excess risk than PM2.5–10.

Table 3 provides summary statistics and correlations for 16-year (2000–2015) averages of spatial decompositions of PM2.5 (local PM2.5, < 1 km; neighborhood PM2.5, 1–10 km; mid-range PM2.5, 10–100 km; regional PM2.5, > 100 km), within the NHIS cohort. Although local, neighborhood, and mid-range PM2.5 are somewhat correlated, regional PM2.5 is mostly uncorrelated with local PM2.5 and negatively correlated with neighborhood and mid-range PM2.5 (see Table 3). Table 3 reports large differences in the means and IQRs of the spatial decompositions of PM2.5.

Table 3 Correlations (Pearson’s r) and summary statistics for spatial decompositions of PM2.5 (2000–2015) in the NHIS cohort

Fig. 3 presents estimated HRs for all-cause and cardiopulmonary mortality from models including spatially-decomposed PM2.5. In the top panel, HRs are presented per 10 μg/m3 to assess the toxicity of spatial components of particulate matter. The same results are also presented as scaled by IQR (bottom panel) to account for differences in exposure variability across spatial decompositions of PM2.5. Regression results from models that included individual spatial decompositions were comparable to results from models that included all four spatial decompositions. Both types of model provide some evidence that local PM2.5 and neighborhood PM2.5 may be more toxic than mid-range and regional PM2.5.

Fig. 3
figure 3

Illustration of spatially-decomposed analyses, presented per 10 μg/m3 (top panel) and per IQR (bottom panel). Hazard ratios (and 95% CIs) were estimated using the basic proportional hazards regressions model which adjusted for age, sex, race-ethnicity, marital status, inflation-adjusted household income, education, smoking status, BMI, U.S. Census region, urban versus rural designation, and survey year. Local PM2.5, PM2.5 generated within 1 km of residence; neighborhood PM2.5, PM2.5 generated 1–10 km from residence; mid-range PM2.5, PM2.5 generated 10–100 km from residence; regional PM2.5, PM2.5 generated over 100 km from residence; IQR, interquartile range. Data used to generate plot are listed in Additional file 1 Table S2.

Fig. 4 presents results from the temporally-decomposed analysis. HRs for all-cause and cardiopulmonary mortality associated with a 10 μg/m3 increase in two-year mean PM2.5 are presented from regressions performed on the 24 temporally-decomposed cohorts. These PM2.5-mortality associations were consistent across follow-up years. Although PM2.5-mortality associations were generally not statistically significant for individual cohort years, meta-analytic estimates of pooled results were statistically significant. HRs from fixed-effect meta-analyses of HRs from the 24 cohorts are also presented for two- and five-year mean PM2.5 and for all-cause and cardiopulmonary mortality. HRs associated with two-year and five-year mean PM2.5 were nearly identical. Also presented are HRs from time-independent analyses which used the entire NHIS cohort and 17-year (1999–2015) or 28-year (1988–2015) mean PM2.5. HRs from meta-analyses of temporally-decomposed regressions were greater than HRs associated with 28-year mean PM2.5 but less than HRs associated with 17-year mean PM2.5.

Fig. 4
figure 4

Illustration of temporally-decomposed and related analyses. Hazard ratios (and 95% CIs) for temporally-decomposed cohort analyses estimated using the complex proportional hazards regression model adjusting for age, sex, race-ethnicity, marital status, inflation-adjusted household income, education, smoking status, BMI, U.S. Census region, urban versus rural designation, and survey year. Cardiopulmonary mortality is based on ICD-10 codes and includes: cardiovascular disease (I00-I09, I11, I13, I20-I51), cerebrovascular disease (I60-I69), chronic lower respiratory disease (J40-J47), and influenza and pneumonia (J09-J18). Data used to generate plot are listed in Additional file 1 Table S3.

aTime-independent estimate using 17-yr (1999–2015) mean PM2.5 in the complex proportional hazards regression model [25].

bTime-independent estimate using 28-yr (1988–2015) mean PM2.5 in the basic proportional hazards regression model, with back-casted PM2.5 data for 1988 through 1998 [25].

cCohort results using two- and five-year lagged PM2.5 were pooled using fixed-effect (FE) meta-analysis.

Discussion

This study advances our understanding of mortality risk associated with long-term exposure to PM2.5 in several ways. First, it illustrates that the PM2.5-mortality association within a large cohort is not highly sensitive to controlling for other air pollutants. Second, results from multiple-pollutant models report that, while mortality risk associated with PM2.5 exposure was the most prominent and robust result, exposures to elevated levels of SO2 and PM2.5–10 were also consistently linked to excess mortality risk. Third, regressions using spatially-decomposed PM2.5 suggest that more spatially variable components (< 10 km) of PM2.5 exposures may be more toxic. Fourth, mortality risk was significantly associated with all spatial decompositions of PM2.5, indicating that the PM2.5-mortality association within the U.S. is likely not the result of exclusively regional or local confounders. And fifth, the temporally-decomposed analysis indicates that PM2.5-mortality associations were largely consistent over time within the NHIS cohort, but provides incomplete evidence regarding the most relevant window of pollution exposure.

The robustness of the PM2.5-mortality association has been reported by various studies, including studies using two- or three-pollutant models [4, 5, 13, 20]. Our results regarding risks associated with other air pollutants, however, were less congruent with existing literature. For example, this study found a relatively stable association between PM2.5–10 and mortality, which contrasts with the lack of consistent associations in similar cohort studies [31]. Similarly, previous studies examining the effect of long-term O3 exposures reported results that remained significant when controlling for PM2.5 and NO2 [4, 13, 20], while this study found that the association was stable except in six-pollutant models. The mortality association with NO2 was extremely sensitive to the inclusion of other pollutants, especially PM2.5. Ultimately, the clearest signals emerging from multiple-pollutant regressions were that the PM2.5-mortality association was the most robust among these pollutants and that the mortality associations of other pollutants require further investigation.

The spatially-decomposed analyses are interesting because they provide insight into different components of PM2.5. PM2.5 is largely comprised of regional and mid-range components which are presumably dominated by secondary material (sulfates, nitrates, and secondary organic aerosol). The neighborhood and local components contribute a relatively small fraction of the PM2.5 mass (6 and 17% respectively) but are presumably more influenced by local emissions and therefore comprised of combustion emissions (black carbon and primary organic aerosol) and other local sources (industrial and road dust). As illustrated in Fig. 3, these results provide some evidence that local PM2.5 and neighborhood PM2.5 may be more strongly associated with mortality risk than regional PM2.5. Near-source PM2.5 was also more strongly associated with mortality risk than regional PM2.5 in another large U.S. cohort [32]. An implication of these results is that reliance on PM2.5-mortaltiy associations that are driven largely by regional differences in pollution may underestimate the health effects of exposure to local sources of pollution.

Strengths of the NHIS cohort have been described previously [25], which include the availability of detailed documentation, precise geographic information, large sample size, representativeness of U.S. adults, and individual-level controls for age, race-ethnicity, sex, smoking status, education, BMI, marital status, and income. Other strengths of this study include a) the robustness of the PM2.5-mortality association in multiple-pollutant models that included modeled air pollution estimates for six criteria pollutants. b) The ability to examine the stability of other pollutant-mortality associations in multiple-pollutant models. c) The use of spatially-decomposed PM2.5 data to investigate whether the toxicity of PM2.5 depended on proximity to source. d) Temporally-decomposed analyses which allowed exposures and mortality effects to vary between years and facilitated comparisons of different windows of exposure.

This study also has important limitations. Like all observational studies, it was hindered by a lack of random exposure assignment, meaning it was susceptible to confounders that were unobserved or inadequately controlled for. Another limitation was the lack of follow-up for most individual-level data, including residential census tract, smoking status, marital status, and income. In multiple-pollutant analyses, correlations among pollutants limit the ability to estimate independent associations between mortality risk and specific pollutants. For example, the correlation between PM2.5 and NO2 likely contributed to instability in the estimated effect of NO2 exposures; in models that controlled for PM2.5, NO2 was linked with decreased mortality risk. Similarly, in the temporally-decomposed analyses, the correlation of PM2.5 exposures over time made it difficult to determine the most relevant exposure window. In addition, the lack of variation in PM2.5-mortality associations between years may reflect a lack of independence between yearly cohorts, in which case the standard errors from fixed-effect meta-analytic estimates may be underestimated.

Conclusions

Associations between long-term exposure to PM2.5 air pollution and mortality risk were robust to controlling for co-pollutants, observed across different spatial decompositions of PM2.5, and consistent over temporal decompositions of PM2.5. There was some evidence of increased toxicity for PM2.5 exposures that occurred closer to pollution sources. Exposures to SO2 and PM2.5–10 were also linked to mortality risk, even when controlling for other air pollutants.