Introduction

Exposure to ambient particulate matter (PM) air pollution has been consistently associated with increased mortality and morbidity in epidemiological studies [1–7]. Most recent epidemiological studies focus on the mass concentration of particles of a specific size distribution, for example, PM less than 2.5 μm (PM2.5) in aerodynamic diameter; however, PM is a spatially and temporally varying mixture of chemical constituents including elemental carbon (EC) and organic carbon (OC) species, ions such as sulfate and nitrate, and elements such as silicon and nickel [8]. The spatial and temporal variation in PM composition may be driving previously observed spatial and temporal differences in estimated health effects of total PM mass [1, 4, 9]. Since studies of both short-term and long-term exposures to PM constituents have demonstrated that associations with adverse health outcomes vary by constituent [10–13], determining which chemical constituents of PM are most harmful is necessary in order to develop better strategies to protect human health.

Large-scale, epidemiological studies of PM constituents and health frequently use ambient, fixed-location monitors to assign exposure and then estimate associations with adverse health outcomes using single pollutant regression models, which is the same approach developed for and applied in epidemiological studies of total PM mass and other major ambient air pollutants. While this approach provides a general framework for estimating associations between air pollution and health, studies of PM constituents and health are faced with particular challenges. First, ambient monitors measuring PM constituents are particularly sparse across space and typically do not measure concentrations daily [14]. This presents problems for population-based epidemiological studies, which aim to assign exposures to populations spread over large areas (e.g., counties or cities), larger than reasonably represented by available monitoring data given the spatial heterogeneity of many PM constituents [15]; lack of daily monitoring data also limits investigation of multiday, lagged effects of PM constituent exposures. Second, PM constituents generated by the same source are often correlated with each other and also PM constituents, by definition fractions of total PM, may be correlated with total PM mass [8]. Thus, it can be difficult to disentangle the health effects of one PM constituent from the health effects of other PM constituents or total PM mass in epidemiological studies. Last, other sources of measurement error, such as error related to instruments or sampling may be greater for measurements of PM constituents than for total PM mass concentrations. As an example, many PM constituents (such as metals and specific organic species) contribute minimally to total PM by mass [8] and low concentrations of these constituents often cannot be confidently differentiated from zero when they fall below their method detection limits (MDL) [16–18]. Two previous papers outlined some of the challenges present in epidemiological studies of PM constituents [19, 20]. A review of current methods and challenges is needed as the number of studies of PM constituents and health has increased in recent years.

In this review, we summarize a literature search of how recent large, population-based epidemiological studies address these challenges and we highlight where further development of statistical methods is most needed. While our focus is large-scale epidemiological studies, smaller studies such as panel studies have a different set of challenges that would be useful to review in future work. We do not review the substantive findings of these epidemiological studies of PM constituents and health, which have been previously summarized elsewhere [21, 22].

Literature Search

We systematically reviewed large, epidemiological studies of the associations between short-term and long-term exposures to PM constituents and adverse health outcomes. In our main search, we identified 1872 citations in PubMed through 31 March 2015 that contained at least one term from each of four categories: PM, constituents, health, and study design (Table 1). After reviewing 1872 citations, we identified 747 relevant, peer-reviewed, epidemiological studies of PM constituents and health. Studies were primarily excluded because they did not analyze PM constituents or health. We limited our review to only large (>500 individuals), population-based studies of multiple PM constituents. We did not review panel studies because they frequently are able to better characterize pollution exposure and they have a different set of challenges than those facing larger epidemiological studies. The process of our literature review is depicted in Fig. 1. Our review identified 15 cohort studies of long-term exposure to PM constituents, 77 time series and case-crossover studies of short-term exposure to PM constituents, and 11 birth cohort studies. From all these studies, we determined the statistical methods commonly used to address major challenges including assigning exposure in the presence of spatial and temporal heterogeneity, disentangling health effects of individual constituents, and instrument-related measurement error. Because of the large number of relevant studies identified (n = 103), we summarize recent studies that exemplify unique approaches to address these challenges in Table 2.

Table 1 Terms used to search PubMed for epidemiological studies of particulate matter constituents and adverse health outcomes
Fig. 1
figure 1

Depiction of literature search conducted on PubMed for epidemiological studies of particulate matter constituents and adverse health outcomes through 31 March 2015

Table 2 Summary of current methods used to handle different challenges in studies of particulate matter (PM) constituents and adverse health outcomes

Assigning Exposure in the Presence of Spatial and Temporal Heterogeneity

Cohort, time series, and case-crossover studies rely on spatial and temporal variability in pollutant concentrations to estimate associations with adverse health outcomes. Cohort studies frequently estimate long-term exposure to pollution (e.g., over years) and compare the spatial distributions between pollutant exposures and adverse health outcomes across an area. In time series studies, temporal (e.g., day-to-day) variability is compared between the pollutant and adverse health outcome for a given area. Time-stratified case-crossover studies also compare temporal variability in the pollutant and outcome, but exposure is assigned to each individual and compared between the case day, when the outcome occurred, and control days, when the outcome did not occur. In some cases, cohort studies, and in particular birth cohort studies, use both spatial and temporal variability in the pollutant and outcome to estimate associations.

Fixed-location networks of ambient pollution monitors provide information about the spatial and temporal variability of pollutant concentrations and these data are commonly used to assign exposure in epidemiological studies; however, as noted above, ambient monitors that measure PM constituent concentrations are frequently spatially sparse. For example, the EPA Chemical Speciation Network (CSN) is a network of approximately 250 sites across the entire USA with only a small number (generally ≤3) of monitors in each urban area, compared with over 2000 monitors across the USA for PM2.5 [11, 14]. In addition, ambient monitoring of PM constituents generally is only conducted every third or sixth day [10, 14]. It is challenging to estimate the true spatial and temporal variation of PM constituent concentrations using the available monitoring data. In this section, we discuss the various ways epidemiological studies have assigned exposure to ambient PM constituents, which are frequently temporally and spatially heterogeneous due to their chemical and physical properties.

Cohort Studies

Because cohort and birth cohort studies rely primarily on spatial variability to estimate associations with health, predicting the spatial surface of local pollutant concentrations is important. Incorrect prediction of the spatial surface can lead to biases in estimated health effects. The spatial surface is commonly estimated using the observed pollutant concentrations at ambient monitors. Frequently, these locations do not match the locations where we observe health outcomes (e.g., residential addresses), and this difference in spatial locations between pollution data and health outcome data is referred to as spatial misalignment in the spatial statistical literature. One method for assigning exposure to PM constituents and accounting for spatial misalignment in cohort studies is to use the long-term average concentration from the closest ambient monitor within a certain distance of an individual’s residential address [13, 23–25]. Other studies of PM constituents have estimated long-term exposure using unweighted or distance-weighted average concentration of several neighboring monitors [26–29]. A comparison of these three approaches (the closest ambient monitor, the unweighted mean of closest monitors, and the weighted mean of closest monitors) found estimated health effects were mostly consistent across approaches [30]. These approaches to predict the exposure surface are both easy to understand and apply, but they often will fail to fully characterize the true concentrations of highly variable PM constituents at residential addresses.

Some cohort studies apply spatial models to ambient monitoring data to assign exposure to PM constituents. To obtain exposure at locations without monitoring data, some simple spatial models predict the full exposure surface by modeling the dependence between monitoring locations as a function of distance between monitors, possibly incorporating spatially varying covariates. Kriging methods are commonly applied spatial models that use a Gaussian process to model the correlation between monitoring locations. Instead of modeling the dependence between monitors directly, land use regression models use spatially varying covariates to predict the exposure surface at locations where covariate information is available but constituent concentrations are not. The predictors are derived using Geographic Information Systems (GIS) data and may include characteristics such as traffic intensity, population density, and land use. Many cohort studies have used spatial or land use regression models to predict exposure to PM constituents [31–38]. To characterize a pollutant’s spatial variation, data from many ambient monitors with good spatial coverage over the area of interest are usually necessary, though such data are not always available in smaller studies. Hence, spatial models and land use models cannot generally be used in single community studies unless supplemental data are collected.

While national monitoring networks are frequently used in cohort studies, some cohort studies have conducted additional measurement campaigns for ambient PM constituents. The European Study of Cohorts for Air Pollution Effects (ESCAPE) study sampled ambient pollution in 20 major European cities at 20 sites each [33, 34, 37, 38]. In the USA, the Multi-Ethnic Study of Atherosclerosis (MESA) included fixed-site ambient monitors placed in densely populated areas underrepresented by other monitoring networks as well as rotating monitors placed outside a sample of subjects’ homes [32, 35, 36]. The large number of ambient monitors allowed these studies to fit complex spatio-temporal models for PM constituents; however, these campaigns are expensive and time consuming.

Several birth cohort studies in our review used the Community Multiscale Air Quality (CMAQ) model, which simulates spatially resolved PM2.5 constituent concentrations using meteorology, state-of-the-art knowledge on atmospheric chemistry, and emission information [39–41]. One of these studies applied a spatio-temporal model using CMAQ data to identify critical exposure windows in the associations between PM2.5 constituents and congenital anomalies [39]. In a study of low birth weight, a chemical transport model was applied to estimate spatially resolved exposure to PM constituents [42]. While using modeled PM constituent concentrations provides more spatially resolved exposure information than using available ambient monitoring data alone, these simulated concentrations can be biased and need to be calibrated using ambient monitoring data within the study area [41, 43, 44].

Time Series and Case-Crossover Studies

Associations between short-term exposure to PM constituents and adverse health outcomes are frequently estimated using time series or time-stratified case-crossover methods. In these epidemiological designs, the selected exposure time frame generally ranges from same-day exposure to exposure several days preceding the outcome. Most studies in our review estimated associations with adverse health outcomes for a single day of exposure (e.g., same-day exposure), often because they did not have daily PM constituent concentrations. Previous studies of total PM mass have shown that the estimated effects on adverse health outcomes can span multiple exposure days [1, 45, 46]. Estimating PM constituent health effects corresponding to multiple days of exposure using non-daily data requires temporal imputation, which has not been previously attempted and is challenging because observed PM constituent concentrations may be substantially different than unobserved concentrations on subsequent and previous days. A non-daily sampling schedule can impact health effect estimation, possibly driven both by decreased power and by random variability across the sampled subsets of health and pollution data [47•]. When daily data are available, studies have averaged exposure over multiple days preceding the outcome of interest [48, 49]. Epidemiological studies of PM constituents with access to daily data have also estimated associations using unconstrained and constrained distributed lag models, which simultaneously estimate associations for multiple exposure days and allow estimation of cumulative effects [50–56].

Spatial variability of the exposure in time series studies is also important to consider because pollution is often measured at fixed-site ambient monitors while adverse health outcomes are aggregated across administrative regions such as cities or counties, a problem of spatial misalignment. To account for spatial misalignment between concentrations and outcomes in time series designs, most studies used either one or an average of ambient monitors to represent concentrations for an entire administrative region (e.g., [11, 47•, 57–59]). Other studies have also used population-weighted averages that utilize spatially interpolated PM2.5 constituents at a finer spatial resolution within the administrative region [48, 60, 61]. A study of pediatric asthma emergency department visits in Atlanta found estimated health effects of PM2.5 constituents did not vary substantially when the ambient average was estimated in three ways; using one designated central monitor, the unweighted average of ambient monitors, or a population-weighted average [48].

These approaches, while straightforward, will not provide a good estimate of the true ambient average for an administrative region if concentrations of the PM constituent are spatially heterogeneous and only a small number of monitors are available. Spatial heterogeneity exacerbates the spatial misalignment problem for time series studies because the true ambient average pollution concentration frequently has less temporal variability than the observed concentrations of a pollutant at a single ambient monitor. Therefore, even for spatially heterogeneous constituents with high temporal correlations across space, spatial misalignment can lead to error in the estimated ambient average concentration, which is needed when linking to aggregated health outcomes. An investigation of various sources of exposure error in time series studies found that attenuation in estimated health effects driven by spatial error was greater for EC, a spatially heterogeneous constituent, than for sulfate, a spatially homogeneous constituent [62•].

When data are available from many monitors, spatio-temporal models can be used to account for some spatial misalignment. In a study of PM2.5 constituents and hospital admissions in 20 US counties, spatial models were used to estimate county-wide ambient concentrations and the corresponding spatial misalignment error variance for each PM2.5 constituent [63]. Another national-level US study represented constituents as time-varying proportions of PM2.5 and fitted a spatial model to PM2.5 mass to account for spatial misalignment [64]. Because these models rely solely on monitoring data, which provide limited spatial information and are not randomly placed throughout an administrative region, they may not fully capture a pollutant’s spatial variability.

In case-crossover studies, exposure to pollution is assigned to each individual. The most common approach to account for spatial misalignment between the individual’s residential address and the ambient constituent monitor is to use only residential addresses located within a certain distance of an ambient monitor [65•, 66], which will be most effective for spatially homogeneous pollutants such as sulfate. A study of myocardial infarction in New Jersey assigned exposure to PM2.5 constituents using monitor-calibrated CMAQ concentrations based on the nearest CMAQ grid-cell containing an ambient monitor [44]. CMAQ yields PM constituent concentrations with more complete spatio-temporal coverage than ambient monitors alone provide; however proper calibration of CMAQ outputs requires data from more ambient monitors than are generally available for PM constituents.

Disentangling Health Effects of Individual Constituents

Disentangling health effects corresponding to PM constituents from those corresponding to total PM mass within epidemiological studies is challenging. Previous studies have consistently demonstrated that total PM mass is associated with adverse health outcomes [1, 2, 5] and epidemiological studies of PM constituents may estimate significant health effects solely because PM constituents are correlated with total PM mass. Many studies used standard covariate adjustment for total PM mass in models of PM constituents and adverse health outcomes (e.g., [36, 38, 41, 67, 68]), though this approach can lead to large standard errors for constituents highly correlated with total PM. Instead of adjusting for total PM2.5 directly, several studies estimated associations between PM2.5 constituents and health while controlling for the leftover PM2.5 mass (total PM2.5 mass − PM2.5 constituent) [51, 53]. This approach minimizes multicollinearity between the constituent and total PM2.5, but the interpretation of health effects can be challenging since increases in the PM2.5 constituent correspond to decreases in the leftover PM2.5 fraction. Another alternative is to first isolate the PM constituent from total PM mass by adjusting for total PM mass, which yields the portion of the PM constituent uncorrelated with total PM. Then the resulting residuals can be used in health effect regression models [65•, 69], though these estimated health effects do not represent the total effect of a PM constituent on adverse health outcomes [65•]. Studies have also estimated health effects for total PM2.5 mass and determined whether the estimated health effect was modified by each PM2.5 constituent [44, 70, 71]. This approach is appealing because the results indicate whether PM2.5 toxicity increases with a higher proportion of a particular constituent, but it does not allow direct estimation of constituent health effects. Mostofsky et al. [65•] discussed the relative advantages and disadvantages of approaches for controlling for total PM mass in epidemiological studies of PM constituents.

Other epidemiological studies have used Bayesian models to account for the relationships between PM constituents and total PM mass. One approach is to estimate community-level health effects of total PM and then apply a Bayesian hierarchical model to determine whether community- and season-specific fractions of PM2.5 constituents explain variability in estimated health effects of PM [72]. Similarly, fully Bayesian hierarchical models can be developed to allow long-term average PM2.5 constituent concentrations to impact the associations between PM2.5 and adverse health outcomes [73]. Constituent health effects can also be modeled simultaneously by representing each as a time-varying proportion of total PM2.5 mass in a Bayesian hierarchical model [64]. Though these Bayesian models require a considerable number of ambient monitors, these models leverage the spatially and temporally resolved PM2.5 ambient monitoring data in their analysis of PM2.5 constituents.

Similar to the issues for accounting for PM2.5 mass, many studies have simultaneously included multiple constituents in health effect regression models in an attempt to disentangle the health effect of one PM constituent from other constituents (e.g., [11, 26, 30, 37, 53, 60). However, there is substantial correlation between PM constituents, driven by shared sources and meteorology, and simultaneously estimating their health effects can lead to multicollinearity and unstable estimates of association. Because of this multicollinearity, health effects may be observed for a non-toxic PM constituent in single pollutant models when another unobserved, but correlated, constituent is truly toxic. A study of 119 US counties compared the estimated health effect magnitudes between PM2.5 constituents by comparing the posterior probabilities that one constituent’s effect exceeded another using a multivariate Bayesian hierarchical model [74]. While this approach helps to disentangle the health effects of PM2.5 constituents, it does not directly address multicollinearity because the approach still fits multivariable regression models to each county. Other studies used variable selection procedures to determine the PM constituents most associated with health. Forward selection in a multipollutant model has been applied to determine the PM2.5 constituents that were most associated with mortality [13]. Spatial variable selection models can be used to simultaneously estimate health effects of multiple PM2.5 constituents while imposing some regularity constraints to account for multicollinearity [75•]. In variable selection models, only those constituents that best predict the outcome are included in the final model. In this strategy, the selection of constituents may be driven by those that are best measured.

Instead of estimating health effects of individual PM constituents, some studies estimated health effects of groups of PM constituents. To estimate joint effects of PM constituents, two studies first estimated individual effects in a multivariate regression model and then summed the resulting estimated coefficients [61] or applied a second-stage regression model [76]. Many studies estimated associations between exposure to PM sources, which represent groups of constituents, and adverse health outcomes (e.g., [25, 29, 42, 50, 59, 66, 77, 78]). In most cases, sources are not directly measured and must be inferred from PM chemical constituent concentrations using source apportionment models [25, 29, 77, 78]. A simpler alternative to source apportionment modeling is to recreate sources of PM (e.g., soil dust) using a linear function of constituents associated with that source [79]. Other studies have used k-means clustering [80] and Bayesian modeling [81] to cluster study days based on properties of PM, including chemical composition, and determined whether the association between PM2.5 and health varied by cluster. Estimating health effects for groups of PM constituents reduces multicollinearity at the expense of not differentiating the effects of one constituent from other constituents.

Instrument-Related Measurement Error

PM can consist of over 50 chemical constituents and many of these constituents, such as transition metals, contribute minimally to PM2.5 by mass, but may be associated with adverse health outcomes [8, 23, 26]. These constituents may be more prone to instrument-related measurement error than major ambient pollutants in part because they frequently have low concentrations in ambient air, and depending on the sensitivity of the specific measurement methods used, their concentrations can often be below the MDL [16–18]. In our literature review, most studies did not mention how data below the MDL were treated. In several studies, concentrations below the MDL were replaced by half the MDL, a common practice in environmental studies [69, 75•, 82]. A study of hourly PM2.5 constituent concentrations in Seoul, South Korea found that estimated associations between PM2.5 constituents and mortality did not vary substantially between using the raw data below the MDL, substituting 1/2 MDL, and omitting values below the MDL, though the percentage below the MDL varied across constituents from 0.0 % (OC) to 43.4 % (sodium) [82]. In studies of constituents with a larger percentage of concentrations below the MDL or studies of daily concentrations, the estimated health effects may be more sensitive to the method used to adjust data below the MDL. A large number of studies included only those constituents with few observations below the MDL [23, 25, 34, 53, 68–71, 76, 80, 83–85], though the proportion allowed varied between studies. Development of imputation approaches [86] or incorporating modeled estimates from dispersion or emissions models may aid health effect estimation for constituents with concentrations below the MDL.

Measurement error in PM2.5 constituent data from ambient monitors can also be driven by the sampling method, the amount of humidity in the air during measurements, loss of volatile particles, electrostatic charge, deposition of additional particles on the filter before or after sampling, or other artifacts [87]. Studies of PM2.5 constituents have minimized measurement error by using only concentration quantiles [44, 88]. Air quality models including CMAQ and chemical transport models do not suffer from the same sources of measurement error as monitoring data because they simulate pollutant concentrations instead of measuring them directly in the air [40–42, 44]. However, these models still rely on monitoring data for bias calibration, and thus accurate and precise monitoring data remain a key part of exposure assessment.

Discussion

In this review, we identified three challenges present in epidemiological studies of PM constituents and adverse health outcomes, including assigning exposure in the presence of spatial and temporal heterogeneity, disentangling health effects of individual constituents, and overcoming instrument-related measurement error. One additional consideration is the development and application of statistical methods that address several of these issues simultaneously. As an example, some PM constituents are emitted by the same, locally generated source (e.g., EC and OC from traffic emissions) and this can drive both the spatio-temporal heterogeneity and the correlation between the PM constituents. To address spatial heterogeneity for multiple PM2.5 constituents, spatial models can be combined with a Bayesian hierarchical model that allows PM2.5 constituents to impact the associations between total PM2.5 mass and mortality [73]. Using a spatial model to estimate PM constituents can introduce measurement error in health effect models and a bootstrap-based measurement error correction model can be used to correct some of this error [31]. If the exposure lag most associated with a health outcome varies between PM constituents, which are not measured daily, it is unclear how to estimate associations in multipollutant models [29]. Additional statistical methods are still needed to address other combinations of these challenges.

When comparing estimated health effects from single pollutant models across constituents, differential measurement error between constituents (driven by differences in spatial heterogeneity, sampling method errors, etc.) can lead to observed associations only for the pollutant measured with smaller error [62•]. A common approach in studies of multiple PM constituents is to estimate their respective health effects in multipollutant models. However, in the case of differential measurement error, the health effects can be transferred from the truly toxic pollutant to one that is correlated with the toxic pollutant, but measured with lower error [21, 89]. Few epidemiological studies of multiple PM constituents and health have incorporated methods for handling measurement error, though methods have been developed for and applied to ambient pollutants [90–92].

We focused on statistical challenges specific to epidemiological studies of PM constituents and health, though previous studies of ambient pollutants have discussed exposure estimation [62•, 93–95], confounding [96], measurement error [91, 92, 97, 98], and spatio-temporal modeling [99–101]. Epidemiological studies of total PM mass are often not as sensitive as constituency-based studies to the issues we outline above. While PM2.5 can also be spatially and temporally heterogeneous [8], there are more total PM2.5 mass monitors than constituent monitors in most communities and PM2.5 monitors often collect daily data [14]. PM monitoring data with sufficient spatio-temporal coverage can be used to calibrate air quality models like CMAQ for use in subsequent health effect studies [101, 102]. Epidemiological studies of total PM often control for other pollutants such as ozone, sulfur dioxide, and nitrogen dioxide [4, 103], but the correlation between PM and these pollutants may not be as large as correlations found in studies of PM constituents. Last, because total PM concentrations are by definition larger than concentrations of PM constituents, total PM may be less prone to some sources of measurement error, such as concentrations below the MDL.

Time series, case-crossover and cohort studies of PM constituents and health are necessary for estimating population-level health effects that can guide regulation. This review discusses important considerations for such studies, but we acknowledge that other types of studies are needed to fully understand the pathway from pollution exposure to mortality and morbidity. Panel studies generally follow a relatively small number of individuals over time, and while less generalizable than large population-based studies, these studies often conduct better exposure measurement and provide an opportunity to assess air pollution health associations under specific contexts [104–107]. Occupational studies measure workplace pollutants and therefore have better exposure measurement for occupational exposures than many population-based studies have for ambient exposures [108, 109]. However, depending on the occupation, occupational exposures to specific PM constituents can exceed those experienced by the general population and these studies are primarily focused on indoor and not ambient pollution. Natural experiments, such as the reduction in ambient pollution levels in Beijing, China during the 2008 Olympics, can provide a unique opportunity to characterize health effects associated with a specific reduction in pollution, but may not be as useful to characterize short-term health effects associated with typical day-to-day variability [105]. Other experimental study designs, such as controlled exposure studies, toxicological studies, and animal studies, are also necessary to determine biological mechanisms and infer causal associations. Epidemiological studies are critical to fully characterize ambient air pollution health effects in the general population and should continue to be used, along with other study designs, to better understand how PM constituents impact human health.

Conclusions

Recent epidemiological studies of PM constituents and health have used a variety of statistical methods to assign exposure in the presence of spatial and temporal heterogeneity, disentangle health effects of individual constituents, and address instrument-related measurement error. Spatial and temporal heterogeneity in PM constituent concentrations has been estimated by spatio-temporal models and more recently by monitor-calibrated air quality model simulations. Commonly multipollutant models of PM constituents include adjustment for total PM mass or other PM constituents, but more advanced statistical techniques, such as source apportionment modeling, have also been applied to estimate health effects corresponding to multiple correlated constituents. Instrument-related and model-based measurement errors have not been extensively investigated in epidemiological studies of PM constituents and health and additional methods may need to be developed. Methods that simultaneously address several issues presented by PM constituent data are critical for future epidemiological studies of PM constituents and health and remain an active area of statistical and epidemiological research.