Abstract
Background
Exposure measurement error in copollutant epidemiologic models has the potential to introduce bias in relative risk (RR) estimates. A simulation study was conducted using empirical data to quantify the impact of correlated measurement errors in timeseries analyses of air pollution and health.
Methods
ZIPcode level estimates of exposure for six pollutants (CO, NO_{x}, EC, PM_{2.5}, SO_{4}, O_{3}) from 1999 to 2002 in the Atlanta metropolitan area were used to calculate spatial, population (i.e. ambient versus personal), and total exposure measurement error.
Empirically determined covariance of pollutant concentration pairs and the associated measurement errors were used to simulate true exposure (exposure without error) from observed exposure. Daily emergency department visits for respiratory diseases were simulated using a Poisson timeseries model with a main pollutant RR = 1.05 per interquartile range, and a null association for the copollutant (RR = 1). Monte Carlo experiments were used to evaluate the impacts of correlated exposure errors of different copollutant pairs.
Results
Substantial attenuation of RRs due to exposure error was evident in nearly all copollutant pairs studied, ranging from 10 to 40% attenuation for spatial error, 3–85% for population error, and 31–85% for total error. When CO, NO_{x} or EC is the main pollutant, we demonstrated the possibility of false positives, specifically identifying significant, positive associations for copollutants based on the estimated type I error rate.
Conclusions
The impact of exposure error must be considered when interpreting results of copollutant epidemiologic models, due to the possibility of attenuation of main pollutant RRs and the increased probability of false positives when measurement error is present.
Background
Many epidemiologic studies of the health effects associated with ambient air pollution exposure have focused on evaluating the effects associated with a single pollutant. In reality, humans are exposed to a complex mixture of pollutants that can vary spatially and temporally. The challenges of examining the relationship between multipollutant exposures and health effects include high correlations between pollutants preventing the use of standard statistical models, correlation of measurement errors across pollutants, varying degrees of exposure error across pollutants, the potential for interaction between pollutants, and pollutant mixtures varying by location [1–5]. Further, these epidemiologic studies often use data from centrally located measurement sites as the exposure estimate, which may not accurately capture the exposure variability due to the spatial variability of pollutant concentrations. As an example, analysis of pollutants with primarily local sources (CO, NO_{x}, and EC) which exhibit a large degree of spatial heterogeneity [6–8] may be more error prone when centralsite (CS) monitor measurements are used to estimate exposure, compared to pollutants primarily originating from regional sources (PM_{2.5}, SO_{4}, O_{3}) which may exhibit spatially homogeneous patterns and thus have a lower degree of measurement error when CS monitor estimates are used. Fixedlocation ambient monitors also do not account for human exposure factors such as where people move in time and space (e.g. invehicle concentrations impacting exposure), and concentrations in different microenvironments (e.g. infiltrationrelated factors). There is therefore the potential for exposure measurement error when CS measurements are used as a surrogate for exposure in epidemiologic studies. The presence of exposure measurement error can lead to effect attenuation and reduced statistical power in the resultant health risk estimate [9].
Billionnet et al. summarized the literature on statistical methods to study the effect of multiple pollutants [4], and Oakes et al. published a review of multipollutant exposure metrics [10]. Both identified a comprehensive understanding of exposure measurement error in the context of multipollutant studies as a key issue to address and investigate further. However while these papers detail work on multipollutant exposure metrics and statistical methods for analyzing multipollutant exposures, previous work to quantify the impact of exposure error on health risk estimates has focused on singlepollutant timeseries models [6–8, 11–14]. While there are inherent difficulties in examining multipollutant exposures, we still do not have a clear understanding of the relationship of exposure measurement error in a more simplistic two pollutant model [15–17].
To our knowledge this is the first work which uses empirical pollutant relationships and health data to quantify the impact of correlated exposure measurement error in copollutant timeseries models on resultant health risk estimates in a simulation study. We consider Poisson regression and allow for additive, multiplicative, and correlated measurement errors, which are typically not considered in the classical/Berkson error framework [16]. We use previously described, daily exposure metrics ranging from CS measurements to more complex modeling approaches [18] to calculate empirical relationships between pollutants (PM_{2.5}, SO_{4}, O_{3}, CO, NO_{x}, EC). Relationships between exposure metrics were previously used to calculate estimates of exposure measurement error due to spatial variability of pollutant concentrations, and human exposure factors at the ZIP code level [17], but did not include the use of empirical health data nor analysis of an epidemiological model. Our previous work showed the potential for bias in model coefficients for copollutant models, motivating the current work to examine the degree of attenuation of relative risks (RRs) empirically. In the current paper, the previous work is extended in a simulation study by considering a Poisson timeseries analysis of emergency department (ED) visits for respiratory disease in the Atlanta, GA metropolitan area to obtain empirical estimates of the attenuation in health risk estimates due to various sources of exposure measurement error.
Methods
Estimates of exposure and exposure measurement error
Three estimates of daily exposure to ambient PM_{2.5}, EC, SO_{4}, CO, NO_{x}, and O_{3} were derived for 193 ZIP codes in the 20county Atlanta metropolitan area. Pollutantspecific measured concentrations, modeled exposure estimates, and summary statistics have been described previously in detail [17, 18]. Briefly the three exposure assessment approaches include: (1) CS measurements (CS), (2) ambient air quality (AQ) estimates obtained by combining simulations from an air quality model for regional background and a dispersion model for the local contribution, and (3) estimates from a stochastic population exposure model (PE). For PE, the contribution from indoor sources was not included because of the desire to examine the association between the health outcomes and exposure to ambient pollution, and for comparability with the CS measurements and AQ estimates. All three approaches estimate exposures to ambient pollution at each ZIP code centroid in the study area. Daily estimates (8h maximum for O_{3}, 24h average for other pollutants) for 1999–2002 were generated for each approach.
For each ZIP code, three types of exposure error (δ_{spatial}, δ_{population}, δ_{total}) were calculated as the difference between two exposure metrics. Detailed summary statistics including the magnitude and variance of δ, and betweenpollutant correlations of δ are presented in [17]. Briefly, the ZIP codespecific exposure error due to a lack of spatial refinement in the exposure estimate is represented by δ_{spatial} = AQ  CS, assuming that the difference between the more (i.e. AQ) and less (i.e. CS) spatially refined metrics gives an estimate of the amount of spatial measurement error that is present in the CS metric. This assumption is made because our air quality models add spatial variability to the AQ metric compared with CS measurements, which lack spatial variability because the same CS measurement was used to represent exposure in each ZIP code. Exposure error introduced when human exposure factors are not included in an exposure estimate is represented by δ_{population} = PE – AQ. Our PE metric includes variability due to human exposure factors such as timelocationactivity patterns of individuals, commuting patterns, and infiltration of ambient pollutants to indoor environments. A third type of exposure error, δ_{total} = PE – CS, represents the combined exposure error when both spatial variability and human exposure factors are not accounted for. δ_{total} does not represent all potential sources of exposure error that may be present in a study; instead it represents the total exposure error that we were able to assess in this analysis.
Epidemiologic model
To conduct the simulation, health data from a previously described epidemiologic study were used [19]. Briefly, individuallevel data on ED visits for respiratory outcomes from 41 hospitals in the 20county Atlanta area were aggregated to daily counts for each ZIP code. The same Poisson timeseries model in S. Sarnat et al. [19] was used in this simulation study to examine the association between the above described exposure metrics and daily counts of ED visits. The basic form of the model was:
where Y _{ kt } is the count of ED visits for respiratory outcomes in ZIP code k on day t. For each pollutant (pollution _{ 1 } and pollution _{ 2 }), daily averages (8h maximum for O_{3}, 24h average for all other pollutants) of sameday concentrations were used. To control for spatial autocorrelation in the baseline ED visits across the ZIP codes, an indicator variable for the k^{th} ZIP code (ZIP _{ k }) was used to represent the areas from which ED counts were spatially aggregated. Dummy variables for day of week and holidays (DOW, indexed by m) and for hospital (hospital, indexed by n) were used; the latter accounted for the differing durations of time each hospital was included in the study. Longterm trends and seasonality of health outcomes (time) were controlled for with parametric cubic splines with monthly knots (g(γ _{1}, …, γ _{ N }; x)). Meteorology was controlled for using maximum temperature (IO _{ temp }) with indicators for each degree Celsius, a cubic function for dew point (dewpt), a cubic function for minimum temperature (temp), and dummy variables for seasons. When simulating the health effect we assume zero lag, but expect results to be generalizable to single day lags.
Simulation of true exposures and health outcomes
True exposure was defined as an exposure estimate without spatial and/or population measurement error. To simulate the true daily exposure for each pollutant, linear models for each type of daily exposure were fitted separately for each ZIP code and for each pollutant; for example, for spatial measurement error, the model is given by AQ _{ kt } = θ _{ k,1} + θ _{ k,2} * CS _{ kt } + ε _{ k,t } for ZIP code k and day t, such that both additive (θ _{1}) and multiplicative bias (θ _{2}) between the two exposure metrics were accounted for. See Additional file 1: Table S1 for the mean and standard deviation across ZIP codes of estimates of θ _{k,1} and θ _{k,2}. For each Monte Carlo iteration, we first calculated a ‘true’ exposure mean for each day by using the unrefined measurement as the predictor (i.e., CS for spatial or total measurement error, AQ for population measurement error) and a realization of the measurement error coefficients (θ _{1} and θ _{2}) drawn from their asymptotic bivariate normal distribution. AcrossZIP code spatial heterogeneity in the measurement error coefficients was allowed, however for each ZIP code we assumed that additive and multiplicative bias remains constant across days. Finally, the residual of the true exposure (ε _{ t }) was subsequently drawn from a bivariate normal distribution with covariance equal to the ZIP code specific empirical measurement error covariance (see Additional file 1: Table S2 for a summary of the median across ZIP codes of the correlation of measurement error covariance) for the two pollutants of interest, and then combined with the true exposure mean to obtain the full true exposure value (true_exp). We chose to simulate true exposure from the less refined exposure because our measurement error framework assumes that the true exposure is more variable than the errorprone (unrefined) exposure. The resultant measurement error between the true and errorprone exposures, derived here from empirical data, do not strictly follow the standard classical or Berkson measurement error models (Zeger et al. [16]).
Using observed health data from the study described in S. Sarnat et al. [19], a design matrix \( X \) (without the two pollutants) was constructed for the Poisson model in Equation (1). The Poisson mean \( \mu \) for each day and each ZIP code was calculated as:
where B is a vector of the estimated regression coefficients from Equation (1), subscripts 1 and 2 denote pollutants 1 and 2, RR _{1} and RR _{2} are the hypothetical RRs per interquartile range (IQR) increase to allow for comparison across pollutants, true_exp _{1} and true_exp _{2} are the simulated true exposures, and IQR _{1} and IQR _{2} are the interquartile ranges for the corresponding true exposure metrics. We assumed RR _{1} = 1.05 and RR _{2} = 1, and in subsequent text refer to pollutant 1 as the ‘main pollutant’ (i.e., the pollutant with an assumed health effect) and to pollutant 2 as the ‘copollutant’ (i.e., the pollutant assumed to have no effect). The assumed RR _{1} = 1.05 was chosen based on previously established singlepollutant models for the same data set, specifically the O_{3}respiratory analysis [19]. Poisson counts were drawn from a Poisson distribution of the simulated Poisson mean μ. Two versions of the Poisson timeseries model (Equation (1)) were then fitted: one using the simulated Poisson counts with the true exposure, and another using the same simulated Poisson counts, together with the unrefined exposure values (standardized by their corresponding IQRs). This resulted in two sets of estimated log RR: \( {\widehat{\beta}}_{1,\ true} \) and \( {\widehat{\beta}}_{2,\ true} \) for the ‘true exposure’ scenario, and \( {\widehat{\beta}}_{1,\ noisy} \) and \( {\widehat{\beta}}_{2,\ noisy} \) for the ‘noisy exposure’ scenario (i.e., using the unrefined exposure metric).
The above simulation was run 1000 times for each pollutant pair and measurement error scenario. The mean over the 1000 estimates for each RR was calculated and used as the overall pointestimate associated with each pollutant pair and measurement error scenario. As an example, the RR for the main pollutant, true exposure scenario, was calculated as:
where n indexes the simulation iteration and N = 1000. Standard error of the RR was calculated as the standard error of the 1000 βs. A student’s twosided ttest was used to determine significant differences between \( {\overline{RR}}_{noisy,a} \) and \( {\overline{RR}}_{noisy,b} \) for two different copollutants a and b, when paired with the same main pollutant, across Monte Carlo simulations.
Statistical analyses
Mean RR estimates from using the true and noisy exposures were compared to the true RR in order to determine if the presence of measurement error induces bias and a loss of power. Various summary statistics are presented, including percent attenuation, root meansquare error (RMSE), and power/type I error.
Percent attenuation of the RR for the main pollutant was calculated to determine the impact of inclusion of measurement error on attenuation of the RR. RMSE was calculated to assess both bias and loss of precision in the estimates. Lastly, statistical power/type I error was calculated to assess false positive associations. As an example, for pollutant 1, noisy scenario, we define:
where RR _{1} = 1.05, RR = 1, N = 1000, β and s _{ β } (standard error of β) are obtained from the Poisson model (equation 1), and \( {\widehat{RR}}_{1,\ noisy} \) is defined as in equation 3. Quantities were similarly calculated for each pollutant pair and measurement error scenario. RMSE ratio was defined as RMSE _{ noisy }/RMSE _{ true } for pollutant 1 or 2.
All statistical analyses and model simulations were completed in R, version 3.0.2 (R Foundation for Statistical Computing; http://www.rproject.org/).
Results
Mean RRs for the main pollutant over 1000 runs of the health model for both the noisy and true exposure metrics are shown in Fig. 1. For all types of measurement error, resultant RR of the main pollutants (pollutant 1) when the health model used the true exposure metric (i.e., the simulated exposure metric without measurement error) is close to 1.05. Similarly, the resultant RR of the copollutants (pollutant 2) using the true exposure metric is close to 1, as expected due to the assumed RR _{1} = 1.05 and RR _{2} = 1. In contrast the presence of measurement error in the noisy exposure metric results in considerable attenuation for all types of measurement error (Fig. 1, Table 1). For spatial measurement error, we see greater attenuation for \( {\overline{RR}}_{1, noisy} \) when a local pollutant (CO, NO_{x}, EC) is the main pollutant (29–40%) compared to a regional pollutant (PM_{2.5}, SO_{4}, O_{3}; 10–15%) (ranges represent the attenuation across all copollutants models for a specific main pollutant). For population measurement error, there is little attenuation when CO is the main pollutant (3–4%), and substantial attenuation (82–85%) of \( {\overline{RR}}_{1, noisy} \) when NO_{x} is the main pollutant. Similarly for total measurement error, the highest level of attenuation of \( {\overline{RR}}_{1, noisy} \) is seen when NO_{x} is the main pollutant (85%), and the least attenuation for CO (31–32%). For all other pollutants, for both population and total measurement error, attenuation of \( {\overline{RR}}_{1, noisy} \) is moderatehigh (Table 1). For all pollutants and all types of measurement error, attenuation of \( {\overline{RR}}_{1, noisy} \) mirrors the patterns of attenuation hypothesized in previous analyses (see figure four of Dionisio et al., 2014) [17] . The scenario with an assumed RR _{1} = 1.05 and RR _{2} = 1.05 was run as a sensitivity analysis. Resultant RR from the sensitivity analysis are presented in Additional file 1: Figure S1. Twosided ttests were used to compare differences in attenuation of the main pollutant RR by copollutant; there were minimal differences seen across copollutants for the case of RR _{1} = 1.05 and RR _{2} = 1, with more differences seen for the sensitivity analysis (see Additional file 1: Supplemental Text, Table S3, and Table S4). The coverage of the 95% confidence intervals was calculated for the main pollutant and copollutant for each of the different scenarios analyzed, and with one exception was in the range of 0.93–0.98.
The RMSE ratios comparing the RMSE for the noisy and true model estimates are presented in Table 2, reflecting both bias and loss of precision. The RMSE ratio for the copollutants is typically small (≤2.0), while the RMSE ratio for the main pollutants ranges from 1.3 to 26.1. While the main pollutant RMSE ratio for regional pollutants (PM_{2.5}, SO_{4}, O_{3}) ranges from 1.8 to 13.7, there is little difference seen across copollutants and measurement error type for the same main pollutant. In contrast, the copollutant can have a substantial impact on the main pollutant RMSE ratio when local pollutants (CO, NO_{x}, EC) are the main pollutant. For example, for δ_{spatial} for CO, NO_{x}, and EC as the main pollutant, the RMSE ratio of the main pollutant ranges from 6.6–11.5, 6.0–10.4, and 6.6 – 12.1 respectively, dependent upon the copollutant (Table 2).
Power estimates for all main pollutants were equal to 1, indicating we will always detect the health effect association for the main pollutant (results not shown) given the simulation setup. Copollutant type I error for the true scenarios (i.e., no measurement error) were approximately 0.05 (results not shown). We present the type I error for the copollutant, noisy scenario (i.e., with measurement error), when the copollutant has no effect (RR_{2} = 1) (Fig. 2). Figure 2 shows that for CO, NO_{x}, or EC as the main pollutant there is a substantial increase in the type I error of the copollutant when spatial or total exposure measurement error is present in both the main pollutant and the copollutant, increasing the likelihood of detecting false positive associations for the copollutants when CO, NO_{x} or EC are the main pollutant in these instances. For copollutants paired with NO_{x} under the δ_{population} scenario we see larger type I error values, particularly for NO_{x} paired with EC. For PM_{2.5}, SO_{4}, and O_{3} as the main pollutant for δ_{spatial} and δ_{total} and for all pollutants except NO_{x} for δ_{population}, there is less likelihood of detecting false positive associations.
Discussion
While previous work has examined the impact of exposure measurement error in single pollutant models [13] and studies have proposed statistical methods for handling multipollutant effects [15, 20–22], little work has been done to quantify the impact of measurement error on health risk estimates in multipollutant models, including copollutant models. This study builds on previous work examining the potential attenuation of model coefficients in copollutant epidemiologic models based on empirical covariance structures [17]. While the previous work built the foundation for the current analysis through development of exposure metrics and calculation of empirical relationships between pollutants, the current manuscript extends the work with the addition of empirical health data applied in an epidemiological model with the previously developed exposure metrics. Using the empirical measurement error covariance structures, estimates of additive and multiplicative bias between exposures, and betweenpollutant relationships, we extended the previous work by conducting a simulation study to assess the impact of measurement error on health risk estimates obtained from a copollutant timeseries model. We show the substantial attenuation of RRs resulting from the presence of exposure measurement error in exposure estimates for most pollutants and measurement error types examined, an empirical result which agrees with the predictions in previous work [17]. Lastly, we have shown evidence for the detection of false positive associations between the copollutant and the health outcome, particularly when CO, NO_{x}, or EC are the main pollutant.
The RMSE ratio comparing the RMSE of the risk estimates from the noisy and true exposures reflects both the bias and standard error of the estimates. When we compare the RMSE ratio for a main pollutantmeasurement error type, we see that for regional pollutants as the main pollutant, RMSE ratio changes very little across copollutants. However the RMSE ratio of the local pollutants as main pollutants is impacted substantially depending upon whether regional or local pollutants are the copollutants. For all local pollutant (as main pollutant) measurement error pairs except for COpopulation measurement error, we see a substantial increase in RMSE ratio for the main pollutant when the copollutant is a regional pollutant, compared to when the copollutant is a local pollutant. Thus from examination of the RMSEs (Table 2) we conclude that the inclusion of a regional copollutant in a model with a local main pollutant substantially increases the degree of bias and standard error of the main pollutant estimates, compared to having a local copollutant. However Fig. 1 demonstrates that the copollutant (whether regional or local) has little effect on the degree of bias in the main pollutant’s effect estimate. The two results taken together allow us to conclude that the differences in RMSE across copollutant type are due to differences in standard error, rather than bias, with the spatial structure of the copollutant influencing the standard error but not the bias of the main pollutant’s effect estimate. We also see differences in the RMSE ratio of local pollutants as main pollutants when the copollutant is varied, even when the copollutants have an assumed null effect (RR = 1). This indicates that relationship of these copollutants and their error with the main pollutant does have an impact on the bias and standard error of main pollutant RR estimates.
Type I error estimates in this study show the likelihood of detecting a false positive effect for a copollutant with an assumed null effect (i.e., RR = 1) is often reduced when measurement error is not present, especially when CO, NO_{x}, and EC are the main pollutants. It is likely that this impact is more evident for CO, NO_{x}, and EC than it is for PM_{2.5}, SO_{4}, and O_{3} because the former are dominated by local sources, thus tend to be more spatially variable [17] and have higher degrees of correlated exposure measurement error. This can be important when trying to understand the independent effect of a pollutant that is a part of a complex mixture. For example, the EPA’s most recent Integrated Science Assessments (ISAs) for CO [23] and NO_{x} [24] both state that while there is evidence of consistent positive associations between shortterm exposure to CO and NO_{2} and effects on the respiratory system in epidemiologic studies, the challenge remains to disentangle the independent effect of CO or NO_{2} related health effects from the larger air pollution mixture. There is the possibility that CO and/or NO_{x} are serving as indicators of combustionrelated emissions, particularly from traffic, for some health outcomes. The moderate to high correlations between CO, NO_{x}, and other pollutants generated from combustion processes (e.g., EC, PM_{2.5}) further complicates interpretation of epidemiologic studies of these pollutants. This provides motivation for the use of exposure metrics which minimize measurement error in epidemiologic analyses.
While the simulation presented here has many strengths, including accounting for multiplicative bias and correlated measurement errors, and the use of empirical data, there are inherent limitations. First, we assume that the refined exposure metric in each of our three constructed measurement error types provides an accurate representation of measurement error. That is, we assume that the AQ exposure metric accurately and completely accounts for spatial variability ignored in the CS metric, and we assume the PE exposure metric appropriately accounts for population variability ignored in the AQ metric. The authors acknowledge that the AQ metric and PE metrics themselves include some error, but in the absence of a ‘true’ spatially refined metric (e.g., a finescale measurement/monitor network in the study area) and a set of populationbased personal exposure measurements, we operate under the above assumptions. Any error present in the refined metrics will remain an influence on the calculated RRs. Further, though it is common to use an individual’s exposure to ambientgenerated pollution in an epidemiologic study, the inclusion of exposure from indoor sources would allow for analysis of the impact of exclusion of this exposure source on commonly calculated RRs. Though we expect results presented here to have general applicability regarding the impact on bias of RRs in a copollutant timeseries model, we acknowledge that results presented here are dependent on empirical data from the Atlanta metropolitan area. Multipollutant relationships may be different in other cities depending on the cityspecific source profile and the magnitude of measurement errors may depend on the location of central monitor and population characteristics.
Simulation studies have examined the impact of measurement error in hypothetical multipollutant epidemiologic studies [16, 25]. Exposure measurement error is often categorized into two classes: Berkson error and classical error, which are often assumed to be independent of the true exposure [16]. In reality, exposure measurement error is a combination of the two. Adding to the complexity, the consequences of each error type are different [5, 26]. In most timeseries studies, Berkson error will not bias the effect estimates, but will tend to increase the variance of the regression coefficients (increasing the width of the CIs and decreasing power), while classical measurement error tends to bias the true effect towards the null, with the magnitude of the effect attenuation depending on the error variance of the exposure estimate relative to the variance of the true exposure [16]. Empirical analyses have shown impacts of both types of error in single pollutant models [14, 27–29]. When two pollutants are measured with error, the correlation between the pollutants, and the correlation between their errors, predicts the magnitude of the bias, while the sign of the correlation between the pollutants predicts the direction of the bias [16, 20]. In this analysis, we do not follow standard classical or Berkson measurement error models, but utilize the empirical relationships between different exposure metrics from a realworld epidemiologic study. By assuming that the more refined exposure metric is the true exposure while treating the less refined metric as the errorprone exposure, we observe effect attenuation and bias away from the null, depending on the copollutant pair considered. In addition, we note that timeseries analyses typically rely on a small number of monitors to derive populationaveraged exposure metrics. Recently there has been increasing interest in conducting spatiotemporal modeling of air pollutants via landuse regression models [30] and data fusion methods [31] to capture spatial heterogeneity within the study region. However, measurement errors are also associated with spatial predictions due to spatial smoothing and estimation uncertainty [21, 32].
Conclusions
To the best knowledge of the authors, this study is the first to quantify the impact of measurement error on health risk estimates in copollutant models using empirical data. In addition, this study directly addresses a question often considered when examining the collective body of evidence with respect to copollutant models: the impact of varying degrees of exposure measurement error on resultant risk estimates. Based on results presented here, the impact of measurement error in future studies of the health effects of exposure to air pollution must be considered, so that the true health impact of air pollution exposure is not underestimated, and to avoid false conclusions. In addition, future studies may investigate additional techniques and statistical methods for measurement error correction.
Abbreviations
 AQ:

Ambient air quality estimates
 CS:

Centralsite measurements
 ED:

Emergency department
 ISA:

Integrated science assessment
 PE:

Population exposure model estimates
 RMSE:

Root meansquare error
 RR:

Relative risk
References
 1.
Dominici F, Peng RD, Barr CD, Bell ML. Protecting human health from Air pollution: shifting from a singlepollutant to a multipollutant approach. Epidemiology. 2010;21:187–94.
 2.
Mauderly JL, Burnett RT, Castillejos M, Özkaynak H, Samet JM, Stieb DM, Vedal S, Wyzga RE. Is the air pollution health research community prepared to support a multipollutant air quality management framework? Inhal Toxicol. 2010;22:1–19.
 3.
Vedal S, Kaufman JD. What does multipollutant air pollution research mean? Am J Respir Crit Care Med. 2011;183:4–6.
 4.
Billionnet C, Sherrill D, AnnesiMaesano I. Study ObotG: estimating the health effects of exposure to multipollutant mixture. Ann Epidemiol. 2012;22:126–41.
 5.
Bateson TF, Coull BA, Hubbell B, Ito K, Jerrett M, Lumley T, Thomas D, Vedal S, Ross M. Panel discussion review: session three  issues involved in interpretation of epidemiologic analyses  statistical modeling. J Expo Sci Environ Epidemiol. 2007;17:S90–6.
 6.
Goldman GT, Mulholland JA, Russell AG, Srivastava A, Strickland MJ, Klein M, Waller LA, Tolbert PE, Edgerton ES. Ambient Air pollutant measurement error: characterization and impacts in a timeseries epidemiologic study in Atlanta. Environ Sci Technol. 2010;44:7692–8.
 7.
Sarnat SE, Klein M, Sarnat JA, Flanders WD, Waller LA, Mulholland JA, Russell AG, Tolbert PE. An examination of exposure measurement error from air pollutant spatial variability in timeseries studies. J Expo Sci Environ Epidemiol. 2010;20:135–46.
 8.
Strickland MJ, Gass KM, Goldman GT, Mulholland JA. Effects of ambient air pollution measurement error on health effect estimates in timeseries studies: a simulationbased analysis. J Expo Sci Environ Epidemiol. 2015;25:160–6.
 9.
Shy CM, Kleinbaum DG, Morgenstern H. The effect of misclassification of exposure status in epidemiological studies of air pollution health effects. Bull NY Acad Med. 1978;54:1155–65.
 10.
Oakes M, Baxter L, Long T. Evaluating the application of multipollutant exposure metrics in air pollution health studies. Environ Int. 2014;69:90–9.
 11.
Setton E, Marshall JD, Brauer M, Lundquist KR, Hystad P, Keller P, CloutierFisher D. The impact of daily mobility on exposure to trafficrelated air pollution and health effect estimates. J Expo Sci Environ Epidemiol. 2011;21:42–8.
 12.
Goldman GT, Mulholland JA, Russell AG, Strickland MJ, Klein M, Waller LA, Tolbert PE. Impact of exposure measurement error in air pollution epidemiology: effect of error type in timeseries studies. Environ Health. 2011;10:1–11.
 13.
Dominici F, Zeger SL, Samet JM. A measurement error model for timeseries studies of air pollution and mortality. Biostatistics. 2000;1:157–75.
 14.
Hart JE, Liao X, Hong B, Puett RC, Yanosky JD, Suh H, Kioumourtzoglou MA, Spiegelman D, Laden F. The association of longterm exposure to PM_{2.5} on allcause mortality in the Nurses’ Health Study and the impact of measurementerror correction. Environ Health. 2015;14:1–9.
 15.
Chang HH, Peng RD, Dominici F. Estimating the acute health effects of coarse particulate matter accounting for exposure measurement error. Biostatistics. 2011;12:637–52.
 16.
Zeger SL, Thomas D, Dominici F, Samet JM, Schwartz J, Dockery D, Cohen A. Exposure measurement error in timeseries studies of air pollution: concepts and consequences. Environ Health Perspect. 2000;108:419–26.
 17.
Dionisio KL, Baxter LK, Chang HH. An empirical assessment of exposure measurement error and effect attenuation in bipollutant epidemiologic models. Environ Health Perspect. 2014;122:1216–24.
 18.
Dionisio KL, Isakov V, Baxter L, Sarnat JA, Sarnat SE, Burke J, Graham SE, Mulholland J, Özkaynak H. Development and evaluation of alternative approaches for exposure assessment of multiple air pollutants in Atlanta, Georgia. J Expo Sci Environ Epidemiol. 2013;23:581–92.
 19.
Sarnat SE, Sarnat JA, Mulholland J, Isakov V, Özkaynak H, Chang H, Klein M, Tolbert PE. Application of alternative spatiotemporal metrics of ambient air pollution exposure in a timeseries epidemiological study in Atlanta. J Expo Sci Environ Epidemiol. 2013;23:593–605.
 20.
Zeka A, Schwartz J. Estimating the independent effects of multiple pollutants in the presence of measurement error: an application of a measurementerrorresistant technique. Environ Health Perspect. 2004;112:1686–90.
 21.
Bergen S, Sheppard L, Kaufman JD, Szpiro AA. Multipollutant measurement error in air pollution epidemiology studies arising from predicting exposures with penalized regression splines. J R Stat Soc: Ser C: Appl Stat. 2016;65(5):731–53.
 22.
Schwartz J, Coull BA. Control for confounding in the presence of measurement error in hierarchical models. Biostatistics. 2003;4:539–53.
 23.
U.S. EPA. Integrated science assessment for carbon monoxide. vol. EPA/600/R09/019F. Research Triangle Park: National Center for Environmental AssessmentRTP Division, Office of Research and Development, U.S. Environmental Protection Agency; 2010.
 24.
EPA US. Integrated science assessment for oxides of nitrogenhealth criteria. vol. EPA/600/R08/071. Research Triangle Park: National Center for Environmental AssessmentRTP Division, Office of Research and Development, U.S. Environmental Protection Agency; 2008.
 25.
Zidek JV, Wong H, Le ND, Burnett R. Causality, measurement error and multicollinearity in epidemiology. Environmetrics. 1996;7:441–51.
 26.
Thomas D, Stram D, Dwyer J. Exposure measurement error: influence on exposuredisease relationships and methods of correction. Annu Rev Public Health. 1993;14:69–93.
 27.
Hart JE, Spiegelman D, Beelen R, Hoek G, Brunekreef B, Schouten LJ, Brandt Pvd. LongTerm Ambient Residential TrafficRelated Exposures and Measurement ErrorAdjusted Risk of Incident Lung Cancer in the Netherlands Cohort Study on Diet and Cancer. Environ Health Perspect. 2015;123(9):860–66.
 28.
Van Roosbroeck S, Li R, Hoek G, Lebret E, Brunekreef B, Spiegelman D. Trafficrelated outdoor air pollution and respiratory symptoms in children: the impact of adjustment for exposure measurement error. Epidemiology. 2008;19:409–16.
 29.
Li R, Weller E, Dockery DW, Neas LM, Spiegelman D. Association of indoor nitrogen dioxide with respiratory symptoms in children: application of measurement error correction techniques to utilize data from multiple surrogates. J Expo Sci Environ Epidemiol. 2006;16:342–50.
 30.
Hu X, Waller LA, Lyapustin A, Wang Y, AlHamdan MZ, Crosson WL, Estes Jr MG, Estes SM, Quattrochi DA, Puttaswamy SJ, Liu Y. Estimating groundlevel PM2.5 concentrations in the Southeastern United States using MAIAC AOD retrievals and a twostage model. Remote Sens Environ. 2014;140:220–32.
 31.
Friberg M, Zhai X, Holmes H, Chang HH, Strickland MJ, Sarnat SE, Tolbert PE, Russell AG, Mulholland JA. Method for fusing observational data and chemical transport model simulations to estimate spatiotemporally resolved ambient air pollution. Environ Sci Technol. 2016;50:3695–705.
 32.
Szpiro AA, Paciorek CJ. Measurement error in twostage analyses, with application to air pollution epidemiology. Environmetrics. 2013;24:501–17.
Acknowledgments
The authors would like to acknowledge Haluk Ozkaynak (U.S. EPA, retired), Vlad Isakov (U.S. EPA), Stefanie Sarnat (Emory University), Jeremy Sarnat (Emory University), and Jim Mulholland (Georgia Tech) for their contributions to the project. Although this work was reviewed by EPA and approved for publication, it may not necessarily reflect official Agency policy. EPA does not endorse the purchase of any commercial products or services mentioned in this publication.
Funding
This publication was made possible in part by US EPA grant R834799 and by grant R21ES02279502 from the National Institutes of Health. Funding bodies did not play a role in the study design, data analysis, interpretation of results, or manuscript writing.
Availability of data and materials
After passing through internal clearance, the dataset supporting the conclusions of this article will be available on the publicly accessible Environmental Dataset Gateway (EDG) website of the U.S. EPA (https://edg.epa.gov/metadata/).
Authors’ contributions
All authors contributed to study design, provided critical input in manuscript development, and reviewed and approved the final manuscript. HHC provided guidance on statistical analyses and interpretation. KLD led data analysis and completed the first draft of the manuscript. LKB contributed to the introduction and discussion.
Competing interests
The authors declare that they have no competing interests.
Consent for publication
Not applicable.
Ethics approval and consent to participate
Not applicable.
Author information
Affiliations
Corresponding author
Additional file
Additional file 1:
Supplemental materials including supplemental text, figures, and tables, are available as an additional file. (PDF 392 kb)
Rights and permissions
COPYRIGHT NOTICE. The article is a work of the United States Government; Title 17 U.S.C 105 provides that copyright protection is not available for any work of the United States government in the United States. Additionally, this is an open access article distributed under the terms of the Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0), which permits worldwide unrestricted use, distribution, and reproduction in any medium for any lawful purpose.
About this article
Cite this article
Dionisio, K.L., Chang, H.H. & Baxter, L.K. A simulation study to quantify the impacts of exposure measurement error on air pollution health risk estimates in copollutant timeseries models. Environ Health 15, 114 (2016). https://doi.org/10.1186/s1294001601860
Received:
Accepted:
Published:
Keywords
 Exposure modeling
 Exposure measurement error
 Exposure assessment
 Bias
 Copollutant