Introduction

Preterm birth is a key determinant of infant mortality and morbidity, and of health status in childhood and even adulthood [1,2,3]. Numerous studies conducted worldwide have examined the association between air pollution and preterm birth [4,5,6,7,8,9,10]. Many studies have found that air pollution exposure increases the risk of preterm birth and it has been estimated that 23% or 3.4 million preterm births globally were attributable to PM2.5 in 2010 [1]. However, there has been some inconsistency in findings, including in Canada, where in some instances we observed significant associations [11], while in others we did not [12, 13]. Most studies have employed cohort or case-control designs, characterizing exposure over the entire pregnancy, trimester or birth month [8], while a smaller number have examined short term exposure, employing time-series [14,15,16,17,18,19,20], case-crossover [21] or time to event analysis [22,23,24,25,26]. It has been hypothesized that the association between air pollution and preterm birth may be more difficult to detect than associations with other outcomes such as term low birth weight or small for gestational age because of the different duration of exposure over the entire pregnancy or third trimester in preterm vs. term births, and the existence of seasonal cycles in incidence of preterm birth [15, 21, 27, 28]. To address these issues and to examine the influence of short-term exposure, here we employ a time to event analysis, using Cox models examining exposures in the week prior to birth.

Methods

We employed data from the Canadian births database. Live birth events are reported to Statistics Canada by the provincial and territorial Vital Statistics Registries in Canada. For this study, singleton live births between 1999 and 2008 in 24 cities with daily air pollution data were eligible. Data include more than one birth to the same mother, but these could not be identified due to data limitations. Preterm births were those occurring at less than 37 weeks gestation, which were further categorized as 20–27, 28–31, 32–33 and 34–36 weeks gestation [29]. Information on maternal behaviours including smoking and alcohol consumption, and individual-level data on socioeconomic status (SES) and ethno-cultural origins were not available in this dataset. Area-level socioeconomic status characteristics were assigned to singleton births by geocoding birth records using the six character maternal postal code from the births database and the Postal Code Conversion File Plus (PCCF+) version 5 k in order to obtain Statistics Canada standard geographic identifiers [30]. Using geocoded birth records, neighbourhood-level SES variables were calculated at the Dissemination Area (DA) level using census data, including the proportion of individuals aged 15 years and over who were unemployed, or in the lowest income quintile, and the proportion of females aged 25 years and over with post-secondary education [31, 32]. Proportion of individuals in a DA who were classified as visible minority was also calculated. Visible minority groups are defined by the Canadian Employment Equity Act and classification of individuals is based on response to census questions pertaining to self-identified population group [33]. Neighbourhood-level variables were calculated based on the census year closest to the date of birth (2001 or 2006). There were 52,993 and 54,626 DAs in the 2001 and 2006 censuses respectively. Based on the 2006 census, the median and 70th percentile of DA population and land area were 513, 598, 0.26 km2 and 1.27 km2 respectively.

Daily air pollution data were obtained from the National Air Pollution Surveillance (NAPS) monitoring network for particulate matter of median aerodynamic diameter less than 2.5 μm (PM2.5) as well as carbon monoxide (CO), nitrogen dioxide (NO2), ozone (O3), and sulfur dioxide (SO2). Daily temperature data were obtained from Environment and Climate Change Canada’s meteorology data archive. Where data were available from multiple monitors, they were averaged.

Statistical analysis was conducted in two stages. In the first stage, data were analyzed in each city employing Cox proportional hazards models using gestational age in days as the time scale, obtaining city-specific hazard ratios (HRs) with their 95% confidence intervals (CIs) expressed per interquartile range (IQR) of each air pollutant. We tested the proportional hazards assumption using the cox.zph function in R, which evaluates the significance of the interaction between the scaled Schoenfeld residuals for the air pollution term(s) and time, and found no evidence of violation of the proportional hazards assumption. Effects of air pollution were examined using distributed lag functions [34, 35] for lags of 0–6 days prior to delivery, as well as cumulative lags from two to six days. Specification of the lag structure for air pollution and temperature was based on natural spline functions employing three to five degrees of freedom, optimality of which was evaluated based on model Akaike Information Criterion (AIC). We evaluated the optimal lag response specification for O3 and temperature in three cities representing diverse climates: Toronto (central Canada), Edmonton (north) and Vancouver (coastal). Three degrees of freedom in the natural spline of both O3 and temperature exhibited the lowest AIC for all three cities. We therefore employed this lag structure specification in all 24 cities. Potential non-linearity in associations with air pollution was assessed by specifying air pollution as a natural spline with 3 degrees of freedom. We accounted for the potential nonlinear effect of daily mean ambient temperature using a cubic B-spline with 3 internal knots, placement of which was evaluated based on model AIC and guided by recent literature [36]. We compared cubic B-splines with 3 internal knots placed at the 10th, 50th and 90th vs. 10th, 75th, and 90th percentiles of city-specific temperature distributions, and found that the AIC was lowest for the latter. Infant sex, maternal age (19 years or less, 20–39, 30–39, 40+ years), maternal marital status (single, married, separated, divorced, widowed), maternal country of birth (Canada, elsewhere), tertile of neighbourhood percent unemployed, low income, visible minority, and with post-secondary education, indicator variables for year and season of birth and a natural spline function of day of year with 3 degrees of freedom were included as covariates in each city specific model. Subgroup analyses were conducted by infant sex, gestational age category (20–27, 28–31, 32–33, 34–36 weeks), tertile of neighbourhood percent low income and season. In the second stage, we pooled the estimated city-specific hazard ratios using a random effects model. Associations with p-values< 0.05 were considered statistically significant. Analyses were performed with R version 3.4, using dlnm package, version 2.3.2 and metafor, version 2.0.

Results

During the study period there were 1,248,240 singleton births in the 24 cities. Frequency and prevalence of preterm and term birth by maternal and infant characteristics, city, season and year are shown in Table 1. Maternal age 19 years and under or 40 years and over, and maternal marital status of single, divorced and separated were associated with a higher prevalence of preterm birth. St. John’s, Winnipeg, Calgary and Edmonton had the highest prevalence of preterm birth. There was no apparent trend by year or season. After exclusion of births with missing covariate data, 1,001,700 births were included in the analysis including 63,400 preterm births, resulting in an overall prevalence of preterm birth of 6.34%. (Note that in accordance with Statistics Canada disclosure rules, all frequencies were randomly rounded to base five. Statistical analyses employed unrounded data.)

Table 1 Preterm and term births by maternal and infant characteristics, city, season and year

The combined population of the 24 cities was 11,522,776 in 2006. A descriptive summary of air pollution and temperature data is shown in Table 2. Mean PM2.5 concentrations were highest in Montreal, Hamilton and Windsor in relation to traffic and industrial activity, while maxima were highest in Kamloops and Kelowna due to wildfire smoke. Mean NO2 concentrations, an indicator of traffic pollution, were greatest in Vancouver, Calgary and Toronto, while mean and maximum ozone concentrations were generally highest in the southwestern Ontario cities of Brampton, Hamilton, St. Catharines and Kitchener, consistent with the most common location of summer regional smog events. Mean SO2 concentrations were highest in Saint John, Montreal, Hamilton and Windsor, reflecting local industrial activity, and CO concentrations were uniformly low. Mean temperatures were generally mildest and exhibited the narrowest ranges in the coastal British Columbia cities of Richmond, Vancouver, Victoria and Nanaimo.

Table 2 Summary of population, air pollution and temperature data by city

Pooled estimates of associations with O3 by lag day from distributed lag models are shown in Fig. 1. The lag 0, 1 and 6 day Hazard Ratios (HR) were positive and significant, while lags 3 and 4 days were negative and significant. I2, Cochrane’s Q and p-values of Q are shown in Table 3. There was significant heterogeneity between cities only for lag 2 days. The cumulative lag HRs for 0–1, 0–2 and 0–3 days were significant. Results for individual cities at lag 0 days are shown in Additional file 1: Figure S1. Significant positive associations were observed in Toronto (HR 1.038 95% CI 1.009, 1.067), Mississauga (HR 1.057 95% CI 1.005, 1.111), Quebec City (HR 1.075 95% CI 1.004, 1.151), Edmonton (HR 1.096 95% CI 1.040, 1.156) and Windsor (HR 1.131 95% CI 1.035, 1.236) (all are expressed per 13.3 ppb O3). As a sensitivity analysis, we specified O3 as a natural spline function with 3 degrees of freedom in four cities of varying sizes and climates (Vancouver, Edmonton, Winnipeg and Toronto) and found that in all cases models employing a linear O3 term had a lower AIC, indicating better fit.

Fig. 1
figure 1

Pooled hazard ratios, 95% confidence intervals by single day lag, distributed lag models. Expressed per 13.3 ppb O3 (interquartile range)

Table 3 Pooled hazard ratios, 95% confidence intervals and heterogeneity measures from distributed lag models

Analyses by subgroups revealed similar results by lag day for male and female infants (Fig. 2). Significant positive associations were observed of O3 with preterm birth at lags 0 and 1 days in the 1st tertile, lag 0 days in the 2nd tertile and lag 6 days in the 3rd tertile of neighbourhood percent low income (Fig. 3). Significant positive associations at lag 0 days were observed for births at 34–36 weeks, while no significant positive associations were observed for births at 20–27, 28–31 or 32–33 weeks (Fig. 4). Significant positive associations were observed at multiple lags in spring, summer and fall, and only at lag 0 in winter (Fig. 5).

Fig. 2
figure 2

Pooled hazard ratios, 95% confidence intervals by infant sex, single day lag, distributed lag models. Expressed per 13.3 ppb O3 (interquartile range)

Fig. 3
figure 3

Pooled hazard ratios, 95% confidence intervals by tertile neighbourhood percent low income, single day lag, distributed lag models. Expressed per 13.3 ppb O3 (interquartile range)

Fig. 4
figure 4

Pooled hazard ratios, 95% confidence intervals by gestational age, single day lag, distributed lag models. Expressed per 13.3 ppb O3 (interquartile range)

Fig. 5
figure 5

Pooled hazard ratios, 95% confidence intervals by season, single day lag, distributed lag models. Expressed per 13.3 ppb O3 (interquartile range)

Associations with other pollutants were mixed (Additional file 1: Figures S2-S5). CO and NO2 exhibited significant negative associations with preterm birth at lag 0, 1, 5 and 6 days and 0, 1 and 6 days respectively, PM2.5 exhibited no significant associations, and SO2 exhibited significant negative associations at lag 0 and 1 day.

Discussion

We observed associations between daily O3 in the week prior to delivery and preterm birth in an analysis of approximately 1 million births in 24 Canadian cities between 1999 and 2008. Our findings for PM2.5 and NO2 were similar to our earlier analysis where we found null or negative associations of preterm birth with PM2.5 or NO2 averaged over gestational month, trimester or the entire pregnancy [12, 13]. Associations were similar for male and female infants but differed by gestational age and season. Our observation of significant associations only at longer gestational ages may result from greater statistical power afforded by the larger number of pregnancies in these categories of gestational age. Greater time spent outdoors and/or increased indoor penetration of outdoor pollutants in spring, summer and autumn could explain our observation of significant positive associations over multiple lags during these seasons, but only for a single lag in winter. Associations with other pollutants were inconsistent.

Our analysis is one of a limited number which have examined these short term associations employing Cox proportional hazards models to account for the different exposure durations of preterm vs. term births (in contrast to studies based on exposure during the entire pregnancy or third trimester). O3 exposure in particular has received relatively little attention in previous studies. In their analysis of 13 birth cohorts comprising 71,493 births from the European Study of Cohorts for Air Pollution Effects (ESCAPE), Giorgis-Allemand et al. found no association of preterm birth with NO2, nitrogen oxides (NOx), PM2.5 and PM10 exposures over durations ranging from one week to the entire pregnancy [22]. In an analysis of 78,633 births in Rome and 27,255 in Barcelona, Schifano et al. found that PM10 and NO2 in the week prior to delivery were positively associated with preterm birth in Barcelona and negatively associated with preterm birth in Rome, while ozone was positively associated with preterm birth in both cities [23]. The hazard ratios for O3 were comparable in magnitude to what we observed: 1.010 (95% CI 1.001, 1.020) per 9.2 ppb in Barcelona and 1.025 (95% CI 1.009, 1.042) per 15.3 ppb in Rome [23]. In contrast to our findings, they observed larger associations at shorter pregnancy durations [23]. An earlier study by the same authors examining preterm birth in Rome using time-series methods found that PM10, O3 and NO2 lagged 0–2 days were not associated with preterm birth in the warm or cold season; only PM10 lagged 12–22 days in the warm season was significantly associated with preterm birth [19]. In a study of nearly 500,000 births in Guangzhou, significant associations were observed between preterm birth and PM10, NO2 and O3, with the peak magnitude of effect at 25 weeks (HR = 1.048, 95% CI 1.034–1.062 per IQR, 37.0 μg/m3), 26 weeks (HR = 1.060, 95% CI 1.028–1.094 per IQR, 15.4 ppb) and 28 weeks (HR = 1.063, 95% CI 1.046–1.081 per IQR, 45.8 ppb) gestation respectively [26]. We recently reported that PM2.5 on the day of delivery was associated with preterm birth only among women assigned to the highest quartile of PM2.5 glutathione-related oxidative potential based on approximately 200,000 births among 31 cities in the province of Ontario, Canada [25]. Johnson et al. found no association between cumulative third trimester PM2.5 or NO2 and preterm birth in a discrete time survival analysis of 258,294 births in New York City [24]. Sagiv et al. conducted a time-series analysis of 187,997 births in Pennsylvania and found that preterm birth was associated with PM10 2 days and 5 days before birth (relative risk (RR) = 1.10; 95% CI, 1.00–1.21 per 50 μg/m3 and RR = 1.07; 95% CI, 0.98–1.18 per 50 μg/m3 respectively) [14] . Associations with O3 were not reported. In another time series analysis of 476,489 births in Atlanta, Darrow et al. observed mostly null associations with air pollution (including O3), but reported that preterm birth was associated with PM2.5 sulfate and PM2.5 water-soluble metal concentrations in the week preceding delivery [15]. Rappazzo et al. also reported that PM2.5 lagged 0–2 weeks before birth was associated with an increased risk of preterm birth in an analysis of nearly 1.8 million births in Ohio, Pennsylvania, and New Jersey [17]. O3 was included as a covariate but associations of preterm birth with O3 were not reported. A time series study in Ahvaz, Iran found no association between O3 in the two weeks prior to birth and preterm birth, although significant associations with CO, NO2 and PM10 were observed [20]. Lee et al. reported no associations of O3, PM10 or meteorological variables with preterm birth in a time series analysis in London examining exposures in the week prior to birth [16]. Arroyo et al. found an association between O3 in the twelfth week of gestation and preterm birth in a time-series analysis in Madrid [18]. Finally, employing a novel hierarchical spatiotemporal model, Warren et al. found that O3 in weeks 1–5 and PM2.5 in weeks 4–22 were associated with increased risk of preterm birth in a study in eastern Texas [37]. In their analysis of air pollution attributable preterm births worldwide, Malley et al. [1] employed an odds ratio of 1.13 (95% CI 1.03, 1.24) per 10 μg/m3 PM2.5 based on the meta-analysis by Sun et al. [9], considerably larger than what Sagiv et al. [14] or Schifano et al. [23]observed. It should be noted that there may be substantial differences in other factors that could contribute to preterm birth among these studies, including prenatal care, employment rights of pregnant women, and obstetrical decision-making (e.g. decision to induce labour).

Mechanisms through which exposures in the days prior to delivery might trigger preterm birth are unknown, but may include non-specific processes such as inflammation or oxidative stress, which are known to be associated with both preterm birth [38,39,40,41] and air pollution exposure [42]. PM2.5 could also trigger preterm birth through cardiovascular mechanisms or effects on endothelial function [42].

Strengths of our study include the large sample size distributed over multiple cities and utilization of Cox models which account for the differing length of exposure in preterm and term births, and distributed lag models which parsimoniously evaluate effects over multiple lags. We also assessed the shape of the exposure response relationship, examined effect modification by infant, maternal and other factors, and considered the effects of other pollutants. Limitations of our study include the lack of data on maternal behavioural risk factors and possible exposure measurement error owing to the limited number of monitors within each city. Since our analysis deals by design with short term temporal variability in air pollution exposure, observed associations are unlikely to be confounded by short-term time invariant risk factors such as smoking. In the only other study employing the same design which included data on maternal smoking, Giorgis-Allemand et al. found that results were not sensitive to inclusion of smoking and other individual characteristics as covariates [22]. Exposure measurement error would be expected to be non-differential, biasing observed associations towards the null [43], and as a secondary pollutant, O3 concentrations would be expected to be relatively homogeneous over larger areas compared to pollutants such as NO2. Of four other studies with the same design, two with the same limitations with respect to relatively sparse fixed site monitoring data found consistent associations with O3 and inconsistent associations with NO2 and PM10 [23] and consistent associations with PM10, NO2 and O3 [26], while two others employing temporally adjusted land use regression models for NO2, NOx, PM2.5 and PM10 [22] and NO2 and PM2.5 [24], potentially reducing exposure measurement error, found no significant associations with preterm birth [22, 24]. Data were missing for at least one covariate for approximately 20% of births in our study. Marital status was the most common missing covariate, and births for which this was missing had a higher prevalence of preterm birth. This suggests that these births differed from those where all covariate data were available which could have biased our results. Results from individual cities were pooled using a random effects model, which treats estimates from individual cities as originating from separate underlying distributions rather than a single common distribution [44]. Differences between cities may stem from differences in the exposure mix, impact of confounders such as weather, or population characteristics. The random effects model is conservative relative to a fixed effects model in that it assigns greater variance to the overall (pooled) measure of effect by incorporating both within and between study variance [44].

Conclusions

In this study, one of a small number employing time to event analysis, we observed significant associations between O3 in the week prior to delivery and preterm birth, based on an analysis of approximately 1 million births over a ten year period. Given the mixed findings in other studies of this kind, additional studies are needed to determine whether the weight of evidence supports the existence of a causal association between preterm birth and air pollution exposure in the days preceding delivery.