Introduction

Despite numerous research programs and public health initiatives, preterm birth (PTB) in the United States (US) remains a major public health challenge. In 2018, 10% of births in the US were preterm [1] and 66% of infant deaths occurred in those born preterm [2], statistics which were unchanged from the previous year [3, 4]. PTB is also associated with long-term effects. Long-term health outcomes associated with PTB include death in early adulthood (18–36 years) [5], neurodevelopmental impairment [6], use of psychiatric medications in early adulthood [7], and potentially cardiovascular disease [8]. Additionally, long-term social and economic outcomes associated with PTB include lower educational attainment [9,10,11], attention deficit/hyperactivity disorder [12], and collecting disability support [9, 11]. Despite decades of research, the contributing factors of PTB remain poorly understood. Gestational parent (GP) diet and environmental exposures are two potential contributors of growing interest, and may act both independently and as modifiers of adverse birth outcomes.

Among environmental exposures, ambient pollutants resulting from fossil fuel combustion are of particular interest. Many studies report positive associations between air pollutants and birth outcomes, including PTB; however, there are also studies that report null or inverse associations between air pollutant exposures and PTB, and there remain uncertainties across the research in this area, including on why some individuals may be more or less susceptible to the impacts of air pollution than others [13,14,15]. Air pollutants potentially act through mechanisms of inflammation and oxidative stress to increase the risk of PTB [16]. Ambient pollutants from fossil fuel combustion include, but are not limited to, particulate matter less than 2.5 microns in diameter (PM2.5), ozone (O3), and nitrogen dioxide (NO2), criteria air pollutants regulated under the Clean Air Act. PM2.5 and O3 levels are influenced and exacerbated by climate change due to changes in pollutant movement and reaction rates in the atmosphere [17]. Exploring factors which modify the relationship between air pollutant and birth outcomes is essential to understanding the present uncertainties in the literature and to better inform policy decisions that protect the most vulnerable.

Although GP diet has been associated with PTB, data remain inconsistent; this is in part because of differing methods of dietary analysis. Traditionally, dietary consumption has been conceptualized in research as individual food items (e.g., fish) [18]. However, individual foods are not consumed in isolation and may therefore exert differential influences on health outcomes dependent on their context. As a result, dietary consumption is increasingly conceptualized as dietary patterns (i.e., the overall diet, composed of individual food items) as opposed to consumption of individual foods; the most recent dietary guidelines published by the US federal government focused on dietary patterns [19]. GP dietary patterns, both empirically derived (e.g., exploratory factor analysis) and defined a priori (e.g., Western, Mediterranean, Prudent, Dietary Approach to Stop Hypertension [DASH] diets), have been investigated in relation to PTB both prior to conception and during pregnancy. This previous literature suggests that certain dietary patterns, such as those characterized by red meat, fried foods, desserts, and white bread are associated with increased risk of PTB [20, 21], while others, such as the DASH diet, Mediterranean diet, and vegetable-fruit-rice diet, are associated with decreased risk of PTB [20, 21]. The literature, however, is inconsistent [20] and faces some comparability challenges resulting from common terms being applied across various empirically-derived dietary patterns [21]. Another approach to incorporating GP diet that may help to address these issues is the use of specific nutrients as dietary indicators.

Despite the importance of understanding the interplay between ambient pollutants and diet characteristics in relation to birth outcomes, few studies focus on this topic. The four related studies focusing on ambient air pollutants (PM, NO2) and dietary factors (folate, fish consumption, methyl donor nutrients) in relation to various birth outcomes (PTB, livebirth, low birthweight, birth defects) [22,23,24,25] suggest associations between specific diet characteristics and ambient pollutants in relation to birth outcomes. These studies are described in more detail in the discussion. With only four studies addressing this complex topic, there is a marked paucity of research in this area.

In toxicologic research, a study in rats showed that maternal tobacco smoke effect on adverse reproductive and birth outcomes is potentially modified by maternal diet protein content [26]. Additionally, other toxicological studies of rodent models that center on overall maternal diet characteristics, such as high fat and high energy intake, provide evidence of pollutant-diet influence on later health outcomes which may indicate the potential for prenatal interactions as well [27, 28]. There has been speculation that ambient pollution and diet characteristics may together exert an influence on birth outcomes. These theories include nutrient deficiencies which hobble compensation mechanisms normally responsible for moderating physiological responses [29], potentially leading to systemic alterations including inflammation and oxidative stress [16]. Humans following diets characterized by levels of higher saturated fat have higher levels of inflammation biomarkers, lending support to these theories [30].

Despite toxicological suggestions of a diet modified effect of air pollution on birth and offspring health outcomes and tangential epidemiological work investigating diet modified effect of air pollution on birth outcomes, few epidemiologic studies have explicitly interrogated this interplay. This study investigated the association of ambient pollution on PTB and effect measure modification of this association by caloric intake, percent caloric intake from fats, and percent caloric intake from saturated fats. We hypothesize that diet characteristics, particularly saturated fat intake, will enhance any relationship between pollution and PTB.

Methods

Study design and population

This study utilized data from the Newborn Epigenetics Study (NEST), a prospective birth cohort in central North Carolina (NC). Pregnant individuals between 6 and 42 weeks of pregnancy (median: 15.6 interquartile range [IQR]: 11.6, 22.7 weeks) were recruited between 2009 and 2011 from prenatal clinics associated with Duke University Hospital and Durham Regional Hospital Obstetrics. Participants were required to be at least 18 years of age, to communicate in English or Spanish, and to intend delivery at one of the aforementioned hospitals. Exclusion criteria were HIV positivity, no intention to retain custody of the infant, and active plans to move residence before the infant’s first birthday.

Data on participants of the NEST cohort were gathered through interviews (English or Spanish) and medical record abstraction. Data gathered during interviews via questionnaire included sociodemographic information, occupation, medical history, lifestyle characteristics, pre-pregnancy anthropometrics, and a food frequency questionnaire (FFQ). Any questionnaire modules not completed during the interviews were sent home with participants for self-administration. During interviews, participants also contributed anthropometric measurements and biospecimens. Information abstracted from medical records included offspring information at the time of delivery including clinical estimate of gestational age, and sex. This study was approved by Institutional Review Boards at both the University of North Carolina at Chapel Hill and Duke University. In this manuscript we are choosing to use the term gestational parent instead of maternal when referring to humans in order to provide substantive specificity and to include those for which “maternal” does not apply.

Exposure assessment

Air pollution

We leveraged two well-accepted daily ambient air pollution models for estimated concentrations of O3, PM2.5, and NO2: the EPA’s Fused Community Multiscale Air Quality (CMAQ) model [fCMAQ] [31, 32] (available at https://www.epa.gov/hesc/rsig-related-downloadable-data-files) and an ensemble model created by researchers from Harvard University [33]. The ensemble model estimates pollutant concentrations from a group of machine learning algorithms which include a neural network, random forest models, and gradient boosting models. These algorithms utilize satellite derived data (e.g., aerosol optical depth), pollutant monitoring data, meteorological data (e.g., ambient temperature, barometric pressure, wind speed), land use data (e.g., normalized difference vegetation index), elevation data, and chemical transport model predictions as inputs, and then outputs PM2.5 and NO2 concentrations at a 1 km2 grid. In the fCMAQ model, outputs from the Models-3/Community Multiscale Air Quality and data from the national and state level monitoring systems are combined using a bivariate Bayesian space-time downscaler approach to produce “fused” concentration estimates at the census tract level for PM2.5 and O3 [31, 32]. Ambient pollutants are represented in the metrics by which they are currently regulated: O3 exposure is represented as 8-hour maxima in parts per billion (ppb), PM2.5 as 24-hour averages in micrograms per meter cubed (µg/m3), and NO2 as 1-hour maxima in ppb.

Addresses at enrollment were geocoded by the NEST team then linked to census tract for fCMAQ output and nearest grid for ensemble model output. Concentrations for each pollutant available from each source (PM2.5 and O3 for fCMAQ and PM2.5 and NO2 for ensemble model) were then assigned for each day of pregnancy and averaged across trimester periods (T1, T2, T3) to produce trimester-specific estimates of air pollution exposure. PM2.5 daily estimates from the ensemble model were used in the primary analyses. As all air pollutant concentrations are model estimations, there is no spatial or temporal missingness in air pollution exposure assignment.

Diet

At enrollment, all participants were requested to complete self-administered Block FFQs for the time-period up to 6 months before pregnancy (median completion date was 134 (IQR: 119) days before delivery). The FFQ was modified to reflect dietary patterns prevalent in NC (University of Texas Anderson Cancer Center Nutrition and Lifestyle Core Questionnaire 2008v.2). This FFQ addressed frequencies and portions of intake for over 150 food items and supplements. Responses to the FFQ were analyzed by Nutrition Quest, resulting in estimates of grams per day intake of specific nutrients as well as overall daily caloric intake. Caloric intake, percent of caloric intake from saturated fat, and percent of caloric intake from total fat were used due to toxicological evidence, saturated fat being associated with inflammatory markers, and missingness in specific diet items in the FFQ. Dietary values for caloric intake, percent of caloric intake from saturated fat, and percent of caloric intake from total fat were dichotomized at the 75th percentile to designate “high” intake, using an empirical definition of “high” intake to preserve a sufficient number of records to allow models to converge. The dichotomization values for kilocalories (kcal), percent total fat, and percent saturated fat (sfat) were 2844 kcals, 35.265%, and 11.775% respectively. The dichotomization value for saturated fat, 11.775%, is marginally greater than the recommended 10% or lower percent intake from saturated fat [19]. Caloric needs change with GP characteristics and gestational period.

Outcome assessment

We define PTB as birth before 37 weeks of completed gestation based on clinical estimate of gestational age at birth abstracted from medical records. Among those for whom clinical estimate of gestational age at birth was not available (n = 184), we recovered 23 by leveraging last menstrual period month and delivery date, applying a mean imputation with random variation. Those missing both gestational age and last menstrual period were excluded from analysis (n = 161).

Potential confounders

Covariates considered potential confounders were determined by a priori assumptions encoded in a directed acyclic graph (see Supplementary eFigureS1) [34]. Categorical covariates included season of conception (Spring [March 21st - June 19th]; Summer [June 20th - September 21st]; Fall [September 22nd - December 20th]; Winter [December 21st - March 20th]) estimated using date of birth and gestational age, self-classified race/ethnicity (Black, non-Hispanic white, other), education (less than high school, high school or equivalent/some college, any higher degree), and annual household income while pregnant (< $10,000; $10,000 - $49,999; ≥ $50,000). Race/ethnicity was collapsed from its original categories (Black, non-Hispanic white, Hispanic, Asian/Pacific Islander, Native American, biracial, other) in order to allow our models to converge given the modest number of PTBs among the analysis sample. This variable is conceptualized as a proxy for experiences of racism, understanding that our “other” race/ethnicity category represents diverse experiences and may not always provide useful insights. Continuous variables considered were age (years) at delivery and calculated pre-pregnancy body mass index (BMI) (kg/m3) from self-reported pre-pregnancy height and weight. Forms of each continuous covariates were determined by a functional form analysis (linear, quadratic, restricted cubic splines, and categorical), with Akaike Information Criterion values and biological plausibility. For age and BMI variables, we collapsed the extreme 2.5% tails of the distributions (at 19 and 39 years, and 18.27 and 44.31 kg/m2 respectively). We used linear age and quadratic BMI in analysis models.

For sensitivity analyses, we considered dichotomous variables for eversmoking (y/n) and for having at least 30 min of outdoor exercise (jogging, walking, playing with children while walking, gardening, lawn work) per day (y/n).

Statistical analysis

We used log binomial regression to estimate risk ratios (RRs (95%CI)) for PTB for IQR increase of each ambient air pollutant. We ran three models: model 1 included age, race, and education; model 2 included everything in model 1 as well as pre-pregnancy BMI and household income; model 3 (fully adjusted model) included everything in model 2 as well as conception season. To assess potential effect measure modification, we included interaction terms between pollutants and dietary characteristics in all models. The presence of interaction on the additive scale was determined through interaction contrast ratios (ICRs) [35, 36]. with confidence intervals calculated using the delta method [37]. Estimates for ICRs from the fully adjusted single pollutant models are considered of note if the estimate is at least 0.2 magnitude with reasonable precision (range between confidence intervals of less than 5). We addressed missing covariate data with multiple imputation using chained equations fully conditional on all other variables included in the analysis models (all exposures, outcome, all covariates, and exposure-diet characteristic interactions) with 17 iterations [38]. We aggregated analysis results using Rubin’s rules. Demographic distributions for individuals with missing gestational age were compared to the full NEST cohort.

We conducted sensitivity analyses to assess the influence of measurement error among PM2.5 measures by repeating analyses utilizing the fCMAQ PM2.5 estimates in place of the ensemble model, retaining the IQR derived from the ensemble model for RR estimates. All data processing and analysis was completed using SAS v9.4 (Cary, North Carolina) and R (Vienna, Austria) [39].

Comparing those missing gestational age (11%) to the whole cohort where non-missing covariates permit, we observed that race/ethnicity, education, and household income during pregnancy are reasonably balanced. Pre-pregnancy BMI is slightly less balanced, but this may be due to a higher proportion of missingness among those missing gestational age. The comparison between the whole NEST cohort (n = 1505) and those missing gestational age (n = 161) gives us no reason to believe that those who are missing gestational age are missing because of their gestational age: that is, we have no reason to believe that the outcome is missing not at random. This supports our use of multiple imputation to address covariate missingness.

As a sensitivity analysis, we included the dichotomous variable for ever-smoking in the fully adjusted model. As a second sensitivity analysis, we included in the fully adjusted model the dichotomous variable for outdoor exercise at least 30 min a day. In addition, we performed analyses excluding any births that did not reach the third trimester (gestational age < 189 days) to assess influence of early PTBs.

Results

The NEST cohort is comprised of GPs largely between the ages of 25 and 29 years, majority HS graduates, a majority under/normal weight pre-pregnancy BMI, and is an overrepresentation of Black and Hispanic/other relative to the contributing population of NC and the surrounding region (Table 1). Imbalances in demographics between GPs who experienced PTB and those who did not are seen in race/ethnicity (51% Black preterm vs. 39% term), educational attainment (76% HS diploma/some college preterm vs. 68% term), household income during pregnancy (27% less than $10,000 annually preterm vs. 21% term), and pre-pregnancy BMI (27% under/normal weight preterm vs. 42% term).

Table 1 Study population (n = 1505) characteristics at enrollment (overall and by preterm birth)

Comparing those missing diet data (50.56%) to the whole cohort (Supplementary eTable S1), relative missingness in covariates is higher among those missing diet data (with absolute numbers missing similar). GP age at delivery, pre-pregnancy BMI, and season of conception are reasonably well balanced considering the difference in relative proportion of missingness between the groups. Those missing diet data had higher proportions of Black individuals, higher proportions of those with a high school diploma or equivalent, and lower proportions of those with an annual household income at least $50,000; the maximum difference in proportions between the whole cohort and those missing diet data was not greater than 8.19% points. This indicates that there may be some selection bias in the analysis sample.

Among the analysis sample, median (IQR) gestational age at enrollment was 14.4 (11.6, 21.4) weeks and at birth was 39 (38.4, 40.1) weeks. From an enrollment total of 1505, we excluded 761 without complete FFQ data, and 129 who could not be geocoded (75 missing both). We also excluded those with implausible values of daily caloric intake: those with a value of 0 kcal (n = 9) and those in the upper (5090 kcals) and lower (832 kcals) 2.5% of the kcal distribution (n = 124 and n = 136 respectively). The analysis sample was thus reduced to 684 pairs for trimester 1 and 2, and 682 for trimester 3 (pair reduction due to 2 births which occurred during trimester 2, and thus did not experience air pollutant exposures during trimester 3). Those missing gestational age, where data availability allows comparison, are similar to the whole cohort (Table 1). See Supplementary eTable S2 for information about the analysis sample.

Ambient pollutants had relatively constant means and IQRs across trimesters, with the largest difference in means shown by NO2 at 1.11 ppb and the largest variation in IQRs seen in fCMAQ PM2.5 (Table 2). IQRs used in this analysis to place association estimates in context are as follows: 6 ppb NO2 1-hour daily maxima, 14 ppb O3 8-hour maxima; 2 µg PM2.5 24-hour average. Correlations between different pollutant-trimester combinations ranged from not correlated (0.00) to highly correlated (0.97) and 46 of the 132 correlations had a magnitude of over 0.5 (Supplementary Materials eTable 3). Of note, the different PM2.5 measures were not strongly correlated during all trimesters.

Table 2 Ambient pollutant exposure distributions among analysis sample

Dietary characteristics between GPs who experienced PTB and those who did not are reasonably balanced as pertains to daily energetic percent of total fat and daily energetic percent of saturated fat (Table 3). Daily caloric intake was slightly different between preterm and term, with a difference between means of 195 kcals. After dichotomizing at the 75th percentile, we again see this imbalance in caloric intake between preterm and term individuals (20% high caloric intake preterm vs. 12% term) (Table 1).

Table 3 Distribution of gestational parent diet characteristics (n = 684)

The precision of pollutant estimates from the fully adjusted models is overall acceptable (max confidence limit ratio [CLR] for all models across trimesters, diet characteristics, and pollutants is 5.83). It should be noted that PM2.5 (both values from the ensemble model and fCMAQ) tended to be unstable, especially in second trimester models including kcal. The precision of ICR estimates in the fully adjusted model was somewhat more variable than that of the pollutant estimates, with a mean CI width of 3.71 across all models but again with PM2.5 models (with values from both the ensemble model and fCMAQ) offering much wider confidence intervals (up to 20.24 in the third trimester with high overall fat intake).

Pollutant estimates of note from the fully adjusted single pollutant models (all including interaction terms) (Table 4) are as follows: for the first trimester, PM2.5 seems inversely associated in models including total fat (RR (95%CI): 0.86 (0.48, 1.53)); for the second trimester, NO2 is harmful in models including sfat (RR (95%CI): 1.10 (0.75, 1.61)), O3 is inversely associated when considered with all diet characteristics (kcal RR (95%CI): 0.77 (0.39, 1.49); fat RR (95%CI): 0.80 (0.40, 1.64); sfat RR (95%CI): 0.79 (0.43, 1.47)), and PM2.5 seems to be inversely associated when considered with total fat and saturated fat (fat RR (95%CI): 0.72 (0.40, 1.30); sfat RR (95%CI): 0.77 (0.44, 1.36); for the third trimesterNO2 seems inversely associated when considered with kcal and total fat (kcal RR (95%CI): 0.87 (0.57, 1.31); RR (95%CI): fat 0.83 (0.55, 1.26)), and O3 is harmful when considered with all diet characteristics (kcal RR (95%CI): 1.51 (0.62, 3.64); fat RR (95%CI): 1.43 (0.60, 3.38); sfat RR (95%CI): 1.36 (0.58, 3.17)).

Table 4 Adjusted risk ratios and 95% confidence intervals for preterm birth by interquartile range change of NO2, O3, PM2.5 modified by diet

Notable ICR results (Table 5) are only seen in the first trimester and only with sfat in models with O3 (ICR (95%CI) -0.35 (-2.01, 1.31)) and PM2.5 (ICR (95%CI)-0.29 (-2.70, 2.11)).

Table 5 Effect measure modification assessment of diet characteristics on fully adjusted risk ratiosf and 95% confidence intervals for preterm birth by interquartile range changeg of NO2, O3, PM2.5 modified by diet

Sensitivity analyses including ever-smoking and including exercising outdoors at least 30 min per day largely failed to converge, therefore we did not include results from these analyses.

Risk ratio estimates for PM2.5 from the ensemble model and those from the fCMAQ were similar across all trimesters; the only discrepancies are in those involving fat characteristics in trimesters one (ensemble model PM RR (95%CI): 0.86 (0.48, 1.53) vs. fCMAQ RR (95%CI): 0.98 (0.56, 1.71)), and three (ensemble model PM RR (95%CI): 1.07 (0.65, 1.76) vs. fCMAQ PM RR (95%CI): 1.21 (0.70, 2.08)). These discrepancies are not in the direction of the estimate, but rather where the weight of the 95% CI lies on the log scale. That is, one estimate is essentially null and the other lends more support for a non-null estimate. ICR estimates from the fully adjusted models have the same signs, overall similar magnitudes, and similar widths of confidence intervals between the different PM2.5 values for each trimester and with each diet characteristics. The model of the first trimester PM2.5 exposure from fCMAQ and sfat also show an ICR of note (ICR (95%CI) -0.31 (-2.38, 1.77)), similar to the ensemble model PM2.5.

Results for sensitivity analyses excluding the 2 births occurring in the 2nd trimesters showed generally minor differences in effect estimates that did not impact estimate interpretation (results not shown).

Discussion

With this analysis we investigated the effect of ambient pollution on PTB and effect measure modification by dietary characteristics. We observed complex relationships between pollutants, diet, and PTB. Specifically: point estimates for NO2 exposures that are harmful for PTB in trimester 2 and inversely associated in trimester 3; point estimates for O3 exposures that are inversely associated for PTB in trimester 2 and harmful in trimester 3; and point estimates for PM2.5 exposures that are inversely associated for PTB in trimesters 1 and 2. Point estimates of exposure association have 95% CIs that span the null. We observed suggestion of interaction on the additive scale in ICR values as well, though with variable ICR precision - point estimates of interaction should be considered with caution due to limited precision. More statistical power is needed to adequately investigate these complex relationships.

While our interpretations must be limited due to cohort size with complete dietary information, we did observe some evidence of effect measure modification of ambient air pollutant-PTB associations by GP dietary characteristics. The proposed mechanisms underlying this interaction include established air pollution mediated pathways of inflammation, particularly placental inflammation, systemic oxidative stress [16], and increased susceptibility to infection during pregnancy. Inflammation may impair placental function and thus may lead to fetal growth restriction, abnormal response to infection [40,41,42], and subsequent PTB, and certain nutrients (e.g., saturated fat) may exacerbate an inflammatory state more than others [16, 30, 43]. Oxidative stress may lead to DNA damage and subsequent cellular dysfunction or may reduce placental response to growth factors, but certain nutrients (such as methyl donor nutrients) have been associated with anti-oxidant qualities and thus may mitigate or reduce this damage [16, 25]. Of note, both inflammation and oxidative stress have been implicated in metabolic outcomes such as obesity, which has traditionally been associated with diet characteristics and is increasingly investigated in relation to ambient air pollutant exposures. There is also evidence of linkages between parent cardiometabolic conditions and PTB, and between PTB and later adverse cardiometabolic outcomes [44,45,46], suggesting the potential contribution of common inflammatory/oxidation pathways.

Though there have been plausible mechanisms proposed for dietary and air pollution interactions with birth outcomes, only a handful of studies examine these interactions, and few examine the same dietary factors or birth outcomes. Due to the paucity of studies evaluating associations between both ambient air pollution and diet with PTB, we are unable to directly compare the results of this study to epidemiologic studies on the same topic.

The closest related study examines folic acid supplementation before conception and PM in relation to PTB, finding interactions with all sizes of PM – those who initiated early supplemental folic acid showed a reduced impact of PM on PTB compared to those who did not initiate early folic acid supplementation [24]. Another examined folate intake and exposure to NO2, O3, PM2.5, and black carbon three months preconception in relation to livebirth [22]. Low folate intake modified the NO2 – livebirth association such that higher supplemental folic acid intake reduced the negative effects of NO2, but none of the other pollutant associations were strongly modified. Previous work also includes a study concerning PM2.5 and the frequency of fish consumption in relation to low birth weight, finding that a higher frequency of fish consumption may reduce negative impact that high PM2.5 has on birth weight [23]. Finally, Stingone et al. [25] examined NO2 methyl donor nutrient intake (including folate) specifically from food in relation to congenital heart defects, finding strong evidence of effect measure modification when considering NO2 in relation to perimembranous ventricular septal defect; although, the study was unable to elucidate the nuances of the complex pollutant-diet-PTB relationship due to a fairly limited analysis sample.

The results of this study taken together with the small number of related previous studies and mechanistic plausibility strengthen the case for studying ambient air pollutants and diet characteristics in relation to birth outcomes.

This question warrants further research, as it concerns both immediately modifiable factors, as well as those which may take on the scale of years to address through systematic action and technological advances [47]. It may also serve to illuminate another mechanism by which we can work to reduce the known disparities in PTB, as systemic racism makes certain populations less likely to have access to non-processed, high quality food items [48]. While the existing epidemiologic body of literature on this topic remains limited, there is a great deal of potential for connections with health care professionals and educators not only on the hazards of air pollution but also on how individual actions might change those hazards. In particular, for pregnant individuals living in areas that experience higher ambient pollution concentrations which are out of their control. Better information on and understandings of interactions between diet and environmental exposures could improve practitioner understanding and give individuals more tools to reduce environmental impacts on their health and well-being. In addition, this work adds to the body of evidence informing policy decisions around air pollutants, including those who may be more susceptible or vulnerable to the effects of air pollution.

As with nearly all studies involving live births and gestational exposure, there is potential for live birth bias. That is, fetuses exposed to stressors during gestation may be miscarried and therefore would not have been able to contribute a preterm birth outcome. This bias can lead researchers to incorrectly conclude that certain stressors result in decreased risk of preterm birth because those who survive gestation may be more resilient [49]. Previous work concerning live birth bias suggests that this may not have substantially altered results [50]. There were some noted losses in the initial NEST cohort [51], however these individuals were removed from the cohort population by design and available data does not allow us to assess the potential for bias due to fetal loss and miscarriage effectively.

This analysis assumes residential stasis over gestation. This is not an unreasonable assumption, as previous work has shown low residential mobility during gestation [52,53,54,55,56]. Additionally, exposure measures assume a relatively equal proportion of time indoors and outdoors for each participant, as well as relative geographic stasis when outside. This may lead to exposure mismeasurement. Diet data were not available for over 50% of the whole cohort, and comparisons of demographics between the whole cohort and those missing diet data show there may be limited selection bias. Additionally, the dietary assessment assumes a stasis in dietary status over gestation. These static representations of potentially dynamic variables may make it more difficult to isolate an association if one exists. The dietary variables also provide an extremely simple presentation of diet which, while still informative and important, is not able to reflect the nuances of the human diet and thus may obscure any influence those nuances exert. Caloric requirements vary greatly by individual and over the course of pregnancy, and these individual needs may not be ideally reflected in the available information. This analysis focuses on fat related variables because there is limited toxicological evidence supporting pollutant-diet influence on offspring health, saturated fat may contribute to inflammatory mechanisms, and specific diet items contained missingness. While this analysis was intended to be preliminary, other macronutrients are of interest beyond fat related alone. Instability in models including PM2.5 indicates a complex relationship which we were unable to interrogate in this analysis due to limited sample size with complete information. As such we were unable to fully assess models which included other lifestyle factors, such as smoking and physical activity, for potential influence of unmeasured confounding. While this analysis was limited in some aspects, it nevertheless has notable strengths which allow it to contribute meaningfully to the literature.

Strengths of this study include the temporal and spatial granularity in ambient pollution exposure estimates and the leveraging of detailed dietary information in conjunction with residential information, which allowed us to examine EMM of ambient air pollutants-PTB by dietary characteristics.

Conclusions

This study should be used as a substantive contribution to the scientific literature, as well as a call to action. This understudied topic is extremely important, but the data necessary to interrogate this question are not available in larger study populations. More illuminating analyses are necessary and possible only through building larger cohorts on whom this data is collected. Additionally, future work on this topic should include consideration of exposure mixtures (of multiple exposures, and exposures at different time periods) as our participants were not exposed to only one pollutant but likely many or all pollutants simultaneously.