Introduction

The global incidence of female breast cancer is rising [1, 2], and in 2020, among women, breast cancer surpassed lung cancer in the number of incident cases reported worldwide [1, 3]. In Canada, breast cancer is the most commonly diagnosed cancer among women, and the second most commonly diagnosed cancer across both sexes [4, 5]. Despite the lengthy list of established and potential breast cancer risk factors [6,7,8,9,10,11,12,13,14], primarily behavioral, reproductive, or genetic in nature, these factors cannot fully explain many breast cancer cases, and some women who develop breast cancer will possess few of these known risk factors [15, 16]. Furthermore, the nature of the association between breast cancer risk factors and female breast cancer varies substantially by menopausal status and tumor subtype [10]. As a result, the etiology of breast cancer warrants further understanding, particularly with regard to long-term environmental exposures, which have garnered increasing attention due to suggestive animal and epidemiologic evidence [17, 18].

Ambient polycyclic aromatic hydrocarbons (PAHs), formed during the incomplete combustion of organic materials, represent a class of important ubiquitous pollutants recognized as animal and human carcinogens, mutagens, and teratogens [19,20,21]. In Canada, major anthropogenic sources of ambient PAH emissions include residential firewood combustion, vehicular/transportation-related emissions, and industrial plants, whereas virtually the entirety of natural Canadian PAH emissions are produced from forest fires [22]. While anthropogenic sources of ambient PAHs dominate human exposure in urban areas, the increasing frequency and severity of Canadian forest fires, in part due to the effects of climate change, may increasingly contribute to urban exposures, especially given the established long-range atmospheric transport capabilities of ambient PAHs [23, 24].

In terms of biological mechanisms of action, inhaled PAHs are able to enter the bloodstream through interstitial spaces between alveoli and subsequently associate with adipose-dense tissues due to their highly lipophilic properties [25]. Once in these tissues, reactive metabolites have the potential to generate reactive oxygen species in the cellular environment and DNA damage through the formation of DNA adducts [26, 27]. Additionally, PAHs are established endocrine-disrupting chemicals and have, more recently, been implicated as xenoestrogens (i.e., substances that mimic estrogens) [28, 29]. Rodent and mechanistic studies have supported the assumption that PAHs can contribute to the formation of mammary tumors [30].

Epidemiological research examining air pollutant exposures and cancer incidence has historically focused on commonly-monitored criteria air pollutants (e.g., fine particulate matter, PM2.5; nitrogen dioxide, NO2) and respiratory cancers [31,32,33]. More recently, characterizing the relationships between common non-respiratory cancer sites (breast, prostate, colorectal, etc.) and air pollutants, including less-frequently studied constituents, has garnered increasing interest as support for their associations appears to grow [34]. A recently published meta-analysis examining the relationship between NO2, a ubiquitous pollutant closely tied to PAHs and a marker for traffic-related air pollution, and breast cancer risk yielded a significant association (pooled relative risk (RR) per 10 µg/m3 = 1.015; 95% CI: 1.003, 1.028) [35].

In 2022, a systematic review and meta-analysis of the existing non-ecological research examining the relationship between [ambient and non-ambient] PAH exposures and breast cancer risk was published [36]. A summary relative risk estimate from five studies that specifically examined outdoor ambient PAH exposures, all assessing vehicular and traffic-related exposures, was not statistically significant [36].

Given the limited existing findings between breast cancer risk and outdoor ambient PAH exposure, and suggestive evidence for criteria pollutant exposures, further research that incorporates improved and sophisticated PAH exposure characterization methods is needed. Likewise, additional research continues to be required to distinguish how the relationship between air pollutants and breast cancer risk differs according to menopausal status given inconsistent findings [37,38,39]. The current study evaluates associations between long-term residential exposure to ambient PAHs and breast cancer risk in the Canadian setting. We also examine how the relationship between ambient PAH exposures and breast cancer risk differs according to menopausal status. This study adds to the current literature by assessing modeled long-term PAH exposure to both pre and postmenopausal breast cancer risk in Canada.

Methods

Case–control study design

The current study draws its study population from the Canadian National Enhanced Cancer Surveillance System (NECSS), a collaborative effort between Health Canada and the Canadian Provincial Cancer Registries, with data collection starting in 1994 [40]. The NECSS, conducted in eight of the 10 Canadian provinces (all except Quebec and New Brunswick), contains rich and comprehensive individual-level risk factor data for a large population-based Canadian case–control study of 18 different cancer sites and includes ~ 5,000 population controls [40, 41]. Importantly, the NECSS collected individual-level data regarding lifetime residential histories. All participants within the NECSS provided informed consent prior to being included. Due to additional covariate information collected by the province of Ontario, along with the existence of a finer-resolution component of the PAH exposure surface available, the current study reports main analyses based on both the national (i.e., all eight participating provinces) and Ontario-only samples.

Incident breast cancer cases were identified starting in 1994 through the respective provincial cancer registries by randomly sampling one in four eligible participants with newly diagnosed histologically-confirmed invasive primary breast cancer, as defined by the International Classification of Diseases [42]. Provincial registries identified patients within one to four months from their diagnosis through the National Cancer Incidence Reporting System. The provincial registries ensured physician consent was given before approaching breast cancer cases. Sampling was performed for each year until a population-based quota was met, resulting in a total study period spanning 1994–1997. All cases were women aged 25–74 at the time of cancer diagnosis. Premenopausal women with breast cancer were over-sampled to ensure adequate power when exploring relationships with risk factors across menopausal strata [40]. Initially, 3,310 female breast cancer cases were ascertained across the participating provinces. Due to a host of factors, namely physician refusals and case deaths, a total of 3,023 questionnaires were mailed to cases, and of these, 2,340 were successfully completed and returned, yielding a case participation rate of ~ 77.4%.

NECSS population controls were identified in 1996 by each participating provincial cancer registry via frequency-matching on the basis of age and sex for the overall distribution of cases, across all 18 cancer sites (i.e., types of primary cancer) included within NECSS [40]. The specific random sampling methods used to obtain population controls differed across participating provinces according to accessibility and availability of data, details can be found elsewhere [40, 41, 43, 44]. Ascertainment of information from the controls was performed per the same protocol as for the NECSS cancer cases.

Questionnaires were successfully mailed out to 3,550 potential controls and 2,531 completed questionnaires were returned (71.3%). Both case and control participation rates were similar in the Ontario sample.

Data collection

Information regarding participant risk factors and residential histories were collected via mailed questionnaires. Specific information collected through these questionnaires included a broad set of demographic, lifestyle, and environmental factors. Namely, information was ascertained regarding; family income, education, marital status, employment history, residential history, reproductive-related factors, body mass index (BMI), smoking history, alcohol consumption, dietary history, physical activity, and vitamin and mineral supplements (among others) [40, 41]. The dietary history component of the questionnaires was based on previously validated instruments [43, 45, 46]. The questionnaire also included specific questions regarding established and potential risk factors for breast cancer.

The NECSS questionnaire delivered in Ontario included additional information for a number of breast cancer risk factors (e.g., oral contraceptive use, hormone replacement therapy, family history of cancer, benign breast disease) not collected in the other provinces. Additionally, a few factors (namely physical activity) were assessed and measured via different methods when comparing the Ontario NECSS to the other versions of the NECSS, thereby requiring harmonization of these measures for national analyses. For additional information regarding the NECSS design and data collection procedures, refer to Johnson et al. [40].

The sole potential adjustment factor that we sourced via outside (non-NECSS) means was a quintile index of neighborhood deprivation (for the year 1996) [47, 48]. This measure was linked to participant longest residence (at any time) and was sought-out based on the considerations that; (1) individual-level measures of socio-economic status (SES), such as income and education, may not be able to fully capture all aspects of SES (resulting in residual confounding), and (2) levels of air pollution are typically greater in areas of lower-SES [49], and women residing in areas of high-SES may be more at risk for developing breast cancer (i.e., qualification as a potential confounder) [50].

Residential histories included complete addresses and corresponding six-character Canadian postal codes. The postal code centroid was used to represent home for all respondents. All valid six-character postal codes were geocoded to the geographic center of postal codes as of 1996 [51]. The home postal code subsequently represents the spatial basis for residential PAH exposure assignment. Of note, six-character postal codes in densely-populated urban areas often represent quite small domains, whereas in rural areas the domain covered by postal codes may be much larger. Based on postal code classifications (i.e., second digit 0 means rural), rural areas within the eight sampled provinces had a median size of 0.018 km2 (mean = 7.9 km2), while urban areas had a median size of 0.008 km2 (mean = 0.14 km2). On average, PAH concentrations are more homogeneous and orders of magnitude lower in rural versus urban areas. As a result, the measurement error introduced by the larger size of rural postal codes is unlikely to introduce substantial misclassification of PAH exposure.

Assessment of exposure to ambient concentrations of PAHs

The exposure time period was a 20-year window back from 2 years prior to diagnosis (cases) or recruitment (controls) [52]. The GEM-MACH-PAH (Global Environmental Multiscale model – Modeling Air quality and Chemistry—Polycyclic Aromatic Hydrocarbons) model generated PAH surfaces using emissions data from the year 2000 (earliest year available) paired with meteorologic data from the year 1994. Meteorologic conditions in 1994 were found to best represent the average meteorology of the preceding decade (1990–2000). Subject residence history was linked to PAH surfaces using the centroid of the 6-digit postal code on a year-by-year basis. Inclusion criteria required that participants provided at least 16-years’ (i.e., 80% of the exposure window) of residential histories and exposure was averaged over this window for available years of residence. Inclusion criteria restrictions reduced the sample size to 1,233 (514 cases; 719 controls) for Ontario analyses and 3773 (1818 cases; 1955 controls) for national analyses.

GEM-MACH-PAH model

The PAH exposure surface (Fig. 1) was generated by the GEM-MACH-PAH chemical transport model [53, 54]. The previously validated GEM-MACH-PAH model simulates airborne PAH concentrations from estimated emissions that are transported and transformed by modeled meteorology and atmospheric processes (e.g., oxidation, deposition, etc.). GEM-MACH-PAH was initially run for a 10 km × 10 km horizontal grid square domain over continental Canada/United States that subsequently drove the boundary conditions for a smaller 2.5 km × 2.5 km domain (“Pan Am” domain) centered on the eastern Laurentian Great Lakes. This finer resolution domain formed the basis for Ontario-specific main analyses. Gridded pollutant emissions were generated for the year 2000 (the earliest year possible), and these were paired with meteorology from the year 1994 which was found to best represent average temperatures and precipitation for the two preceding decades. Fluoranthene was used as a representative PAH due to its high modeled accuracy compared to measurements, its prevalence in urban PAH air pollution, and its presence in both the gas and particle phases of ambient air [53]. Output from validated simulations demonstrates a high degree of spatial correlation among individual PAH compounds (Online Resource 1; Table S2) thereby allowing a single compound to represent PAHs as a class [53, 54].

Fig. 1
figure 1

Ambient fluoranthene concentration estimates and spatial distribution (Year 2000) generated from GEM-MACH-PAH chemical transport model, Canadian domain (bottom; 10 km × 10 km model resolution) and nested “Pan Am” domain (top; 2.5 km × 2.5 km model resolution)

Statistical analyses

All analyses, except for the generation of spline curves, were conducted using SAS, version 9.4 (SAS Institute, Inc., North Carolina). Restricted cubic spline curves were created using R Statistical Software (v4.3.0; R Core Team 2023).

We used unconditional logistic regression to estimate the odds ratios (ORs) and 95% confidence intervals (CIs) for breast cancer incidence associated with mean fluoranthene exposure levels across the study exposure period. We conducted main analyses using three principal stratifications; all women grouped together, only women of premenopausal status, and only women of post-menopausal status. For the current study, premenopausal status was assigned to [female] participants if, at the time of completing the questionnaire, they were; (1) still menstruating, or (2) less than 50 years old and had an unreported menstruation status, or (3) not currently menstruating but last menstruation reported within the preceding year, otherwise, postmenopausal status was assigned.

Exposure–response patterns for continuous covariates were explored using Box-Tidwell tests and qualitative examination of age-adjusted effect estimates based on equidistantly-spaced categorical representations [55]. Where non-linear relationships of the logit were present, or if linearity was called into question, categorical representations were considered for main analyses. Specifically, categorical cut-points were determined based on the following prioritized criteria; (1) if previously established and common cut-points exist, these were used (e.g., BMI), (2) if the same covariate [in categorical form] were presented in prior studies utilizing the breast cancer component of the NECSS, these were used, (3) otherwise, quartile cut-points were used. For the main exposure of interest (fluoranthene), age-adjusted restricted cubic splines confirmed non-linearity of the logit (for all three stratifications), and thus a categorical representation was warranted. Ensuing, we determined cut-points for fluoranthene by creating equidistant levels of exposure based on the log-transformed variable (due to a highly right-skewed distribution) while ensuring that each level of exposure comprised at least 10% of the control distribution across both the Ontario- and national-analyses and when considering stratifications (i.e., pre- and post-menopausal). Appropriately, all crude and fully-adjusted ORs for breast cancer incidence were reported with respect to the lowest exposure level as the referent category.

Covariates considered for model adjustment were those with established or plausible relationships with breast cancer risk and meeting criteria for appropriateness as a confounder [14]. Since the relationships between risk factors for premenopausal and postmenopausal breast cancer were assumed a priori to be different [10], main analyses employed backwards elimination, with a p-value criterion of 0.2, to determine individual covariate sets for all three stratifications (all women, premenopausal only, postmenopausal only). This was done separately for both Ontario and national samples given the considerable difference in sample size and sample populations.

Due to the presence of missing data for multiple covariates (Table 1 and Online Resource 1; Table S1), and assuming the mechanism behind missingness was Missing at Random (MAR), we conducted single stochastic regression imputation [56, 57]. All analyses are presented with respect to the imputed covariates herein.

Table 1 Distribution of breast cancer risk factors by case–control and menopausal status, Ontario sample, (n cases = 514, n controls = 719)

Nitrogen dioxide

In addition to main analyses, we performed a number of exploratory analyses involving NO2. Namely, we were interested in; (1) investigating the degree to which concentrations of NO2 are spatially correlated (for the year 2000) with GEM-MACH-PAH generated estimates of fluoranthene, and (2) to what extent do the individual associations between NO2 and fluoranthene with breast cancer risk, respectively, differ from those observed when these ambient pollutants are modeled together (i.e., controlling for each other).

To facilitate the aforementioned analyses, we sourced national pollutant estimates, for the year 2000, from Canadian Consortium on Urban Environmental Health (CANUE: www.canue.ca) data repositories [58]. Briefly, estimates for NO2 were generated from a national land-use regression (LUR) model using national air pollution surveillance (NAPS) monitoring data [59,60,61]. Estimates for NO2 (µg/m3) were linked to corresponding annual postal code files by CANUE. National ambient fluoranthene estimates (10 km model resolution) linked to postal codes used to derive average exposures for NECSS participants across residential histories (~ 8,100 individual postal codes) were used in conjunction with corresponding NO2 estimates (i.e., at corresponding postal codes) to drive correlation analyses. Spearman correlation was estimated at the level of individual postal codes, across Canada, for which we were able to obtain estimates for both fluoranthene and NO2 (7,300 postal codes).

To explore associations between NO2 and breast cancer risk (including combined NO2 and fluoranthene models), we created quintile-based exposure categories based on the control distribution (i.e., 20% of controls in each level of exposure) for the Ontario sample and ran logistic models with adjustment for the same covariates controlled for within the main Ontario analysis. This analysis was restricted to participants with valid average exposure measures for both NO2 and fluoranthene, reducing the sample to 494 cases and 681 controls.

Results

The original breast cancer component of the NECSS dataset contained 2,340 cases and 2,531 controls. For Ontario (2.5 km × 2.5 km exposure grid), 514 cases and 719 controls met inclusion criteria based on completion of an Ontario version of the NECSS questionnaire, and completeness of residential history and menopausal status. For the national sample (10 km × 10 km exposure grid), a total 1,818 cases and 1,955 controls met inclusion criteria based on completeness of residential history and menopausal status.

Description of cases and controls

Table 1 displays the covariate (risk factor) distributions and characteristics for all cases and controls within the Ontario sample. Table S1 (Online Resource 1) presents the same information with respect to the national sample.

Among the Ontario sample, the mean age of pre- and post-menopausal cases was 45.3 and 62.3, and 45.3 and 63.5 for controls, respectively (not shown in Table 1). In premenopausal women, cases tended to have somewhat higher household income, higher number of smoking pack-years, slightly less physical activity, higher alcohol consumption, higher proportion of benign breast disease, oral contraceptive use, and immediate relatives diagnosed with cancer, when compared to controls. In postmenopausal women, cases had higher BMI, more years of education, slightly less physical activity and number of children, older age at first full-term pregnancy (or never pregnant), higher number of years menstruated, younger age at menarche, and a higher proportion of benign breast disease and immediate relatives diagnosed with cancer, when compared to controls. Similar distributions and relationships were observed in the national sample (Online Resource 1; Table S1).

Exposure surfaces

Figure 1 illustrates the spatial surfaces of GEM-MACH-PAH derived fluoranthene estimates applied to all participant residential histories. This figure includes exposure surfaces for both the 2.5 km × 2.5 km resolution “Pan Am” model (above), applied for Ontario-specific analyses, as well as the 10 km × 10 km resolution model (below) applied for national analyses. The spatial bounds (borders) of the nested “Pan Am” model are not displayed in this figure, but can be found elsewhere [49].

Table 2 summarizes study participant average fluoranthene exposures for both the Ontario (2.5 km model resolution) and national (10 km model resolution) samples. In general, average fluoranthene exposures were somewhat higher among participants in the Ontario-only sample, and the range of exposures was also greater in this sample. The greater range of exposure is largely attributable to the application of the finer resolution (2.5 km × 2.5 km) exposure surface which more accurately depicts areas of high or low concentration (i.e., “hot-spots” or “cold spots”), whereas the coarser resolution (10 km × 10 km) model may act to ‘smooth’ some of these areas of high concentration in with surrounding areas of, comparatively, lower concentration (or vice versa). This process is known as a low (or high) pass filter.

Table 2 Distribution of average residential ambient fluoranthene exposure (µg/m3) across Ontario (n = 1,233) and national (n = 3,773) samples and by case–control status

Ontario analysis: Breast cancer risk and fluoranthene exposure

Table 3 presents the crude and adjusted odds ratios and corresponding 95% confidence intervals for the relationship between categories of fluoranthene exposure and breast cancer risk in the Ontario sample. We observed positive associations between long-term fluoranthene exposure and breast cancer incidence for premenopausal, but not postmenopausal women. In premenopausal women, adjusted ORs of 2.48 (95% CI: 1.29, 4.77) and 1.97 (95% CI: 0.99, 3.90) were found when comparing the two highest levels of exposure, respectively, to the lowest level of exposure. Among premenopausal women, adjusted models were indicative of elevated risks when compared to crude (age-adjusted) models.

Table 3 Odds ratios for the incidence of breast cancer associated with ambient fluoranthene exposure, by menopausal status, Ontario sample (n cases = 514, n controls = 719) at 2.5 km model resolution

Increasing the completeness of residential history [over the exposure window] required for study inclusion (18 years; 90% of exposure window) reduced the Ontario sample size by less than 100 participants and yielded only small changes in resulting adjusted-ORs for breast cancer risk.

Figure 2 displays the adjusted exposure–response splines for the Ontario sample, including relevant stratifications for menopausal status. Splines were created using restricted cubic functions with 4 degrees of freedom.

Fig. 2
figure 2

Association between the incidence of pre- and post-menopausal breast cancer and concentrations of fluoranthene using restricted cubic splines with 4 knots, Ontario sample at 2.5 km model resolution. The maximum likelihood estimate is shown as the solid line, and the broken lines represent the upper and lower pointwise 95% confidence limits. Individual spline functions adjusted for same covariates sets as for main regression analyses: (All Women) Age-group, years of menstruation, age at first full-term pregnancy, physical activity, body mass index, smoking pack-years, alcohol consumption, history of benign breast disease, immediate relative diagnosed with cancer, hormone replacement therapy; (Postmenopausal) Age-group, years of menstruation, age at first full-term pregnancy, physical activity, body mass index, smoking pack-years, alcohol consumption, history of benign breast disease, immediate relative diagnosed with cancer, meat consumption, years of education, total household income, oral contraceptive use, hormone replacement therapy; (Premenopausal) Age-group, age at menarche, body mass index, history of benign breast disease, oral contraceptive use, immediate relative diagnosed with cancer

National analysis: Breast cancer risk and fluoranthene exposure

Table 4 presents the crude and adjusted odds ratios and corresponding 95% confidence intervals for the relationship between categories of fluoranthene exposure and breast cancer risk in the national sample. We observed small, but suggestive positive associations between long-term fluoranthene exposure and breast cancer incidence for both pre- and post-menopausal women.

Table 4 Odds ratios for the incidence of breast cancer associated with ambient fluoranthene exposure, by menopausal status, national sample (n cases = 1,818, n controls = 1,955) at 10 km model resolution

In premenopausal women, an adjusted OR of 1.59 (95% CI: 1.11, 2.29) was found when comparing the second-highest level of exposure to the lowest level of exposure. In postmenopausal women, a corresponding (second-highest level vs. lowest level of exposure) adjusted OR of 1.33 (95% CI: 1.02, 1.73) was found. In general, adjustment for covariates resulted in small changes in risk estimates compared to crude models.

For both national and Ontario samples, we observed a decrease in breast cancer risk associated with the highest level of exposure when compared to the second- and third-highest levels of exposure. The exception was, however, for postmenopausal women within the Ontario sample, which yielded more so a linear trend in exposure–response. These aforementioned patterns of effect were confirmed by restricted cubic spline curves for the Ontario sample (Fig. 2).

Nitrogen dioxide

Estimates of NO2 exposure were available for a subset of participants based on linkage to CANUE-derived estimates of NO2 (year 2000) for the Ontario sample (n cases = 494, n controls = 681). The spatial correlation (Spearman) between fluoranthene and NO2 at individual postal codes was found to be rs = 0.717. Exploratory analysis examined the risk for fluoranthene and NO2 in a model containing both exposures (Online Resource 1; Table S3). Effect estimates for premenopausal women were strongest for fluoranthene, and estimates for postmenopausal women were strongest for NO2.

Discussion

Overall, our findings suggest an increased risk for incident breast cancer among premenopausal women exposed to higher concentrations of ambient PAHs. In comparison, findings among postmenopausal women were more mixed. Linear dose–response patterns were largely absent across both menopausal strata where positive associations were present.

The current study marks the first time that a national surface for PAH exposure has been applied to a population-based cancer study in Canada, and represents the first use of this specific exposure surface in an epidemiological context.

Despite the larger study population and domain associated with the national sample, Ontario analyses have the following two major advantages; (1) application of a much finer resolution exposure surface (i.e., less exposure misclassification, greater variability), and (2) adjustment consideration for four additional important breast cancer risk factors (i.e., oral contraceptive use, hormone replacement therapy, benign breast disease status, immediate family history of cancer). Effect estimates for premenopausal breast cancer were stronger among the Ontario sample, where the aforementioned methodological advantages were implemented.

NO2 is a routinely measured air pollutant and has been the subject of meta-analyses in relation to breast cancer risk [35]. However, NO2 is generally considered a marker for traffic-related exposures, including PAHs, rather than a causal agent. This study investigation examined fluoranthene as a more proximal marker of potential carcinogenic agents and thus it was of interest to contrast effects observed with those for NO2 exposure (Online Resource 1; Table S3).

Our findings, particularly concerning the contrast in association across menopausal strata, align with recently published case–control and cohort studies examining breast cancer risk with respect to residential air pollutant exposures including; Hystad et al. [44], Villeneuve et al. [38], Mordukhovich et al. [63], Goldberg et al. [64], and Nie et al. [65]. Yet, several studies have yielded positive findings for postmenopausal breast cancer [39], or have even found significant effects for postmenopausal, but not premenopausal breast cancer [37, 66]. Additionally, a few recent studies do not report any statistically significant association (for either pre- or post-menopausal breast cancer) [67,68,69], though these represent the minority of the published literature, especially with regard to existing work with PAHs [36, 70]. These findings relate to the fact that pre- and post-menopausal breast cancers are somewhat different diseases with varying risk factors. The pre-existing study with the most methodologically similar design, that of Amadou et al. [70], yielded an OR of 1.15 (95% CI: 1.04, 1.27) for an interquartile range (IQR) increase in benzo[a]pyrene exposure. Interestingly, they found that significant associations remained only for women who underwent menopausal transition (i.e., premenopausal women at recruitment who became postmenopausal at cancer diagnosis), and also noted that linear dose–response patterns were largely absent.

There are a number of methodological limitations that must be taken into account when considering our study findings. First, the retrospective (i.e., case–control) nature and provincial-based design of the current study presents some inherent potential for bias (e.g., selection bias in the recruitment of controls). With that said, and as noted by Hystad et al. who also utilized the breast cancer component of the NECSS in their study of NO2 [44], though case and control response rates were somewhat low, risk estimates for established risk factors obtained from the NECSS data are generally similar to what has been published in the existing literature, suggesting that the potential for selection bias in the form of participation bias is relatively low.

Second, we recognize that our exposure assignment only accounts for residential exposures, which only makes up part of total PAH exposure. With that said, the average Canadian spends a substantial proportion of their day within and in close proximity to their place of residence [71], and research with NO2 has shown a moderate degree of correlation between residential and total personal exposure [72]. Third, despite a comprehensive set of breast cancer risk factors considered for model adjustments, there remains the potential for residual or missing confounding. For example, our study lacked information on genetic history or predisposition for breast cancer (e.g., BRCA1/BRCA2), though this would likely not be related to air pollution exposure, and therefore may not be a true confounder.

Fourth, we generated PAH exposure surfaces using the best available information to represent the years of 1973–1995 (i.e., possible range of exposure years) to derive estimates of average long-term exposure. The surfaces were based on gridded emissions estimates from the earliest year (2000) combined with meteorological data from 1994 which was found to best represent average conditions over most of the exposure period [73]. In comparison to more commonly-studied constituents of air pollution (e.g., PM2.5, NO2), there is a general lack of historical fixed-site monitoring data and exposure-assignment methods for which ambient PAH exposures can be assigned. This results in challenges in generating exposure within and/or prior to the NECSS study period. While we recognize that absolute concentrations of ambient PAHs have decreased over time across the Canadian domain [74], ambient observation records rarely pre-date 2000 [75]. Despite this, limited available information is consistent with a spatial contrast in ambient concentration that has largely been maintained, as has been shown for NO2 [albeit, in Europe and for a shorter time period] [76]. In addition to the aforementioned limitation, there is likely further non-differential exposure misclassification associated with the use of geocoded addresses (based on postal codes) as well as the use of an ecologically-derived measure of exposure as a proxy for personal exposure. Due to this, our results likely underestimate true estimates of risk [72].

Our findings should not be interpreted as definitive causal agents with respect to breast cancer risk. PAHs may be a proxy for the complex mixture of ambient by-products derived from various combustion sources (including NO2) [39, 77], though the proposed biological mechanisms (i.e., endocrine-disrupting activity, DNA adduct formation), previous epidemiological studies of various cancer sites, and the presence of both particle- and gas-phase states make PAHs a particularly plausible, and proximal agent for the carcinogenesis-related effects imparted through air pollutant exposures. Additionally, it is possible that there exist certain critical periods of exposure throughout the lifetime (e.g., early-age) whereby PAH [and other ambient pollutant] exposures may be especially relevant to [breast] cancer development, which we were unable to account for in the present study [39].

In spite of limitations, this study adds to the limited existing literature of ambient PAH exposures and breast cancer risk and has a number of substantial strengths, including; (1) the first epidemiological application of a newly-developed PAH exposure surface – broadening evidence beyond criteria air pollutants, (2) detailed lifetime residential histories – reducing the potential for exposure misclassification when compared to truncated histories, (3) a high number and quality of available covariates (with specific relation to breast cancer), and (4) oversampling of premenopausal breast cancer cases – allowing for additional power when examining associations by menopausal status.

We found an association between exposure to ambient PAHs (represented by fluoranthene) and the incidence of premenopausal breast cancer among Canadian women. Associations among postmenopausal women were inconclusive given inconsistent findings across national and Ontario-specific analyses. Future research should continue to attempt to elucidate the nature of the relationship between ambient PAH exposures and breast cancer, along with other non-respiratory cancer sites, given expanding evidence for an association. Subsequent work in this area may also benefit from larger and prospective studies of breast cancer (premenopausal in particular), improved and modern exposure assessment methods for unsubstituted (‘parent’ compounds consisting of only carbon and hydrogen) and substituted (‘parent’ compounds with additional functional groups) PAHs, multi-pollutant analyses, and investigation into potential critical periods of exposure.