Background

Breast cancer is a clinically heterogeneous disease and differences in tumor behavior can influence treatment recommendations and clinical outcomes, including recurrence and survival, among breast cancer patients [1, 2]. In addition to influencing breast cancer incidence, risk factors such as elevated body mass index (BMI), nulliparity, lack of breastfeeding, and older age at first full-term pregnancy may also influence clinical outcomes following breast cancer diagnosis [3,4,5,6,7,8]. However, it is unclear whether these factors affect combinatorial biomarkers of breast cancer prognosis, which aid to guide treatment selection and patient management.

In general, most previous studies examining the relationships between breast cancer risk factors and tumor behavior have focused on the individual tumor characteristics, and not their constellation. Higher BMI, for instance, is reportedly associated with higher grade or larger size tumors; nulliparity and menopausal hormone therapy use (MHT) with highly proliferating and lobular carcinomas, respectively; and higher parity with P53 expressing tumors [9,10,11,12]. The impact of these factors on prognostic biomarkers that combine several tumor characteristics to infer tumor aggressiveness and aid treatment recommendation is, however, less well-studied.

To aid prognostication in breast cancer, several tumor characteristics including histologic grade, tumor size, lymph nodal involvement, estrogen receptor (ER), progesterone receptor (PR), human epidermal growth factor receptor 2 (HER2) and KI67, a marker of proliferation, have been combined into clinically useful prognostic algorithms [13,14,15,16]. The Nottingham prognostic index (NPI) is popular for combining information on tumor size, histologic grade, and lymph nodal involvement into a single quantitative measure [13]. The IHC4 score, on the other hand, combines information on ER, PR, HER2, and KI67 into a prognostic algorithm [16].

Despite their prognostic relevance, it remains unclear whether breast cancer risk factors influence these combinatorial prognostic markers. Our main aim in this study was, therefore, to investigate the associations between breast cancer risk factors and breast tumor behavior defined by NPI and IHC4 score among a cohort of Chinese breast cancer patients. Key findings from the primary analysis were then re-evaluated in an independent cohort of Polish patients.

Methods and materials

Study population

The main study population comprised Chinese breast cancer patients from a hospital-based case series who had histologically confirmed invasive breast cancer that were diagnosed and treated at the Cancer Hospital, Chinese Academy of Medical Sciences (CHCAMS), Beijing, China. Overall, a total of 8616 patients, aged 29–97 years at diagnosis, were recruited from CHCAMS between 2011 and 2016. Of these, 437 patients did not have complete data on hormone receptor status, i.e. ER and/or PR and hence were excluded from the analysis. This study received ethical approval from the CHCAMS Ethics Committee. Owing to the fact that this analysis did not involve interaction with human subjects or the use of individual’s personal identifying information, it was granted exemption from review by the Office of Human Research Protections at the National Institutes of Health, NIH (exempt number 11751). The Polish Breast Cancer Study (PBCS) is a population-based study in Poland that enrolled women 20–74 years with histologically or cytologically confirmed breast cancer (n = 2386) at five participating hospitals in Warsaw and Lodz over a three-year period between 2000 and 2003 [17]. For the current analysis, we identified 972 patients from the PBCS study with complete information to allow the generation of the IHC4 score and/or NPI. Ethical approvals for PBCS were obtained from local ethics committees and all participants provided written informed consent as required by local institutional and National Cancer Institute/NIH review boards.

Data on tumor clinicopathological features and breast cancer risk factors

Data on tumor clinicopathological features, including morphology, histologic grade, tumor size, lymph nodal involvement, ER, PR, HER2, and KI67 were obtained from pathology records. Data on breast cancer risk factors, including age, BMI, family history of breast cancer (FHBC) in a first degree relative, as well as reproductive factors, including age at menarche, parity and number of children, and breastfeeding were extracted from patients’ medical records. These data were collected from patients and entered into the medical records as part of patients’ medical workup. Anthropometric measures, including height and weight, were obtained by trained members of staff during clinical workup.

Immunohistochemical staining

Details of IHC staining and scoring procedures have been previously described [18]. In brief, all IHC markers were stained using standard laboratory procedures. ER, PR, and HER2 were stained using Roche rabbit monoclonal antibodies, SP1, 1E2, and 4B5 clones, at 1:1000, 1:1000, and 1:66 dilutions, respectively. KI67 was stained using mouse monoclonal antibody MIB1 based on the manufacturer optimized concentration. Staining was performed using the Roche Ventana XT autostainer for all markers. Scoring was performed by pathologists with expertise in breast cancer. Based on international conventions [2], ER and PR positivity were defined as >1% positively staining cells while for HER2, 3 + on IHC or amplification on fluorescent in situ hybridization (FISH) was considered positive. In keeping with results from a previous meta-analysis showing KI67 score of 25% to provide the best survival discrimination [19], we dichotomized KI67 at this cutoff-point with scores of >25% designating high KI67 expression. Binary categories of KI67 were used in combination with standard clinical categories of ER, PR, and HER2 to define breast cancer subtypes (luminal A-like, luminal B-like, HER2-enriched, and triple negative breast cancer [TNBC]) according to internationally recommended guidelines [20]. Continuous measures of ER, PR, and KI67 were used in combination with HER2 for the calculation of the IHC4 score in keeping with the validated algorithm [16].

Computation of the NPI and IHC4 score

We calculated the NPI based on the published equation combining tumor size, grade, and nodal involvement [13]:

  • NPI = 0.2 × tumor size (cm) + grade + nodal status (pN0 = 1, pN1-3 = 2, pN ≥ 4 = 3)

We calculated the IHC4 score based on the published equation combining ER, PR, HER2 and KI67 [16]:

  • IHC4 score = 94.7 × {(−0.100ER10) + (−0.079PR10) + (0.586HER2) + [0.240ln(1+10×Ki67)]}

Statistical analysis

The following breast cancer risk factors were examined: age: <35 (reference), 35–45, 45–55, >55 years; parity: nulliparous (reference), 1, 2, ≥3 children; age at menarche: ≤12 (reference), 13, 14, ≥15 years; BMI, in kg/m2: underweight (<18.5), normal (18.5–25; reference), overweight (25–30), and obese (>30); and FHBC in a first degree relative: yes (if present) or no (if absent; reference). NPI and IHC4 score were categorized into quartiles as follows; Q1 (<25th percentile); Q2 (25th–50th percentile); Q3 (50th–75th percentile); and Q4 (>75th percentile). Associations between breast cancer risk factors and tumor-related prognostic indicators were assessed in polytomous logistic regression models with quartiles of NPI and IHC4 score as outcomes (Q1 = baseline category) and breast cancer risk factors as predictors. Analyses were performed overall and following stratification by age (≤50 and >50), as surrogate for menopausal status, and by tumor hormone receptor (HR) expression status (i.e. HR + and HR−). In sensitivity analysis, we used PREDICT[14], another combinatorial prognostic marker that contains tumor size, grade, and lymph nodal involvement, in place of the NPI and assessed relationships with breast cancer risk factors. To assess which if any, of the individual clinicopathologic (tumor size, histologic grade, nodal involvement) and IHC (ER, PR, HER2 and KI67) factors was driving associations between risk factors and these combinatorial prognostic factors, we modeled relevant breast cancer risk factors as outcome variables and mutually adjusted for the individual clinicopathological and IHC features as predictors. The contribution of individual factors to model prediction was determined by assessing change in likelihood ratio Chi-square (LRχ2) when the factor was removed from the fully adjusted model. For those factors that were significantly related to NPI and/or IHC4 in the Chinese population, we repeated the analysis in an independent sample of 972 breast cancer cases from PBCS [17]. Missing values on risk factor covariates were addressed by the listwise deletion approach. In additional sensitivity analyses, we created indicators for missing values on the covariates that were included in multivariable models. All analyses were two-sided and performed using Stata statistical software version 16.1.

Results

Distribution of baseline clinicopathological factors and breast cancer risk factors among study participants

Table 1 describes the distribution of clinicopathologic as well as lifestyle and reproductive factors in this dataset consisting of 8,179 breast cancer patients, overall and stratified by HR status. The mean age at diagnosis was 51.8 years, which did not differ significantly by HR status. The majority of these patients had intermediate or high grade (91%) tumors. Similarly, node negative (53%), ER + (77%), PR + (76%), HER2- (80%), and low KI67 (55%) tumors predominated. Notably, the frequencies of all clinicopathologic characteristics differed significantly by HR status, with HR– tumors having higher frequencies of aggressive tumor characteristics than HR + tumors. In terms of breast cancer risk factors, the majority of the patients had at least one child (97%), had breastfed (87%), had normal BMI (54%), and had a negative FHBC in a first degree relative (92%), none of which varied by HR status (Table 1). In general, patients from the PBCS study were older, experienced menarche earlier, and were more frequently nulliparous or obese than patients from the CHCAMS study (Supplementary Table 1).

Table 1 Distributions of clinicopathological and breast cancer risk factors, overall and by tumor hormone receptor expression status, among Chinese breast cancer patients

Associations between breast cancer risk factors and levels of Nottingham prognostic index (NPI)

Table 2 shows the associations between breast cancer risk factors and the NPI. We found increasing age to be statistically significantly inversely associated with higher NPI (ORtrend (95% CI) = 0.75 (0.66–0.84), Ptrend < 0.001). Conversely, higher parity was significantly associated with higher NPI (ORtrend (95% CI) = 1.20 (1.05–1.37), Ptrend = 0.007). Furthermore, overweight (OR (95% CI) = 1.60 (1.29–1.98)) and obese (OR (95% CI) = 2.12 (1.43–3.14) women were substantially more likely than normal weight women to have high (Q4 vs Q1) NPI values. There was a significant trend with increasing BMI and NPI (ORtrend (95% CI) = 1.53 (1.30–1.79), Ptrend < 0.001). A positive FHBC in a first degree relative was inversely associated with higher NPI values (OR (95% CI) Q4 vs Q1 = 0.66 (0.45–0.95); P = 0.027). Other factors, including breastfeeding and age at menarche were not associated with the NPI. In sensitivity analysis using PREDICT, we found strikingly similar associations between age, parity, BMI, and family history with PREDICT as we did in relation to the NPI (Supplementary Table 2). The associations between parity, BMI, and FHBC with NPI were similar in women with HR + and HR− breast cancer (Table 3), in younger and older women (Supplementary Table 3), and in analysis accounting for missing values on covariates (Supplementary Table 4).

Table 2 Odds ratios (ORs) and 95% confidence intervals (CIs) for the associations between breast cancer risk factors and levels of the Nottingham prognostic index (NPI) among Chinese breast cancer patients with complete data (n = 3668)
Table 3 Odds ratios (ORs) and 95% confidence intervals (CIs) for associations between parity, body mass index (BMI), and family history and levels of the Nottingham prognostic index (NPI) in stratified analyses by hormone receptor-expression status among Chinese breast cancer patients

Associations between breast cancer risk factors and levels of immunohistochemical 4 (IHC4) score

Of the risk factors assessed, only BMI demonstrated significant associations with IHC4 score. In contrast to the positive associations that we found between elevated BMI and high NPI, elevated BMI was inversely associated with the IHC4 score. In comparison to normal weight women, overweight (OR (95% CI) = 0.82 (0.66–1.02)) and obese (OR (95% CI) = 0.52 (0.36–0.76)) women were less likely to have tumors with high IHC4 score (Table 4). In stratified analysis by age, we found the inverse association between increasing BMI and the IHC4 to be statistically significantly stronger among older than younger women (P value for heterogeneity [P-het] = 0.006) but estimates were in the same direction (Supplementary Table 5). We did not find evidence for associations between age at menarche, parity, and breastfeeding with the IHC4. Overall, the results were similar in analysis accounting for missing values on covariates (Supplementary Table 6).

Table 4 Odds ratios (ORs) and 95% confidence intervals (CIs) for the associations between breast cancer risk factors and levels of the immunohistochemical 4 (IHC4) score among Chinese breast cancer patients with complete data (n = 3475)

Age at onset, parity, BMI and FHBC in relation to NPI and IHC4 score among Polish breast cancer patients

To check whether the associations we found between age at onset, parity, BMI, and family history with clinical prognostic markers were seen in other populations, we used data from a subset of 972 Polish breast cancer patients. Similar to the findings among Chinese patients, increasing age and a positive FHBC in a first degree relative were associated with lower NPI values even though the estimates did not attain statistical significance. In addition, overweight (OR (95% CI) = 1.34 (0.82–2.19) and obese (OR (95% CI) = 2.27 (1.32–3.89) women were more likely to have high NPI when compared to normal weight women (Table 5), which is similar to what we found among the Chinese women. None of the factors evaluated was statistically significantly associated with the IHC4 score among Polish women overall. However, consistent with the findings among Chinese women, the inverse association between elevated BMI and levels of IHC4 score were stronger among older than younger women (P-het = 0.04; Supplementary Table 7).

Table 5 Odds ratios (ORs) and 95% confidence intervals (CIs) for the associations between age, parity, body mass index (BMI), family history, and levels of the Nottingham prognostic index (NPI) and immunohistochemical 4 (IHC4) score among Polish breast cancer patients (n = 972)

Associations between parity, BMI, and family history with individual clinicopathologic characteristics

In analysis to determine the contributions of individual clinicopathologic factors to predictive models for BMI, parity, and family history in relation to clinical prognostic markers, we modeled each of these risk factors as an outcome variable and clinicopathological and IHC factors as explanatory variables. In stepwise analyses, we removed each clinicopathologic factor from the full model and assessed the change in likelihood ration Chi-square (ΔLRχ2), which was compared with the nested model using LR tests. In general, age was most relevant for the estimation of models for BMI, parity, and FHBC. This was followed by tumor size, ER, HER2 and grade for obesity; node status, grade, tumor size and ER for parity; and node status, HER2, and grade for FHBC (Table 6).

Table 6 Change in likelihood ratio Chi-square (ΔLRχ2) and corresponding p values testing the contributions of individual clinicopathological characteristics to predictive models for body mass index (BMI), parity, and family history among Chinese breast cancer patients

Discussion

In this large-scale analysis of over 8,000 Chinese breast cancer patients, we assessed relationships between breast cancer risk factors and clinically important combinatorial prognostic biomarkers in breast cancer. Specifically, we used routinely available clinicopathological (tumor size, histologic grade, or nodal involvement) and IHC (ER, PR, HER2, and KI67) parameters to compute the NPI and IHC4 score and investigated relationships with breast cancer risk factors. We found younger age, higher parity, being overweight or obese, and not having a FHBC to be associated with more clinically advanced breast cancer, represented by high NPI. Unlike what we observed in relation to the NPI, however, being overweight or obese was associated with lower IHC4 score, indicating less aggressive breast cancer phenotype. Nevertheless, accounting for HR expression, women with elevated BMI were more likely to have clinically advanced HR + and HR− breast cancers. Taken together, these results suggest that breast cancer risk factors may influence the biology and clinical presentation of breast tumors, with implications for improved surveillance and the development of prognostic tools incorporating risk factors.

In addition to tumor-related factors, the age at breast cancer diagnosis has long been recognized to be associated with worse clinical outcomes [21,22,23,24,25]. Our age-related findings are in keeping with those from previous studies demonstrating associations between younger age at onset and more aggressive tumor characteristics even among HR + tumors [21] and suggests that previously reported associations may partly be mediated by tumor characteristics, particularly those contained in the NPI, i.e. size, grade, and lymph nodal involvement. Nevertheless, further studies incorporating mediation analysis will be required to conclusively determine whether previously reported associations between age at onset and recurrence or survival following breast cancer are mediated by the influence of age on specific tumor characteristics.

Our findings of associations between increasing parity and clinically advanced breast cancer is in line with previously reported associations between parity and aggressive breast cancer phenotypes [26,27,28]. Given that IHC markers and other tumor characteristics are highly correlated, it is unclear if previously reported associations between parity and HR– tumors are driven entirely by HR expression or by other clinicopathological characteristics. By accounting for many of these markers in the current analysis, we demonstrated that tumor characteristics contained in the NPI may partly explain previously observed associations between parity and HR− breast cancer. The mechanism through which parity predisposes to aggressive phenotypes of breast cancer is largely unknown but it is thought to be related, at least in part, to changes in the breast microenvironment that can result from aberrant post-partum lobular involution [29, 30]. Previous reports were in support of the attenuating effects of prolonged breastfeeding on the association between parity and TNBC, an aggressive subtype of breast cancer [27, 31, 32]. In contrast to these reports, however, we did not find evidence to support associations between breastfeeding and any of the prognostic parameters.

Results from epidemiological studies indicate that elevated BMI more strongly predisposes to HR− than HR + breast cancers among premenopausal women, with the converse predominating among postmenopausal women [33, 34]. Given the superior survival indices for HR + over HR− breast cancer, some researchers have suggested that elevated BMI more strongly predisposes to less aggressive tumor subtypes, particularly in postmenopausal women [35]. Perhaps paradoxically, results from several other studies, including a large-scale pooling analysis, have supported higher frequencies of aggressive tumor characteristics, particularly histologic grade, among overweight and/or obese than normal weight women [9, 10, 36]. Our findings that elevated BMI was associated with low IHC4 on the one hand, and high NPI on the other hand, might explain previous reports since low IHC4 corresponds to high HR expression whereas high NPI can connote larger tumor size, higher histologic grade and/or positive lymph nodal status, all markers of poor prognosis. Notably, the associations between BMI and NPI were irrespective of HR expression, suggesting that overweight and/or obese women were more likely to develop aggressive tumor subtypes regardless of HR expression status. Our observation that several tumor characteristics were statistically significantly predictive of overweight and obese BMI following LR tests indicates that the somewhat paradoxical relationship between BMI and tumor aggressiveness might be due to varying roles of estrogen metabolism, adiponectin, insulin-like growth factors, chronic inflammation, and/or delayed detection in obesity-related breast carcinogenesis [29, 37,38,39].

The literature on the relationship between FHBC and clinical outcomes in breast cancer is not consistent [40,41,42,43,44]. In the current study, we observed having a positive FHBC to be associated with lower NPI values, suggesting better prognosis. In analysis assessing the contributions of the individual tumor characteristics to a predictive model for FHBC, we found lymph nodal status to provide more predictive information than other tumor characteristics. This might suggest that women with a positive FHBC were less likely to have clinically advanced disease than those without a FHBC. A possible explanation for this finding may be that women with a positive FHBC may be more likely to seek and/or undergo screening or other forms of surveillance which can lead to the detection of early stage tumors than those without a FHBC. This may be especially true in countries where large-scale, organized, breast cancer screening is not available and where selective screening is offered to high risk women (partly defined by having a positive FHBC). Accordingly, further studies with detailed screening histories will be required to conclusively determine relationships between FHBC, prognostic biomarkers, and clinical outcomes in breast cancer.

We assessed the external generalizability of findings from the Chinese population by investigating the key results in an independent population of Polish breast cancer patients. With the exception of parity, the associations between risk factors and NPI were in the same direction in the Polish and Chinese populations. In particular, the associations between BMI and NPI were significant in both populations. Unlike Chinese patients, for whom increasing parity was statistically significantly associated with higher NPI, parity was not associated with higher NPI among the Polish patients. Given the documented associations between pregnancy-associated breast cancer, which could occur up to 10 years after pregnancy, and aggressive phenotypes of breast cancer [45], it is possible that differences in results between the study populations may reflect differences in time since last childbirth. We were unable to specifically evaluate this question due to lack of information on time since last birth and breast cancer diagnosis in both study populations. Nonetheless, Polish participants were older and more likely to be postmenopausal than the Chinese participants which might suggest a longer time since last birth for the Polish patients. Future studies will be warranted to investigate the impact of time since last birth on clinically relevant prognostic biomarkers in breast cancer.

An important consideration for the development of targeted prevention strategies in breast cancer is the identification of women at high risk of developing the most aggressive subtypes of breast cancer, for whom interventions to prevent invasive disease can be instituted. Based on our findings, higher parity as well as being overweight or obese were associated with poor prognostic indices, suggesting a potential beneficial effect for improved surveillance among such women. Relatedly, these findings may be suggestive of the importance of including breast cancer risk factors as part of breast cancer prognostic models. To date, only PREDICT has included breast cancer risk factors, i.e. age at diagnosis and mode of detection, in its calculations and has been shown to outperform the IHC4 score, IHC subtyping, and C-score in terms of prognostic power [46]. It is probable that future models incorporating BMI may provide further survival discrimination in breast cancer patients beyond what is contained in existing models.

The assembly of several clinicopathological and epidemiological factors for >8000 patients is an important strength of this study. In terms of limitations, however, this population is largely unscreened, as such, some of the reported associations may be subject to detection bias. For example, smaller tumors may be less likely to be palpated in obese than normal weight women. Accordingly, overweight and/or obese women may be more likely to have larger tumors on the basis of detection alone. Nevertheless, other tumor characteristics were independently associated with higher BMI in this study, suggesting some biological underpinning for our findings. Moreover, detection bias is itself a clinically important problem that needs to be addressed. Also, we did not have clinical outcome data such as recurrence or survival for these patients, which precluded our ability to perform mediation analyses or to directly examine associations between breast cancer risk factors and clinical outcomes. Nonetheless, NPI and IHC4 score strongly predict clinical outcomes in breast cancer. Together, they reflect the spectrum of clinicopathological characteristics that are used in routine clinical practice to inform breast cancer management.

Conclusions

In conclusion, our results support associations between breast cancer risk factors and two clinically useful prognostic factors in breast cancer, i.e. NPI and IHC4. Specifically, we found younger age at onset, higher parity, being overweight/obese and absent FHBC to be associated with poor prognostic indices, indicated by higher NPI score, irrespective of HR expression. Although elevated BMI was associated with low IHC4 score, connoting clinically less aggressive HR + tumors, NPI-related findings were supportive of poor prognosis among overweight/obese patients irrespective of tumor HR status. These findings highlight important relationships between breast cancer risk factors and clinically relevant prognostic factors, with potential implications for the development of surveillance, prognostication, and counseling strategies that take host factors into account.