Background

Breast cancer (BC) is the most common diagnosed cancer and the fifth leading cause of cancer death among women in China [1]. The 5-year survival of early stage breast cancer (EBC) patients in China is about 58–78%, which is low compared to that in American and varies in different geographic areas of China [2]. Traditionally, there are some prognostic factors for EBC survival including tumor size, lymph node involvement, tumor grade, hormone receptor (HR) status. However it has been proven that inherited host characteristics, such as single nucleotide polymorphisms (SNPs), play an important role [3].

Recently, genome-wide association studies (GWAS) have been widely applied to search genetic variations and disease association. It is worth noting that some susceptibility genes or polymorphisms identified by GWAS have been proven to not only be associated with predisposition to malignant tumors, but also influence their clinical outcome [4,5,6]. Only one study and one meta-analysis examined the relationship between GWAS-identified BC risk polymorphisms and the outcome for BC, both of which focused on Caucasian populations [6, 7]. However, rs6504950 and rs3803662 had different effects on the survival of BC patients in those two studies. Differences might be due to the different sample sizes and the different enrolled BC cases. Still, those studies already demonstrated the possible associations between BC risk loci and BC survival.

Similarly, there had been some BC-risk GWAS focusing on East Asian women and that found several BC risk variants, most of which were different from those identified in other ethnic populations [8, 9]. However, the relation between these polymorphisms and survival of EBC Asian patients has never been established. In the present study, we analyzed the association between 21 GWAS-identified SNPs and the survival of patients in Southeastern China with EBC.

Methods

Study populations

This is a hospital-based study including 1177 early breast cancer cases from Fujian Medical University Union Hospital from July 2000 and October 2014. All the participants were histopathologically confirmed with invasive breast cancer and subsequently treated with curative surgical resection and systemic therapy. Clinicopathological and demographic data were collected from the hospital records and survival data were obtained from the followed-up database which was renewed annually. The patients were staged according to the 7th version of American Joint Commission on Cancer (AJCC) tumor-node-metastasis (TNM) staging system [10]. Estrogen receptor (ER)/progesterone receptor (PR) positivity was determined by IHC analysis of the number of positively stained nuclei (≥ 10%) and hormone receptor (HR) positivity was defined as being either ER+ and/or PR+. Tumors were considered human epidermal growth factor-2 (HER2) positive when cells exhibited strong membrane staining (3+). Expressions of 2+ would require further in situ hybridization testing for HER2 gene amplification while expressions of 0 or 1+ were regarded as negative. The subtypes were categorized as follows [11]: luminal A (ER+, PR+ > 20%, HER2−, Ki67 < 14% or grade I when Ki67 was unavailable), luminal B (HR+, HER2−, Ki67 > 14% or grade II/III when Ki67 was unavailable or HR+, HER2+); HER2 enriched (HR−, HER2+) and triple negative (HR− and HER2−). The study was approved by the Institutional Ethics Committee and all participants consented to genetic testing at the time of their participation and contributed data.

SNPs selection

We selected the polymorphisms associated with breast cancer susceptibility from the US National Human Genome Research Institute (NHGRI) Catalog of Published Genome-Wide Association Studies. We used the following inclusion criteria: (i) the significance level for genome-wide association was considered to be P ≤ 1 × 10−9; (ii) the minor allele frequency (MAF) was at least 10% in the HapMap CHB data of the public SNP database (http://www.ncbi.nlm.nih.gov/SNP); (iii) pair wise linkage disequilibrium (LD) between the eligible SNPs calculated by Haploview 4.1 software must be less than 0.8 (r2 < 0.8). At last, 21 polymorphisms were applied in this study which can be found in Additional file 1: Table S1.

DNA extraction and SNPs genotyping

Blood samples were collected in EDTA anticoagulant tubes and stored at − 80 °C until DNA extraction. Genomic DNA was extracted using the Whole-Blood DNA Extraction Kit (Bioteke, Beijing, China), according to the manufacturer’s protocol. The genotype analysis was performed by SNPscan, which is a high-throughput SNPs genotyping technology (Genesky Biotechnologies Inc., Shanghai, China). Finally, the raw data were analyzed by the GeneMapper 4.0 Software (Applied Biosystems, Foster City, CA). 5% of samples were randomly selected as blinded duplicates for quality assessment purposes and 100% concordance was obtained.

Statistical analyses

Overall survival (OS) and breast cancer specific survival (BCSS) were our primary endpoints and defined as the time from the date of cancer diagnosis to the date of mortality for all cause and breast cancer, respectively. Disease free survival (DFS) and distant disease free survival (DDFS) were our secondary endpoints and calculated separately as the time from the date of diagnosis to the date of any recurrence and distant recurrence to the last patient contact [12]. Survival data were analyzed using the Kaplan–Meier method with the log-rank test and multivariate Cox stepwise regression analysis to the end of follow-up (2016.12.31). Adjustment for age at diagnosis, tumor size, lymph node involvement, histological grade, ER status, and HER-2/neu expression were applied. The hazard ratios (HRs) and 95% confidence interval (CI) for each factor in multivariate analyses were calculated from the Cox-regression model. The Chi square-based Q test was used to examine the heterogeneity between subgroups. The possible gene-environment interactions were also evaluated by the Cox proportional hazard regression models. All tests were 2-sided, and P values of < 0.05 were considered statistically significant. SAS 9.4 (SAS Institute Inc., Cary, NC) was used for all statistical analyses.

Results

Patient characteristics and clinical features

Patients’ clinical characteristics and survival are summarized in Table 1. All the 1177 early breast cancer cohort, were female and their mean age was 47.0 ± 10.3 years old at breast cancer diagnosis. During a median follow-up time of 174 months, 446 cases experienced recurrence (142 locoregional and 410 distant) and 343 died (333 died of BC and 10 died of other disease).

Table 1 Patients’ clinicopathological characteristics and clinical outcome

No significant difference in BC-DDFS, BCSS, and OS was shown in the subgroup of age at diagnosis (P = 0.087, 0.420, and 0.402). But patients with a tumor size > 2 cm, lymph node positive, grade III, clinical stage II + III, or HER2 positive had significantly shorter survival times, whereas being ER or HR positivity remarkably improved the survival of EBC patients (log-rank P < 0.05, Table 1). Furthermore, our intrinsic molecular subtypes (luminal A, luminal B, HER2-enriched, and triple negative) were also associated with significantly different survival (log-rank P < 0.05, Table 1).

Effects of each polymorphism on survival of EBC

Among the 21 SNPs, 6 SNPs (rs13281615, rs4415084, rs4784227, rs889312, rs10474352 and rs10816625) had a log-rank P under 0.05 in some genetic models and in some outcome indicators (log-rank P < 0.05, Table 2). But after adjusting for age at breast cancer diagnosis, tumor size, lymph node involvement, grade, hormone receptor status, and HER2 status, only rs889312 and rs2046210 had significant effect on improving survival of EBC patients. In a recessive model, rs889312 was significantly associated with better iDFS and DDFS (iDFS: adjusted HR (aHR): 0.761, 95% CI 0.583–0.994, and DDFS: aHR: 0.631, 95% CI 0.470–0.848; Table 3). Similarly, in contrast to the GG + AA genotypes, the GA genotype of rs2046210 also improve the survival of EBC patients (iDFS: aHR: 0.812, 95% CI 0.673–0.980; DDFS: aHR: 0.771, 95% CI 0.635–0.938; BCSS: aHR: 0.790, 95% CI 0.636–0.981 and OS aHR: 0.786, 95% CI 0.635–0.934, Table 3).

Table 2 Genotyping results with EBC’s survival
Table 3 Association between the SNPs’ genotype with EBC’ survival (multivariate cox proportional hazard model)

Prognostic implication of risk variants in molecular subtypes

For a large number of patients enrolled in this study, we analyzed the association between enrolled SNPs and survival associated with different molecular subtypes of EBC. As showed in Table 3, rs9485372 and rs4415084 were still associated with a worse outcome in luminal A and triple negative EBC patients, respectively, after adjustment (for rs9485372 under the recessive model: iDFS: aHR: 2.465, 95% CI 1.133–5.360; DDFS: aHR: 2.671, 95% CI 1.214–5.875; BCSS and OS: aHR: 3.522, 95% CI 1.464–8.473; for rs4415084 under the dominant model: iDFS: aHR: 1.674, 95% CI 1.043–2.687; DDFS: aHR: 1.804, 95% CI 1.084–3.002 and OS: aHR: 1.674, 95% CI 1.000–2.803). Furthermore, in the luminal B subtype we found that rs4951011 (under the dominant model) and rs889312 (under the recessive model) could significantly improve the iDFS, DDFS, BCSS and OS of the breast cancer, while rs9485372 (under dominant model) worsens outcome (iDFS: aHR = 0.719, 95% CI 0.557–0.928, DDFS: aHR = 0.734, 95% CI 0.561–0.960, BCSS: aHR = 0.721, 95% CI 0.528–0.984 and OS: aHR = 0.690, 95% CI 0.510–0.934 for rs4951011; iDFS: aHR = 0.558, 95% CI 0.381–0.817, DDFS: aHR = 0.419, 95% CI 0.269–0.653, BCSS: aHR = 0.498, 95% CI 0.304–0.815 and OS: aHR = 0.465, 95% CI 0.285–0.761 for rs889312 and iDFS: aHR = 1.482, 95% CI 0.124–1.954, DDFS: aHR = 1.557, 95% CI 0.161–2.088, BCSS: aHR = 1.504, 95% CI 1.071–2.112 and OS: aHR = 1.538, 95% CI 1.104–2.142 for 9485872, Table 3). However, no significant effect was observed in the HER2-enriched subtype in any model of the 21 polymorphisms.

Combined analysis of three risk SNPs on survival of luminal B EBC

To assess the combined effects on risk of recurrence and death from luminal B EBC, we combined the risk genotypes of rs4951011, rs889312 and 9485372. According to the number of combined risk genotypes, the univariate survival analysis show that all of iDFS, DDFS, BCSS and OS were significantly different among different groups with different combined risk genotypes (P Log-rank < 0.01) (Fig. 1). As shown in Table 4, compared to subjects with one or no unfavorable genotype, subjects carrying more unfavorable loci had shorter survival time and had a 1.534–1.645 fold increased risk of recurrence and/of death even after adjustment (iDFS: aHR = 1.534, 95% CI 1.288–1.827, DDFS: aHR = 1.632, 95% CI 1.356–1.964, BCSS: aHR = 1.570, 95% CI 1.267–1.944 and OS: aHR = 1.645, 95% CI 1.334–2.029, respectively for trend).

Fig. 1
figure 1

Kaplan–Meier plots of survival for combined effect of the three SNPs on luminal B EBC survival

Table 4 Cumulative effect of unfavorable genotypes in luminal B subtype breast cancer

Stratification and interaction analysis

The associations between breast cancer risk loci genotypes and EBC survival were then evaluated by stratified analysis of age at diagnosis, tumor size, lymph node involvement, grade, hormone-receptor status and HER2 status. As shown in Table 5, we found that rs4415084 and rs2981582 were associated with shorter survival of the patients who were younger (rs4415084 for age at diagnosis ≤ 35 years: iDFS: aHR = 1.792, 95% CI 1.161–2.915, DDFS: aHR = 2.172, 95% CI 1.310–3.602, BCSS: aHR = 2.250, 95% CI 1.278–3.959 and OS: aHR = 1.871, 95% CI 0.988–3.544) and with higher grade tumors (rs2981582 for grade III: iDFS: aHR = 1.666, 95% CI 1.051–2.639, DDFS: aHR = 1.682, 95% CI 1.049–2.698, BCSS: aHR = 1.783, 95% CI 1.080–2.944 and OS: aHR = 1.732, 95% CI 1.050–2.855). But rs2046210 and rs3803662 had beneficial effects on survival of the patients with larger tumor (rs2046210 for tumor size > 2 cm: iDFS: aHR = 0.757, 95% CI 0.606–0.944, DDFS: aHR = 0.732, 95% CI 0.582–0.919, BCSS: aHR = 0.713, 95% CI 0.533–0.920 and OS: aHR = 0.694, 95% CI 0.540–0.992) and with higher grade tumors (rs3803662 for grade III: iDFS: aHR = 0.588, 95% CI 0.414–0.834, DDFS: aHR = 0.586, 95% CI 0.407–0.845, BCSS: aHR = 0.479, 95% CI 0.319–0.717 and OS: aHR = 0.484, 95% CI 0.324–0.722) respectively. However, we did not find that the other SNPs affected survival in the subgroups of patients with different tumor characteristics.

Table 5 Stratification analysis of polymorphism genotypes associated with EBC survival

An interaction analysis was performed (Table 6) and statistically significant multiplicative interactions on EBC survival were found both between rs4415084 genotypes and age at diagnosis (adjusted Pint: iDFS 0.045, DDFS 0.013, BCSS 0.025 and OS 0.018) and between rs3803662 genotypes and tumor grade (adjusted Pint: iDFS 0.011, DDFS 0.001, BCSS 4.7 × 10−4 and OS 9.9 × 10−4).

Table 6 The interaction analysis between risk variants and clinicopathological parameters

Discussion

In this study, we evaluated the possible relation between 21 GWAS-identified BC susceptibility germline variations and EBC clinical outcome in a large Chinese cohort of 1177 EBC cases. To the best of our knowledge, this is the first study that reports the association between GWAS-identified BC susceptibility loci and clinical outcomes in a Chinese population and it produced different results from two other American studies findings [6, 7]. The most significant and novel result of this study is that the influence of BC risk polymorphisms on the outcome of EBC depends on different intrinsic molecular subtypes, especially for luminal B breast cancer.

More recently, Zhang and his colleagues demonstrated some GWAS-identified SNPs are associated with molecular subtypes of EBC in Chinese women [13]. It has been accepted worldwide that breast cancer is a complex disease and consists of several intrinsic subtypes, which have different etiologies and prognosis [14]. By altering the related genes’ expression and/or function in key signaling pathways, we gradually realize putative SNPs may take effect on the basis of molecular subtypes, whether in risk or in clinical outcome of EBC [15,16,17].

Loci rs889312, rs4951011, and rs9485372 play significant and independent roles in survival of luminal B breast cancer patients both individually or jointly by all of the four outcome indicators (iDFS, DDFS, BCSS and OS). Recently, MAP3K1 rs889312 has been identified as a low-penetrant risk factor for breast cancer, both for ER+ or ER− breast cancer [18]. It was also demonstrated to be an independent risk factor for poor survival in diffuse-type gastric cancer in an overdominant model [19]. However, two similar investigations failed to prove this variant was associated with BC clinical outcome [6, 7], although neither of them carried out survival analysis on the basis of BC intrinsic subtypes. From most recent available data, rs889312 (C/C) was found to be significantly associated with poor DFS, DDFS and OS among HR positive breast cancer patients [20], which was similar to our results. The MAP3K1 gene is the most important member in the MAPK signal pathway which activates the transcription of essential cancer genes [21]. But the exact mechanism as to how rs889312 can change MAP3K1 protein structure and/or function is still beyond our knowledge.

The rs4951011 located in intron 2 of the zinc finger CCCH domain-containing protein 11A (ZC3H11A) and 5′-UTR of ZBED6 gene, has been first identified as a BC susceptibility loci in East Asian [8]. In another study, it was only associated with triple negative breast cancer but not other BC subtypes [22]. For rs4951011 in the dominant model, we found that the GA + GG genotype was significantly associated with a better DFS, DDFS, BCSS and OS (aHR = 0.690–0.734). However, there was no evidence indicating a relation between this variant and clinical outcome of other malignant tumors. The data of ENCODE from human mammary epithelial cells (HMEC) suggests that rs4951011 may be located in a strong enhancer region marked by peaks of several active histone acetylation modifications (H3K4me1, H3K4me3, H3K9ac, and H3K27ac) [23]. Furthermore, it was found in colorectal cancer cell lines that repressing transcription of ZBED6 modulates expression of 10 genes, including PTBN1, WWC1, WWTR1, etc., linked to important signal pathway and tumor development depended on the genetic background of tumor cells and the transcription state of its target genes [24]. So rs4951011 may regulate expression of some important metastasis-related genes and then influence the course of breast cancer.

The SNP rs9485372 was also found to play a significant role in the clinical outcome of luminal A and luminal B breast cancer patients. For luminal A BC, rs9485372 in the recessive model had a worse iDFS, DDFS, BCSS, and OS (aHR 2.465–3.522). For luminal B BC, the GA + AA genotypes had a worse iDFS, DDFS, BCSS and OS (aHR = 1.482–1.557), compared to the GG genotype. This variant is located in Table 2  (TGF-β activated kinase 1/MAP3K7 binding protein 2) which plays a pivotal role in the TGF-β pathway and contributes to development of cancer [25]. Table 2 is near the ESR1 gene and it was found to be co-expressed with ESR1 in hepatocellular carcinoma [26]. Table 2 was found to be a mediator of resistance to endocrine therapy which is a poor prognostic indicator for HR+ breast cancer patients and is a potential new target to reverse pharmacological resistance and potentiate anti-estrogen action [27]. Therefore it is possible that the association both rs9485372 and survival of luminal A and B BC patients may be mediated by regulating estrogen signaling and the TGF-β pathway.

Two GWAS-identified BC risk loci, rs1219648 and rs13387042, were found to take effect on overall survival of EBC in Tunisians [28]. On the contrary, we failed to confirm this result in our Chinese population. We attribute this difference to the following reasons. Firstly, these two studies focused on different ethnic groups with different genetics background. Secondly, we used a much bigger sample size and longer follow-up than the other study which made our result more reliable. Finally, both of these two studies are retrospective. We used the multivariate Cox proportional hazard model to evaluate the independent effect of every SNP on survival of EBC patients while the other study just used Kaplan–Meier Curve and Log-Rank Test.

Some potential limitations of our study should be taken into consideration. First, as all patients were of Chinese origin, it is unclear whether our findings are Chinese Han population—specific or common in other populations. Second, the biological mechanism of the significant SNPs in breast cancer is still unclear. Therefore, more studies with diverse ethnic backgrounds and determination of the functional characterizations of the SNPs are warranted. Nevertheless, this is the first study with integrated clinicopathological data and long enough follow-up data to investigate the association between genetic breast cancer risk polymorphisms and survival of Asian breast cancer patients depended on intrinsic molecular subtypes.

Conclusions

Our findings indicated that breast cancer risk variants are not in general strongly associated with clinical outcome. However, we illustrated that, on the basis of molecular subtypes, there are some potential BC risk polymorphisms, which are probably novel predictors for EBC outcome in Chinese patients. Large better-designed investigations with a variety of populations, as well as functional assessments are needed to verify and extend our findings.