Introduction

Breast cancer is one of the most common malignancies worldwide, with an estimated 255,180 new cases and 41,070 deaths in the United States in 20171. Notably, the incidence and mortality of breast cancer have increased tremendously in developing countries including China. Breast cancer is responsible for 15% of all new Chinese women cancer patients2. With similar carcinoma genetic mechanisms to other kinds of solid tumor, development of breast cancer is a chronic and multiple-step process involving accumulation of genetic and epigenetic alterations3,4. Risk factors for breast cancer include obesity, lack of physical exercise, alcohol abuse, hormone replacement therapy during menopause, ionizing radiation, early age at first menstruation, having children late or not at all, older age, and family history3,4,5,6,7,8,9. Triple-negative breast cancer (TNBC) refers to one type of breast cancer that does not express estrogen receptor (ER), progesterone receptor (PR) and Her2/neu10,11,12,13,14. TNBC patients usually have relatively poor outcomes due to its intrinsically aggressive behaviors and requires combination therapies instead of common chemotherapies because loss of target receptors. Thus, more effective and sensitive prognostic markers are instantly needed to guide clinical management of TNBC precisely.

Many genetic variants have been described to contribute to breast cancer risk since the discovery of BRCA1 and BRCA2 in 1990s15,16,17,18,19,20,21,22. These genetic inheritable variants are associated with familial breast cancer. Thus, identifying new genetic variants will show great significance in breast cancer prediction and treatments. Recent genome-wide analyses based on large consortia do avoid false positive identification of candidate genes23. Two 18q11.2 genetic variants (CHST9 rs1436904 and AQP4 rs527616) were identified as novel breast cancer susceptibility components based on GWAS24. One nested case-control study based on female Chinese patients within Singapore Chinese Health Study were also performed to verify the roles of these SNPs. It has been demonstrated that genetic variants on top of conventional risk factors did improve the risk prediction of breast cancer in Chinese women25, but not clear enough to declare whether CHST9 rs1436904 and AQP4 rs527616 affect prognosis of TNBC. To test this, we conducted a hospital-based cohort study of early-stage TNBC to further illustrate the role of these two genetic variants in breast cancer progress. We found that the CHST9 rs1436904 polymorphism might be a potential prognostic biomarker for early-stage TNBC, especially in the patients harboring larger tumor (tumor size > 2 cm), without lymph-node metastasis, being premenopausal at diagnosis or with vascular invasion.

Materials and Methods

Study subjects

A total of 381 TNBC patients were recruited between January 2008 and December 2015 at Cancer Hospital, Chinese Academy of Medical Sciences (Beijing, China). These patients were followed until May 6, 2016 in order to collect data on clinicopathological characteristics, treatments, and vital status, such as recurrence and death. Disease free survival (DFS) was defined as the time from the date of diagnosis until the date of the first locoregional recurrence, first distant metastasis, or death due to any cause. Patients known to be alive with no evidences of disease progression were censored at the last follow-up date or on May 6, 2016 (whichever came first). All subjects were ethnic Han Chinese. As we can see from histopathological data, most majority of recruited TNBC patients were at early-stage. At recruitment, the informed consent was obtained from each subject. This study was approved by the institutional Review Boards of Cancer Hospital, Chinese Academy of Medical Sciences and Shandong Cancer Hospital affiliated to Shandong University.

Immunohistochemistry (IHC) of formalin-fixed, paraffin-embedded breast cancer tissue samples obtained from the patients was used to evaluate ER or PR status with anti-ER and anti-PR antibodies. A positive ER or PR status was defined by nuclear staining of more than 1% based on guidelines of American Society of Clinical Oncology (ASCO) and College of American Pathologists (CAP) in 2010. To determine the HER2 status, IHC or gene amplification was performed by fluorescence in situ hybridization (FISH). Tumors negative for ER, PR, and HER2 were defined as TNBCs.

Genotyping

A total of 1 mL blood sample was collected from each patient upon recruitment. Genomic DNA was extracted from the blood. The CHST9 rs1436904 and AQP4 rs527616 polymorphisms were analyzed by the MassArray system (Sequenom Inc., San Diego, California, USA) as described previously26,27,28,29. CHST9 rs1436904 PCR primers are 5′-ACGTTGGATGCTTCCCTGCAAGACTATGTG-3′ (Forward) and 5′-ACGTTGGATGGCAAGACAGGAGACAGATTC-3′ (Reverse). CHST9 rs1436904 UEP_SEQ primer is 5′-CCCCCTTGTGTCTCATTCCTCA-3′. CHST9 rs1436904 EXT1_SEQ primer is 5′-CCCCCTTGTGTCTCATTCCTCAG-3′. CHST9 rs1436904 EXT2_SEQ primer is 5′-CCCCCTTGTGTCTCATTCCTCAT-3′. AQP4 rs527616 PCR primers are 5′-ACGTTGGATGTTACACGAGACTGAGCCAAC-3′ (Forward) and 5′-ACGTTGGATGGAAATGCCCCTTAGGACAAG-3′ (Reverse). AQP4 rs527616 UEP_SEQ primer is 5′-GAGCTCCAGTGCTATTT-3′. AQP4 rs527616 EXT1_SEQ primer is 5′-GAGCTCCAGTGCTATTTC-3′. AQP4 rs527616 EXT2_SEQ primer is 5′-GAGCTCCAGTGCTATTTG-3′. A 20% blind, random sample of study subjects was genotyped in duplicates and the reproducibility was 100%.

Statistics

The differences of patient clinical characteristics were calculated using Student’s t test or χ2 test. DFS was calculated as the time to progression or death without progression from the date of diagnosis. Survival distributions were estimated with the Kaplan-Meier method and were compared using log-rank test. The multivariate Cox proportional hazards model was applied to estimate effects of prognostic factors on DFS, using proverbial clinical factors, including age of onset, body mass index (BMI), tumor size, lymph-node metastasis, histological type, histological grade, menopausal status, vascular invasion, breast or ovarian cancer history, surgical method, taxane/anthracycline-based chemotherapy and radiotherapy, where it was appropriate. References for multivariate analyses were without family history of breast cancer or ovarian cancer for breast cancer or ovarian cancer history, postmenopausal at diagnosis for menopausal status at diagnosis, modified radical mastectomy for operation method,histological grade I for histological grade, without vascular invasion for vascular invasion, tumor size ≤ 2 cm for tumor size, without lymph-node involvement for lymph-node involvement, without acceptance of chemotherapy for taxane/anthracycline-based chemotherapy, and without acceptance of radiotherapy for radiotherapy. P value of less than 0.05 was used as the criterion of statistical significance. All statistical procedures were conducted using SPSS software (version 16.0).

Results

TNBC patients’ characteristics and clinical outcomes

A total of 381 TNBC patients were enrolled in this study. All individuals were female ethnic Han Chinese. The distribution of demographic and clinical characteristics of patients were showed in Supplementary Table 1. By the time of the final analysis (May 2016), the median follow-up time of the patients was 45.5 months. One hundred and eighty-two patients (49.2%) had disease progression and the median DFS time was 36.0 months (range, 0–204 months).

Comparison of survival according to baseline characteristics of TNBC patients

To test whether various clinical characteristics contribute to DFS, patients were grouped according to age of onset, BMI, tumor size, lymph-node metastasis, histological type, histological grade, menopausal status, vascular invasion, breast or ovarian cancer history, surgical method, taxane/anthracycline-based chemotherapy and radiotherapy, respectively. DFS was compared between (or among) different sub-groups. As shown in Supplementary Table 2, BMI, histological grade, vascular invasion, lymph-node metastasis and radiotherapy can significantly influence DFS independently (P < 0.05). However, other baseline characteristics did not affect DFS (P > 0.05). After adjustments for other clinicopathologic factors, only BMI and vascular invasion showed statistically significant impacts on patient prognosis (Supplementary Table 2).

Effects of CHST9 rs1436904 and AQP4 rs527616 polymorphisms on TNBC DFS

It has been demonstrated that CHST9 rs1436904 and AQP4 rs527616 are breast cancer susceptibility single nucleotide polymorphisms (SNPs)24,25. However, the role of those two genetic variations in TNBC patients’ outcome has not been examined. Genotype frequencies of CHST9 rs1436904 and AQP4 rs527616 SNPs among patients were summarized in Table 1. Interestingly, only CHST9 rs1436904 polymorphism was significantly associated with DFS of TNBC patients. The mean DFS of TNBC patients with the CHST9 rs1436904 GG genotype (46.8 months) or the GT genotype (50.1 months) was significantly shorter than that of the TT group (55.3 months). Moreover, both univariate and multivariate Cox proportional hazards model indicated that the CHST9 rs1436904 genetic variation was significantly associated with disease progression of TNBC patients (Table 1 and Fig. 1). After adjustments of multiple clinical factors, the CHST9 rs1436904 GG genotype was still significantly associated with disease progression compared to subjects with the TT genotype (HR = 1.70, 95% CI = 1.03–2.81, P = 0.038). Similarly, the risk of early recurrence for TNBC patients carrying the CHST9 rs1436904 G allele (GT and GG genotype) increased about 1.51-folds (95% CI = 1.03–2.22) in comparison with TT genotype patients (P = 0.033).

Table 1 Genotype frequencies of rs1436904 and rs527616 polymorphism among TNBC patients and their association with DFS.
Figure 1
figure 1

Kaplan-Meier curves of DFS for TNBC patients harboring different genotypes of CHST9 rs1436904 genetic variations. Survival curves of different genotypes were compared by Kaplan-Meier method, followed by long-rank test (P = 0.042).

Stratified analyses of the effects CHST9 rs1436904 on DFS of TNBC patients

The association between CHST9 rs1436904 polymorphism and DFS of TNBC patients was further examined by stratifying for age of onset, BMI, tumor size, lymph-node metastasis, histological type, histological grade, menopausal status, vascular invasion, breast or ovarian cancer history, surgical method, taxane/anthracycline-based chemotherapy and radiotherapy, respectively (Table 2 and Supplementary Table 37).

Table 2 DFS of TNBC associated with CHST9 rs1436904 genotypes by tumor size, lymph-node involvement, menopausal status as well as vascular invasion

In the subgroup of TNBC patients harboring large tumors (tumor size > 2 cm), the CHST9 rs1436904 G allele (GT and GG genotype) was associated with a significantly increased risk of disease progression (HR = 1.88, 95% CI = 1.06–3.35; P = 0.032) compared to the TT genotype. The mean DFS of the G allele carriers was obviously shorter compared to the cases with the TT genotype (46.4 months vs. 55.9 months; P = 0.015) (Table 2 and Fig. 2). However, such differences were not observed in patients with small tumors (tumor size ≤ 2 cm), indicating that the CHST9 rs1436904 polymorphism was an independent prognostic marker of TNBC cases with large tumors.

Figure 2
figure 2

Stratified analyses of DFS for TNBC patients with different genotypes of CHST9 rs1436904 genetic variations according to tumor size. (a,b) DFS for the patients harboring small tumors (≤2 cm; P = 0.551 or 0.850 for different classifications); (c,d) DFS for the patients harboring large tumors (>2 cm; P = 0.033 or 0.015 for different classifications).

Among the TNBC patients without lymph-node metastasis, the mean DFS of the CHST9 rs1436904 GG genotype was significantly shorter than that of the TT genotype patients (48.1 months vs. 62.6 months; P = 0.033) (Table 2 and Fig. 3). However, there was no such association between the polymorphism and DFS in patients with lymph-node metastasis (mean DFS of the TT, GT and GG genotypes: 48 months, 72 months and 60 months, respectively; P = 0.620). HRs, calculated from the multivariate Cox proportional hazards model, demonstrated that patients without lymph-node metastasis harboring CHST9 rs1436904 GG or GT/GG genotype showed 2.27-fold or 2.01-fold increased risk for disease progression (P = 0.033 or 0.017, respectively) compared to the TT genotype patients (Table 2).

Figure 3
figure 3

Stratified analyses of DFS for TNBC patients with different genotypes of CHST9 rs1436904 genetic variations according to lymph-node metastasis. (a,b) DFS for the patients without lymph-node metastasis (P = 0.004 and 0.002 for different grouping mode); (c,d) DFS for the patients with lymph-node metastasis (P = 0.620 and 0.423 for different grouping mode).

Among the patients who were premenopausal at diagnosis, the median DFS of either the rs1436904 GT or GG genotype patients (48.0 months or 36.0 months) was shorter than that of the rs1436904 TT genotype patients (48.0 months). However, there was no such association between the polymorphism and DFS in patients being postmenopausal at diagnosis (Table 2 and Fig. 4). In the multivariate Cox proportional hazards model, premenopausal patients with the rs1436904 GG genotype showed 2.18-fold increased risk for disease progression (95% CI = 1.12–4.22, P = 0.021) compared to subjects with the TT genotype (Table 2). Similar results were observed among premenopausal patients with the rs1436904 GT genotype (HR = 2.07, 95% CI = 1.18–3.65, P = 0.011) (Table 2).

Figure 4
figure 4

Stratified analyses of DFS for TNBC patients with different genotypes of CHST9 rs1436904 genetic variations according to menstrual status. (a,b) DFS for the patients being premenopausalat at diagnosis (P = 0.052 and 0.028 for different grouping mode); (c,d) DFS for the patients being postmenopausalat at diagnosis (P = 0.785 and 0.666 for different grouping mode).

In the subgroup of TNBC patients with vascular invasion, the CHST9 rs1436904 G allele was significantly associated with an increased risk of disease progression compared to the TT genotype (HR = 6.51, 95% CI = 1.89–22.36; P = 0.003). Especially, TNBC patients with GG genotype showed 39.37-fold increase risk for disease progression (P < 0.001). However, such differences were not observed in patients without vascular invasion (Table 2 and Fig. 5).

Figure 5
figure 5

Stratified analyses of DFS for TNBC patients with different genotypes of CHST9 rs1436904 genetic variations according to vascular invasion. (a,b) DFS for the patients being premenopausalat at diagnosis (P = 0.337 and 0.1184 for different grouping mode); (c,d) DFS for the patients being postmenopausalat at diagnosis (P = 0.0024 and 0.0174 for different grouping mode).

Discussions

Development of breast cancer are multiple-process consequences of combined genetic and epigenetic changes3,4. About five to ten percent of breast cancer cases are believed to be hereditary and associated with certain gene mutations20,21,22. Although multiple breast cancer susceptibility genes have been identified, new sets of susceptibility genes should also be identified. TNBC accounts for 12–24% of breast cancers associating with early recurrence and poor outcome. Additional efforts should be made to discover specific loci or genetic variants related to TNBC risk, which will expand our understanding of the etiology of this aggressive breast cancer and improve its prevention and clinical diagnosis. Currently, genome-widely analysis was performed to discover and validate genetic variants that are associated with breast cancer risk using large consortia15,16,17,18. Multiple novel breast cancer genetic susceptibility loci were identified and validated base on this approach24. Based on a breast cancer GWAS, which identified AQP4 rs527616 and CHST9 rs1436904 genetic variants, we explored their involvement in early-stage TNBC and concluded that the CHST9 rs1436904 SNP is an independent prognostic genetic variant in Chinese TNBC patients. These results will provide new prevention and diagnosis targets in TNBC therapy.

In Michailidou et al.’s study24, the G (minor allele) in relative to T (major allele) is a relative “protective” risk against breast cancer (OR = 0.96, 95% CI = 0.94–0.98). However, we did observe the significant role of the same SNP on prognosis of TNBC. There might be multiple reasons. First of all, the purpose of our study is to identify prognostic markers, but the GWAS study was aimed to identify susceptibility SNPs. As a result, the observation of different results due to different study purpose might be possible. For example, ERCC1 C118T was associated with lung cancer risk. The OR was 0.90 (95% CI = 0.81–0.99, P = 0.043) in an additive genetic model (C allele vs. T allele) and 0.77 (95% CI: 0.63–0.95, P = 0.013) in a recessive genetic model (CC/CT vs. TT)30. However, ERCC1 C118T was proved to be a risk SNP of overall survival for platinum-based chemotherapy in Asian NSCLC patients (CT + TT versus CC: HR = 1.24, 95% CI = 1.01–1.53)31.

The two SNPs examined in this study locate in AQP4 and CHST9 gene. AQP4 belongs to AQP family and functions in water maintaining and ion homeostasis32. It is located at membrane and cytoplasmic fraction and markedly decreased in tumor tissues compared to paired-adjacent tissue, thus indicating its pathogenic role during cancer development33. CHST9 belongs to the N-acetylgalactosamine 4 sulfotransferase (GalNAc4ST) family, which transfers sulfate to position 4 of nonreducing terminal GalNAc residues34. Sulfate group I carbohydrates play important roles in conferring highly specific functions on glycoproteins, glycolipids, and proteoglycans35. It plays an important role in hematologic malignancies because CHST9 copy number variants (CNV) are associated with acute myelogenous leukemia (AML)36. Additionally, CHST9 CNV and amplification are also found in the brain of schizophrenia patients and gastric cancer patients with metastatic lymph node37,38. Nevertheless, the role of CHST9 as well as its genetic variations in breast cancer, especially TNBC, has not been determined. Our results show that CHST9 rs1436904 SNP significantly contributes to early-stage TNBC progression risk. Notably, our results also show that the CHST9 rs1436904 G allele is a “risk” genetic variant for outcome of TNBC patients, significantly associated with shorten DFS in TNBC patients harboring big tumors (>2 cm), without metastasis, being premenopausal at diagnosis or with vascular invasion.

In all, to the best of our knowledge, our study for the first time identified an inherited variation in CHST9 which was significantly associated with DFS of TNBC patients, especially in TNBC patients harboring big tumors, without lymph-node metastasis, being premenopausal at diagnosis or with vascular invasion. Our findings might have potential clinical implications on precision treatment of TNBC, and eventually affect the therapeutic efficacy.

Ethical approval

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.