Background

Breast cancer is the most frequently diagnosed cancer and the leading cause of cancer death among females worldwide [1]. While the incidence rates are generally higher in more developed areas, such as North America and Australia, the incidence of breast cancer in developing countries has been increasing in recent years. In China, breast cancer has become the most common cancer in females and the leading cause of cancer-related death in younger women, especially in highly urbanized regions, which is possibly due to changes in lifestyle and reproductive behavior [2, 3]. With breast cancer, it is not the primary tumors but the metastasis that is responsible for the death of over 90% of breast cancer patients [4, 5]. Some breast cancer patients who initially present with distant metastases and resection are diagnosed with late-stage disease that is nearly incurable. It is possible that the seeds of metastasis are sown at a very early stage in the primary tumor development in the breast [5,6,7,8]. Other patients, who have no detectable metastases at the time of diagnosis, ultimately develop metastatic lesions, often months or years after the initial diagnosis [9, 10]. Therefore, the identification of metastasis-related factors warrants further investigation.

Enormous efforts have been made in identifying metastasis-related factors that can be used as prognostic markers to predict the transition from primary to systemic diseases [11,12,13,14,15]. Established prognostic factors that have been confirmed to be involved in breast cancer metastasis include tumor size, axillary lymph node status, and histological grade/subtype. New potential prognostic biomarkers of breast cancer metastasis are continuously being uncovered, which include uPA/PAI1, ER, PR, HER2/ErbB2, circulating tumor cells, the presence of epithelial cells in the bone marrow [12, 16], E-cadherin [17] and, more recently, nucleobindin-2 [18]. Unfortunately, each of these prognostic markers has limited prognostic value in only certain subgroups of patients with breast cancer. Moreover, metastasis to the lymph node, primarily the axillary nodes, is the earliest sign of the metastatic spread of breast cancer [19] and this process occurs at a higher rate than any single distant organ metastasis [20]. In addition to the well-known CXCL12/CXCR4 axis in directing the migration of breast cancer cells through the lymphatics [21, 22], very few studies have been conducted to identify biomarkers associated with the lymph metastasis of breast cancer.

Profiling the tumor tissue proteomics provides important information of biomarker discovery. This potentially useful strategy, however, is limited by the sensitivity of the currently available methods [16]. Isobaric tags for relative and absolute quantitation (iTRAQ) has been widely employed in quantitative proteomic studies in complex biological systems [23, 24] and has been successful in the characterization of protein bioindicators of diverse effects [25]. Recently, the combination of iTRAQ isobaric labeling, multidimensional liquid chromatography and ultrahigh resolution mass spectrometry has been used to identify tumor biomarkers in cancer, including breast cancer [26,27,28,29,30]. In this study, primary breast tumor tissues and paired lymph node metastases from breast cancer patients were analyzed in parallel by the quantitative iTRAQ proteomic method. Four differentially regulated proteins were validated by immunohistochemistry. Through further clinicopathological correlation and bioinformatic studies, we identified αB-crystallin as a potential prognostic biomarker to predict the occurrence of lymph metastasis and the clinical outcome of breast cancer patients.

Methods

Human subjects

This study was approved by the Research Ethics Committee of Central South University, China, and informed consent was obtained from all of the patients. All patients were diagnosed by two senior pathologists as invasive breast cancer (invasive ductal carcinoma or invasive lobular carcinoma) without radiotherapy or chemotherapy before surgery.

Mass collection methods for breast cancer

Select the cases with large lesions (> 1.5 cm × 1.5 cm × 1 cm) which were diagnosed as breast cancer by frozen section. Tissue samples were cut the tumors (> 0.5 cm × 0.5 cm × 0.5 cm) and preserved them in liquid nitrogen. We then decided whether to join the group according to routine diagnosis and lymph node metastasis.

Methods for collecting lymph node metastases

The lymph nodes with the largest diameter (> 1 cm) were selected, the adipose tissue around the lymph nodes was removed, the lymph nodes were cut along the largest diameter, and the color of the section was observed by naked eyes. The selected lymph nodes were divided into two parts, half of which were stored in liquid nitrogen, and the other half were stained with H&E and observed under a microscope to determine whether the lymph nodes really existed. In breast cancer metastasis, the criterion for admission was that metastatic cancer accounted for more than 90% of lymph nodes. The collected breast cancer tissues and matched metastatic lymph nodes were preserved in liquid nitrogen.

iTRAQ proteomics

Twenty-three paired fresh primary tumors and metastatic axillary LNs were collected from Hunan Cancer Hospital between November 2013 and March 2014. Each collected tissue sample was divided into two parts; one part was used for routine pathological examination, and the other part was stored in liquid nitrogen for the proteomic study. To minimize the influence of residual lymphoid tissues on protein identification, only the axillary LNs with > 95% neoplastic cells according to H&E examination were used for the proteomic study. Relative quantitative proteomics was performed using the Fitgene iTRAQ Proteomics Platform (http://www.fitgene.com) according to the standard procedure [28, 30]. Briefly, the prepared lysates (200 μg) were treated with 4 μL of reducing reagent for 1 h at 60 °C and then blocked with 2 μL of cysteine blocking reagent for 10 min at room temperature. After centrifugation, the supernatant was collected and incubated with trypsin and TEAB overnight at 37 °C. The samples were then mixed with the iTRAQ reagents and subjected to two-dimensional LC-MS/MS analysis and a database search. An expression change greater than 1.5-fold was considered a difference between the primary tumor tissues and the paired metastatic LN tissues.

The raw data acquired from LC-MS/MS was processed with AB Sciex ProteinPilot 4.0 (AB Sciex, Concord, Ontario, Canada), and protein identification and quantification were achieved by searching the UniProt database (Release 2014.5.14). Proteomics profiling and database searching based on the TripleTOF® 5600+ System (AB Sciex) and ProteinPilot 4.0 (AB Sciex) were performed following the manufacturer’s recommendations. The parameters were set as follows: Unused ≥1.3; Credibility ≥95%; C.V. ≤ 0.5; AVG. ≥ 1.5 or ≤ 0.67; T.TEST < 0.05; Peptides (95%) ≥ 4. To ensure the reliability and stability of the reported data, we performed the following steps for data quality control. First, before database searching, we selected “Run False Discovery Rate Analysis” in the software AB Sciex ProteinPilot for FDR control. Second, we removed the results identified by the reverse database. Third, we removed those proteins with extremely high or low ratios. Finally, we removed those proteins with abnormal quantification between technical repetition and biological repetition.

The coefficients of variation (CV) of biological repetition were analyzed for data from different groups of samples. By observing the experimental data, when the coefficient of variation is within (+ 50%), 60% of the identified proteins can be covered. Most of the data exceeding the coefficient of variation are caused by individual differences of organisms. In subsequent analysis, this part of data will be excluded from the scope of analysis. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE [31] partner repository with the dataset identifier PXD013931.

Immunohistochemical analysis

A total of 106 paired paraffin-embedded tissue samples with lymph node metastasis were obtained from female patients with breast disease who were operated on in Hunan Cancer Hospital between May 1996 and May 2008. None of the patients underwent preoperative chemotherapy or radiotherapy. The tissue samples were fixed with 10% formaldehyde in PBS, embedded in paraffin and cut into consecutive 4-μm sections. Breast cancer was staged according to the Nottingham modified program of Bloom-Richardson scoring system.

For immunohistochemistry, a two-step polymer-based detection method (EnVison™) was used according to our recently published protocol [18]. The primary antibodies (all diluted 1:200) were rabbit monoclonal antibodies obtained from Abcam (Cambridge, MA, USA) (EpCAM [ab124825], FADD [ab108601], αB-crystallin [ab76467]) or CST (Danvers, MA, USA) (NDRG1 [#9485]). The staining was examined by two senior pathologists, and the total immunostaining score (TIS) was calculated as described.

Clinicopathological correlation study

A total of 190 breast cancer patients admitted to Hunan Cancer Hospital between May 1996 and March 2005 were followed up for over 10 years, and the clinicopathological parameters, including age at diagnosis, tumor size, axillary node status, clinical stage, histological type/grade, ER/PR/HER2 status, and menstruation history, were recorded. These parameters were correlated with the expression levels of the four metastasis-associated proteins.

GEO analysis

The difference in the expression levels of αB-crystallin between normal breast tissues and breast cancers was analyzed online in the Gene Expression Omnibus (GEO) profile (https://www.ncbi.nlm.nih.gov/geo/) using the search terms of “invasive breast cancer” and “CRYAB”.

Statistical analysis

The statistical analysis was performed using SPSS 2.0 Software. A Wilcoxon signed-rank test was used to compare the expression of the metastasis-associated proteins between the paired primary tumors and the metastatic lesions of breast cancer on immunohistochemistry. A chi-square (χ2) test was used to evaluate the metastasis-associated proteins with the clinicopathological parameters. Survival analysis was performed using the Kaplan-Meier method. The Student’s t test was used to compare the mRNA expression of FADD and αB-crystallin between normal breast and breast cancer tissues from the GEO profile. A p value of less than 0.05 was considered statistically significant.

Results

Identification of lymph metastasis-associated proteins in breast cancer patients

To identify the proteins associated with lymph metastasis of breast cancer, we first analyzed 23 paired primary tumors and axillary lymph node metastases from patients with metastatic breast cancer using iTRAQ-based proteomic analysis. The quantitative data are presented in Additional file 3: Table S1. A total of 637 differentially regulated proteins (397 upregulated and 240 downregulated) between the primary sites and the lymph node metastases of breast cancer were identified based on a 95% confidence interval and a difference ratio of ≥1.5 for up-regulated protein, and ratio ≤ 0.67 for down-regulated. The top 30 upregulated and downregulated proteins are presented in Additional file 4: Table S2 and Table S3, respectively.

To gain insights into the biological and molecular characteristics of these proteins, gene ontology (GO) analysis was performed on the differentially regulated proteins. An analysis of the biological process annotations of the 397 proteins that were upregulated in metastatic sites is shown in Additional file 1: Figure S1A. These proteins were predominantly involved in cellular nitrogen compound metabolism and biosynthesis, followed by signal transduction, small molecule metabolism, and stress responses. The GO enrichment analysis of cellular components indicated that these upregulated proteins were primarily distributed in the nucleus and the cytoplasm (Additional file 1: Figure S1B). In terms of molecular functions, the majority of these upregulated proteins were involved in binding activities, such as RNA binding and ion binding (Additional file 1: Figure S1C). The 240 proteins that were downregulated in lymph node metastases were primarily associated with signal transduction, anatomical structure development, stress response, and cell differentiation (Additional file 1: Figure S1D). For cellular distribution, the downregulated proteins were predominantly localized in the extracellular region, the organelles, and the cytoplasm (Additional file 1: Figure S1E). The most significant molecular function of these downregulated proteins was ion binding (Additional file 1: Figure S1F).

Validation of differentially regulated proteins

We filtered out four proteins (two upregulated proteins and two downregulated proteins) for further validation. These proteins were chosen based on the following criteria: 1) they had a fold-change of greater than 1.5 (for the upregulated proteins) or less than 0.67 (for the downregulated proteins); 2) they had a peptide number of greater than 3 in the iTRAQ identification; and 3) they are known to be related to cancer cell invasion/metastasis based on previous studies. These four proteins were EpCAM (epithelial cell adhesion molecule) [32], FADD (Fas-associated death domain protein) [33], NDRG1 (N-myc downstream-regulated gene 1) [34] and αB-crystallin (Alpha-crystallin B chain) [35], and their ratios of metastatic vs. primary tumor sites were 1.85, 1.51, 0.33, and 0.34, respectively. The mass annotated product ion spectra of these four proteins were obtained (data not shown). The biological processes, cellular locations, and molecular functions of these four individual proteins (Additional file 4: Table S4) were analyzed using the UniProt knowledgebase (http://www.uniprot.org/), which was in agreement with the abovementioned GO analysis results.

Next, we used immunohistochemistry to verify the expression of the four breast cancer lymph metastasis-associated proteins in 106 cases of paraffin-embedded paired primary tumors and lymph metastasis tissues obtained from metastatic breast cancer patients. The representative staining images are presented in Fig. 1, and the quantitatively analyzed results, which are presented as total immunostaining score (TIS), are summarized in Table 1. As shown in Fig. 1, most of the EpCAM was localized on the plasma membrane, which is in agreement with its known cellular localization. FADD was primarily localized in the cytoplasm and the nucleus. NDRG1 was located in the plasma membrane and the cytoplasm. The αB-crystallin protein was primarily expressed on the plasma membrane and in the cytoplasm. Consistent with the iTRAQ data, NDRG1 and αB-crystallin were downregulated at the metastatic sites compared with the primary tumors in terms of TIS (Table 1) (P = 0.0003 [NDRG1] or P = 0.046 [αB-crystallin]). However, the expression levels of EpCAM and FADD were also lower at the metastatic sites compared with the primary tumors (P = 0.0005).

Fig. 1
figure 1

Immunohistochemical analysis of the expression of four breast cancer metastasis-associated proteins. The expression levels of EpCAM, FADD, NDRG1, and αB-crystallin were evaluated by the immunohistochemical staining of paraffin-embedded paired primary and metastatic tissue sections that were obtained from patients with metastatic breast cancer

Table 1 Summary of the expression of the four metastasis-associated proteins in the paired primary and metastatic tissues of breast cancer

Correlation of metastasis-associated proteins with the clinicopathological features of breast cancer patients

To clarify the clinical relevance of the proteins identified from iTRAQ proteomics that were associated with lymph metastasis, we analyzed the relationship between these four proteins and the clinicopathological parameters of 190 cases of breast cancer patients. We showed that EpCAM was not correlated with any of the clinicopathological parameters examined (Table 2). However, FADD expression was positively correlated with a younger age at diagnosis (P = 0.049) and lymph node metastasis (P = 0.003). NDRG1 expression was correlated with worse histological grade (P = 0.041) but not with lymph node metastasis (P = 0.655). αB-crystallin expression was inversely correlated with lymph node metastasis (P < 0.001), clinical stage (P = 0.001), histological grade (P = 0.037), ER (P < 0.001), and PR status (P = 0.007).

Table 2 The association between the four metastasis-associated proteins and the clinicopathological features of 190 breast cancer patients

Association of metastasis-associated proteins with overall survival of breast cancer patients

In addition, we followed up 190 breast cancer patients for over 10 years and conducted a survival analysis for the positivity of expression (EpCAM, FADD, and αB-crystallin) or the level of expression (NDRG1) in the primary tumor sites. The results revealed that the patients who had positive expression of EpCAM or FADD survived for a shorter time compared with those with negative expression (Fig. 2a-b). Those who had positive expression of αB-crystallin survived longer than those with negative expression (Fig. 2d). However, the expression level of NDRG1 had no prognostic value for breast cancer patients (Fig. 2c). Moreover, the prognostic value of EpCAM only applied to patients with lymph node metastasis (Fig. 3a-d). Univariable analysis linked with tumor diameter, TNM stage and histology stage and type, but multivariable analysis assigned significance only to histology type (lobular carcinoma vs. duct carcinoma) (Table 3).

Fig. 2
figure 2

The association between four metastasis-associated proteins and the overall survival of breast cancer patients. Kaplan-Meier plots of the association between the expression of EpCAM (a), FADD (b), NDRG1 (c), and αB-crystallin (d) and the overall survival probability of breast cancer patients

Fig. 3
figure 3

The association between four metastasis-associated proteins and the overall survival in breast cancer patients with metastasis. Kaplan-Meier plots of the association between the expression of EpCAM (a), FADD (b), NDRG1 (c), and αB-crystallin (d) and the overall survival probability in breast cancer patients with metastasis

Table 3 Univariate and Multivariate Analysis by a Cox Proportional Hazards Regression Model in Cohort

Downregulation of αB-crystallin mRNA expression in breast cancer

Finally, to examine whether αB-crystallin (gene name: CRYAB) was also involved in human breast cancer development, using the public database, we reviewed the mRNA expression of CRYAB in normal breast and invasive breast cancer tissues in Gene Expression Omnibus (GEO) (Expression Profile GDS3324). The results are presented in Additional file 2: Figure S2. The expression of CRYAB was significantly lower in breast cancer tissues compared with normal breast tissues (P = 0.001). We further found that the level of expression of αB-crystallin was indeed lower in breast cancer tissues compared with benign breast lesions, with metastatic breast cancer having the lowest expression (Table 4). These findings support the tumor-suppressive role of αB-crystallin in the development of breast cancer.

Table 4 Summary of the expression of CRYAB in different stages of breast tissues

Discussion

Metastasis is one of the most important factors that causes the death of patients with breast cancer. Detection of breast cancer metastasis at the earliest possible stage is critical for the successful management of breast cancer progression. Therefore, it is very important to search for effective biomarkers for breast cancer metastasis and prognosis. In proteomic comparative studies of breast cancer metastasis, with tumor tissue as the research object, the commonly used method is based on the comparison of lymph node metastasis or other organ metastases, gene expression or protein expression of primary breast cancer with metastasis and without metastasis. In this study, we used the iTRAQ proteomic technique to analyze the differentially regulated proteins between the primary tumor sites and their corresponding lymph node metastases in metastatic breast cancer patients, and this comparison method can more accurately compare the differences in protein expression of breast cancer cells with varying metastatic capacity. Four proteins (EpCAM, FADD, NDRG1, and αB-crystallin) were chosen for validation by immunohistochemistry. Specially, αB-crystallin could potentially be addressed as a potential prognostic biomarker to predict the lymph node metastasis and clinical outcomes of breast cancer patients.

αB-crystallin, also called HspB5, is a member of the α-crystallin family small heat shock proteins and is an important component of the vertebrate lens [36]. In nonlens tissues, αB-crystallin is an integral part of the cellular proteostasis system, which is associated with a broad spectrum of human diseases, including cancer [37]. αB-crystallin plays an important role in stress responses, such as heat shock and radiation poisoning. As a molecular chaperone, αB-crystallin is expressed in human cells at a higher level under pathological conditions. The expression of αB-crystallin in human renal carcinogenesis, triple-negative (basal-like) breast cancer, hepatocellular carcinoma, and squamous cell carcinoma of the head and neck is related to poor prognosis [36, 37], suggesting an oncogenic role for αB-crystallin in promoting tumorigenesis. In breast cancer, αB-crystallin has been shown to be an oncoprotein that predicts poor prognosis [38,39,40,41] and resistance to neoadjuvant chemotherapy, especially for triple-negative breast cancer [40, 42]. However, the role of αB-crystallin as a tumor suppressor has also been reported [43]. These contradictory findings indicate that the role of αB-crystallin in carcinogenesis is complicated. The present study demonstrated that αB-crystallin was downregulated in the lymph metastases compared with the primary breast tumors. This finding is inconsistent with the previous finding that αB-crystallin expression promotes the brain metastasis of breast cancer [38, 44]. Recently, the majority of lymphatic and distant metastases were shown to originate differently in human colorectal cancer [45]. This phenomenon is also true for breast cancer metastasis, in which approximately 1/3 of breast cancer patients without lymph metastasis develop distant metastasis [46]. These observations suggest that the two routes of cancer spreading may occur independently and may use different sets of molecular routers to drive the metastatic spread of cancer cells through either the lymphatics or the blood vessels. Reconciling our data with the previous reports, it is possible that αB-crystallin plays a role of router to switch between lymphatic and hematogenous spreading. That is, the role of αB-crystallin in breast cancer progression needs to be reevaluated. It is speculated that αB-crystallin may function as a tumor promoter in hematogenous metastasis – to the brain, for example, but αB-crystallin may function as a tumor suppressor in lymph node metastasis. However, this speculation should be validated experimentally through in vitro and in vivo studies. Clearly, our findings further support a tumor-suppressor role for αB-crystallin in breast cancer development.

Many studies have shown that there is close link between FADD and many cancers, such as nonsmall cell lung cancer [47], gastric cancer [48] and hepatocellular carcinoma (HCC) [49]. In the first two of these cancers, the expression of FADD was correlated with lymph node metastasis and the poor prognosis of patients, and the loss of FADD expression plays an important role in HCC carcinogenesis. FADD expression is associated with T stage and perineural invasion [50]. An increase in FADD expression was shown to be associated with a higher incidence of lymph node metastasis at presentation and with a shorter DMFI when lymph node metastases are present [33]. These studies only involved the comparison between cancer and the surrounding normal tissues, whereas we focused on the differences in FADD expression between primary tumors and metastases. Using proteomic results, we determined that the expression of FADD was upregulated in metastasis. Furthermore, the IHC results revealed that there were significant differences in FADD expression between the primary tumors and metastases, but the rate of FADD-positive tumors decreased, which is inconsistent with the proteomic results. The possible reason for this inconsistency is that proteomics analyzes the relative quantity of protein expression, whereas immunohistochemistry analyzes the positive rate of protein expression, and thus results from these two methods are not always consistent. In addition, we also investigated potential correlations between FADD expression and the clinical pathological characteristics of 190 patients with breast cancer. We performed a 120-months survival analysis and found that FADD expression was associated with lymph node metastasis. Furthermore, higher expression levels of FADD were identified in patients with breast cancer, which were also correlated with a shorter survival time. These finding suggest that there is a close relationship between FADD expression and the lymph node metastasis and poor prognosis of breast cancer. Moreover, the regulatory mechanism of FADD in breast cancer metastasis warrants further investigation.

NDRG1 has been reported to function as a metastasis suppressor gene, and it is downregulated in gastric cancer [34], prostate [51, 52], pancreatic cancer [53] and breast cancers [45]. However, compared with normal tissue, NDRG1 expression was shown to be upregulated in homologous hepatocellular carcinoma [54] and oral squamous cell carcinoma [55]. In this study, all of the proteomics and IHC results revealed that NDRG1 expression was downregulated in metastases compared to the primary tumors. The expression of NDRG1 in various tissues may be affected by many factors, such as metal ions, oxygen, proto-oncogenes, tumor suppressor genes, hormones or vitamins. For example, NDRG1 expression in prostate cancer cells was shown to be affected by androgens, whereas NDRG1 expression in breast cancer cells is mainly associated with estradiol. Thus, the expression of NDRG1 is variable. In the clinical pathology and survival analysis, significant differences in NDRG1 expression were not detected in this study.

EpCAM is a transmembrane glycoprotein and appears to play a role in tumorigenesis and metastasis of carcinomas [56]. EpCAM is frequently upregulated in carcinomas but is not expressed in cancers of non-epithelial origin. At present, the FDA approves the automated cell detection method for EpCAM as biomarker, and this method has been used to detect circulating tumor cells in patients with breast [57], prostate [32, 58] and esophageal cancer [59]. The expression of EpCAM was shown to be high in laryngeal carcinoma but low in bone marrow as a metastatic niche for disseminated cancer cells [60]. These findings are consistent with our IHC results. However, EpCAM expression was increased in the metastatic group compared to the nonmetastatic group according to both iTRAQ and the proteomics analysis. Furthermore, the survival analysis showed that the survival rate was lower in the EpCAM-positive group. Therefore, the expression of EpCAM should be further clarified in breast cancer metastasis. Taken together, these data suggest that EpCAM plays a critical role in the metastatic process of breast cancer.

Conclusions

In summary, we discovered differentially regulated proteins between the primary breast tumors and their lymph node metastatic sites using the iTRAQ proteomics analysis. Through further immunohistochemical study, clinicopathological correlation analysis, and GEO profiling, we identified αB-crystallin as an independent biomarker to predict the outcome of breast cancer patients in the lymph node. Obviously, αB-crystallin plays a role in the metastasis of breast cancer cells to the lymph node, but its exact role in each step of breast cancer metastasis and the underlying signaling mechanism remain to be fully clarified. EpCAM, FADD and NDRG1 expression were shown to be associated with the progression of breast cancer, but the questions of how certain oncogenes may initiate dissemination before triggering aggressive proliferation and how tumor-suppressor pathways suppress metastasis in breast cancer warrant further investigation.