Background

Worldwide, breast cancer (BC) is the most prevalent cancer entity among women and is known as a heterogeneous disease in terms of tumor morphology and molecular structure [1,2,3,4]. Although many genes and proteins have been investigated as prognostic and predictive factors, only a few are decisive for treatment. This is reflected in the classicl breast cancer stratification into receptor-defined subtypes, termed luminal A-like, luminal B-like, triple negative breast cancer (TNBC), and human epidermal growth factor receptor 2 (HER2)-positive, as common clinical practice [5, 6]. However, expanding protein profiling towards novel directions could provide new insights into molecular mechanisms associated with the observed heterogeneous clinical outcome. Moreover, analyzing these protein profiles harbors the potential for identification of prognostic markers and druggable targets off the beaten track.

Altered metabolism has long been known to characterize tumors ever since Otto Warburg reported his first observations of the metabolic changes that accompany malignancy [7]. Furthermore, deregulated cancer metabolism has regained attention and is regarded as a new hallmark of cancer [8]. Metabolic transformations have been intensively studied over recent years and as a result, the first strategies to target the altered metabolism of cancer cells are emerging [9].

Mutations in metabolic enzymes can drive tumorigenesis; more often, however, cancer metabolism is transformed by altered abundance and activity of metabolic enzymes [10]. Proliferative cells alter their metabolism to support biosynthetic reactions required for accumulation of biomass and the production of macromolecules [11]. Reprogrammed cellular metabolism involves increased glucose intake and glutamine addiction. Glutamine is the most abundant amino acid in serum and represents a fundamental source for nucleotide and amino acid synthesis. “Glutamine addiction”, which is characterized by poor cancer cell survival in the absence of glutamine, has been observed in several cancer entities [12]. Glutamine acts as a nitrogen donor for nucleotide and protein synthesis, and is converted via glutaminase to glutamate, which represents the main nitrogen donor for the synthesis of nonessential amino acids [13]. Furthermore, glutamine has been described as an essential activator of the mammalian target of rapamycin complex 1 (mTORC1), which regulates protein translation, cell growth and autophagy [14]. Glutamine is transported by several families of amino acid transporters, of which ASC amino-acid transporter 2 (ASCT2), also named solute carrier family 1 member 5 (SLC1A5), belongs to the most ubiquitously expressed glutamine transporters in human cancer cells [15]. Apart from glutamine metabolism, serine and glycine metabolism are also important mediators in cancer cell development. Serine and glycine are biosynthetically linked, and together provide essential precursors for the synthesis of proteins, nucleic acids, and lipids that are crucial to cancer cell growth. Serine hydroxymethyltransferase (SHMT) reversibly converts serine to glycine, connecting the serine and glycine pathways. Glycine is required to maintain the cellular redox balance and also sustains oxidative phosphorylation in the mitochondria [16]. It has been shown that glycine uptake and catabolism are able to promote tumorigenesis and malignancy, suggesting that serine and glycine metabolism could be a target for therapeutic intervention [17].

Nevertheless, the criteria used to evaluate tumor metabolism are still not well-established and thus are not universally applied. Also, it is unclear mechanistically how metabolic characteristics of the tumor influence patient outcome and how they can be utilized in the clinical management of tumors. Therefore, it is necessary to obtain a better understanding of molecular mechanisms underlying the heterogeneity of breast cancer metabolism. Transcriptional profiling of genes associated with cancer metabolism has to some extent identified associations with different clinical features [18]. However, the breast cancer transcriptome does not directly translate into proteome and comprehensive analysis of messenger RNA (mRNA) expression does not reflect all layers of biological complexity [19, 20]. Thus, a systematic study of protein expression profiles related to major metabolic pathways may facilitate a more precise classification and exploration of prognostic markers in breast cancer.

During recent years, reverse phase protein array (RPPA) has emerged as a powerful high-throughput approach for targeted proteomics [21, 22]. RPPA allows the quantification of protein expression profiles in large sample sets while requiring very low amounts of biological sample. Therefore, the RPPA platform is ideally suited for the analysis of clinical materials and biomarker discovery purposes [23,24,25].

In respect of the current focus on precision medicine, the identification of novel therapeutic proteins and prognostic biomarkers is critical for future clinical drug discovery and patient stratification purposes. The objective of this study was to assess the relationship between protein profiles of major metabolic pathways and their prognostic value in patients with breast cancer treated in accordance to national guidelines. We applied RPPA-based functional proteomics to a large number of patient samples from a multicenter prospective cohort. We assessed clusters of breast cancer subgroups based on metabolism-associated protein expressions. Furthermore, we aimed to identify new markers and prognostic factors associated with patient outcome.

Methods

Patients and tissue samples

Human primary breast cancer samples were collected at the Martin-Luther University, Halle-Wittenberg between 2009 and 2011 as part of the multicenter prospective PiA trial (NCT 01592825). Only fresh frozen tissue samples from female patients with operable non-metastasized breast cancer were included. The study was approved by the ethics committee of the Martin-Luther University Halle-Wittenberg and informed consent was obtained from each patient. A cohort of 801 primary tumor tissue samples was investigated using RPPA. Tumor specimens were fresh frozen after surgery and stored at −80 °C until further use. Tumor content was verified by histopathological assessment. Clinicopathological parameters were obtained for each patient and documented using SPSS 22, SPSS Inc., Chicago, IL, USA. The TNM staging system was used [26]. Patient information was anonymized prior to analysis. Receptor-defined breast cancer subtypes were determined according to the St. Gallen classification [27]. Due to missing Ki-67 values, we used histopathological grading to assess cell proliferation [28]. The following stratification system was applied:

  • Luminal A-like: estrogen receptor (ER) positive and/or progesterone receptor (PgR) positive, HER2 negative, grade 1 or 2.

  • Luminal B-like (HER2 negative): ER positive and/or PgR positive, HER2 negative, grade 3.

  • Luminal B-like (HER2 positive): ER positive and/or PgR positive, HER2 positive, all grades.

  • HER2 positive (non-luminal-like): ER negative and PgR negative, HER2 positive, all grades.

  • Triple negative breast cancer (TNBC): ER negative, PgR negative, HER2 negative, all grades.

The standardized definitions for efficacy endpoints (STEEP) criteria were used as endpoint definitions [29]. Additional information on patient and tumor characteristics are illustrated in Table 1.

Table 1 Patient and tumor characteristics

Reverse phase protein array profiling

Frozen tumor specimens were homogenized using a bead mill and tissue protein extraction reagent (50 mM Tris, pH 8.5, 138 mM NaCl, 2.7 mM KCl, 1% Triton X-100). Total protein concentration was determined by bicinchoninic acid protein assay (Thermo Scientific). Tumor lysates were adjusted to a total protein concentration of 2 μg/μl, mixed with 4 × SDS sample buffer (10% glycerol, 4% SDS, 10 mM DTT, 125 mM Tris–HCl, pH 6.8) and denaturated at 95 °C for 5 min. Protein lysates and dilution series of tumor sample pools serving as controls, were spotted as technical triplicates on nitrocellulose coated glass slides (Oncyte Avid, Grace-Biolabs) using an Aushon 2470 contact spotter (Aushon BioSystems). Post spotting, slides were incubated with blocking buffer (Rockland Immunochemicals) in TBS (50%, v/v) containing 5 mM NaF and 1 mM Na3VO4 for 2 h at room temperature. Incubation with target-specific primary antibodies was applied at 4 °C overnight. Primary antibodies were selected to cover a range of metabolic pathways and to achieve a broad perspective on breast cancer metabolism (Additional file 7: Table S5). Antibody validation was carried out as previously described [30]. Primary antibodies were detected with Alexa Fluor 680 F(ab')2 fragments of goat anti-mouse IgG or anti-rabbit IgG (Life Technologies) in 1:12000 dilution. In addition, representative slides were stained for total protein quantification using Fast Green FCF protein dye as described before [31]. TIFF images of all slides were obtained at an excitation wavelength of 685 nm and at a resolution of 21 μm using an Odyssey Scanner (LI-COR, Biosciences). Signal intensities of individual spots were quantified using GenePixPro 7.0 (Molecular Services Inc.). Data preprocessing and quality control were performed using the RPPanalyzer R-package [32].

Immunohistochemical analysis

Immunohistochemical analyses (IHC) were performed on 4-μm tissue sections. Protein expression was assessed using Bond Max Polymer Refine Immunohistochemistry protocol. Primary antibodies were diluted 1:250. Epitope retrieval was performed with Bond Epitope Retrieval Solution for 30 min at pH6, followed by a peroxidase block. Primary antibody was incubated for 20 min and detected using Bond Polymer Refine Detection with 3,3-diaminobenzidine (DAB) substrate. IHC was performed by a pathologist as a semi-quantitative visual score, based on the fraction of cytoplasmic staining above background.

Statistical and bioinformatic analyses

Hierarchical clustering

Hierarchical cluster analysis was performed on z scores of protein expression levels using Ward's minimum variance method and squared Euclidean distance. Patient samples and protein targets were clustered simultaneously and the resulting dendrograms were visualized with a heatmap depicting z score values. RPPanalyzer R-package was used for visualization, with adjustment of color bars according to the clinicopathological features of interest and exploiting the dendextend R-package for dendrogram color-coding [32, 33].

Univariate analysis

The relationship between clinicopathological variables and the three patient clusters was evaluated using analysis of variance (ANOVA), the Kruskal-Wallis rank sum test, and Fisher's exact test, as appropriate. The relationship between the variables and the patient groups, stratified based on the median expression of a protein, was evaluated using the t test, Wilcoxon rank sum test, and Fisher's exact test, as appropriate.

Survival analysis

Kaplan-Meier analysis of overall survival (OS) and recurrence-free survival (RFS) was performed on patients stratified into groups (based on receptor-defined subtypes, median expression level or patient dendrogram clusters). The difference in Kaplan-Meier curves was tested using the log-rank test implemented in the survival R-package [34]. Univariate Cox proportional hazard regression models were applied to test individual protein target association with OS and RFS [35]. For each target the exponent of the estimated regression coefficient is reported as a hazard ratio (HR) with its 95% confidence intervals (CI). P values were adjusted for multiple testing resulting in false-discovery rate (FDR) values [36]. Univariate Cox proportional hazard regression models were further used to evaluate clinicopathological variables. Multivariate Cox analyses were then performed on selected non-correlated clinicopathological covariates for each of the proteins that was significant in the univariate Cox analysis.

STRING visualization

The Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database (Version 10) of the STRING Consortium was used for visualization of protein interactions by choosing Gene IDs corresponding to proteins [37]. If not otherwise stated, the data were analyzed using the R statistical computing environment (version 3.0.2) [38]. A p value <0.05 was considered statistically significant. The complete RPPA data matrix with corresponding clinicopathological parameters is appended in Additional file 8: Table S6.

Results

Unsupervised clustering of protein expression profiles in patients with breast cancer

To investigate altered expression patterns of metabolism-related proteins in tumorigenesis of BC, we performed RPPA of 801 patient specimens. The clinicopathological features of the cohort are summarized in Table 1. The median follow up of the cohort was 55.44 months for overall survival (OS) and 54.46 months for recurrence-free survival (RFS). In a first step, the patient profiles of 37 metabolism-related proteins were assessed by unsupervised hierarchical clustering. As illustrated in Fig. 1, clustering divided the cohort into two patient subgroups (green, n = 440; violet, n = 361).

Fig. 1
figure 1

Unsupervised clustering of protein profiles. The heatmap represents expression levels of 37 metabolism related proteins after unsupervised hierarchical clustering. The dataset consists of 801 tumor specimen. The z scores of log2 transformed protein expression levels are color-coded on a low-to-high scale (green-black-magenta). Dendrogram branches divide the patient set into a green and violet cluster and protein targets into a “diffuse” and “compact” cluster. Annotation bars include receptor-defined subtypes (a), histological grade (b), histology (c), menopausal status (d), nodal status (e) and T stage (f). HER2, human epidermal growth factor receptor 2; TNBC, triple negative breast cancer

To elucidate the potential association with survival in the two subgroups, we performed Kaplan-Meier analysis (Additional file 1: Figure S1). We observed no significant association with OS or RFS. However, a distinct horizontal partition seemed to be a more dominant feature of the heatmap. A separation into two protein expression subgroups indicated functional differences throughout the whole patient cohort. Therefore, we divided the protein dendrogram into two protein subgroups, a “diffuse”’ cluster (blue, n = 19), characterized by a heterogeneous protein expression pattern and a “compact” cluster (gold, n = 18) with a clear protein expression pattern. Notably, the impact of the compact protein cluster in driving the initial clustering and formation of the two patient clusters, seemed to subdue the effects of the diffuse cluster. Therefore, we focused on re-investigating the diffuse protein cluster separately.

Diffuse protein signature revealed three patient clusters significantly associated with survival

Hierarchical clustering of the 19 protein targets representing the diffuse cluster, resulted in three refined patient clusters based on the dendrogram arrangement (Fig. 2a).

Fig. 2
figure 2

Unsupervised clustering and analyses based on “diffuse” cluster refinement (a). The heatmap represents metabolism-related protein expression levels of the diffuse target signature after unsupervised hierarchical clustering of 801 tumor specimen. The z scores of log2-transformed protein expression levels are color-coded on a low-to-high scale (green-black-magenta). Annotation bars include receptor-defined subtypes (a), histological grade (b), histology (c), menopausal status (d), nodal status (e) and T stage (f). Statistical analysis of the three patient clusters (blue, yellow, brown) is shown in Additional file 3: Table S1. Kaplan-Meier curves represent the proportion of overall survivors (b) and recurrence-free survivors (c), compared by log-rank test

The blue (n = 242), yellow (n = 89) and brown cluster (n = 470) were compared in survival analysis of OS and RFS. This revealed a significant difference among the clusters in both OS (p = 0.023, Fig. 2b) and RFS (p = 0.0071, Fig. 2c). The blue cluster showed the most favorable OS and RFS, whereas the yellow cluster represented the worst. Clinical parameters (age, tumor size, histology, T stage, grade, node status, menopausal status and receptor-defined status) were examined for differences in distribution between the patient clusters (Additional file 3: Table S1). Univariate comparison across the patients’ groups showed that all clinical parameters were significantly different between the three clusters (p ≤ 0.05). Furthermore, multivariate analysis was conducted based on selected clinicopathological covariates and while multivariate analysis became null once additional clinical covariates were included, univariate associations between the clusters and OS/RFS were significant.

The proteomic network of the diffuse and compact cluster

To visualize the biological context of the proteins representing the diffuse and compact clusters at a glance, we visualized them in two protein networks by using the STRING database (Additional file 2: Figure S2). As part of the diffuse cluster, we observed all proteins related to glycine synthesis (SHMT2), lipid and fatty acid synthesis (FASN, STARD10, ACC and SREBF1), and glycolysis and lactate production (GLUT1, GAPDH, PKM2, LDHA and LDHB). The compact cluster in comparison is composed of all measured proteins associated with serine synthesis (PHGDH, PSAT1 and PSPH). Pyruvate kinase isozyme M1 (PKM1) was the only glycolysis protein represented in the compact cluster. Proteins related to the tricarboxylic acid (TCA) cycle, urea cycle and glutaminolysis, were found in both the diffuse and the compact cluster.

Correlation between individual target expression and clinicopathological characteristics

In order to identify individual proteins responsible for the survival association of the diffuse cluster, and to evaluate their potential role as biomarkers, we next analyzed the expression of all probed proteins individually. The association between each protein expression level and OS and RFS was tested using univariate Cox proportional hazard regression models and protein expression was treated as a continuous variable (Additional file 4: Table S2). Out of 37 metabolism-related proteins tested, SHMT2 and ASCT2 were found to be significantly associated with OS (Table 2). Univariate Cox analysis of RFS identified 6 out of 37 proteins to be significantly associated with outcome (Table 3).

Table 2 Protein targets significantly associated with overall survival (OS)
Table 3 Protein targets significantly associated with recurrence-free survival (RFS)

Patients with breast cancer were further stratified into “low” and “high” protein expression groups to explore the relationship with clinicopathological variables. This was based on the median protein expression of SHMT2, ASCT2, GAPDH, FH, CAD and PKM2 (Additional file 5: Table S3). Univariate analysis showed that all six protein expression profiles were significantly associated with tumor size, T stage, grade, nodal status and receptor-defined subgroups. Except for CAD, all proteins showed significant association with histology results, whereas PKM2 protein expression was the only protein profile significantly correlated with age. No significant difference between the protein expression profiles and menopausal status was observed.

SHMT2 and ASCT2 protein expression as independent prognostic factors in patients with breast cancer

To further confirm our findings, multivariate Cox analyses for OS and RFS was conducted based on selected clinicopathological covariates and univariate significance. Proteins that were significant in the univariate Cox analysis were included (Additional file 6: Table S4).

To address whether SHMT2 and ASCT2 protein expression are independent prognosticators for OS and RFS, we analyzed the association between SHMT2/ASCT2 protein expression levels and clinical characteristics of BC, using multivariate Cox models. This revealed that high SHMT2 protein expression is an independent negative prognostic factor for OS (p = 0.011; Table 4) and both high SHMT2 and high ASCT2 protein expression levels are independent negative prognostic factors for RFS (SHMT2, p = 0.003; ASCT2, p = 0.042; Table 5) in patients with BC. Kaplan-Meier survival estimates, based on dichotomized protein expression data, subsequently confirmed that patients with BC with high SHMT2 and high ASCT2 protein expression had significantly shorter OS (SHMT2, p < 0.001; ASCT2, p = 0.0165) and RFS (SHMT2, p < 0.001; ASCT2, p < 0.001), (Fig. 3).

Table 4 Univariate and multivariate Cox regression analysis of overall survival
Table 5 Univariate and multivariate Cox regression analysis of recurrence-free survival
Fig. 3
figure 3

Kaplan-Meier survival estimates and boxplot representation of key targets associated with survival. Kaplan-Meier plots of serine hydroxymethyltransferase 2 (SHMT2) and ASC amino-acid transporter 2 (ASCT2) for overall survival (OS) (a), and recurrence-free survival (RFS) (b). Statistical difference in outcome between high (n = 400) and low (n = 401) expression were compared by log-rank test. Boxplots represent the relative target protein expression per receptor-defined subtype, luminal A-like (n = 510), luminal B-like human epidermal growth factor receptor 2-negative (HER2-neg) (n = 104), luminal B-like HER2-positive (HER2-pos) (n = 74), HER2-pos (n = 36), triple negative breast cancer (TNBC) (n = 77). FDR, false discovery rate; pval, p value

We additionally explored the distribution of SHMT2 and ASCT2 protein expression across BC subtypes. This identified higher protein expression of both targets in the aggressive HER2-positive and the TNBC breast cancer subtype, in comparison to the luminal subgroup (Fig. 3). SHMT2 and ASCT2 immunostaining of representative cases were selected on the basis of RPPA protein expression analysis and revealed a confirming pattern of cellular target protein expression in RPPA vs. IHC. Cases of high target-protein expression in RPPA also represented a high cellular target protein expression in IHC and vice versa (Fig. 4). Taken together, these results illustrate the prognostic value of profiling proteome data and highlight the importance of including the proteomic level in biomarker research.

Fig. 4
figure 4

Representative immunoexpression of serine hydroxymethyltransferase 2 (SHMT2) and ASC amino-acid transporter 2 (ASCT2). Cases were selected on the basis of reverse phase protein array (RPPA) protein expression results. Representative pictures of the highest or lowest 10% of cases based on the target expression over all cases. SHMT2 immunoexpression is elevated in Case M571 and low in Case M1084. ASCT2 immunoexpression is elevated in Case M1199 and low in Case M907. The scale bar indicates 200 μm (×20)

Discussion

Since breast tumors are heterogeneous at the molecular level and in outcome, future clinical management should include personalized tumor characterization, which leads to improved treatment decisions. So far, the metabolic state of tumors has not been studied widely and is insufficiently embodied by current molecular biomarkers that predict adverse clinical outcome. Moreover, large cohort studies have addressed metabolic biomarkers at the genomic level only. Even though genomic information does translate to a certain extent into phenotypic characteristics, genomic and transcriptomic screens of tumors in patients with BC have proven unsuccessful in predicting protein status [39, 40]. Therefore, a complementary study investigating the metabolic landscape of breast cancer at the proteomic level is beneficial in identifying proteome-based biomarkers with clinical impact.

In this study, we used RPPA to generate protein expression data from 801 breast tumor specimens, in order to identify prognostic markers and to gain insights into the metabolic heterogeneity of breast cancer. Clustering analysis of the whole dataset separated the metabolism associated proteins into a diffuse and a compact subgroup, indicating different metabolic profiles. STRING visualization of the protein distribution depicted a prominent role of glycolysis towards lactate production in the diffuse cluster. Also, SHMT2, primarily responsible for synthesizing glycine from serine, was a part of the diffuse cluster, whereas all proteins of the serine pathway (PHGDH, PSAT1 and PSPH) were distributed in the compact cluster. Interestingly, key enzymes of the TCA cycle, crucial for citrate production, like FH and SDHA, were present in the diffuse cluster, whereas IDH1 and IDH2, which mainly drive the TCA cycle towards α-ketoglutarate production, were part of the compact cluster. This observation was supported by the presence of PDH, STARD10 and FASN and hints towards distinct citrate production in order to fuel the lipid and fatty acid synthesis. Notably, the glutamine transporters ASCT2 (SLC1A5) and SLC7A5, and the glutamine producing enzyme GLUL, were also part of the diffuse cluster. Altogether, the protein composition of the diffuse cluster hints towards glucose consumption, glutamine addiction and glycine production and indicates a more active Warburg-like characteristic in comparison to the compact cluster [41].

Subsequent clustering analysis based on the diffuse-cluster protein levels identified three patient clusters, which are significantly associated with survival on univariate analysis. These three patient clusters do not fully reflect the receptor-defined subtypes and may thus provide additional information towards understanding breast cancer heterogeneity. Interestingly, further analysis showed that all proteins found to be significantly correlated with survival, were part of the diffuse cluster. SHMT2, ASCT2, GAPDH, FH, CAD and PKM2 were associated with RSF and SHMT2 on univariate analysis, and ASCT2 was associated with with OS. Further, we explored the biomarker potential of these proteins in multivariate Cox analyses.

Multivariate analysis identified SHMT2 and ASCT2 protein expression levels as significantly associated with age, nodal status and T stage (RFS only). Moreover, high SHMT2 protein levels were significantly associated with poor RFS and OS. High protein expression of ASCT2 was significantly correlated with poor RFS. Patients of HER2-positive and HR-negative breast cancer had increased SHMT2 and ASCT2 levels compared to patients with luminal-like cancer. Notably, the significant correlation between ASCT2 or SHMT2 and nodal status, T stage and survival indicates a connection of higher metabolic activity and associated protein expression in metastatic tumors and tumors with further progression. These observations are in line with studies reporting that the metabolic demands of cancer cells are related to their cell size, progression and protein synthesis rates [42].

Glutamine metabolism is considered to be a therapeutic target, as some cancer cells exhibit high uptake of and addiction to this non-essential amino acid [41]. Recent studies have demonstrated that the primary glutamine transporter, ASCT2, can promote tumor cell survival, growth and cell cycle progression in neuroblastoma, colorectal cancer, prostate cancer, clear-cell renal cell carcinoma and non-small cell lung cancer [43,44,45,46,47]. The ubiquitous tissue expression, along with its ability to transport crucial amino acids, indicates the central role of ASCT2 in physiological processes including glutamine homeostasis, embryogenesis and retroviral infection [48, 49]. Glutamine is not only an important nutrient for cancer cell survival, but also a crucial mediator for immune cell functions. ASCT2 was shown to be involved in inflammatory T cell responses, which might exert key functions in tumor immunity [50]. Besides its significance in prognosis, ASCT2 has also gained more attention in cancer treatment during recent years. ASCT2 is considered to be a major regulator of glutamine metabolism and thus represents an important regulator of cancer development [51]. ASCT2 also regulates the cellular nutrient uptake and concentration [52, 53]. Several studies indicate that blocking glutamine uptake might be an attractive strategy for cancer therapy. We showed that high protein levels of ASCT2 are correlated with unfavorable prognosis. Blocking the glutamine uptake by utilizing ASCT2 as a potential therapeutic target and reducing its protein expression, could therefore be a promising approach.

Besides glutamine, serine and glycine metabolism is also crucial in cancer cell development. Serine and glycine are biosynthetically linked, and besides cancer growth also affect the cellular antioxidative capacity, thus supporting tumor homeostasis. SHMT2 has been implicated as an essential factor in serine and glycine metabolism in several cancer cell types, including breast cancer [17]. SHMT2 catalyzes the reversible reaction of serine and tetrahydrofolate to glycine and 5,10-methylene tetrahydrofolate. Genomic studies have shown that high levels of glycine are associated with poor prognosis in breast cancer, irrespective of the ER status [54]. We could demonstrate that high protein levels of SHMT2 are correlated with poor outcome. Inhibition of glycine synthesis by reducing SHMT2 protein expression could therefore represent a promising strategy to employ SHMT2 as a potential therapeutic target. Considering the metabolic heterogeneity of breast cancer, SHMT2 and ASCT2 might be useful as potential markers in risk stratification and targets for drug development. Notably, to date there are no SHMT2 and ASCT2 inhibitors commercially available for cancer therapy. To our knowledge this is the first study to report the prognostic value of SHMT2 and ASCT2 at the protein expression level in patients with breast cancer.

Although our study revealed the clinical significance of SHMT2 and ASCT2 in breast cancer, some limitations warrant further investigation. For instance, the molecular mechanisms and functional behavior of SHMT2 and ASCT2 in breast cancer merit further exploration. Furthermore, investigations in an independent external cohort are needed to validate our findings. We seek to further investigate the mechanisms discussed in future studies and will conduct long-term follow up of the patient cohort to monitor the prognostic power of our results.

Conclusions

In this newly generated breast cancer dataset, we identified metabolism-associated proteins linked to breast cancer progression. We found metabolic clusters of breast cancer, characterized by differences at the proteomic level. Particularly, proteins mapping to the diffuse cluster, were found to be associated with poor prognosis. Univariate and multivariate analyses supported the crucial role of SHMT2 and ASCT2 protein expression as independent prognostic factors in breast cancer. High protein expression of SHMT2 and ASCT2 were significantly associated with shorter RFS. Moreover, high SHMT2 protein expression was also a predictor for shorter OS. In summary, SHMT2 and ASCT2 protein expression were identified as novel potential prognostic biomarkers for patients with breast cancer, as their high protein expression is associated with poor outcome.