Introduction

Gastric cancer (GC), globally recognized as the most common malignancy affecting the digestive system, afflicts approximately one million individuals worldwide each year, as documented in reference [1]. The prognosis for GC patients remains bleak, with survival rates beyond five years from diagnosis hovering below 30% [2]. This dismal prognosis is primarily attributed to advanced cancer stages coupled with a hostile tumor microenvironment (TME) [3].

Glycolysis serves as the primary energy source for tumor cells. Notably, elevated glycolytic activity in these cells has been linked to poorer outcomes and drug resistance across various cancer types, GC being no exception [4]. Numerous investigations have explored the intricate interplay between tumor glycolysis and the TME [5,6,7]. Glycolysis enables tumor cells to produce lactic acid, leading to TME acidification. This acidic environment promotes tumor cell proliferation, invasion, and migration [8]. Furthermore, metabolic reprogramming has been implicated in fostering an immunosuppressive TME [9]. Extensive research has been conducted on the relationship between glycolysis and various tumor types, including liver cancer [10], glioblastoma [11], and endometrial cancer [12]. In this study, we delved into the role of glycolysis-related genes (GRGs) in prognostic prediction and their association with immune microenvironment traits in gastric cancer patients.

Materials and methods

Acquisition of the mRNA expression dataset

The Cancer Genome Atlas (TCGA) database (accessible at https://portal.gdc.cancer.gov/) [13] provided the transcriptome data, quantified as fragments per kilobase million (FPKM), alongside corresponding clinical details. This dataset encompassed expression profiles of 407 samples, comprising 32 normal tissues and 375 tumor tissues. Additionally, we sourced the GSE84437 [14] and GSE13763 [15] datasets from the Gene Expression Omnibus (GEO) platform (https://www.ncbi.nlm.nih.gov/geo/). These datasets, which included microarray data generated using the Illumina HumanHT-12 V4.0 expression bead chip, served as our validation cohort along with accompanying clinical information. Utilizing the appropriate annotation files, namely GPL6947 and GPL570, we converted probe IDs to gene symbols and computed the mean expression values for multiple probes targeting the same gene. The data underwent standardization and processing with R software (version 3.5.1) and its Bioconductor components. To assess the correlation between gene expression and immunotherapy response, we referenced data from the immunotherapy cohort PRJEB25780 (accessible at https://www.ebi.ac.uk/ena/browser/home) [16]. For gene expression analysis, we employed the limma and ggpubr packages in R. Furthermore, we used the pROC package in R to generate ROC curves, aiming to predict gastric cancer (GC) patients’ responsiveness to anti-PD-1 therapy.

Analysis of differentially expressed glycolysis-related genes (GRGs)

Utilizing the keyword “glycolysis,” we retrieved five glycolysis-related sets—GO_GLYCOLYTIC_PROCESS, KEGG_GLYCOLYSIS_GLUCONEOGENESIS, BIOCARTA_GLYCOLYSIS_PATHWAY, HALLMARK_GLYCOLYSIS, and REACTOME_GLYCOLYSIS—from the Molecular Signatures Database (MSigDB) accessible at https://www.gseamsigdb.org/gsea/msigdb/ind-ex.jsp [17]. These sets collectively encompassed genes deemed glycolysis-related genes (GRGs). Subsequently, within the TCGA-STAD cohort data, a total of 326 GRGs were pinpointed. The differentially expressed genes (DEGs) were determined using the “edgeR” R package [18], specifying a false discovery rate (FDR) threshold of < 0.05 and an absolute log fold-change (|logFC|) exceeding 1. To further explore the potential functions and associated pathways of these differentially expressed GRGs, gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were conducted utilizing the “ClusterProfiler” R package [19].

Consensus clustering analysis of GRGs

The R package “ConsensuClusterPlus” [20] was utilized to evaluate the expression patterns of 326 GRGs for consensus clustering. To guarantee classification consistency, a thousand permutations were performed. A consensus heatmap, alongside the cumulative distribution function (CDF), aided in determining the optimal k value.

For analyzing the biological functions among the subgroups, gene set variation analysis (GSVA) was executed by employing the “GSVA” R package. The enrichment outcomes were subsequently visualized through a heatmap generated by the “heatmap” R package. Utilizing the “limma” R package, statistically significant differences were established at an adjusted p-value of less than 0.05 and a false discovery rate (FDR) below 0.25. Both gene set enrichment analysis (GSEA) and gene set variation analysis (GSVA) used the Kyoto Encyclopedia of Genes and Genomes (KEGG) gene set collection (c2.cp.kegg.v7.1.symbols.gmt) as their input file.

Construction of a glycolysis-related risk model

To identify survival-associated GRGs, a univariate Cox regression analysis was conducted. Subsequently, the “glmnet” R package [21] was employed to dimensionality reduction, leveraging the least absolute shrinkage and selection operator (LASSO) method specifically for the survival-linked GRGs. A glycolysis-related risk model was then constructed using the formula: \(\:Risk\:score={\sum\:}_{i=1}^{n}Expi\:\beta\:i\), where Exp denotes the expression levels of the genes and β represents the regression coefficient [22]. Based on the median risk score, Kaplan-Meier curves were plotted to assess the prognostic outcomes of patients across different groups. Additionally, survival ROC curves were generated using the “Survival ROC” R package [23].

Development of a nomogram based on GRGs and clinical features

To evaluate the correlation between GRG expression and various clinical attributes, including age, gender, clinical stage, grade, tumor-node-metastasis (TNM) classification, and G grade, a univariate Cox regression analysis was executed. Subsequently, a multivariate Cox regression analysis was conducted to identify those prognostic variables that were independently significant.

Incorporating both GRG expression levels and clinical characteristics, a nomogram scoring framework was established. This nomogram was further validated through both univariate and multivariate Cox regression analyses, aimed at pinpointing prognostic indicators. Each factor within the nomogram was assigned a corresponding score, and these scores were then aggregated. Calibration curves were employed to assess the nomogram’s predictive accuracy, while the concordance index (C-index) served as a metric to quantify its discriminatory power.

Cell culture and treatment

The human gastric cancer cell line BGC-823 was acquired from the Shanghai Cell Bank, affiliated with the Chinese Academy of Sciences, located in Shanghai, China. These cells were propagated in RPMI-1640 medium (Gibco, Grand Island, New York, USA), enriched with 10% fetal bovine serum (FBS, Gibco, Grand Island, New York, USA) and a 1% solution of penicillin and streptomycin (Beyotime, Nantong, China). The cells were nurtured in a humidified atmosphere containing 5% CO2 at a temperature of 37 °C. Passage of the cells occurred at regular intervals of 2–3 days to maintain their viability and proliferative capacity. Lentiviruses encoding shRNAs for CXCR4 (VSVG-Lentai-hU6-shRNA-CXCR4-BSD-hEF1a-3xFlag) and negative control (NC) lentivirus (VSVG-Lentai-hU6-shRNA-NC-BSD-hEF1a-3xFlag) were purchased from Shanghai Taitool Bioscience Co. (Shanghai, China). The sequences of the shRNAs targeting CXCR4 are summarized in Table 1. Cells were infected with the lentivirus and selected by using blasticidin (BSD, Sigma) 72 h after infection.

Table 1 ShRNA sequences targeting CXCR4

Quantitative real-time PCR

Total RNA was extracted utilizing TRIzol reagent obtained from Thermo Fisher Scientific (Waltham, MA, USA). Spectrophotometric quantitation of the extracted RNA followed, where 1 µg of the total RNA was diluted to a final volume of 20 µl. This diluted RNA was then reverse transcribed using the PrimeScript RT Reagent Kit (TaKaRa, Shiga, Japan), strictly adhering to the manufacturer’s instructions. Equal aliquots of the resulting complementary DNA (cDNA), which corresponded to identical RNA amounts, were prepared for mRNA quantification through RT-PCR. This quantification was performed on the Light Cycler 96 Real-time Quantitative PCR Detection System (Roche, Indianapolis, IN, USA). The reaction mixture included the respective cDNA, forward and reverse primers, and SYBR Green PCR master mix (Roche). β-Actin served as an internal control for normalization. The sequences of the primers used in this study are comprehensively listed in Table 2.

Table 2 Primers used for real-time PCR

Glucose uptake assay

The glucose uptake assay kit (fluorometric, ab136956, Abcam) was utilized to quantify the uptake of 2-deoxyglucose (2-DG), strictly adhering to the manufacturer’s instructions. Initially, cells were seeded at a density of 2.5 × 104 cells per well in 96-well plates and incubated overnight. Subsequently, these cells were deprived of serum for an additional 24 h before being incubated in fresh complete media for 48 h. This incubation period allowed the cells to recover and resume their normal metabolic activities. Following this, the cells were exposed to a solution containing 0.5% bovine serum albumin (BSA) in phosphate-buffered saline (PBS), with or without the addition of 2-DG, for a duration of 1 h. The Fluoroskan microplate fluorometer (Thermo Fisher Scientific, San Jose, CA, USA) was then employed to measure the relative fluorescence units (RFU) at an excitation/emission wavelength of 535/587 nm. Finally, the uptake of 2-DG was accurately calculated using a standard curve generated with 2-DG-6-phosphate (2-DG6P) and the measured RFU values of the samples.

Lactate measurement assay

To assess lactate levels, an L-Lactate Assay Kit (Abcam, ab169557) was utilized. Cell extracts were oxidized in a specific manner, leading to the formation of an intermediate compound. This intermediate subsequently reacted with a colorless probe, generating fluorescence, which was quantified at an excitation wavelength of 530 nm and an emission wavelength of 590 nm using a Fluoroskan microplate fluorometer (Thermo Fisher Scientific, San Jose, CA, USA). The fluorescence intensity observed was directly proportional to the lactate concentration, expressed in nmol/ml.

Extracellular acidification rate (ECAR) assay

The extracellular acidification rate (ECAR), a key indicator of glycolysis, was measured using a dedicated assay kit (Abcam, ab197244). Cells were plated in a 96-well format at a density of 5 × 104 cells per well. Prior to the assay, 150 µL of respiration buffer was added to each well. As a control, carbonyl cyanide 4-(trifluoromethoxy) phenylhydrazone (FCCP) was added in a volume of 10 µL to either stimulate or suppress glycolytic activity. Subsequently, 10 µL of the reconstituted glycolysis assay reagent was introduced, and the ECAR signal was monitored at intervals of 1.5 min for a duration of at least 120 min. The Fluoroskan microplate fluorometer (Thermo Fisher Scientific, San Jose, CA, USA) was used for this measurement, employing excitation and emission wavelengths of 380 nm and 615 nm, respectively.

Wound healing assay

Cells were seeded into a six-well dish at a concentration of 2 × 105 cells per well. It was anticipated that the cells would reach 80–90% confluency. A 200 µL pipette tip was used to create consistent length lesions on cell monolayers, which were then cultured in basal medium. Following the removal of cellular debris with phosphate-buffered saline (PBS), the wounds were photographed at 0-, 12-, and 24-hours using inversion microscopy. Cell movement was assessed by calculating the changes in wound size using ImageJ software from the National Institutes of Health in Bethesda, MD, USA.

Transwell invasion assay

A diluted solution of Matrigel (1:8 ratio with a concentration of 50 mg/L) was applied to the top side of the lower membrane in the transwell chamber and left to dry at 4 °C.After being resuspended in serum-free DMEM, the cells were placed in the upper chamber. The bottom wells were filled with complete DMEM. After 24 h of incubation, the upper chamber cells were taken out. After passing through the Matrigel matrix membrane, the cells were treated with 4% paraformaldehyde and then stained using 0.1% crystal violet. Subsequently, representative pictures were captured using a microscope, and the cell count in each image was determined using ImageJ.

MTT

Cell proliferation was assessed with the MTT assay, which stands for 3-(4,5-dimethyl-2-yl)-2,5-diphenyl tetrazolium bromide. Cells were placed in 96-well dishes with 3,000 cells per well and exposed to GEM for 24 h. MTT was introduced into the solution on days 0, 1, 2, 3, 4, and 5 at a concentration of 5 mg/mL. Following a 4-hour incubation at 37 °C, the liquid was extracted, and 200 milliliters of dimethyl sulfoxide (DMSO, Sigma) were introduced to each well for dissolving. The microplate reader (Thermo Fisher Scientific, Waltham, MA, USA) was used to measure the absorbance of DMSO at 490 nm.

Subcutaneous xenograft nude mouse model

Female BALB/c thymic nude mice, aged four weeks, were housed in a pathogen-free environment at the animal laboratory of Hangzhou Ziyuan Experimental Animal Technology Co., Ltd. in Zhejiang, China. Tumor cells were injected into the left axilla of the nude mice. At the end of the experiment, the mice were euthanized using carbon dioxide, and the tumors were then excised. The mice were placed in a chamber filled with carbon dioxide gas at a flow rate of 10–30% volume displacement per minute and gradually increasing concentrations, which induces unconsciousness and ultimately leads to death. Tumor dimensions, including length (L) and width (W), were measured using calipers. Tumor volume (TV) was calculated using the formula TV = (L × W2)/2.

Immunohistochemical staining and associated analyses

Tumor tissues from 67 gastric cancer patients who received first-line immune checkpoint inhibitor therapy were collected from our institution for immunohistochemical staining. High and low expression were evaluated by staining intensity and the proportion of positive cells. Data are presented as the means ± SD, Unpaired t test for two group comparisons with Graphpad Prism 8.0.

Statistical analysis

R version 3.5.1 or GraphPad Prism 8.0 was used for all the statistical analyses. There were a minimum of three runs of each experiment. The results are shown as the average ± standard deviation (SD).

Results

Characteristics of GRGs in GC

The expression of 326 glycolysis-related genes (GRGs) in normal tissues and in the TCGA-STAD cohort was visualized via a heatmap (Fig. 1A) and a volcano map (Fig. 1B). Somatic copy number alterations were detected in a total of 73 differentially expressed genes (DEGs). A decrease in CNV (copy number variation) was detected in STMN2, NUP205, VCAN, HS2S11, DCN, and LDHAL6B, while CNV increased in VEGFA, ALDOC, EFNA3, COL5A1, NIP188, and NUP155 (Fig. 1C). The sites of CNV abnormalities in the DEGs are presented in the circle diagram (Fig. 1D). GO and KEGG pathway enrichment analyses were used to predict the potential biological effects of these DEGs. GO analysis revealed that the DEGs were clustered into metabolic GO categories such as nucleic acid transport, RNA transport, mRNA transportation, RNA localization, and nucleobase-containing molecule transport (Fig. 1E-F). KEGG analysis revealed that genes related to glycolysis, fructose and mannose metabolism, nucleocytoplasmic transport, and pyruvate metabolism were most significantly enriched in the HIF-1 signaling pathway (Fig. 1G-H).

Fig. 1
figure 1

Characteristics of GRGs in GC. (A) Heatmap showing DEGs between GC tissues and normal tissues. (B) The volcano map shows DEGs between GC and normal tissues. (C) Frequencies of CNV gain, CNV loss, and non-CNV among the GRGs. (D) Locations of CNV alterations in GRGs on chromosomes. (E-F) GO enrichment analysis of DEGs. (G-H) KEGG enrichment analysis of DEGs

Fig. 2
figure 2

Subtype identification based on GRGs. (A) The item-consensus plot shows the chosen optimal cluster number (k = 2) (cluster A, n = 291; cluster B n = 43). (B) Consensus values of the TCGA-STAD dataset. ranging from 0 to (1) (C) The corresponding relative changes in the areas under the cumulative distribution function (CDF) curves when the cluster number changed from k to k + 1. The range of k changed from 2 to 9, and the optimal k was (2) (D) Gene set variation analysis (GSVA) revealed the differentially enriched crucial pathways between clusters A and B. (E) Gene set enrichment analysis (GSEA) indicated the pathways enriched in cluster (A) (F) GSEA indicated the pathways enriched in cluster (B) (G) Survival curves of patients in cluster A and cluster B

Subtype identification based on the expression patterns of GRGs in GC

Based on the expression of 326 glycolysis-related genes (GRGs), a consensus cluster analysis was performed on 334 patients in the TCGA-STAD data set to determine whether the prognosis of GC patients was related to their expression. The heatmap of the consensus matrix showed a clear separation with minimal overlap at k = 2, suggesting that the samples could be reliably grouped based on the expression profiles of the GRGs (Fig. 2A-C). Two different groups, namely, cluster A (n = 291) and cluster B (n = 43), were obtained. The differences in survival between the two clusters were then assessed. Through the utilization of gene set variation analysis (GSVA) (Fig. 2D) and gene set enrichment analysis (GSEA) (Fig. 2E and F) on these two sets, we discovered that cluster A primarily had enriched metabolic pathways, whereas cluster B had enriched immune pathways. Therefore, cluster A was named the metabolic subtype, and cluster B was termed the immune subtype. Additionally, it was discovered that patients in cluster A, the metabolic subtype, had a more favorable prognosis compared to patients in cluster B, the immune subtype (P = 0.016, Fig. 2G).

Construction of a glycolysis-related risk model based on the GRG signature

To predict the prognosis of GC patients based on the expression of GRGs in the TCGA-STAD cohort, a signature model was developed due to the association between GRG expression and survival. Initially, OS-related GRGs were identified through univariate Cox analysis, resulting in the identification of ten genes (Fig. 3A). Subsequently, LASSO regression was utilized to prevent overfitting (Fig. 3B-C), leading to the selection of six genes: ME1, PLOD2, NUP50, CXCR4, SLC35A3, and SRD5A3. A predictive model was then established using the expression levels of these six genes, along with the calculation of their respective regression coefficients. The Risk Score was calculated using the formula: Risk Score = (-0.057 × expression of ME1) + (0.062 × expression of PLOD2) + (-0.088 × expression of NUP50) + (0.007 × expression of CXCR4) + (-0.050 × expression of SLC35A3) + (0.026 × expression of SRD5A3).

We then calculated the risk score for each patient in the TCGA-STAD cohort and stratified them into high- and low-risk groups based on the median risk score. A heatmap was utilized to visualize the expression patterns of the six genes across the two risk groups (Fig. 3D). Additionally, the risk scores of individual patients were plotted in Fig. 3E. Notably, patients in the high-risk group exhibited shorter OS (Fig. 3F). Principal component analysis (PCA) confirmed the distinct separation between the high- and low-risk groups (Fig. 3G). Kaplan‒Meier survival curves further corroborated the superior prognosis observed in the low-risk group (P < 0.05, Fig. 3H). To assess the model’s performance, we generated a receiver operating characteristic (ROC) curve (Fig. 3I), revealing AUC values of 0.712 (1-year), 0.689 (3-year), and 0.681 (5-year), indicating its satisfactory sensitivity and specificity.

To validate our risk model, we employed the GSE84437 dataset. Similar to the TCGA-STAD cohort, patients in the validation set were categorized into high- and low-risk groups based on the median risk score derived from the TCGA-STAD cohort. A heatmap was used to represent the expression profiles of the six genes in the validation set (Fig. 3J). Figure 3K and L present the risk scores and survival time distributions of individual patients, respectively. Consistent with the findings from the TCGA-STAD cohort, patients in the low-risk group of the validation set also exhibited a more favorable prognosis (Fig. 3M). The AUC values in the validation set were 0.742 (1-year), 0.738 (3-year), and 0.740 (5-year) (Fig. 3N).

In summary, our glycolysis-related risk model, based on the 6-GRG signature, holds promise as a reliable predictor of GC patient prognosis.

Fig. 3
figure 3

The construction of the GRG signature as a risk model. (A) univariate Cox regression analysis of prognostic GRGs in the TCGA-STAD cohort; (B) LASSO coefficient trends of 10 prognosis-related GRGs; (C) Elastic net regularization with partial likelihood deviance plot to determine optimal lambda; (D) Heatmap of GRG expression profiles in the prognostic signature of the TCGA-STAD cohort; (E) Risk score distribution in TCGA-STAD. (F) Survival time and status of patients in the TCGA-STAD cohort; (G) PCA analysis of patients based on prognostic signature; (H) Kaplan-Meier OS analysis of high- and low-risk patients in the TCGA-STAD cohort; (I) ROC curves for predicting the survival of patients in the TCGA-STAD cohort; (J) GRG expression heatmap in the GSE84437 dataset; (K) Risk score distribution in GSE84437; (L) Survival time and status of patients in the GSE84437 dataset.; (M) Kaplan-Meier OS analysis in GSE84437; (N) ROC curves for survival prediction in GSE84437

Establishment of a nomogram according to the glycolysis-related risk model and other risk factors

The independent prognostic factor for GC patients in the TCGA-STAD cohort was determined to be the 6-GRG signature risk score through univariate Cox analysis (HR = 1.263, P = 0.001; Fig. 4A). In the Multivariate Cox analysis, it was found that the risk score continued to be a significant predictor even when taking into account clinical characteristics (HR = 1.262, P = 0.001; Fig. 4B).Multivariate analysis of ROC curves was utilized to assess the predictive performance of the risk score in comparison to different clinical variables. In the GSE84437 cohort, univariate and multivariate COX analyses of the prognosis of GC patients showed that age, T, N stage, and risk score were independent prognostic predictors (SFigure1). Overall, the risk score was the most accurate prognostic indicator (AUC = 0.658, Fig. 4C). In line with worse outcomes in cluster B (immune subtype), individuals in cluster B were more commonly found in the high-risk category (Fig. 4D). Moreover, correlations were found between the risk score and other factors, including age, grade, and M stage (Fig. 4D). A nomogram (Fig. 4E) was ultimately created to predict 1-, 3-, and 5-year OS by considering multiple risk factors. The nomogram accurately predicted the 1-, 3-, and 5-year OS probabilities, which aligned well with the actual OS rates of patients in the TCGA-STAD cohort as demonstrated by the calibration curves in Fig. 4F.

Fig. 4
figure 4

A nomogram according to the 6-GRG signature risk score and other risk factors. (A) Univariate regression analysis of risk factors, including the 6-GRG signature risk score. (B) Multivariate regression analysis of risk factors. (C) Multi-index ROC curve of the 6-GRG signature risk score and other indicators. (D) The distributions of clinicopathological features were compared between the low- and high-risk groups. (E) Development of a nomogram to predict 1-, 3-, and 5-year OS. (F) Calibration plots assessed the agreement between nomogram-predicted and observed survival outcomes at 1-, 3-, and 5-year intervals in the TCGA-STAD cohort. The ideal performance is indicated by the 45° dashed line, whereas the actual model performances are represented by the lines. Different colored lines correspond to the results for the respective survival durations. *P < 0.05, **P < 0.01, ***P < 0.001

CXCR4 promoted the metastasis and growth of GC cells

To analyze the characteristics of the six genes included in the risk model, we investigated the mutation profiles of the six GRGs by using the “maftools” R package [24]. Figure 5 A demonstrates that mutations were infrequent in these GRGs. The mutation frequencies of PLOD2, SRD5A3, SLC35A3, NUP50, ME1 and CXCR4 were 3%, 1%, 1%, 1%, 0% and 0%, respectively. A visual representation, in the form of a heatmap, was created to demonstrate the correlation between the risk score and the expression levels of the six GRGs (Fig. 5B). In the high-risk group, ME1, PLOD2, and CXCR4 showed high expression levels, whereas NUP50 and SLC35A3 exhibited high expression levels in the low-risk group (Fig. 5C). K‒M analysis was then subsequently performed (Fig. 5D-I).High expression of CXCR4 (P = 0.002) and PLOD2 (P = 0.043) indicated a relatively poor prognosis (Fig. 5E and G), and high expression of NUP50 (P = 0.018) and SLC35A3 (P = 0.043) indicated a better prognosis (Fig. 5F and H).However, the expression of SRD5A3 and ME1 was not related to prognosis (Fig. 5D and I).

Since CXCR4 was the most significantly related gene to prognosis among these six GRGs, we further focused on the biological functions of CXCR4 in GC. Data from the TCGA-STAD cohort showed higher levels of CXCR4 expression in tumor tissues compared to normal tissues (Fig. 5J). Knockdown of CXCR4 in BGC-823 cells (Fig. 5K) was associated with reduced glucose consumption (Fig. 5L), lactate production (Fig. 5M) and ECAR (Fig. 5N-O). MTT tests indicated that reducing CXCR4 expression notably suppressed the growth of BGC-823 cells, as shown in Fig. 5P. Subcutaneous xenograft nude mouse models further confirmed that the growth of tumors was weakened by knocking down CXCR4 in vivo (Fig. 5Q and R). The wound healing assay showed that cell migration was repressed after CXCR4 was knocked down (Fig. 5S). Transwell invasion assays showed that the downregulation of CXCR4 weakened the invasion ability of BGC-823 cells.

Fig. 5
figure 5

Characteristics of the six genes recruited in the risk model. (A) The landscape of the mutation profiles of patients in the TCGA-STAD cohort. The waterfall plot shows mutation information for the GRGs. The corresponding colors indicate different mutation types. The upper barplot shows the TMB (tumor mutation burden) of each subject. The numbers on the right represent individual mutation frequencies. (B) Correlation matrix of the six prognostic GRGs and the 6-GRG signature risk score. (C) Differential expression of GRGs in the low- and high-risk groups of the TCGA-STAD cohort. (D-I) Kaplan‒Meier survival curves for the six prognostic GRGs in the TCGA-STAD cohort. (J) The expression of CXCR4 in GC and adjacent tissues in the TCGA-STAD cohort. (K) Verification of CXCR4 knockdown in BGC-823 cells. (L) Glucose consumption of BGC-823 cells after CXCR4 knockdown. (M) Lactate production in BGC-823 cells after CXCR4 knockdown. (N-O) ECAR of BGC-823 cells after knockdown of CXCR4. (P) Proliferation of BGC-823 cells after CXCR4 knockdown. (Q) Growth curve of transplanted tumors in nude mice. (R) Images and weights of nude mice bearing transplanted tumors after dissection. (S) Migration ability of BGC-823 cells after knockdown of CXCR4. (T) Invasive ability of BGC-823 cells by knockdown of CXCR4. ns, not significant, *p < 0.05, **p < 0.01, ***p < 0.001

The expression of ME1 and CXCR4 predicted immune cell infiltration and response to immunotherapy

Several studies have shown that metabolic reprogramming in cancer cells can result in the formation of an immunosuppressive environment within tumors. To explore the connection between gene regulatory genes (GRGs) and immune microenvironment modeling in gastric cancer (GC) progression, we conducted a detailed investigation. Using the “ESTIMATE” R package, we calculated and compared the ImmuneScore between different subtypes - metabolic subtype (cluster A) and immune subtype (cluster B), as well as between low-risk and high-risk groups (the 6-GRG signature risk model). Surprisingly, despite cluster B patients and those in the high-risk group having a poorer prognosis in the TCGA-STAD cohort, they exhibited significantly higher ImmuneScore levels (Fig. 6A and B), indicating a potentially beneficial immune microenvironment for individuals in the immune subtype (cluster B) and high-risk group.

To validate this hypothesis, we conducted a comparative analysis of immunological checkpoint molecule expression levels across clusters A and B. Notably, cluster B samples displayed higher expression levels of immunological checkpoint molecules, including CD274 (also known as PD-L1), compared to cluster A samples (Fig. 6C). Similarly, the expression of most immune checkpoint molecules, particularly PD-L1, was upregulated in the high-risk group compared to the low-risk group (Fig. 6D).

Subsequently, we employed the CIBERSORT algorithm [25] to analyze the distribution of twenty-two immune cell subsets. Our analysis revealed that cluster A samples contained higher proportions of M2 macrophages, activated mast cells, and eosinophils, whereas cluster B samples exhibited increased proportions of memory B cells, CD8+ T cells, regulatory T cells (Tregs), and naive B cells (Fig. 6E). Additionally, the low-risk group displayed enrichment of activated CD4+ memory T cells, follicular helper T cells, and M1 macrophages, while the high-risk group exhibited greater infiltration of Tregs, monocytes, M2 macrophages, and resting dendritic cells (Fig. 6F).

Cluster B and the high-risk group exhibited a favorable immune microenvironment and upregulated expression of immune checkpoints, including PD-L1, hinting at a potential role for GRGs in predicting immunotherapy response. To validate this notion, we accessed RNA-Seq data from the immunotherapy cohort PRJEB25780 archived at the European Nucleotide Archive (ENA) (https://www.ebi.ac.uk/ena/browser/home). This cohort comprised 45 patients with metastatic or recurrent gastric cancer treated with anti-PD-1 therapy [16]. The treatment responses were categorized as complete response (CR), partial response (PR), stable disease (SD), and progressive disease (PD). We stratified the patients into two groups: those with a confirmed response (CR/PR) and those without a clinical response (PD/SD). Subsequently, we compared the expression levels of PLOD2, SRD5A3, SLC35A3, NUP50, ME1, and CXCR4 between these two groups (Fig. 6G). Notably, ME1 expression was significantly elevated in the non-responder group, whereas CXCR4 expression was higher in the responder group (Fig. 6G).

To assess the prognostic performance of GRG expression, we employed the ROC curve analysis (Fig. 6H-O). The areas under the ROC curves for ME1 and CXCR4 were 0.727 and 0.687, respectively (Fig. 6J and M). No significant differences were observed in the expression of PLOD2 (Fig. 6L), SRD5A3 (Fig. 6N), SLC35A3 (Fig. 6I), NUP50 (Fig. 6K), or the 6-GRG signature risk score (Fig. 6H). Furthermore, we developed a scoring system based on CXCR4 and ME1 expression (score = 0.189 + 2.516×CXCR4-3.351×ME1) (Fig. 6O). This score demonstrated superior prognostic performance with an area under the ROC curve of 0.814, outperforming CXCR4 or ME1 alone (Fig. 6O).

Using the optimal threshold of CXCR4 and ME1 expression or the ME1/CXCR4 score, we dichotomized the patients into high- and low-risk subgroups (Fig. 6P-R). The expression levels of CXCR4 (Fig. 6P), ME1 (Fig. 6Q), and the ME1/CXCR4 score (Fig. 6R) significantly correlated with the immunotherapy response, further supporting the potential of GRGs in predicting immunotherapy outcomes.

To explore the underlying mechanisms of CXCR4 and ME1 expression in predicting immunotherapy response, we conducted an analysis of the immune microenvironment and immune cell infiltration based on their expression levels. Our findings revealed a positive association between CXCR4 expression and the ImmuneScore, whereas ME1 expression exhibited a negative correlation with the ImmuneScore (Fig. 6S). Correspondingly, a higher expression of CXCR4 was predictive of increased infiltration of CD8+ T cells, whereas ME1 expression demonstrated a negative relationship with CD8+ T cell infiltration (Fig. 6T). Consequently, the combination of high CXCR4 expression and low ME1 expression indicated a favorable immune microenvironment, supporting their potential role in immunotherapy prediction.

Fig. 6
figure 6

Evaluation of the immune microenvironment and response to immunotherapy in the GRG-related consensus clustering subtypes and risk groups. (A) ImmuneScore of subjectes in cluster A and B. (B) ImmuneScores of subjects in the low- and high-risk groups. (C) Immune checkpoint expression in cluster A and B. (D) Immune checkpoint expression in the low- and high-risk groups. (E) Infiltration of immune cells in cluster A and B. (F) Infiltration of immune cells in the low- and high-risk groups. (G) Expression of SLC35A3, ME1, NUP50, PLOD2, CXCR4 and SRD5A3 in groups that did or did not respond to treatment with anti-PD-1. (H-O) The ROC curve of the 6-GRG signature risk score; expression of SLC35A3, ME1, NUP50, PLOD2, CXCR4 and SRD5A3; and the ME1/CXCR4 score. (P) Relationship between the expression of ME1 and the response to treatment with anti-PD-1. (Q) Relationship between the expression of CXCR4 and the response to treatment with anti-PD-1. (R) Relationship between the ME1/CXCR4 score and response to treatment with anti-PD-1. (S) Relationship between the ImmuneScore and the expression of SLC35A3, ME1, NUP50, PLOD2, CXCR4 and SRD5A3. (T) Relationship between the infiltration of immune cells and the expression of SLC35A3, ME1, NUP50, PLOD2, CXCR4 and SRD5A3. *P < 0.05, **P < 0.01, ***P < 0.001

To verify the relationship between CXCR4 and ME1 expression levels and the immune microenvironment, we used the immunotherapy cohort of our institution to validate this result. In the immunotherapy cohort, a total of 67 GC patients treated with a PD-1 inhibitor as a first-line treatment were included in the final analysis. Figure 7A presents the representative images of CXCR4, ME1, and CD8+ T cells after immunohistochemical staining. In gastric cancer tissues with high expression of CXCR4 and low expression of ME1, the infiltration degree of CD8+ T cells was relatively higher. Conversely, in gastric cancer tissues with low expression of CXCR4 and high expression of ME1, the infiltration degree of CD8+ T cells was lower (Fig. 7B). High expression of CXCR4 was associated with abundant infiltration of CD8+ T cells (17.27 ± 5.33 vs. 37.31 ± 12.47, P < 0.001) (Fig. 7C). High expression of ME1 indicates less infiltration of CD8+ T cells (30.31 ± 15.68 vs. 22.30 ± 7.93, P = 0.0226) (Fig. 7D). Additionally, these GC patients showed significant differences in median progression-free survival (mPFS) among patients with different CXCR4 and ME1 expression levels after first-line immune checkpoint inhibitor treatment. The mPFS of gastric cancer patients with high expression of CXCR4 was superior to that of the low expression group (mPFS: 9.8 months vs. not reached, P = 0.0452) (Fig. 7E), and the mPFS of gastric cancer patients with low expression of ME1 was better than that of the high expression group (mPFS: 9.8 months vs. not reached, P = 0.0435) (Fig. 7F). These results also further illustrated that the expression levels of CXCR4 and ME1 could predict the response of gastric cancer patients to immunotherapy.

Fig. 7
figure 7

Evaluation of the relationship between different expression levels of CXCR4 and ME1 in gastric cancer tissues and the immune microenvironment and response to immunotherapy. (A) Representative images of CXCR4, ME1, and CD8+ T cells after immunohistochemical staining. (B) Representative immunohistochemical images of CD8+T cell infiltration in tumor tissues with high and low CXCR4 expression, and high and low ME1 expression. (C) Difference in the degree of CD8+T cell infiltration between the low and high CXCR4 expression groups. (D) Difference in the degree of CD8+T cell infiltration between the low and high ME1 expression groups. (E) The mPFS of patients with high expression of CXCR4 was superior to that of the low expression group (mPFS: 9.8 months vs. not reached, P = 0.0452, log-rank Mantel–Cox test). (F) The mPFS of patients with low expression of ME1 was better than that of the high expression group (mPFS: 9.8 months vs. not reached, P = 0.0435, log-rank Mantel–Cox)

Discussion

Metabolic reprogramming is one of the hallmarks of cancer [26]. To sustain continuous replication and high proliferation rates, metabolic reprogramming of cancer cells must switch their metabolic reprogramming to a ‘glycolysis-dominant’ metabolic profile. This shift promotes cell survival by meeting energy, synthesis, and redox demands, and it also renders the tumor microenvironment more conducive to cancer progression. [27, 28].

The Warburg effect, initially described by Otto Warburg in the 1920s, serves as a prime example of metabolic switching in cancer cells. This phenomenon is characterized by a shift in the metabolic machinery of cancer cells, favoring glycolysis and lactic acid fermentation over oxidative metabolism, even under normoxic conditions [29]. In physiological settings, cells in aerobic environments typically metabolize glucose through glycolysis, the tricarboxylic acid (TCA) cycle, and oxidative phosphorylation (OXPHOS). Normal cells preferentially rely on OXPHOS for oxidation as it generates a higher yield of adenosine triphosphate (ATP). However, the hypoxic tumor microenvironment, with its limited oxygen supply, prompts cancer cells to favor aerobic glycolysis [30]. Consequently, cancer cells exhibit an increased reliance on glycolysis, manifesting in elevated glucose uptake and fermentation to lactate, to fulfill the augmented anabolic demands of a malignant phenotype [31].

For nearly a century, the Warburg effect has been a subject of intense scrutiny, with numerous studies elucidating the mechanisms underlying the augmented glycolytic dependency of cancer cells. This aberrant mode of energy utilization in cancers not only serves as a source of energy but also significantly shapes the TME. By establishing a hypoxic, acidic, and nutrient-depleted environment enriched with immune-modulatory metabolites such as lactate and kynurenine, it fosters a conducive niche for tumor growth and hinders effective cancer treatment [30, 32]. Cancer cells tend to sustain the Warburg effect by mutating in order to utilize these metabolic byproducts of glycolysis. Various oncogenic proteins and pathways, such as hypoxia-inducible factor (HIF-1) and the PI3K/Akt/mTOR pathway, have been identified as playing a role in regulating the metabolic reprogramming specific to cancer cells [9].

Given the critical role of the Warburg effect in the progression of cancer, increasing research has been dedicated to revealing the mechanisms involved and identifying novel targets for clinical application. In the current study, we have characterized comprehensive changes in GRGs at both transcriptional and genetic levels in gastric cancer (GC). Utilizing 326 GRGs, we discovered two distinct molecular subtypes. Notably, patients belonging to the immune subtype, designated as cluster B, exhibited more aggressive clinicopathological features and shorter overall survival (OS) compared to those in the metabolic subtype, designated as cluster A. Additionally, both univariate and multivariate Cox regression analyses demonstrated the prognostic significance of a risk score derived from the expression levels of six specific GRGs: ME1, PLOD2, NUP50, CXCR4, SLC35A3, and SRD35A3. Survival analysis further corroborated that the high-risk group exhibited a poorer prognosis. These results suggested that the GRG signature could be helpful in guiding clinical treatment decisions.

The redirection of intermediate molecules from glycolysis and the TCA cycle towards the synthesis of nucleotides, lipids, and non-essential amino acids is crucial for generating the essential products that underlie cell proliferation and survival [30]. Furthermore, the elevated glycolysis observed in cancer cells contributes to chemo- and radio-resistance, implying that therapeutics aimed at suppressing glycolysis in these cells could serve as a promising approach to enhance the sensitivity of cancer cells to other conventional treatment strategies [9]. While numerous therapeutics targeting glycolysis in cancer cells have been developed and are currently undergoing preclinical and clinical investigations at various stages, there remains a pressing need for effective solutions specifically targeting cancer cell glycolysis [9].

In our current investigation, we established CXCR4 as a promising therapeutic target for mitigating hyperactive glycolysis in gastric cancer (GC). CXCR4, also recognized as CD184, is one of the most commonly encountered chemokine receptors. This receptor has been associated with the emergence and advancement of various malignancies, including breast cancer, melanoma, prostate cancer, and GC [33,34,35,36]. Notably, CXCR4 expression has been linked to lymph node metastasis in GC patients [37]. CXCR4 functions as a G protein-coupled receptor that binds to its ligand, CXC ligand 12 (CXCL12), also known as stromal cell-derived factor-1 (SDF-1). CXCL12, secreted by stromal cells and released into the TME, interacts with CXCR4 expressed on the tumor surface. In GC, cancer-associated fibroblasts modulate the CXCL12/CXCR4 axis [38]. Activation of this axis triggers the MAPK cascade, promoting chemotaxis and cellular proliferation [39]. Additionally, it stimulates the PLC/PKC and PI3K/Akt signaling pathways, thereby enhancing cell migration and survival [40].

Our study demonstrated that CXCR4 knockdown attenuated the migratory and invasive capabilities of GC cells. Furthermore, downregulation of CXCR4 suppressed the growth of GC cells both in vitro and in vivo, pointing to its potential as a drug target. Lactate, the terminal product of aerobic glycolysis, lowers the pH of the TME, facilitating cancer cell invasion and metastasis [31, 41]. Lactate is also metabolized in human tumors, fueling the TCA cycle and contributing to energy production [42]. Prior research has shown that CXCR4 is involved in the Warburg effect in acute myeloid leukemia (AML) cells via the CXCL12/CXCR4/mTOR pathway [43]. The mTOR inhibitor rapamycin effectively suppressed the upregulation of glucose transporters in the matrix, enhanced glucose influx, and decreased lactate production [43]. Our current investigation revealed that knockdown of CXCR4 in GC cells decreased glycolytic capacity, glucose consumption, and lactate production. Consequently, the anticancer effects of CXCR4 knockdown may be mediated through the suppression of the Warburg effect.

Interestingly, although both the immune subtype (cluster B) and the high-risk group identified in the present study by using the TCGA database were associated with an unfavorable prognosis, these two cohorts presented a favorable immune microenvironment and increased expression of immune checkpoints, suggesting that the application value of GRGs still needs further elucidation. The reason for this inconsistency may be that the data from the TCGA-STAD cohort were collected mainly from before the era of immunotherapy. Without immune checkpoint inhibitor treatment, increased expression of immune checkpoints suggests immune escape, resulting in unfavorable outcomes [44].

Immune cells in the TME undergo a metabolic shift towards glycolysis, creating a nutritional competition with cancer cells and tumor-infiltrating cells [9]. This metabolic dysregulation can hamper the infiltration of immune cells and suppress antitumor immunity by generating immunosuppressive byproducts. Additionally, lactate derived from the tumor plays a dual role in immunosuppression, facilitating the accumulation of Tregs and MDSCs, while also promoting the polarization of M2 tumor-associated macrophages [45, 46]. Elevated tumor glycolysis is associated with resistance to adoptive T cell therapy in melanoma, highlighting the potential benefits of targeting both cancer immunity and metabolism [5]. Moreover, metabolic reprogramming in cancer cells can modify the expression of cell surface markers, impeding immune surveillance [30]. Glycolytic activity can upregulate immune checkpoint expression, indicating that glycolysis might enhance immunotherapy response [47] and serve as a valuable predictor for immunotherapy efficacy.

Immune checkpoint inhibitors (ICIs) have transformed therapeutic strategies for numerous tumor types. The most extensively studied inhibitors include antibodies targeting cytotoxic T-lymphocyte antigen-4 (CTLA-4) and programmed cell death protein-1 (PD-1). The administration of anti-CTLA-4 and anti-PD-1 agents has significantly improved survival rates in patients suffering from diverse tumor types [30]. Notably, the phase 2 KEYNOTE-059 trial established the safety and efficacy of pembrolizumab as a third-line therapeutic option for locally advanced or metastatic gastric or gastroesophageal junction (G/GEJ) adenocarcinoma [48]. In this multicohort, single-arm study, the objective response rate (ORR) reached 11.6%, with complete responses (CRs) observed in 2.3% of patients [48]. Furthermore, the phase 3 CheckMate-649 trial demonstrated the tolerability and effectiveness of nivolumab as a first-line treatment for patients with untreated, unresectable, HER2-negative G/GEJ cancer or esophageal adenocarcinoma [49]. This trial exhibited significant improvements in overall survival (OS) and progression-free survival (PFS) among patients with a combined positive score (CPS) of ≥ 5 [49]. Given these advancements, immune checkpoint inhibitors have emerged as standard therapeutic agents for GC treatment.

The efficacy of immunotherapy lies in its capacity to “rewrite” the immune cycle, fostering enduring immune responses. Nevertheless, immunotherapy is not universally effective among patients [30]. Consequently, it is crucial to establish target biomarkers that can predict a patient’s response to immune checkpoint inhibitors (ICIs). Currently, PD-L1 immunohistochemistry (IHC), microsatellite instability (MSI)/mismatch repair (MMR), Epstein-Barr virus (EBV), and tumor mutational burden (TMB) are regarded as promising indicators for predicting immunotherapy response in GC patients [50]. However, no single biomarker alone can consistently predict immunotherapy outcomes [50]. Therefore, it is imperative to develop a comprehensive predictive model that takes into account the intricate interplay between host and TME, reflecting the heterogeneity inherent in GC.

In our present study, we showed that GRGs exhibit potential for predicting response to ICIs. The expression of CXCR4 and ME1 was correlated with the infiltration of CD8 + T cells and the response to treatment with an anti-PD-1 ICI. Moreover, the combination of CXCR4 and ME1 expression had a more effective prognostic prediction effect. Interestingly, although increased CXCR4 expression indicates a poor prognosis when ICIs are not used, CXCR4 could be a promising biomarker for the response to ICIs. Therefore, the prognostic value of specific subjects should consider different historical backgrounds. The roles of GRGs in the age of immunotherapy still require further investigation, and how to best utilize glycolytic targets to boost antitumor immunity and improve immunotherapies still needs further elucidation.

This work revealed that GRGs play a nonnegligible role in determining TME diversity and complexity. Evaluating the GRG modification pattern of individual tumors will contribute to enhancing our understanding of TME infiltration and guide more effective immunotherapy strategies.