1 Background

On a global scale, stomach and esophageal cancers comprise the fifth and seventh most prevalent malignancies, respectively, and are frequently associated with high mortality rates [1]. In recent years, immune checkpoint inhibitors (ICIs) have emerged as a highly effective therapeutic approach for a wide range of malignancies, including gastroesophageal cancers (GECs) [2]. Accordingly, sustained responses have been documented with ICIs in instances of advanced GECs that are resistant to chemotherapy. This is supported by the KEYNOTE-061 trial, which documented a median response duration of 18.0 months [3]. However, the majority of patients (~ 85%) develop primary resistance to ICI monotherapy and do not experience a substantial improvement. Furthermore, the subset of patients who respond well to ICI therapy may encounter acquired resistance. Thus, the precise delineation of patient subgroups within GEC cohorts who are poised to derive substantial therapeutic advantage from ICIs presents a formidable clinical challenge.

Currently, the most reliable prognostic biomarkers for determining the effectiveness of ICIs include the extent of microsatellite instability (MSI) and the level of programmed death-ligand 1 (PD-L1) expression [4]. However, although response rates in tumors with high MSI exceed 50%, these malignancies account for approximately 4% of GECs [4]. As a result, researchers have focused their attention on studying PD-L1 expression. Intriguingly, various antibodies for immunohistochemistry are available, and the difference in their specificities and sensitivities may underlie the spectrum of expression estimates, potentially impacting the precision of tumor expression assessment and, by extension, therapy eligibility [5]. In this context, PD-L1 expression demonstrates a robust, negative predictive value for response [absent expression correlates with a response rate (RR) of 2–6% for ICI monotherapy]. However, its positive predictive value is comparatively weaker [a PD-L1 combined positive score (CPS) ≥ 1 is associated with an RR of 15–16%, and a PD-L1 CPS ≥ 10 is associated with an RR of 24–25%]. In addition, tumor mutational burden (TMB) comprises another potential biomarker, which has been observed to correlate well with responses to ICIs in a recent study [6]. However, TMB has not been established as a reliable biomarker for GECs [7]. The immunogenic effect of mutations has been shown to differ, with certain mutations such as those in the PBRM1, KEAP1, and STK11 proteins [8,9,10] potentially affecting the efficacy of ICI treatment in either a positive or negative manner. Furthermore, the predictive power of TMB scoring systems for ICIs appears to be limited because they do not take into account the specific effects of these mutations. In order to address this issue, recent studies have proposed refining the TMB calculation algorithm or developing gene mutation-based signatures to enhance the predictive accuracy of ICIs outcomes [11, 12].

Artificial intelligence (AI) has fundamentally transformed biomedical research over the past few years. Healthcare systems have increasingly leveraged AI-driven predictive analytics to enhance diagnostic accuracy, prognostic assessment, and therapeutic decision-making [13,14,15]. The present study, therefore, utilizes genomic mutation data to construct and validate a novel AI-based genomic mutation signature (GMS). The findings show significant potential to offer crucial insights to improve immunotherapy strategies and potentially improve the clinical outcomes for GECs.

2 Methods

2.1 Designing research studies and collecting data

Thus, purpose of this study was to construct a GMS in predicting the efficacy of immunotherapeutic approaches. A training cohort consisting of 123 GECs, which were administered ICIs, was assembled at the Memorial Sloan Kettering Cancer Center (MSK) to identify prognostically significant mutations and establish a prognostic signature [7]. Additionally, two validation cohorts of ICI-treated GECs that were independent of one another were obtained from public repositories. Accordingly, the combined Janjigian and Pender cohort included 42 GECs [16, 17], while the PUCH cohort comprised 66 individuals who were diagnosed with GESc [18]. The patient selection criteria were as follows: (1) primary GECs; (2) accessible gene mutation profiles and clinical data with follow-up; and (3) confirmation of having undergone a minimum of one cycle of treatment with a PD-1/PD-L1 inhibitor, CTLA-4 inhibitor, or combination therapy. The clinical information of these three cohorts is shown in Supplementary Table 1. Additionally, somatic mutation data, mRNA expression profiles, and copy number variations (CNVs) comprising esophageal cancer (N = 184) and gastric cancer (N = 439), were obtained from The Cancer Genome Atlas (TCGA) database.

2.2 Mutation data analysis and assessment of clinical outcomes

MSK-IMPACT® sequencing was performed on tumor specimens from the Janjigian and MSK cohorts, utilizing panels consisting of 341, 410, or 468 genes. In addition, whole-genome sequencing (WGS) was applied to tumor tissues from the Pender cohort, whereas whole-exome sequencing (WES) was employed to assess the tumor tissues from the PUCH cohort. Our analysis was confined to non-silent mutations, encompassing categories such as nonsense, missense, frameshift, inframe, splice site, translation start site, and nonstop mutations. The overall survival (OS) constituted the principal survival endpoint.

2.3 AI network-based signature generation

A novel AI-based network that integrated 297 algorithmic combinations was developed, comprising 22 techniques derived from traditional regression, machine learning, and deep learning. The algorithms included stepwise Cox, RSF, SuperPC, obliqueRSF, GLMBoost, BlackBoost, Rpart, Survreg, Ranger, Ctree, LASSO, plsRcox, survival-SVM, Ridge, Enet, DeepHit, DeepSurv, CoxTime, XGBoost, Coxboost, CForest, and VSOLassoBag. The sequential procedure for constructing the signature is outlined as follows: (1) The MSK cohort was analyzed using univariate Cox regression to identify prognostic genes. (2) The initial identification of signatures was carried out using the AI network in the MSK cohort. (3) The AI network was validated by testing it on two independent cohorts (the Janjigian and Pender cohort, and the PUCH cohort). (4) Harrell’s concordance index (C-index), was computed for each model in all cohorts. The model that achieved the highest average C-index in the test cohorts was considered to be optimal. The GMS was established by merging XGBoost and XGBoost algorithms. The XGBoost algorithm identified the most critical genes with importance over than 0.03. The XGBoost model was subsequently developed using tenfold cross-validation. The XGBoost model was constructed using default parameters.

2.4 Functional annotation of the GMS

The data on immune modulators were collected from a previous study [19]. In order to quantify immune cell infiltration, four distinct algorithms were employed: the quanTIseq algorithm [20], which analyzes 11 specific immune cell types; the EPIC (Estimating the Proportions of Immune and Cancer cells) algorithm [21], which calculates the proportions of 8 immune cell types; the MCPcounter (Microenvironment Cell Populations-counter) algorithm [22], which identifies 10 immune cell types; and the ESTIMATE (Estimation of STromal and Immune cells in MAlignant Tumors using Expression data) algorithm [23], which assesses both stromal and immune cell components in malignant tumors. Furthermore, 29 canonical immune signatures were obtained from the earlier study conducted by He et al [24]. In order to calculate the cytolytic activity scores (CYTs), the geometric mean of the expression levels of the granzyme (GZMA) and the perforin (PRF1) was utilized [25]. Additionally, the GSVA R package, which employs the single-sample gene set enrichment analysis (ssGSEA) method, was utilized to assess the enrichment levels of these 29 immune signatures across all samples in the TCGA cohort.

2.5 Immunogenomic indicator determination

The immunogenomic indicators were procured from the pan-cancer immune landscape project [19]. Essentially, the intertumoral heterogeneity (ITH) score is defined as the subclonal genome fraction, which represents the proportion of the tumor genome that does not belong to the dominant clone. This was determined using ABSOLUTE, a computational tool that models tumor copy number alterations and mutations as combinations of subclonal and clonal components with varying ploidy levels. The copy number burden was quantified using two scores: n_segs, which represents the total number of segments in the copy number profile of each sample, and frac_altered, which indicates the proportion of bases that differ from the baseline ploidy. The aneuploidy score was computed as the sum of the altered arms. In addition, the diversity scores for TCR (T-cell receptor) and BCR (B-cell receptor) was inferred using Shannon entropy and richness measures, based on cancer RNA-sequencing data.

2.6 Unveiling genomic mutational signatures

The maftools R package was used to perform nonnegative matrix factorization (NMF) analysis on 96 trinucleotide context mutations in GEC samples obtained from The Cancer Genome Atlas (TCGA). Subsequently, the obtained mutational profile was assessed by comparing it to the Catalogue of Somatic Mutations in Cancer (COSMIC) using cosine similarity as the metric.

2.7 Drug sensitivity prediction

The Genomics of Drug Sensitivity in Cancer (GDSC) database was queried for data regarding the sensitivity of tumor cell lines to potential drugs and the corresponding mutations. Accordingly, the half of the maximal inhibitory concentration (IC50) of the drugs was utilized to assess the sensitivity of the cell line.

2.8 Cell line cultivation

The human gastric cancer cell lines AGS and MKN45 were obtained from the Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences. Correspondingly, the AGS cells were cultured in Ham’s F-12 medium, which was supplemented with 10% fetal bovine serum (FBS) and 1% penicillin–streptomycin, whereas the MKN45 cells were cultured in RPMI 1640 medium. The cell cultures were maintained inside an incubator set at 37 °C in a 5% CO2 atmosphere.

2.9 Colony formation assay

For a duration of 2 weeks, 1000 untreated cells were seeded into each well of a six-well plate, in the absence or presence of trametinib or dimethyl sulfoxide (DMSO). Subsequently, the colony formation was evaluated.

2.10 Statistical analysis

Herein, the chi-square test was utilized to assess categorical data, whereas the Wilcoxon analysis was employed to inspect numerical variables. The magnitude of correlation was measured using Pearson’s correlation coefficient. The construction of Kaplan–Meier survival plots was facilitated by using the Survival and survminer packages within R. The prognostic independence of the GMS signature in relation to clinical contributing factors was evaluated using univariate and multivariate Cox regression analyses. In addition, the Receiver Operating Characteristic (ROC) curves were generated to assess the model’s sensitivity and specificity in predicting survival. A P value less than 0.05 was considered statistically significant, unless explicitly mentioned otherwise. The statistical analyses were conducted using R (version 4.2.3).

3 Results

3.1 Establishment and verification of the GMS

The schematic representation of the research process is depicted in Fig. 1. The training set consisted of 123 GEC patients from MSK who received ICIs. A univariate Cox analysis identified 23 prognostic genes. Subsequently, seed genes with a mutation frequency higher than 1% were selected and processed using an AI network that facilitated the construction of the GMS. The optimal model was achieved by combining XGBoost and XGBoost, resulting in the highest average C-index (0.72) out of all 297 algorithmic combinations (Fig. 2A). The XGBoost algorithm identified 12 genes and utilized them to create the most reliable GMS using the resulting XGBoost model (Fig. 2B). The GMS scores of each patient were computed and categorized as high-risk or low-risk based on the median GMS score (56.40) in the training set. As observed in Figs. 2C–E, the high-risk group consistently showed a significantly worse OS compared to the low-risk group across all cohorts (P < 0.05). Furthermore, the resulting time-dependent ROC curves consistently demonstrated the strong performance of the GMS in all cohorts. As evident from the above, the GMS exhibits significant promise for use as a predictor of immunotherapeutic outcomes in GECs.

Fig. 1
figure 1

An illustration of the general workflow adopted in this study

Fig. 2
figure 2

Development and validation of an artificial intelligence network using 297 algorithm combinations. A Evaluation and C-index computation for 297 prediction models across all validation datasets. B Variable importance of the top 12 genes determined using the XGBoost algorithm. CE Kaplan–Meier survival analysis (left) and receiver operating characteristic (ROC) (right) curves for overall survival (OS) in the MSK, Janjigian and Pender, and PUCH cohorts

3.2 The powerful prognostic capability of GMS

To determine whether GMS could function as an autonomous predictor of OS in immunotherapy patients, both univariate and multivariate Cox regression analyses were conducted across the included cohorts. Despite being adjusted for various factors including age, gender, drug category, MSI, PDL-1, and TMB, the GMS continued to be a reliable prognostic indicator in multivariate analysis (Figs. 3A–C), underscoring its prognostic efficacy. When juxtaposed against typical clinical attributes and molecular features, the GMS demonstrated a notably higher degree of precision in comparison to other parameters including age, gender, drug type, TMB, MSI, and PD-L1 across the three cohorts examined (Figs. 3D–F). This highlights the potential likelihood that our GMS could function as a reliable surrogate for prognostic prediction in clinical patients with GECs undergoing immunotherapy.

Fig. 3
figure 3

Univariate and multivariate Cox regression analyses of the GMS and other characteristics. A GMS subjected to univariate and multivariate Cox regression analyses in the MSK cohort. B GMS subjected to univariate and multivariate Cox regression analyses in the Janjigian and Pender cohort. C GMS subjected to univariate and multivariate Cox regression analyses in the PUCH cohort. D Comparison of GMS performance with other clinical and molecular variables for prognosis prediction in the MSK cohort. E Comparison of GMS performance with other clinical and molecular variables for prognosis prediction in the Janjigian and Pender cohort. F Comparison of GMS performance with other clinical and molecular variables for prognosis prediction in the PUCH cohort

3.3 Extrinsic immune landscapes of the GMS

The potential of GMS as an immune status marker was evaluated by analyzing its correlation with both immune cell infiltration and immune checkpoint expression. The TCGA dataset revealed that the low-risk group exhibited elevated infiltration and regulatory activity of immune cells, as depicted in Figs. 4A and B. Upon comparing the 29 immune signatures among the groups, it was observed that the low-risk group exhibited a greater abundance of immune cells, such as CD8 +T and NK cells (P < 0.05) (Fig. 4C). To further ascertain whether the risk categories corresponded to cohorts of high and low immune cell infiltration, the 29 immune signatures were utilized to perform an unsupervised clustering of the TCGA patients. Two discernible immunological patterns were identified as a consequence of the analysis: those indicating high immune infiltration and those indicating low immune infiltration (Fig. 4D). Importantly, the low-risk category was more frequently observed in the cluster with higher immune infiltration (P < 0.001) (Fig. 4E). Furthermore, low-risk tumors were found to significantly correlate with higher CYT scores (P < 0.001) (Fig. 4F). These observations indicate a relatively inflammatory and immunostimulated environment, which could potentially be amenable to immunotherapeutic interventions [26].

Fig. 4
figure 4

Immune infiltrating characteristics of the GMS. A The relationship between the GMS and infiltrating immune cell populations. B The association between the GMS and immune modulatory factors expression. C The relationship between the GMS and 29 immune signatures score. D Unsupervised clustering based on 29 immune signatures. E The proportions of high and low immune infiltration were estimated in both the high-risk and low-risk groups. F A comparison of the cytolytic activity scores (CYTs) score was conducted betweenthe high-risk and low-risk groups. NS, no significant; **p < 0.01; ***p < 0.001

3.4 Intrinsic immune landscapes of the GMS

In order to identify variations in the determinants of tumor immunogenicity between the two cohorts, mutation rate, TCR diversity, BCR diversity, CNV load, aneuploidy, and intertumoral heterogeneity were initially evaluated. In contrast to the high-risk group, the low-risk group demonstrated a significantly higher mutation rate (P < 0.05), as well as a substantially increased TCR and BCR diversity (P < 0.05) (Fig. 5A). On the other hand, the high-risk group exhibited a higher CNV load and a greater degree of aneuploidy in comparison to the low-risk group (P < 0.001) (Fig. 5A). These findings align well with previous studies that have indicated a correlation between tumor aneuploidy and diminished immunotherapeutic responses, along with indications of immune evasion [27]. In addition, it was observed that patients assigned to the high-risk group exhibited greater intertumoral heterogeneity in comparison to those assigned to the low-risk group (P < 0.001) (Fig. 5A). This finding provides support for the hypothesis that tumors have the ability to undergo clonal evolution as a consequence of cytolytic activity and a reduction in the quantity of actively infiltrating immune cells, which leads to the emergence of heterogeneity. Thus, it was inferred that the elevated immunogenicity of the low-risk group may induce an extrinsic immune response. To gain a better understanding of the mutational processes underlying the distinction between high-risk and low-risk groups, the mutational signatures were classified according to somatic mutation data. Accordingly, two distinct mutagenesis patterns were identified within the TCGA (Fig. 5B). In this context, SBS6, which is distinguished by the presence of C > T mutations and generally associated with dysfunctional DNA mismatch repair (MMR) (Fig. 5B), was found to be more prevalent in the low-risk group than in the high-risk group (Fig. 5C). Subsequently, the enrichment scores for the ten oncogenes that involved in oncogenic pathways were calculated. Higher scores were observed in the high-risk group for the cell cycle, Wnt, and MYC pathways, while an enrichment of the TP53 pathway was observed in the low-risk group (Fig. 5D). Reportedly, the Wnt pathway is linked to immune exclusion [28].

Fig. 5
figure 5

Exploration of potential intrinsic immune response and escape landscapes in the high-risk and low-risk groups. A Comparison of immunogenomic markers between the high-risk and low-risk groups. B Analysis of mutational activities of two extracted mutational signatures. C Comparison of the SBS6 signature activity between high-risk and low-risk groups. D Comparison of enrichment scores for 10 oncogenic pathways between high-risk and low-risk groups. NS, no significant; **p < 0.01; ***p < 0.001

3.5 Recognition of small compound medications exhibiting negative associations with GMS

Trametinib, a small compound medication, which exhibited the most pronounced inverse correlation with GMS score and the smallest p-value (P < 0.01), was successfully identified by employing GDSC drug sensitization data. Accordingly, it was hypothesized that trametinib might be more effective in high-risk patients. In order to validate this hypothesis, the GMS of the two cell lines from our laboratory (AGS GMS score: 62.32; MKN45: 4.75) were examined and their receptivity was compared against trametinib. The IC50 of trametinib for AGS and MKN45 was identified to be 3.59 nm and 47.20 nm, respectively (Fig. 6A). Additional evidence was presented in the form of a plate clone formation assay, which validated the increased sensitivity of AGS to trametinib (Fig. 6B).

Fig. 6
figure 6

Identification of small molecule drugs negatively associated with GMS. A IC50 of trimetinib of AGS and MKN45. B The clonogenicity of AGS and MKN45 by using a colony-forming assay

4 Discussion

The present study details the successfully development and validation of a genomic classifier, denoted as GMS, which is composed of 12 genes and is powered by an AI network. The as-obtained classifier was found to be able to enhance the prognosis of ICI therapy in patients with GECs. Furthermore, the optimal model was identified by combining XGBoost and XGBoost algorithms, resulting in the highest average C-index among the three cohorts. In addition, the predictive value of the GMS was found to be independent of other variables, and its performance was consistent across all validation cohorts. Furthermore, the GMS demonstrated a substantial level of sensitivity and specificity in forecasting OS at 6, 12, and 18 months, as evidenced by the ROC analysis. Moreover, the GMS demonstrated considerably greater precision than clinical characteristics (like gender), and molecular features (including MSI, TMB, and PD-L1 expression). The presented evidence indicates that the GMS possesses substantial potential for improved translation and clinical application.

By leveraging the TCGA cohort, the response of cancer to immunotherapy was also assessed in the current study. Upon examining the characteristics of extrinsic immune infiltration according to a variety of algorithms, the low-risk group was found to comprise a substantial quantity of immune cells. Furthermore, the intrinsic immune landscapes exhibited greater immunogenicity of the low-risk group, as evidenced from the higher mutation rate. Moreover, compared to the high-risk group, the low-risk group exhibited considerably increased expression of immune checkpoints, including PD-L1, PD-1, and CTLA-4, in addition to a heightened CYT score. Thus, increased tumor immunogenicity, activated antitumor immunity, and elevated levels of PD-L1, PD-1, and CTLA-4 may account for the fact that the low-risk group is more likely to benefit from ICI therapy than the high-risk group.

In addition to the above, the innovative contributions and practical applications resulting from the present study are also highlighted. The AI network was initially constructed using 297 algorithm combinations, 22 of which were derived from classic regression, machine learning, and deep learning. The predictive performance of this network, which featured an extensive and diverse collection of algorithms, exceeded that of the previous machine learning-based integration [29,30,31]. Furthermore, the optimal algorithm pairing was found to be XGBoost in tandem with XGBoost, which presents a significant combination that had not been observed in previous studies [29,30,31]. Further application of additional algorithm combinations successfully streamlined the variable dimensionality, thereby rendering the GMS more user-friendly and practically viable. Moreover, the execution of multibiomarker predictive frameworks requires an in-depth comprehension of elements influencing the accuracy and precision of high-throughput assays in a clinical setting. Among these elements, the fluctuation of biomarker measures presents a critical concern, which can be ascribed to technical factors associated with reliance on a particular platform. Several mRNA-based markers, such as the T cell-inflamed gene-expression profile (GEP), comprising an 18-gene assay, have been developed to predict clinical efficacy in patients undergoing ICI therapy [32]. In this context, relative quantification is employed to evaluate mRNA expression, while normalization is performed on reference genes [33]. However, the risk score computations and cutoff values of these mRNA markers are not suited for validation using alternate data measurement. In the present study, mutant genes were identified as predictors of the clinical success of ICI therapy. Thus, despite the utilization of diverse platforms across multiple centers, the GMS remains impervious to technical fluctuations. Furthermore, when examining patients who are anticipated to exhibit low responsiveness, the GMS averts potential immunological adverse effects from an operational standpoint. In addition, this expedites the pairing of patients with treatments that may be potentially more effective. Furthermore, given that the mean cost of treatment exceeds $120,000 [34], the incorporation of biomarker strategies that improve diagnostic accuracy could alleviate substantial costs that are typically associated with anticipated reduced benefits. Thus, given the simplicity with which tumor samples from patients can be obtained via targeted NGS of these genes, the use of GMS, encompassing these mutations, as opposed to the assessment of the TMB, a complex and expensive procedure in commonplace practice, presents additional merits.

In addition, trametinib, a small molecular drug with the lowest p-value and a significant inverse correlation with GMS, was identified by utilizing GDSC drug sensitization data. Trametinib, a reversible allosteric suppressor of MEK1/2 that is orally ingestible [35, 36], is currently authorized for both single application and concurrent use with dabrafenib in the treatment of patients with BRAF V600-mutated non-excisable/metastatic melanoma. It is also endorsed for simultaneous use with dabrafenib as a supplementary treatment for patients who have undergone complete resection of Stage III melanoma, advanced non-small cell lung malignancy, and locally progressive or metastatic anaplastic thyroid carcinoma with a BRAF V600 mutation. In the present study, it was noted that the AGS cell line, which was categorized as the high-risk group, exhibited greater sensitivity to trametinib than the MKN45 cell line, which was categorized as the low-risk bracket. As a result, it was hypothesized that the combination of trametinib and ICIs might potentially increase the efficacy of ICIs within the high-risk contingent. However, additional validation of this theory is required via in vivo experiments.

The present study also suffers from several limitations. First, all-encompassing clinical records for every patient could not be assessed; this is likely to have introduced an analytical bias in our data. Second, the application of immunohistochemistry techniques to verify the presence of immune cell population and the expression of immune checkpoint appears essential. Thus, to address these limitations, it is necessary to take a proactive approach to analyze and validate a wide range of ethnicities within a large group of patients with GECs undergoing ICI therapy. Studies conducted in these areas are likely to help to strengthen the findings and consequences of our study.

5 Conclusions

The GMS developed in the current study serves as a promising biomarker capable of effectively predicting the prognosis of patients with GECs undergoing immunotherapy. The use of the GMS is likely to facilitate a cost-effective strategy to identify patients who may benefit from immunotherapy, which is worth exploring through future studies. Furthermore, integrating the GMS could substantially aid in refining customized treatment approaches and enhancing patient results in the field of GECs immunotherapy.