Identification hub genes of consensus molecular subtype correlation with immune infiltration and predict prognosis in gastric cancer

Gastric cancer (GC) has a great fatality rate, meanwhile, there is still a lack of available biomarkers for prognosis. The goal of the research was to discover key and novel potential biomarkers for GC. We screened for the expression of significantly altered genes based on survival rates from two consensus molecular subtypes (CMS) of GC. Subsequently, functional enrichment analysis showed these genes involved in many cancers. And we picked 6 hub genes that could both secreted in the tumor microenvironment and expression enhanced in immune cells. Then, Kaplan Meier survival and expression detected in the tumor pathological stage were utilized to clarify the prognostic of these 6 hub genes. The results indicated that OGN, CHRDL2, C2orf40, THBS4, CHRDL1, and ANGPTL1, respectively, were significantly associated with poor OS in GC patients. And their expression increased with cancer advanced. Moreover, immune infiltration analysis displayed that those hub genes expression positively with M2 macrophage, CD8+ T Cell, most immune inhibitors, and majority immunostimulators. In summary, our results suggested that OGN, CHRDL2, C2orf40, THBS4, CHRDL1, and ANGPTL1 were all potential biomarkers for GC prognosis and might also be potential therapeutic targets for GC. Supplementary Information The online version contains supplementary material available at 10.1007/s12672-021-00434-5.

they all have been reported to promote the progression of GC [14][15][16][17]. Furthermore, we explored the characteristics of the 52 genes in the GeneCards (https:// www. genec ards. org/) and the Human Protein Atlas (https:// www. prote inatl as. org/). As shown in Table 2, 10 genes could be transcribed into secreted proteins, 22 genes enriched in immune cells and only 7 genes own both characters ( Supplementary Fig. S1e).

Survival outcomes in selected genes
To verify whether the selected 7 genes (OGN, CHRDL2, C2orf40, THBS4, CHRDL1, ANGPTL1, SLIT2) were correlated with worse overall survival (OS) in GC patients, survival analysis was performed in Kaplan Meier plotter [18]. On account of many articles that have been shown that SLIT2 was over-expression in GC and might be an independent risk factor for GC [19,20], we did not explore it further in this paper. The Kaplan Meier plotter results showed that the up-regulated expression of OGN, CHRDL2, C2orf40, THBS4, CHRDL1, and ANGPTL1, respectively, was correlated with poor OS in GC patients ( Fig. 2a-f ). We then observed the prognostic value of these six hub genes in OncoLnc websites [21] and got similar results (Fig. 3a-f ). Among them, OGN and CHRDL2 were firstly reported in GC that their expression was related to OS of GC patients. To further confirm the correlation of 6 genes expression with tumor progression in GC, expression in different stages was executed in GEPIA [22]. The analyzed results indicated that the expression of OGN, CHRDL2, C2orf40, THBS4, CHRDL1, and ANGPTL1 was gradually increased in the GC stage ( Supplementary Fig. S2a-f). These results revealed these 6 hub genes were significantly related to the aggravation and survival of GC.

Correlation analysis between hub genes and immune markers
To better understand the immune function of these hub genes, we explored the relationship between their expression and immune markers in TISIDB [24]. Tumor immunology analysis exhibited that the expression of these 6 genes was positively correlated with most immunoinhibitors (Fig. 6a) and majority immunostimulators ( Supplementary Fig. S4a). Furthermore, we check out the striking different genes verified again in GEPIA, like CSF1R and Cxcl12, and got similar results ( Fig. 6b-g, Supplementary Fig. S4b-g). In the end, we search data found that the expression of OGN, C2orf40, CHRDL2, THBS4, CHRDL1, and ANGPTL1, respectively, correlated with different immune subtypes in GC (Fig. 7a-f ). Molecular classification of GC benefited patients to gain precisely targeted therapies [25]. Previous studies that separately GC patients into 5 subtypes containing Epstein-Barr virus (EBV)-positive, hypermutated-single-nucleotide variant predominant (HM-SNV), hypermutated enriched for insertion/deletion (HM-indel, which lead to microsatellite instability), chromosomal instability (CIN), and genomically stable (GS) [25,26]. We then explored the six hub genes expression in various molecular subtypes in GC ( Supplementary Fig. S5a-f ). In summary, these results implied that these 6 hub genes might potentially govern the conscription and awakening of immune cells in GC.

Discussion
Chemotherapy, radiotherapy, surgery, immunotherapy, and targeted therapy are effective in the progression of GC in appropriate indications. While the limited OS that no more than 30% in most countries, there are still need specific biomarkers to guide rational using drugs, especially for ICB treatment.
OGN is a member of the small leucine-rich proteoglycan (SLRP) family and its function may vary from different tumors. However, this is the first time that the role of OGN in GC has been reported so far. There was experimental evidence that OGN was upregulated and function as a tumor promoter by inhibiting NF2 expression and triggering mTOR signaling in meningioma [27], but OGN expression reduced and inhibited cell proliferation, invasion, and epithelial to mesenchymal transition through EGFR/Akt pathway in colorectal cancers [28]. While the work model of OGN in GC is not clear and needs to be further explored. Moreover, OGN, as one of the biologically active elements of the vascular extracellular matrix, could be tested in plasma/serum and acted as a biomarker in disease [29,30]. For instance, serum OGN was an independent risk predictor for patients with chronic kidney disease [29]. This led us to speculate that OGN might serve as a serological marker for the prognosis of GC, which required to be confirmed by extra experiments. Furthermore, OGN also could adjust immune response via mediating immune cell infiltration in cancer, such as OGN expression positively associated with CD8+ T cells recruitment/infiltration by inhibited HIF-1α/ VEGF pathway in colorectal cancer [31]. Here, our results also indicated that OGN was significantly positively correlated with immune cell infiltration in GC, such as M2 macrophages. OGN might be an immune modulator in TME in GC. In a word, our study showed that OGN could be a new useful prognostic biomarker and immune regulator for GC.
CHRDL2, an antagonist of bone morphogenic proteins (BMPs), worked as an oncogene in colorectal cancer [32] and osteosarcoma [33]. CHRDL2 was firstly discovered in 2003 named as BNF-1 (breast tumor novel factor 1) and found its overexpression in breast, lung, and colon tumors detected by PCR in a small sample [34]. While the report of CHRDL2 in the tumor is very limited. Here, we first pointed out that CHRDL2 expression increased in the GC process and its high expression was associated with poor prognosis in GC patients. As a secreted protein, we found its expression associated with immune cell infiltration, especially for M2 macrophage, and immune inhibit molecular markers in GC. Therefore, CHRDL2 might be a novel target of GC therapy.
C2orf40 encodes a protein called esophageal cancer-related gene-4 (ECRG4), which is down-regulated by hypermethylation of its promoter in diverse types of tumors, including hepatocellular carcinoma [35], GC [36], breast cancer [37]. According to current reports, c2orf40 might as a potential tumor suppressor gene in tumors. However, our results 1 3 Fig. 4 Expression of six hub genes correlated with TAM polarization in the TME of GC. Scatterplots outline the relationship between the expression of six hub genes and various gene markers of TAMs (a), M1 macrophages (b), and M2 macrophages (c) by TIMER exhibited that high expression of ECRG4 was remarkably associated with poor outcomes (HR = 1.95, log-rank P = 6.7e−10), and its expression increased gradually in GC advance. The opposite result may be on account of different levels of evidence or cohort studies [36,38]. The deep reasons are yet to be further validated. It has been reported that ECRG4 interacted with TLR4 [36]. And TLR4/PI3K/Akt signaling was a vital way to promote M2 polarization of macrophages [10]. Our results exhibited C2orf40/ ECRG4 expression notably related to immune cell infiltration, including M2 macrophages. Thus, it also needs to verify whether ECRG4 is involved in the development of GC by modifying the M2 polarization of macrophages through TLR4.
THBS4 belongs to the thrombospondin protein family, which is a kind of adhesive glycoproteins that mediate cell-tocell and cell-to-matrix interactions. THBS4 tended to be an oncogene in GC [39], colorectal cancer [40], prostate cancer [41], and hepatocellular carcinoma [42]. As an illustration, inhibited expression of THBS4 could impede the PI3K/Akt signaling and disturb the cancer stem cell (CSC)-like properties in prostate cancer [41]. And THBS4 might as a biomarker for diffuse-type gastric adenocarcinomas [43], and a potential indicator for risk assessment and prognosis prediction of GC according to its polymorphisms [44] and bioinformatics analysis [45]. In addition, THBS4 is a secreted extracellular matrix protein that has been reported to mediate angiogenesis, adhesion, migration, and proliferation responding to TGF-β signaling [46]. Our results suggested THBS4 expression was associated with TGFβ in GC. Whether THBS4 responds to TGFβ signaling pathway in GC and participates in the progression of GC remains to be verified experimentally. Our study found that the expression of THBS4 was notably increased following the advanced procession of GC and TGFβ pathways might also facilitate THBS4 secretion to promoting GC. THBS4 also positively correlated with others immunoinhibitors and majority immunostimulators. The important role of THBS4 in GC still needs more evidence to prove.
CHRDL1 is a paralog of CHRDL2, which has only been reported in two papers in GC. One paper reported that its expression associated with CLIP4 DNA methylation and CHRDL1 might be a prognostic signature gene [47]. Another reported low expression of CHRDL1, as an antagonist of bone morphogenetic protein 4 (BMP4), might promote GC cell proliferation and migration by BMP receptor II [48]. However, our results revealed that high expression of CHRDL1 with poor OS in GC by detecting in two datasets, and CHRDL1 expression was associated with tumor stages, immune cell infiltration, immunoinhibitors, and immunostimulators.
Many papers showed ANGPTL1 acted as a tumor suppressor by inhibiting angiogenesis, cancer metastasis, cancer stemness, and repressing sorafenib resistance in treatment [49,50]. While its role in GC still not clear [51]. Our results revealed that ANGPTL1 might be a potential target in GC, according to its high expression related to poor OS and gradually increased with tumor stages as well as the feasible relationship with immune cells and biomarkers in TME.

Conclusions
Here, we searched a series of databases to find out the most relevant molecular that their expression closely related to the survival of GC. And functional enrichment analysis of survival-related CMS genes was performed. These genes are involved in many cancers, including GC, and cancer processes, such as angiogenesis. Then, we picked 6 genes that could be secreted in the TME and enhanced in immune cells. Further survival analysis demonstrated that high expression of OGN, CHRDL2, C2orf40, THBS4, CHRDL1, and ANGPTL1, respectively, was significantly associated with poor OS in GC patients. OGN and CHRDL2 were the firstly reported in GC. Moreover, the expression of these 6 genes prominently increased with the tumor pathological stage. In addition, we investigated the pathways of those hub genes by enrichment analysis. Ultimately, immune infiltration analysis displayed that those hub genes expression positively with CD8+ T Cell, M2 macrophage, most immunoinhibitors, and majority immunostimulators. These processes are closely relative Fig. 6 Relations between the abundance of tumor-infiltrating immunoinhibitors and expression of six hub genes. a Heatmap exhibited six hub genes that were positively correlated with most immunoinhibitors base on TISIDB. b-g CSF1R had linked with six hub genes base on GEPIA to tumor growth and metastasis. Therefore, high expression of 6 hub genes alone or synergistically resulted in a poor prognosis in GC patients. More experiments are still needed to verify these findings.

COMSUC
COMSUC [12] (http:// comsuc. bioin forai. tech/ home) is used to identify Consensus Molecular Subtypes (CMS) by integrating multiple clustering results based on multiple platforms, multiple omics data, and multiple methods. In this study, we integrated clustering results of GC data from TCGA into two groups by three algorithms based on K-means, hierarchical clustering (Hclust), non-negative matrix factorization (NMF).

Metascape
Metascape [13] (http:// metas cape. org/ gp/ index. html#/ main/ step1) is a web server designed to provide an extensive gene list annotation and analysis resource for users. Enrichment analysis was the essential part of Metascape. Here, we analyzed the prominent changed genes of CMS1 for enrichment analysis by custom analysis.

Kaplan-Meier plotter
Kaplan-Meier plotter [18] (http:// kmplot. com/ analy sis/) is a platform utilized to discover and validate the survival biomarkers of four cancers, including gastric cancer. To analyze the prognostic value of OGN, CHRDL2, C2orf40, THBS4, CHRDL1, and ANGPTL1, the cohorts were divided into high-and low-groups through their expression. In this study, the overall survival of 6 hub genes was compared in GC. And the hazard ratios (HRs, with 95% confidence intervals) and log-rank P-values (< 0.05 as significant difference) were counted.

GEPIA
Gene Expression Profiling Interactive Analysis [22] (GEPIA, http:// gepia. cancer-pku. cn/) uses a standard processing approach to examine genes expression by RNA sequencing data for 8587 normal samples and 9736 tumors from GTEx and TCGA projects. Here, GEPIA was used to investigate the six hub genes expression correlation for GC tumor stages from TCGA data. In addition, we valued the correlation between the expression of CSF1R/CXCL12 and six hub genes base on GEPIA.

TIMER
TIMER [23] (https:// cistr ome. shiny apps. io/ timer/) is a web tool used to investigate immune cell infiltration in diverse cancers, including GC. It provides data to evaluate the associations between expression levels of selected genes and infiltrating immune cells, such as B cells, CD4+ T cells, CD8+ T cells, neutrophils, macrophages, and dendritic cells. In our study, we estimated the correlation of expression of OGN, CHRDL2, C2orf40, THBS4, CHRDL1, and ANGPTL1 with immune cell infiltration.

TISIDB
TISIDB [24] (http:// cis. hku. hk/ TISIDB/ index. php) is an online web for assessing tumor-related gene and immune system interaction. In our study, we used TISIDB to find out the correlation between the expression of six hub genes and the abundance of immunomodulators. And distribution of six hub genes expression across immune and molecular subtypes in GC. The correlations between six hub genes and the immune system were measured by Spearman's test.