Introduction

Colon adenocarcinoma (COAD) is currently the second leading cause of cancer-related deaths, and its prevalence and deaths are rapidly increasing [1, 2]. However, the atypical early symptoms of colon cancer patients often lead to delays in diagnosis and treatment and even missed opportunities for surgical intervention [3]. Therefore, it is important to find reliable diagnostic and therapeutic targets. In recent years, although immune checkpoint blockade (ICB) has shown beneficial effects in COAD patients, only a small proportion of patients with high microsatellite instability or mismatch repair defects have responded to ICB [4]. Multi-omics are being used to discover new options and directions for the diagnosis and treatment of colon cancer due to the heterogeneity of the tumor immune microenvironment (TIME) [5].

The unfolded protein response (UPR) primarily involves three transmembrane proteins: ATF6, IRE1α, and PERK [6]. Several studies have shown that the unfolded protein response influences the malignant progression of tumors by altering their immune microenvironment [7]. The researchers found that ATF6 mediates the macrophage-induced resistance to 5-fluorouracil due to membrane translocation of MRP1 [8]. IRE1α overexpression can induce apoptosis and enhances adaptive anti-cancer immune surveillance to suppress tumors [9]. Genetic or pharmacological targeting of PERK damages NRF2 signaling in myeloid-derived suppressor cells to induce an antitumor immune response, thereby enhancing the efficacy of immune checkpoint blockade and T-cell immunotherapy [10]. These studies suggest that the UPR may influence tumor progression by altering TIME.

Although UPR affecting the TIME has been extensively studied, however, there are few studies combining them to screen tumor prognosis related markers in COAD. In this study, we screened UPR-related genes and immune cells associated with the prognosis of COAD patients in the TCGA database and then obtained UPR-immune-related genes by Weighted correlation network analysis (WGCNA). We found that UPR-immune-related genes play a role in the tumor microenvironment using single-cell analysis. We further obtained the key genes in UPR-immune-related genes by machine learning algorithm. Finally, we experimentally verified the connection between the core gene and UPR (Fig. 1).

Fig. 1
figure 1

Workflow of the study design

Materials and methods

Data gathering and collation

We downloaded the mRNA data and clinical data of the TCGA COAD cohort through R package “TCGAbiolinks” (Version 2.29.6) from the TCGA database (https://portal.gdc.cancer.gov/) [11]. The data of GSE39582 [12] were obtained from the GEO database (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE39582). In the TCGA dataset, there were 478 tumor specimens and 41 normal specimens, while the GSE39582 dataset included 419 tumor specimens. The gene set of unfolded protein response was downloaded from the human gene database (www.genecards.org) [13].

Differentially expressed genes (DEGs) analysis

R package “limma” (Version 3.50.3) was used to select DEGs [14]. |logFC| > 1 and an adjusted p-value < 0.05 was used to identify DEGs for WGCNA, while |logFC| > 1.5 and an adjusted p-value < 0.05 to select DEGs for UPRs. A larger sample size is required for WGCNA analysis. At the same time, we would like to screen for genes related to UPR as well as immune cells in a wide range. And when screening for UPRs, we need differential genes with bigger differences in expression to narrow down the range of UPRs for prognostic scoring.

Construction of the prognostic UPR_score

We initially screened the UPRs associated with patient prognosis by the univariate cox analysis (R package “survival”, Version 3.2.13), and further screened them by the lasso cox analysis (R package “glmnet”, Version 4.1.6) and the stepwise cox analysis (R package “My.stepwise”, Version 0.1.0). We calculated the coef values for each gene using the multivariate cox analysis (R package “survival”, Version 3.2.13). The UPR_score was calculated as follows: UPR _score = Σ (Expi * coefi).

Expi indicated the expression of the gene in the samples. We finally categorized COAD patients into a high UPR_score group (UPR_score > median value) and a low UPR_score group (UPR_score < median value) based on the median UPR_score.

Construction of the prognostic TIME_score

The gene sets of 28 tumor-infiltrating lymphocytes were acquired from the TISIDB website (http//cis.hku.hk/TISIDB/download.php), and the scores of different immune cells were obtained using the ssGSEA algorithm (R package “GSVA”, Version 1.42.0). Subsequently, we calculated the immune cells associated with the patient’s prognosis by univariate cox analysis and the coef values of the immune cells using multivariate cox analysis (R package “survival”, Version 3.2.13). The TIME_score was calculated as follows: TIME _score = Σ (Scoi * coefi).

Scoi represented the score of immune cells calculated by the ssGSEA algorithm. We finally categorized COAD patients into a high TIME_score group (TIME_score < median value) and a low TIME_score group (TIME_score > median value) based on the median TIME_score.

Kaplan–meier analysis

R package “survminer” (Version 0.4.9) and R package " survival” (Version 3.2.13) was used for Kaplan-Meier analysis based on gene expression. The cutoff value of gene expression was calculated by the surv_cutpoint function.

Weighted gene co-expression network analysis (WGCNA)

Analyses of WGCNA were conducted on 5708 DEGs [15] (R package “WGCNA”, Version 1.72.1). The optimal soft threshold power cutoff was set to R2 = 0.88. Weighted gene co-expression networks were constructed based on the optimal values. The degree of correlation between genes was calculated by topological overlap measure (TOM) [16].

We divided the patients into four groups: UPR_high_TIME_high group, UPR_high_TIME_low group, UPR_low_TIME_high group, UPR_low_TIME_low group. We analyzed the correlation between the four groups of patients and the module. We selected the pink modular genes in WGCNA.

Single-cell analysis

We calculate the scores of UPR-immune related genes (UPR-immune-score, UI-score) by the AddModuleScore function (R package “Seurat”, Version 4.3.0) in GSE132465 [17]. CellChat was used for evaluating Cellular communication (R package “CellChat”, Version 1.6.1) [18].

Tumor immune dysfunction and exclusion analysis

We employed Tumor Immune Dysfunction and Exclusion (TIDE, http://tide.dfci.harvard.edu) to anticepate patient response after immunotherapy.

Screening of hub genes

To further screen the genes in UI-related genes, we used lasso analysis (R package “glmnet”, Version 4.1.6), random forest analysis (R package “randomForest”, Version 4.7.1.1), and support vector machine–recursive feature elimination (SVM-RFE) (R package “e1071”, Version 1.7.13). The grouping settings of the three algorithms are tumor samples and normal samples.

Cell culture

HCT116 and HCT8 were purchased from the Cell Bank of the Chinese Academy of Sciences. We purchased Lenti-X™ 293 T cells from Takara Biomedical Technology Co., Ltd. (Beijing, China). HCT116 cell were cultured in McCoy’s 5 A medium, while the Lenti-X™ 293 T and HCT8 cells were cultured in DMEM. The cells were cultured at 37 °C in a humidified incubator with 5% CO2 and 10% fetal bovine serum, 2 mM L-glutamine, 100 U/mL penicillin, and 100 mg/mL streptomycin.

Lentivirus transduction and plasmid

Lentiviral plasmid (pLKO.1-puro-shATF6, pLKO.1-puro-shPERK, pLKO.1-puro-shIRE1), together with pCMV-dR8.2 and pCMV-VSV-G, were transfected into Lenti-X™ 293 T cells. Cells were selected by 2 µg/mL puromycin for 7 days to establish stable cell strains after transduction. An unrelated shRNA sequence targeting Luciferase was used as a negative control (shLUC). The shRNA sequences are listed as follow:

ATF6 Sense: GCAGCAACCAATTATCAGTTT.

ATF6 Antisense: AAACTGATAATTGGTTGCTGC.

PERK Sense: GTTGTGCTAGCAACCCTAATA.

PERK Antisense: TATTAGGGTTGCTAGCACAAC.

IRE1 Sense: CCCATCAACCTCTCTTCTGTA.

IRE1 Antisense: TACAGAAGAGAGGTTGATGGG.

Luciferase Sense: CGCTGAGTACTTCGAAATGTC.

Luciferase Antisense: GACATTTCGAAGTACTCAGCG.

Western blot analysis

Using SDS-PAGE, proteins were separated and then electrotransferred to a nitrocellulose membrane. The membranes were blocked with 5% nonfat milk at room temperature and then incubated overnight at 4 °C with primary antibody. Subsequently, the membrane is incubated with the appropriate HRP-labeled secondary antibody. Tanon-5200 chemiluminescence apparatus (Tanon Science & Technology Co. Ltd., Shanghai, China) was used for visualizing bands. The primary antibodies used in this study were ATF6(24169-1-AP, Proteintech, USA), PERK(24390-1-AP, Proteintech, USA), IRE1(27528-1-AP, Proteintech, USA), TIMP1(ab211926, abcam).

Immunohistochemistry (IHC)

In this study, paraffin tissues were obtained from the Northern Jiangsu People’s Hospital. The Ethics Committee of the Northern Jiangsu People’s Hospital approved the study.

Cell growth curves

Cell growth curves were drawn by counting cells after different culture times.

Statistical analysis

R.4.1.3 and GraphPad Prism 8.0.2 were used to analyze data and visualization. All findings were statistically significant when P < 0.05.

Results

The combination of UPR and immune cells affected the prognosis of colon cancer patients

2867 differential genes were screened in the TCGA dataset under the condition of |logFC| > 1.5 and an adjusted p-value < 0.05 (Supplementary File 1). We obtained 511 UPR-related differential genes after intersecting the differential genes with the UPR-related gene set(Supplementary File 2). Univariate analysis was used to screen for prognostically relevant UPRs. 121 UPR-related genes associated with the prognosis of colon cancer patients were obtained by univariate cox analysis. Lasoo cox analysis was used to screen for key prognostically relevant UPRs. After they were used in Lasoo cox analysis, we further obtained 30 UPR-associated genes (Fig. 2A-B; Supplementary File 3). Stepwise cox analysis can obtain the final Cox proportional hazards model for the best candidate. And this model was evaluated by combining several candidate genes with their coefficients as a whole, and we have presented these genes using a bar chart. We performed stepwise cox analysis on the 30 UPR-related genes and obtained the 13 key UPR-related genes (Fig. 2C). We calculated the score for each gene by multivariate cox analysis (Supplementary File 4). We calculated the infiltration abundance of 28 immune cells in the TCGA dataset by ssGSEA and obtained the immune cells associated with the prognosis by univariate cox analysis. The scores of prognostically relevant immune cells were obtained by multivariate cox (Supplementary File 5). The results of Kaplan-Meier analysis showed that patients in the high UPR_score group had a worse prognosis, which was also observed in the low TIME_score group (Fig. 2D-E). We further categorized patients into four groups by combining UPR_score and TIME_score and found that patients with high UPR_score combined with low TIME_score had a worse prognosis (Fig. 2F). In the GSE39582 dataset, we found the same results (Fig. 2G). The ROC curves showed AUC of 0.7, 0.631, and 0.652 for three, five, and seven years for the TCGA dataset, respectively, whereas they exhibited 0.632, 0.585, and 0.542 in GSE39582 (Fig. 2H-I). These results suggest that UPR may influence the prognosis of colorectal cancer patients in combination with immune cells. However, the AUC of the survival curves was suboptimal and requires further validation.

Fig. 2
figure 2

UPR combined with immune cells to determine the prognosis of colorectal cancer patients. (A) Ten-time cross-verification for tuning parameter selection in the LASSO-cox model by R package “glmnet”. (B) LASSO coefficient profiling by R package “glmnet”. (C) Genes selected by Stepwise cox analysis (R package “My.stepwise”). (D) Survival curve comparing high-risk and low-risk groups based on UPR_score by R package “survival” in TCGA. (E) Survival curve comparing high-risk and low-risk groups based on TIME_score by R package “survival” in TCGA. (F) Survival curve comparing high-risk and low-risk groups based on TIME_score and UPR_score by R package “survival” in TCGA. (G) Survival curve comparing high-risk and low-risk groups based on TIME_score and UPR_score by R package “survival” in GSE39582. (H) ROC curve of 1, 3, 5-year survival by R package “timeROC” in TCGA. (I) ROC curve of 1, 3, 5-year survival by R package “timeROC” in GSE39582. UPR: unfolded protein response; TCGA: The Cancer Genome Atlas; ROC: Receiver Operating Characteristic; AUC: Area Under the Curve

Screening of prognostically relevant UPR-Immune-related genes by WGCNA and validation of their role in the microenvironment by single-cell data

WGCNA is used to find gene modules composed of highly related genes, summarize these modules using module signature genes or key genes within the module, and correlate the modules with clinical sample traits. We selected the most relevant genes to UPR and TIME through WGANA, particularly to the UPR_high + TIME_low group and also the UPR_low + TIME_high group. We found that the pink module was most highly correlated with the UPR_high + TIME_low group and least correlated with the UPR_low + TIME_high group. Therefore, we selected 100 genes in the pink module associated with UPR and immune cells based on WGCNA (Fig. 3A, B; Supplementary File 6). We used univariate cox analysis to screen 13 genes associated with the prognosis from the 100 genes (Supplementary File 7). Single-cell data showed significant changes in the proportion of cells in colon cancer compared to normal tissue (Fig. 3C). The UI score was a score calculated using the AddModuleScore function on the screened UPR-immune related genes at the single cell level. We found that the expression of UPR-Immune-related genes was significantly changed in stromal cells and had the largest gap (Fig. 3D). Therefore, we categorized stromal cells into UPR-immune_high (UI_high) stromal cells and UPR-immune_low (UI_low) stromal cells based on UI score. Cell communication analysis was used to observe the communication of the target cell population with other cell populations. By analyzing cell communication, we found that the number of communication between UI_low stromal cells and bone marrow cells was more than that of UI_high stromal cells, whereas the communication strength between UI_high stromal cells and epithelial cells was stronger (Fig. 3E, F). By visualizing the intercellular communication pathways, we showed that UI_high stromal cells communicated more frequently with other cells in the tumor microenvironment, especially in epithelial cells (Fig. 3G, H).

Fig. 3
figure 3

Screening of prognostically relevant UPR-Immune-related genes by WGCNA and validation of their role in the microenvironment by single-cell data. (A) Dynamic Tree and hierarchical clustering modules with different colors by R package “WGCNA”. (B) Heatmap of correlation between modules and clinical features by R package “WGCNA”. (C) TSNE map of single-cell data by R package “Seurat”. (D) Expression of UIscore in the tumor microenvironment by R package “Seurat”. (E-F) The number of cell-to-cell interactions and the total interaction strength by R package “CellChat”. (G) Pathway analysis between cells by R package “CellChat”. (H) Interaction strength analysis by R package “CellChat”. UIscore: score of UPR-immune related genes; TSNE: T-distributed Stochastic Neighbor Embedding

TIMP1, CAV1, SERPINH1 were further screened from UPR-Immune-related genes by machine learning algorithms

We screened the UPR-immune related genes by WGCNA and screened the key genes related to prognosis by univariate analysis. By single-cell analysis, we verified the role of UPR-immune related genes in the tumor microenvironment. To further search for the key genes, we used machine learning algorithms to screen them. Lasso algorithm, random forest algorithm and SVM algorithm algorithms are commonly used for screening genes. We screened eight genes (TIMP1, SERPINH1, NOTCH3, FSTL3, CAV1, SFXN3, SNAI1, S1PR5) by the Lasso algorithm (Fig. 4A-B). We screened six genes (TIMP1, CAV1, SERPINH1, SFXN3, SERPINE1,SNAI1) by random forest algorithm and five genes (TIMP1, SERPINH1, CAV1, NOTCH3, S1PR5) by SVM algorithm (Fig. 4C-E). Through the intersection of the three algorithms, we finally filtered out three core genes (TIMP1, CAV1, SERPINH1) (Fig. 4F). We found that all three genes were associated with the prognosis of colon cancer patients through KM curve analysis (Fig. 4G-L).

Fig. 4
figure 4

Hub UPR-immune related genes Screening. (A) Ten-time cross-verification for tuning parameter selection in the LASSO-cox model by R package “glmnet”. (B) LASSO coefficient profiling by R package “glmnet”. (C) Random forest by R package “randomForest”. (D-E) SVM-RFE algorithm for feature selection by R package “e1071” and “randomForest”. (F) Venn diagram of the LASSO-cox analysis, Random forest analysis, the SVM-RFE algorithm by R package “ggvenn”. (G-L) Survival curve at TIMP1, CAV1, SERPINH1 in TCGA and GSE39582 by R package “survival”. LASSO: the Least Absolute Shrinkage and Selection Operator; SVM-RFE: Support Vector Machine–Recursive Feature Elimination

Knockdown of TIMP1 inhibited proliferation and migration of colon cancer cell lines

We ultimately screened three core genes (TIMP1, CAV1, SERPINH1) from the UPR-Immune-related genes. We found that TIMP1 appeared during the initial screening process and had the highest importance in the random forest algorithm(Figs. 2C and 5C). Currently, researchers have investigated the link between CAV1, SERPINH1 and the UPR and immune cells. However, although TIMP1 has been studied in relation to immune cells, the relationship between it and the UPR is still unknown. Therefore, in this study, we chose TIMP1 as the study subject. We constructed TIMP1 knockdown colon cancer cell lines (Fig. 5A, B). We found that TIMP1 knockdown decreased the proliferation of HCT116 and HCT8 cell lines (Fig. 5C, D). Meanwhile, we found that TIMP1 knockdown reduced colon cancer cell migration by transwell assay (Fig. 5E). These findings suggested that TIMP1 affected the proliferation and migration of colon cancer cells.

Fig. 5
figure 5

TIMP1 regulated colorectal cancer growth and migration. (A-B) The levels of TIMP1 protein were determined by western blot analysis in HCT116 and HCT8. (C-D) Cell growth curves. (E) Transwell analysis to examine cell migration

TIMP1 was regulated by key UPR genes

There are three key regulatory genes in the UPR regulatory pathway: ATF6, PERK, and IRE1. To explore the connection between TIMP1 and UPR, we constructed cell lines with knockdown of ATF6, PERK, and IRE1. We found that TIMP1 expression was decreased in HCT116 and HCT8 after knockdown of ATF6, PERK, and IRE1 (Fig. 6A, B). Immunohistochemistry showed a similarity between TIMP1 and ATF6, PERK, and IRE1 expression in colon cancer tissues (Fig. 6C). These results suggested that TIMP1 was regulated by UPR key genes.

Fig. 6
figure 6

TIMP1 was regulated by ATF6, PERK, IRE1. (A-B) The levels of TIMP1 protein were determined by western blot analysis. (C) Immunohistochemistry analysis of TIMP1, ATF6, PERK, and IRE1

Discussion

In this study, we first found that high UPR_score group and low TIME_score group resulted in poor patient prognosis. Meanwhile, after combining UPR_score and TIME_score, we found that there was a significant difference in the prognosis of patients in the UPR_high + TIME_low group and the UPR_low + TIME_high group. Therefore, we further screened the genes related to UPR_high + TIME_low group, UPR_low + TIME_high group by WGCNA, and screened the genes among them related to patients’ prognosis by univariate analysis. Next, we verified the relationship between the screened genes and the immune microenvironment using a single-cell dataset. Then, we further narrowed down the screened genes by machine learning algorithm and obtained TIMP1, CAV1, SERPINH1. Currently, researchers have investigated the link between CAV1, SERPINH1 and the UPR and immune cells. However, although TIMP1 has been studied in relation to immune cells, the relationship between it and the UPR is still unknown. Therefore, in this study, we chose TIMP1 as the study subject. We experimentally verified the function of TIMP1 in colorectal cancer and its relationship with key genes of UPR.

Current studies have shown that cancer cells undergoing UPR can dramatically alter the recruitment and function of immune cells in the tumor microenvironment, coordinating various immune evasion mechanisms to promote malignant progression [19]. The results of this study are consistent with the results of these studies. In the present study, patients with high UPR scores and low TIME scores had a poorer prognosis. After combining the two scores, we found a competitive relationship between the UPR score and the TIME score, suggesting the promotion of malignant behavior of the tumor by the UPR and the protective effect of immune cells.

We finally screened three key UPR-immune-related genes (TIMP1, CAV1, SERPINH1). CAV1 is localized at the endoplasmic reticulum-mitochondrial interface, where it impairs the remodeling of endoplasmic reticulum-mitochondrial contacts, reduces protein kinase A signaling, and thereby enhances cell death [20]. Moreover, the researchers found that hypoxia-induced high expression of CAV1 enhanced the metastatic potential of hepatocellular cancer [21]. Anne Schönle et al. found that CAV1 modulates TCR signaling intensity and differentiates regulatory T cells into allogeneic responsive T cells [22]. These studies suggest that CAV1 is associated with the UPR and immune cells, which is consistent with our findings.

SERPINH1 is a molecular companion of collagen that locates in the endoplasmic reticulum and plays a role in the proper folding of collagen [23]. Many studies have shown that HSP47 can promote tumor proliferation and metastasis in cancer cells [24, 25]. Recently, researchers have found that SERPINH1 can interact with IRE1α to sustain cancer cell growth and enhance their metastatic potential [26, 27]. Interestingly, Takuto Miyamura et al. found that the presence of a large number of SERPINH1-positive fibroblasts in the stroma of lung cancer was a risk factor for postoperative lung cancer recurrence [28]. Similar results were obtained in COAD [29], which provided evidence for the results of the single-cell analysis in this study.

TIMP1 is part of the family of tissue inhibitors of metalloproteinases that inhibit the proteolytic activity of matrix metalloproteinases [30]. Recently, researchers have found that TIMP1 promotes the progression of clear cell renal cell carcinoma by regulating loss-of-nest apoptosis to shape the immunosuppressive microenvironment [31]. TIMP1 overexpression drives lung adenocarcinoma progression through CD63 expression in tumor-associated fibroblasts [32]. Meanwhile, tumor-exposed neutrophils can promote cancer metastasis via a CD90-TIMP1 proximal-paracrine loop [33].

Taken together, we found that CAV1 and SERPINH1 have been studied by researchers in conjunction with the UPR and the immune microenvironment. Meanwhile, TIMP1 has also been shown to be associated with the immune microenvironment. However, the link between TIMP1 and UPR has not yet been studied by researchers. Therefore, we selected TIMP1 as a research object. We verified the function of TIMP1 and found that TIMP1 could promote proliferation and metastasis of colorectal cancer. Then, we knocked down the UPR key gene and found that the expression of TIMP1 decreased after the knockdown of the UPR key gene. This indicated that TIMP1 was regulated by the UPR key gene in colorectal cancer. These results suggested that TIMP1 was a prognostic marker for colorectal cancer regulated by the UPR.

There are many shortcomings in this study. This manuscript attempted to construct a prognostic model, but the prognostic model was not ideal. Also, this manuscript does not have enough justification for the final selection of genes. Finally, the logic of this manuscript needs to be strengthened.

Conclusion

We found that TIMP1 may be a potential prognostic marker by integrating UPR with immune cells and experimentally confirmed that TIMP1 was regulated by key genes of UPR.