Background

Pancreatic cancer is a common malignant tumor of the digestive tract, which has the characteristics of difficult early diagnosis, rapid metastasis, and poor therapeutic effect. It is one of the most invasive and fatal malignant tumors in the world, and the prognosis is very poor. The five-year survival rate is less than 8% [1]. Surgery is the most important and comprehensive treatment for pancreatic cancer, and most patients have recurrence and metastasis earlier after operation. It is difficult to determine the overall postoperative annual survival rate. Postoperative adjuvant radiotherapy and chemotherapy aim to kill residual tumor cells to improve the curative effect; however, due to the difficulties of early diagnosis and the lack of effective adjuvant treatment, the effect is very little [2, 3]. Up to now, the prediction and prognosis of pancreatic cancer mainly depend on histopathological diagnosis and tumor staging system. However, traditional methods are not enough to accurately evaluate the prognosis of patients with pancreatic cancer and cannot meet the needs of clinicians to assist in diagnosis and treatment [4]. Therefore, it is imperative to develop reliable and accurate prognostic biomarkers to help clinicians optimize treatment strategies.

With the increased understanding on the key role of immune system in the development of pancreatic cancer, immunotherapy such as chimeric antigen receptor T cell therapy and immune checkpoint inhibitor (ICBs) is very promising for the treatment of pancreatic cancer patients. Exciting results have been achieved in preclinical and clinical trials of pancreatic cancer [5, 6]. However, according to the latest research, the efficacy of immunotherapy is affected by many factors, such as tumor microenvironment, microsatellite instability, and tumor mutation load. In addition, some patients develop drug resistance after a period of remission, and even the vast majority of patients are not sensitive to ICB [7]. To develop better strategies to enhance immunity, researchers have conducted several in-depth studies on the matrix response of pancreatic cancer and its interaction with the immune microenvironment of pancreatic cancer. Riquelme et al. found that the development of pancreatic cancer in the mouse model is related to the enrichment of specific strains of bacteria in the intestinal tract and tumor. These strains induce tolerance and immunosuppressive microenvironment, which is conducive to the progression of cancer and resistance to immunotherapy. Eliminating microorganisms with antibiotics can reshape the tumor microenvironment, induce T cell activation, improve immune monitoring, and improve the sensitivity of immunotherapy to established tumors [8]. Loss or inhibition of CXCR2 can improve T cell entry, and combined inhibition of CXCR2 and PD1 in mice with confirmed disease can significantly prolong survival and inhibit pancreatic tumorigenesis and pancreatic cancer metastasis [9]. Winograd et al. found that in pancreatic cancer (a non-immunogenic tumor), the baseline refractory of checkpoint inhibitors can be saved by inducing T cell response with α-CD40/ chemotherapy [10]. These studies have shown that the efficacy of immunotherapy for pancreatic cancer can be significantly improved by changing the tumor immune microenvironment, such as increasing the density of intratumoral effector T cells or inhibiting immunosuppressive cells and receptors. However, the immune microenvironment of pancreatic cancer and its role in the immune escape of cancer cells still need to be understood. It is also urgent to find more new targets to regulate the immune microenvironment to improve the efficacy of immune checkpoint therapy [11].

In recent years, gene expression database has been used to mine valuable therapeutic genes, identify promising prognostic factors, and analyze the molecular mechanisms of various cancers. We used univariate Cox proportional hazard regression analysis, the least absolute shrinkage and selection operator (LASSO), and multivariate COX regression analysis to screen 8 genes related to prognosis from the 314 immune-related genes in the TCGA pancreatic cancer cohort. These can be used as the potential prognostic indicators of pancreatic cancer. The genes were used to establish the optimal risk model; survival analysis, univariate Cox proportional hazard regression analysis, and multivariate Cox proportional hazard regression analysis were used to evaluate the prognostic value of risk score. ROC curve and principal component analysis (PCA) were used to evaluate the accuracy of the model. Patients were divided into high-risk and low-risk groups according to the median risk score. Gene ontology (GO), Kyoto gene and genome encyclopedia (KEGG), and gene set enrichment analysis (GSEA) were used to explore the differences of key signal pathways between high-risk and low-risk groups. Single sample gene set enrichment analysis (ssGSEA) method was used to quantify immune cell infiltration, and the relationship between immune risk genes and tumor immune microenvironment was analyzed. The objectives of this study are to provide a feasible method to evaluate the prognosis of pancreatic cancer, to provide a powerful means of tumor prevention and treatment for regulating the body’s immunity against tumor, and to add new content in the development of new adjuvant drugs targeted at tumor immunotherapy.

Methods

Data collection

IMMUNE_RESPONSE and IMMUNE_SYSTEM_PROCESS2 immune gene sets were obtained from Molecular Signatures Database (MSigDB), and a total of 314 immune-related genes were obtained. mRNA expression data and clinical data of 183 pancreatic cancer samples were obtained from the Cancer Genome Atlas (TCGA) database(https://portal.gdc.cancer.gov/).

Construction of prognostic signature for TCGA pancreatic cancer cohorts

Univariate Cox regression analysis of 314 immune-related genes was carried out to analyze the genes significantly related to the overall survival (OS) of pancreatic cancer. Then LASSO regression analysis was performed on these immune genes. LASSO regression analysis reduced the dimensionality of high-dimensional data by the sum of the absolute values of the limiting coefficients less than the predetermined value, and the variables with relatively small contributions will be given zero coefficients. Only the genes with non-zero coefficient in LASSO regression analysis were selected for further multivariate Cox regression analysis, and the resulting genes were used to build a predictive model. The risk score of each patient was calculated according to the mRNA expression level and risk coefficient of each risk gene, which was calculated by the following equation:

$$\begin{aligned} {\text{risk score}} & = {\text{Expression}}_{{{\text{mRNA1}}}} \times {\text{Coefficient}}_{{{\text{mRNA1}}}} + {\text{Expression}}_{{{\text{mRNA2}}}} \\ & \quad \times {\text{Coefficient}}_{{{\text{mRNA2}}}} + \cdots {\text{Expression}}_{{{\text{mRNAn}}}} \times {\text{Coefficient}}_{{{\text{mRNAn}}}} \\ \end{aligned}$$

Taking the median risk score as the cut-off value, the patients were divided into high-risk and low-risk groups for follow-up analysis.

Pathway analysis

The R language “EdgeR” package calculation was used to analyze the difference of mRNA between low-risk and high-risk groups. mRNA with FDR value less than 0.05 was annotated by Gene Ontology (GO, http://geneontology.org/), and the biological functions of different genes, including biological processes, cellular components, and molecular functions, were analyzed. The Kyoto Encyclopedia of genes and Genomes (KEGG, https://www.genome.jp/kegg/) analyzes a variety of biological information, including metabolic pathways, predicts the function of genes, and analyzes the roles of proteins and other macromolecules. The metabolic pathways and signal transduction pathways of significant enrichment in pancreatic cancer were revealed by pathway enrichment analysis (FDR < 0.05). Then, we performed Gene Set Enrichment Analysis (GSEA, http://software.Broadstitute.org/GSEA/) to reveal the signal pathways and biological processes in which differentially expressed genes were enriched between high-risk and low-risk subpopulations. The "ClusterProfiler" R package was used for visualization [12].

Calculate the composition of tumor immune microenvironment

To explore the role of immune genes and tumor immune microenvironment, we quantified the level of immune cell infiltration in TCGA pancreatic cancer cohort(TCGA-PAAD) samples. According to the immune cell marker genes, provided by Bindea et al. [13], we used R language "GSVA" package according to the expression of immune cell marker genes in TCGA-PAAD. Single-sample gene set enrichment analysis (ssGSEA) was used to quantify the infiltration level of 24 types of immune cells in the sample, such as T lymphocytes, dendritic cells, and natural killer cells [14]. Pearson's method was used to calculate the correlation between risk genes and immune cells. The TIMER database was used for verification [15].

Statistical analysis

All statistical analyses were carried out by R programming language (https://www.r-project.org/). R language "Survival" package and "survminer" package were used to draw Kaplan–Meier curve. Univariate and multivariate Cox proportional hazard regression analyses were also used to evaluate the relationship between risk scores and OS. ROC analysis was used to examine the sensitivity and specificity of survival prediction using the gene signature risk score. An area under the ROC curve (AUC) served as an indicator of prognostic accuracy. A P-value of less than 0.05 was set as statistically significant for all the analyses.

Results

Acquisition of immune risk genes

A total of 314 immune-related genes were collected from MSigDB. The expression level and prognosis data of PAAD-related genes were obtained from the TCGA database. Univariate Cox regression analysis was carried out. A total of 109 immune genes with P < 0.05 were selected for lasso regression analysis. When 16 immune gene models were obtained according to the lambda.min value, the performance of 16 immune gene model was the best (Fig. 1b). The 16 genes were analyzed by multivariate COX regression analysis. A total of 8 immune genes (ITGA7, RBM14, DENND4B, LQK1, ZNF709, COL7A1, SP1, NCBP2) were identified as independent prognostic factors of pancreatic cancer, which were used to construct a clinical predictive model (as shown in Table 1). The expression profiles of these genes in high- and low-risk groups are shown by heat map (Fig. 2c), and the Kaplan–Meier curves of these genes are drawn by R language "Survival" package and "survminer" package. The survival time of patients with high expression of ITGA7, RBM14, DENND4B, LQK1, and ZNF709 was significantly longer than that of the patients with low expression, and the prognosis of patients with high expression of COL7A1, SP1, and NCBP2 was worse (Fig. 3).

Fig. 1
figure 1

The establishment and verification of the model a LASSO coefficient profiles of the 109 immune-genes in TCGA-PAAD. b A coefficient profile plot was generated against the log (lambda) sequence. c Kaplan–Meier analysis of TCGA-PAAD patients stratified by the median risk score. d The sensitivity and specificity of the ROC curve were used to evaluate the model

Table 1 Univariate Cox regression analysis and multivariate COX regression analysis results of 8 immune genes
Fig. 2
figure 2

a The black dotted line is the best dividing line for dividing patients into low-risk and high-risk groups. b The distribution of survival status and survival time of patients with TCGA-PAAD. c Heatmap of immune gene expression profile in prognostic markers

Fig. 3
figure 3

Kaplan–Meier analysis of the effects of 8 immune genes on overall survival in TCGA-PAAD patients

Construction and verification of prognostic signature for TCGA-PAAD cohorts

The risk score of each patient was calculated according to the mRNA expression level of each risk gene and the risk coefficient. With the median risk score as the cut-off value, patients were divided into high- and low-risk groups. Kaplan–Meier accumulation curve showed that patients with low-risk score survived significantly longer than those with high-risk score (Fig. 1c). In the TCGA-PAAD cohort, the area under the curve (AUC value) of the risk score was 0.755 (Fig. 1d). Subsequently, we evaluated the prognostic value of the risk score, and univariate COX analysis showed that the TCGA-PAAD risk score was significantly correlated with the overall survival (OS) (HR 1.406, 95%CI 1.237–1.598, P < 0.001) (Fig. 4a). Multifactorial analysis showed that the risk score was an independent predictive factor (HR 1.400, 95%CI 1.287–1.522, P < 0.001) (Fig. 4b). Finally, based on the whole genome expression set, total immunity gene set, and immune risk gene set, we used principal component analysis (PCA) to study the different distribution patterns between low- and high-risk population. When PCA was performed based on the whole genome expression profile, there was no significant segregation in immune status in each group (Fig. 4c). According to the immune risk gene set, low-risk group and high-risk group tended to be divided into two groups (Fig. 4d). The risk score distribution of PAAD patients and the relationship between risk score and survival time are shown in Fig. 2. With the increase of risk value, the number of deaths increases significantly, and the survival time of low-risk group is significantly higher than that of the high-risk group.

Fig. 4
figure 4

Univariate Cox proportional hazard regression analysis (a) and multivariate Cox proportional hazard regression analysis (b) explored the correlation between risk score, age, sex, grade, T, N, M, smoking and OS. c The expression patterns of grouped samples were analyzed by PCA using all mRNAs. d The expression patterns of grouped samples were analyzed by PCA using Prognostic signature

Signaling pathway analysis

To explore the potential signal pathways related to the immune risk genes, we used edgeR package calculation to analyze the difference of mRNA between low-risk group and high-risk group and also to analyze the signal pathway of mRNA whose FDR value was less than 0.05. GO analysis showed that these genes could be classified into several basic biological processes, including biological development, hormone secretion, synthesis of transmembrane transporter complex, and important transmembrane transporter activity (Fig. 5a). KEGG analysis showed that these genes mainly interact with chemical carcinogenic, neuroactive ligand receptor interaction and cAMP signaling pathways (Fig. 5b, Table 2). By calculating the multiplication of mRNA expression levels of all protein-coding genes between high-risk group and low-risk group, and using GSEA analysis, it was found that the altered genes were significantly enriched in several common pathways. The high-risk group was positively correlated with epithelial-mesenchymal transformation, glycolysis, MTORC1 signal pathway, p53 channel, hypoxia, apoptosis, E2F targets, MYC targets v1, MYC targets v2, TNFA Signaling via NFKB, and inflammatory response, while the low-risk group was positively correlated with pancreas beta cells, allograft rejection and bile acid metabolism (Fig. 5c, Table 3).

Fig. 5
figure 5

Signaling pathway analysis a GO signal pathway enrichment analysis of differential mRNA between low-risk group and high-risk group b KEGG signal pathway enrichment analysis was used to analyze the difference of mRNA between low-risk group and high-risk group. c GSEA signal pathway enrichment analysis was used to analyze the difference of mRNA between low-risk group and high-risk group

Table 2 KEGG analysis of the main enriched signal pathways of differential genes between low-risk group and high-risk group
Table 3 GSEA analysis of main enrichment pathways of differential genes between low-risk group and high-risk group

Correlation with tumor immune microenvironment

After quantifying 24 types of immune cells, Pearson test was used to calculate the correlation between risk genes and immune cells. DENND4B, ITGA7, and T cell lines such as T helper cells, Tcm, Tem, T cells, CD8 T cells, and TReg were highly positively correlated, whereas it was s negatively correlated with Th2 cells. RBM14 was negatively correlated with macrophages infiltration level, ZNF709 was significantly and positively correlated with TFH, and NCBP2 was positively correlated with Th2 cells (Fig. 6a).We observed that the infiltration levels of DC, iDC, pDC, B cells, T cells, Tcm, Tem, TFH, Th17 cells, and cytotoxic cells in the low-risk group were significantly higher than those in the high-risk group.Th1 cells and NK CD56dim cells in the high-risk group increased significantly compared with the low-risk group (Figs. 6b, 7b). Correlation analysis showed that DENND4B and ITGA7 were highly positively correlated with T helper cells, Tcm, Tem, T cells, CD8 T cells, and TReg. ZNF709 and SP1 were highly positively correlated with CD8 T cells, macrophage, and DC (Fig. 6a). TIMER database was used to further verify our results. DENND4B, ITGA7 and ZNF709, SP1 and CD4 T cells, Neutrophil, DC and other cells are highly positively correlated. ZNF709 and SP1 are positively correlated with CD8 T cells, Macrophage, DC and other cells (Fig. 7A).

Fig. 6
figure 6

Correlation between prognostic markers and tumor immune microenvironment. a Correlations of 8 immune gene mRNAs and infiltration levels of 24 immune cells. b The heatmap shows the difference in infiltration levels of 24 immune cells in the high- and low-risk groups

Fig. 7
figure 7

a The correlation between immune genes and immune cells was verified by using the TIMER database. Each point represents a sample, and the blue line represents the relationship between the expression level of each gene and the immune cell content. b Comparison of the difference in immune cell infiltration levels between high-risk group and low-risk group, ****P ≦ 1e−04, ***P ≦ 0.001, **P ≦ 0.01, *P ≦ 0.05, ns P ≧ 1

Discussion

Pancreatic cancer is one of the most malignant tumors of the digestive tract. The 5-year survival rate of patients with pancreatic cancer is less than 8%, and the average survival rate is less than 6 months. Considering that the mortality rate has been high in the recent decades, there is an urgent need to find effective biomarkers to promote and evaluate the diagnosis, treatment, and prognosis of pancreatic cancer [16,17,18]. The immune response in the microenvironment plays a decisive role in the progression, metastasis, and recurrence of tumors. Studies have confirmed that the low-immunogenic and immunosuppressive tumor microenvironment of pancreatic cancer is an important cause of poor prognosis. Following surgery, chemotherapy, and radiotherapy, immunotherapy has been considered as the fourth mode of cancer treatment. More and more clinical data show that cancer immunotherapy is a key step in clinical cancer treatment [19, 20]. Cancer management methods, especially non-small cell lung cancer, melanoma, urothelial cancer, and kidney cancer, and some preclinical and clinical trials have confirmed that immunotherapy has achieved encouraging results in many malignancies, including pancreatic cancer [21,22,23]. However, based on available data, it is clear that ICB has limited success in pancreatic cancer, which is also related to the low immunogenicity and immunosuppressive tumor microenvironment of pancreatic cancer [24, 25]. Based on the important role of immune response in pancreatic cancer, it is urgent to find new targets to provide powerful tumor prevention and control methods for regulating the body's immunity against tumors, and to add new content in the development of new adjuvant drugs targeted at tumor immunotherapy.

Gene markers, also known as classifiers, are often used to predict prognosis, sometimes even better than TNM staging methods, and have been reported in a variety of cancers [26,27,28,29]. Given the importance of immunity in pancreatic cancer, it is reasonable to speculate that immune-related genes have great promise in predicting prognosis. Multigene signals obtained from reliable algorithms will be superior to single molecules in predicting OS in pancreatic cancer. We constructed and validated a new immune-related gene signature that may be a potential target for cancer treatment, and they can improve the individualized prognosis of patients with pancreatic cancer.

We obtained 8 immune genes identified as independent prognostic factors for pancreatic cancer (ITGA7, COL7A1, SP1, NCBP2, RBM14, DENND4B, LQK1, ZNF709) through single factor COX regression, Lasso regression, and multifactor COX regression, which significantly affects the prognosis of patients with pancreatic cancer (Fig. 3). Some of them have been closely related to pancreatic cancer. The classic transcription factor SP1 has been confirmed to be closely related to the expression of multiple genes and the development of multiple cancers including PDAC. Target gene transcription can promote cell proliferation of pancreatic cancer cells [30, 31].

Previous literatures have also reported similar findings. For example, Wei-Dong Shi identified 12 human pancreatic cancer highly metastatic cell lines, SW19 × 90HM cells, including ITGA7 through microarray analysis [32]. Medicherla et al. showed that SD-208, a small molecule inhibitor of TGF-β receptor I kinase, downregulates the expression of the TGF-β regulatory gene COL7A1 and improves intervention in the development of pancreatic cancer [33]. However, NCBP2, RBM14, LQK1, and ZNF709 have not been reported to be related to pancreatic cancer. These may be the potential targets for the treatment of pancreatic cancer. We then used these genes to construct a clinical prediction model and calculate the risk score for each patient. Univariate COX regression analysis and multivariate COX regression analysis showed that the risk score was an independent predictor of prognosis. The Kaplan–Meier cumulative curve showed low risk. Patients with low risk scores survived significantly longer than patients with high risk scores (Fig. 1c). The highest area under the curve (AUC) values for this model were 0.755 (Fig. 1d). These results confirm the reliability of the model.

Pathway analysis found that high-risk groups were significantly positively associated with well-known cancer signaling pathways such as hypoxia, glycolysis, epithelial-mesenchymal transition, MTORC1 signaling pathway, P53 channel, apoptosis, E2F Targets, MYC Targets V1, and MYC Targets V2. Hypoxia is a common feature of malignant tumors, which is the result of increased oxygen demand due to cancer cell proliferation, tumor vascular dysfunction, and insufficient blood supply. Hypoxic tumor cells mainly rely on glycolysis to obtain energy and participate in the process of tumorigenesis, metastatic invasion, and treatment tolerance. This existing research confirms that hypoxia can regulate tumor immune microenvironment by regulating a variety of immune cells. Low oxygen levels significantly reduce the proliferation and activation of T lymphocytes, reduce the NKG2D receptors on NK cells and thereby inhibit the killing function of NK cells, increase tumor-associated macrophages to induce angiogenesis, and reduce inflammation to promote tumor progression [34,35,36]. On the other hand, it can affect tumor immune microenvironment through glycolysis pathway to affect immune cell infiltration and functional activation, and it is closely related to the efficacy of immunotherapy [37]. Growth factor signal transduction pathway mediated by mTOR complex 1 (mTORC1) promotes cancer metabolism through key enzymes that regulate metabolic pathways. Inhibition of mTOR C1 reduces glycolysis of cancer cells [38, 39].EMT can promote tumor cell infiltration and tumor metastasis and may also make tumor cells escape apoptosis induced by some factors [40, 41]. This process also plays a key role in regulating cellular plasticity in normal human tissues and tumor tissues. Many different cell subsets can be formed by EMT, that is, the heterogeneity of tumor cells is formed [42, 43]. P53 channel, E2F, and MYC are also common important signal pathways that affect tumor progression. This finding suggests that these immune risk genes are involved in the regulation of multiple signal pathways in pancreatic cancer and may have important biological and clinical significance.

New evidence confirms that disturbances in the immune response in the tumor microenvironment play a decisive role in tumor development [44].As a tumor killer, immune cells can interfere with molecular signals and play an important role in tumor biology such as tumor proliferation, metastasis, and invasion [45]. In our study, this signal was related to the immune response, immune microenvironment, and tumor purity of pancreatic cancer. By quantifying 24 types of immune cells, we observed that the low-risk group DC, iDC, pDC, B cells, T cells, Tcm, Tem, TFH, Th17 cells, and cytotoxic cells had a higher level of infiltration. Th1 cells and NK CD56dim cells in the high-risk group increased significantly (Fig. 7b). Correlation analysis showed that DENND4B and ITGA7 were highly positively correlated with T cell lines such as T helper cells, Tcm, Tem, T cells, CD8 T cells, and TReg. ZNF709 and SP1 are highly positively correlated with CD8 T cells, Macrophage, and DC. The accuracy of our results was verified using the TIMER database (Fig. 7a). The higher the level of these cells, the greater the benefit to the patient, and the better the efficacy of ICT. This further illustrates the importance of immune genes as prognostic markers in immune regulatory responses. The above indicates that our results may provide targets for immunotherapy.

Conclusions

In conclusion, based on the TCGA-PAAD cohort, we identified immune-related markers with independent prognostic significance, verified, and analyzed their biological functions, to provide a feasible method to evaluate the prognosis of pancreatic cancer and provide valuable targets for immunotherapy of pancreatic cancer.