Introduction

Colorectal cancer remains a leading disease burden and ranks second in cancer-related mortality worldwide [1, 2]. Preoperative neoadjuvant chemoradiotherapy (NCRT) presents advantageous features in tumor downstaging, surgical resection rates, and anus conservation rates [3,4,5]. Currently, NCRT followed by radical surgery is a mainstay of therapeutic strategy for patients with locally advanced rectal cancer (LARC). However, roughly 15-45% of patients exhibit therapy resistance and are likely to suffer from potential complications and toxicity of NCRT, which must not be neglected [6]. Thus, identifying the regulator genes of NCRT is crucial to improving the treatment efficacy.

Previous studies have reported differentially expressed genes (DEGs) between NCRT-sensitive patients and NCRT-resistant patients and potential markers to predict tumor regression grading (TRG) [7,8,9,10]. Whereas, the prognostic value of the predictor was not explored in several studies. The neoadjuvant rectal (NAR) score has recently been proposed as a composite endpoint for LARC patients to predict clinical outcomes [11, 12]. A low NAR score indicates a positive treatment response and a better prognosis. In view of that, joint analysis of NAR score, TRG, and prognosis may contribute to a better understanding of NCRT regulatory factors and clinical outcomes.

In this study, we performed gene expression profiles on tumor biopsy samples from patients with LARC undergoing NCRT. Weighted gene co-expression network analysis (WGCNA) was used to determine NAR score-related modules and identify candidate genes with predictive and prognostic significance. Patient tissue samples and external datasets were used for validation.

Method

Patients and clinical data collection

A total of 64 patients with LARC undergoing NRCT between 2015 and 2018 in Fujian Medical University Union Hospital were enrolled. Specifically, as we previously reported [13], the radiotherapy consisted of a 45 Gy dose in 25 fractions over five weeks and a boost dose of 5.4 Gy for the tumor. And concurrent chemotherapy was as follows: oral capecitabine 825 mg/m2 twice per day for two weeks. Patients received radical surgery after 6 to 8 weeks from the last dose of radiotherapy. All patients were recommended to receive adjuvant chemotherapy after surgery. Tumor biopsy samples, used for gene expression profiles analysis, were obtained from colonoscopy before NCRT. In addition, LARC patients undergoing NRCT and radical surgery between 2012 and 2014 were also included. Their colonoscopy samples before neoadjuvant treatment were collected to validate and further identify the protein expression of candidate genes. This study was approved by the Institutional Review Board of Fujian Medical University Union Hospital (2019KY006).

TRG was used to assess pathological response to NCRT. In detail, TRG 0, no residual tumor cells; TRG 1, near-complete regression with tumor cells individually or in small groups; TRG2, residual tumor cells with a desmoplastic response; and TRG 3, minimal or no regression. Patients with TRG 0 and TRG 1 were classified as NCRT-sensitive groups, while TRG 2 and TRG 3 as NCRT-resistant groups. NAR score was calculated based on the equation: [5ypN–3 (cT–ypT) + 12]² /9.61 [12]. Hereinto, cT refers to clinical T stage (value: 1, 2, 3, 4), ypT refers to pathological T stage (value: 0, 1, 2, 3, 4), and ypN refers to pathological nodal status (value: 0, 1, 2).

Gene expression analysis

The total RNA was extracted from tumor biopsy samples using Trizol reagent (Invitrogen) according to the manufacturer’s instructions. NanoDrop ND-1000 monitored RNA quality control and quantification. Labeling, hybridization, and scanning were carried out according to standard protocols. Then, data quality control and normalization were performed.

DEGs identification and enrichment analysis

The DEGs were identified using the R “limma” package. Genes with |log fold change (logFC)|> 0.5 and p-value < 0.05 were determined as DEGs between the NCRT-sensitive and the NCRT-resistant group. Visualization and comparison of DEGs use volcano plots and heatmaps. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) were used to explore the biological functions of DEGs by the R “clusterProfiler” package.

WGCNA

WGCNA was carried out using the R “WGCNA” package [14]. That is, the soft threshold power was set, and the topological overlap matrix (TOM) was calculated. Modules were determined by the Dynamic Tree Cut method. Then, high similarity modules were merged by clustering analysis. The correlation between modules and clinicopathological features is calculated.

Immunohistochemical analysis

The protein expression of the candidate genes was assessed by immunohistochemistry. Immunohistochemical staining for candidate genes was conducted as described earlier [15]. The following antibodies were used: anti-FBXO7 (203,049-T40, Sino Biological, China), anti-GSTT4 (bs-16345R, Bioss, China), anti-CCT5 (11603-1-AP, Proteintech, China), and anti-ELF1 (22565-1-AP, Proteintech, China), and anti-SLC44A1 antibody (14687-1-AP, Proteintech, China). Immunohistochemical results were scored using a semi-quantitative scoring method. Specifically, data were collected from random visual fields of five different areas. The intensity of staining was scored as 0 (negative), 1 (light yellow), 2 (brown), and 3 (deep brown). The percentage of positive cells was scored as 0 (< 5%), 1 (5–25%), 2 (25–50%), 3 (50–75%), and 4 (> 75%). The two values were multiplied and calculated as immunohistochemical scores. Scores of 0–4 were considered low expression, and those with scores above 4 were classified as high expression.

External validation datasets

External validation datasets were obtained from Gene Expression Omnibus (GEO) database. The GSE3493, including 46 LARC patients undergoing NRCT, was identified to assess the gene expression of the candidate genes between NCRT-sensitive groups and NCRT-resistant groups. After excluding samples with unknown TRG information, a total of 56 LARC patients undergoing NRCT in GSE119409 were also included in the external validation. The GSE133057 (33 LARC patients) was adopted for the validation of survival outcomes.

Gene set enrichment analysis

Gene set enrichment analysis (GSEA) was employed to investigate the potential biological pathways of candidate genes. LARC patients were enrolled in high and low-expression groups of candidate genes based on the median expression. The false discovery rate (FDR) < 0.25 and P < 0.05 were accepted as statistically significant.

Immune infiltration analysis

The relationship between gene expression and immune cells infiltration (including B cell, CD8 + T cell, CD4 + T cell, macrophage, neutrophil, and dendritic cell) was explored using Tumor IMmune Estimation Resource (TIMER).

Drug-sensitive analysis

The drug sensitivity dataset was obtained from the CellMiner website [16]. Pearson’s correlation test determined the correlation between target genes and drug sensitivity. These correlation results were visualized using the R package ggplot2.

Statistical analysis

Statistical analysis was carried out with R software (version 4.1.2), GraphPad Prism 8, and SPSS (version 22). X-tile software was adopted to determine the cut-off points for the gene expression of the candidate genes. T-tests or non-parametric tests were done to test differences between continuous variables. The categorical data were analyzed by Fisher’s exact or Chi-square tests. Kaplan–Meier (KM) analysis was used to estimate survival outcomes using a log-rank test. Correlations were evaluated using Pearson correlation test. Receiver operating characteristics (ROC) analysis was performed, and the area under the curve (AUC) was calculated to assess the TRG and predictive survival capacities of the candidate genes. Cox proportional hazards regression analysis identified independent risk factors for progression-free survival (PFS) and overall survival (OS). A nomogram was created based on the above factors by the R “rms” package. The calibration curve was used to evaluate the performance of the model. P-value < 0.05 was considered statistically significant.

Result

DEGs and functional enrichment

The gene expression profile based on gene chips was obtained from 64 LARC patients (Supplementary Table 1) before NCRT in our cohort. A total of 333 DEGs were identified between NCRT-sensitive patients and NCRT-resistant patients. Among these, compared to the NCRT-sensitive group, 94 genes are upregulated, and 239 genes are downregulated in the NCRT-resistant group. Heatmap and volcano plot were shown in Fig. 1A and B. We performed GO and KEGG analyses to explore the potential biological significance of DEGs. GO analyses revealed that the DEGs were significantly enriched in serine phosphorylation of STAT protein, natural killer cell activation involved in immune response, response to exogenous dsRNA, and so forth (Fig. 1C). Furthermore, the top 10 enriched KEGG pathways are presented in Fig. 1D, including cytokine-cytokine receptor interaction, the Toll-like receptor signaling pathway, and cytosolic DNA-sensing pathway, respectively.

Fig. 1
figure 1

DEGs between NCRT-sensitive patients and NCRT-resistant patients and functional enrichment

In 64 LARC patients, (A) Volcano plot of DEGs; (B) Heatmap of DEGs; (C) Top 10 pathways of GO enrichment analysis; (D) Top 10 pathways of KEGG functional enrichment

LARC: locally advanced rectal cancer; DEGs: differentially expressed genes; NCRT: neoadjuvant chemoradiotherapy; GO: Gene Ontology; KEGG: Kyoto Encyclopedia of Genes and Genomes

Identification of candidate genes

A weighted gene co-expression network was constructed to identify NCRT-regulated genes (Fig. 2A). A total of 3 modules were identified by merging the high similarity modules. Here, as seen in Fig. 2B, a negative correlation of the brown module with TRG was present (r=-0.24, P = 0.06), and the blue module positively correlated with carcinoembryonic antigen (CEA, r = 0.25, P = 0.05). The turquoise module (r = 0.28, P = 0.02) had the highest positive correlation with the NAR score, while the brown module (r=-0.47, P < 0.01) with the highest negative correlation. Next, we also identified 1147 TRG-associated genes and 692 cancer progression genes(PFS-associated genes), respectively (all P < 0.05). The intersection among the three sets was estimated to determine the candidate genes with the predictive capabilities of both pathological response and prognosis. Finally, there were five genes overlapped among three sets, including FBXO7, CCT5, ELF1, GSTT4, and SLC44A1, which were identified as the candidate genes (Fig. 2C). In addition, GO and KEGG pathway analyses were performed for the turquoise module (Fig. 2D and E) and the brown module (Fig. 2F and G) genes to gain a more comprehensive understanding of biological effects.

Fig. 2
figure 2

WGCNA analysis

In 64 LARC patients, (A) Identification of WGCNA modules dynamic tree cut method; (B) The relationship between modules and clinical phenotypes; (C) Venn diagram showed the intersection of NAR-associated, TRG- associated, and cancer progression(also PFS)-associated gene sets; (D) Top 10 pathways of GO enrichment analysis of the turquoise module; (E) Top 10 pathways of KEGG functional enrichment of the turquoise module; (F) Top 10 pathways of GO enrichment analysis of the brown module; (G) Top 10 pathways of KEGG functional enrichment of the brown module

LARC: locally advanced rectal cancer; WGCNA: weighted gene co-expression network analysis; NAR: neoadjuvant rectal; TRG: tumor regression grading; PFS: progression-free survival; GO: Gene Ontology; KEGG: Kyoto Encyclopedia of Genes and Genomes

Internal validation of predictive capabilities of candidate genes

Next, to test the discriminant power of candidate genes in NCRT sensitivity, the correlations between candidate genes’ expression and TRG grade were calculated. The R-values of the correlation for FBXO7, CCT5, ELF1, GSTT4, and SLC44A1 were 0.27 (P = 0.032), 0.27 (P = 0.028), 0.26 (P = 0.041), -0.25 (P = 0.047), and 0.21 (P = 0.092), respectively (Fig. 3A and E). We also evaluated the association between candidate genes and NAR score, and the results revealed a significant correlation between these (Fig. 3F J). Next, we employed the gene expression of five candidate genes to predict the pathologic response of NCRT. As shown in Fig. 3K and O, the AUC values of FBXO7, CCT5, ELF1, GSTT4, and SLC44A1 to predict NCRT response were 0.637 (P = 0.059), 0.654 (P = 0.034), 0.638 (P = 0.057), 0.632 (P = 0.071), and 0.672 (P = 0.018), respectively. Furthermore, the ability to assess the prognosis of five candidate genes was further explored. The cut-off points of gene expression were determined by X-tile software for survival analysis (Supplementary Fig. 1). KM analysis revealed the candidate genes expression at diagnosis could predict the survival of LARC patients undergoing NRCT. Specifically, high expression of FBXO7, CCT5, ELF1, and SLC44A1 were correlated with poor survival, while high expression of GSTT4 was associated with a better prognosis (Fig. 3P and Y). Notably, no significant overall survival differences were observed in high ELF1 expression (P = 0.242).

Fig. 3
figure 3

Internal validation of candidate genes

In 64 LARC patients, the correlations between FBXO7 (A), CCT5 (B), ELF1 (C), GSTT4 (D), and SLC44A1 (E) expressions and TRG grade; The association between FBXO7 (F), CCT5 (G), ELF1 (H), GSTT4 (I), and SLC44A1 (J) expressions and NAR score; ROC analysis for the expression of FBXO7 (K), CCT5 (L), ELF1 (M), GSTT4 (N), and SLC44A1 (O) to predict NCRT response; KM survival curves for PFS (P-Q) and OS (U-Y) of FBXO7, CCT5, ELF1, GSTT4, and SLC44A1.

LARC: locally advanced rectal cancer; TRG: tumor regression grading; NAR: neoadjuvant rectal; NCRT: neoadjuvant chemoradiotherapy; KM: Kaplan − Meier; ROC: receiver operating characteristics; AUC: the area under the curve

External validation analysis

The GSE3493 and GSE119409 datasets were collected from the GEO database for external validation. Gene expression was applied to predict the treatment response (NCRT-sensitive). As depicted in Supplementary Fig. 2A-2D, in GSE3943 dataset, the AUC of FBXO7, CCT5, ELF1 and GSTT4 were 0.636 (P = 0.18), 0.512 (P = 0.91), 0.697(P = 0.05), and 0.575 (P = 0.46), respectively. As displayed in Supplementary Fig. 2E-2I, in GSE119409 dataset, CCT5 showed the highest power (AUC = 0.689, P = 0.03), followed by FBXO7 (AUC = 0.593, P = 0.29), SLC44A1 (AUC = 0.586, P = 0.33) GSTT4 (AUC = 0.527, P = 0.76) and ELF1 (AUC = 0.515, P = 0.86). Furthermore, we also evaluated the effects of candidate genes on clinical outcomes. Based on X-tile, candidate genes were separated as the high and low expression groups in GSE133057. We observed that patients of the CCT5 high expression groups had significantly worse prognoses (P = 0.012) and that a trend toward poor survival in the high expression of FBXO7, ELF1, GSTT4, SLC44A1 (all P > 0.05,Supplementary Fig. 2J-2 N). However, due to SLC44A1 of GSE3493 being missing, we were unable to verify its performance.

Immunohistochemistry validation of candidate genes

To further validate the gene expression of five candidate genes in LARC patients undergoing NCRT, a total of 117 patients were enrolled in the validation cohort. Clinicopathological features of the validation cohort were shown in Supplementary Table 2. We assessed the protein expression of five candidate genes by immunohistochemical staining in tumor biopsy samples from these patients before NCRT. Supplementary Fig. 3 illustrated the different expression levels of five genes. The result demonstrated a higher immunohistochemistry score of CCT5 and ELF1 in NCRT-resistant patients compared with NCRT-sensitive patients (all P < 0.05,Fig. 4B C), while no difference was observed in the expression of FBXO7, SLC44A1, and GSTT4 (Fig. 4A, D and E). The ROC curve indicated the AUC of CCT5 and ELF1 to predict the treatment response to NCRT were 0.727 and 0.717 (Fig. 4G H, all P < 0.05), and FBXO7, GSTT4, and SLC44A1 cannot differentiate between two groups (Fig. 4F J, all P > 0.05). Moreover, CCT5 and ELF1 expression were significantly related to the NAR score (Fig. 4L M), while the trend was not observed in FBXO7, GSTT4, and SLC44A1 (Fig. 4K N-4O). Survival analysis demonstrated that high expression of FBXO7, CCT5, and ELF1 had worse PFS (P = 0.010, P = 0.058, P = 0.018, respectively, Fig. 4P and R) and OS (P = 0.043, P = 0.069, P = 0.003, respectively,Fig. 4U W). At the same time, there was no difference in survival between high and low GSTT4 and SLC44A1 expression, (Fig. 4S, T, X and Y).

Fig. 4
figure 4

Immunohistochemistry validation analysis

In the validation cohort of 117 LARC patients, immunohistochemistry score of FBOX7 (A), CCT5 (B), ELF1 (C), GSTT4 (D), and SLC44A1 (E) between NCRT-sensitive and NCRT-resistant patients, ROC analysis for candidate genes to predict NCRT response (F-J), the correlation between FBOX7 (K), CCT5 (L), ELF1 (M), GSTT4 (N), and SLC44A1 (O) and NAR score, KM survival curves for PFS (P-Q) and OS (U-Y) of FBXO7, CCT5, ELF1, GSTT4, and SLC44A1.

NCRT: neoadjuvant chemoradiotherapy; ROC: receiver operating characteristics; AUC: the area under the curve; KM: Kaplan − Meier

Construction of a risk score model

We further performed Cox analysis to determine the prognostic value of genes. On univariate analysis, a high CCT5 and ELF1 expression were associated with shorter OS (all P < 0.05). Multivariate analysis revealed that CCT5 (hazard ratio [HR], 1.141, 95% confidence interval [CI]: 1.007–1.294, P = 0.039) and ELF1 (HR,1.179, 95%CI, 1.067–1.304, P = 0.001) were independently associated with prognosis (Table 1). Hence, based on coefficients obtained from Cox analysis, the risk score was computed as: 0.132×CCT5 expression + 0.165×ELF1 expression. We divided the validation cohort into two groups based on the median risk score, and the clinicopathologic characteristics of the two groups were presented in Supplementary Table 3. The risk score of the pathological complete response (pCR) group was significantly lower than the non-pCR group (Fig. 5A), with excellent predictive capacity (AUC = 0.780, P < 0.01, Fig. 5B). A significant correlation was observed between the risk score and NAR score (R = 0.43, P < 0.01, Fig. 5C). Survival analysis showed the risk score has an excellent ability to discriminate clinical outcomes (Fig. 5D H).

Table 1 Cox analysis of OS for candidate gene in the validation cohort of 117 LARC patietns
Fig. 5
figure 5

Development of a risk score model

In the validation cohort of 117 LARC patients, the risk score between non-pCR and pCR patients (A) and the capacities of the risk score to predictive pCR (B), the relationship between risk score and NAR (C) and survival (D), KM analysis between high-risk and low-risk groups in PFS (E) and OS (F), ROC analysis for the risk score to predict PFS (G) and OS (H).

LARC: locally advanced rectal cancer; pCR: pathological complete response; NAR: neoadjuvant rectal; KM: Kaplan − Meier; ROC: receiver operating characteristics; AUC: the area under the curve; PFS: progression-free survival; OS: overall survival

Prognosis analyses and development nomogram

We after that evaluated the prognostic value of the risk score. Univariate analysis revealed that the ypTMN stage (P < 0.001), TRG (P < 0.001), pathological type (P = 0.036), risk score (P < 0.001), and NAR score (P = 0.006) were significantly associated with PFS. Multivariate analysis manifested that the ypTMN stage ( P = 0.040) and risk score ( P = 0.025) were independent risk factors of PFS (Table 2). Furthermore, as shown in Table 3, the ypTMN stage ( P = 0.031) and risk score ( P = 0.008) were significantly correlated with OS by multivariate analysis. Results of the analysis based on Cox analysis, nomograms were constructed to predict 1-year, 3-year-, and 5-year PFS (Fig. 6A) and OS (Fig. 6C), and good calibration was also confirmed (Fig. 6B and D).

Table 2 Univariate and multivariable Cox analyses of PFS in the validation cohort of 117 LARC patients
Table 3 Univariate and multivariable Cox analyses of OS in the validation cohort of 117 LARC patients
Fig. 6
figure 6

Construction nomograms for PFS and OS.

In the validation cohort of 117 LARC patients, nomogram to predict PFS (A) and OS (B) for LARC patients undergoing NCRT, the calibration curve was used for model validation for PFS (C) and OS (D).

PFS: progression-free survival; OS: overall survival; LARC: locally advanced rectal cancer; NCRT: neoadjuvant chemoradiotherapy

External validation for risk score

The risk score of the NCRT-resistant group was considerably higher in the GSE3493 cohort than the NCRT-sensitive group ( Supplementary Fig. 4A) with an adequate capacity to discriminate NCRT-sensitive patients (AUC = 0.701, P = 0.046, Supplementary Fig. 4C). However, we found no statistical difference between the gene expression of the GSE119409 cohort ( Supplementary Fig. 4B, 4D) and the prognosis of the GSE133057 cohort (all P > 0.05, Supplementary Fig. 4E).

GSEA analysis

Following that, we look into the potential mechanisms by which CCT5 and ELF1 might impact NCRT sensitivity. The median expression level defined the groups with high and low CCT5 and ELF1 expression. Colorectal cancer, apoptosis, and DNA replication were enriched in the high CCT5 expression group, according to GSEA analysis (Fig. 7A). Furthermore, the high ELF1 expression group was significantly enriched with apoptosis, cell cycle, mTOR signaling pathway, and cancer pathway (Fig. 7B). These pathways contribute to a better understanding of the mechanism of NCRT resistance.

Fig. 7
figure 7

GSEA analysis

In 64 LARC patients, potential biological pathways are enriched in the high CCT5 expression group (A) and the high ELF1 expression group (B) LARC: locally advanced rectal cancer

Drug sensitivity analysis

We then examined the association of CCT5 and ELF1 with drug sensitivity using the CellMiner database. Figure 8 demonstrated the results of the drug sensitivity analysis. For instance, the ELF1 expression was positively associated with the drug sensitivity of vorinostat, artesunate, nilotinib, and selumetinib (All P < 0.05).

Fig. 8
figure 8

Drug sensitivity analysis

The relationship between CCT5 and ELF1 and drug sensitivity using the CellMiner database. The x-axis represents the gene expression, and the y-axis is drug sensitivity

Analysis of immune infiltration

Next, we also investigated immune cell infiltration. As shown in Fig. 9A, positive correlations were observed between CCT5 expression and the infiltration of CD8 + T cell (P < 0.01) and neutrophil (P < 0.05). B cell, CD8 + T cell, macrophage, and neutrophil infiltration were significantly associated with ELF1 expression (All P < 0.05, Fig. 9B).

Fig. 9
figure 9

Analysis of immune infiltration

The correlations between CCT5 (A), ELF1 (B) expression, and immune cell infiltration using Tumor Immune Estimation Resource database

Discussion

Previous studies have demonstrated the superiority of NCRT in LARC treatment. At the moment, the mechanism of NCRT resistance remains unclear. Given treatment resistance in certain patients and potential complications and toxicity, further research is warranted. The expression profile of the pre-treatment tumor sample from our cohort was used for the analyses in this study. WGCNA was utilized to determine NAR score-related modules and genes. To determine the potential genes with a high ability to predict sensitivity to NCRT and prognosis, TRG-related and cancer progression (also PFS)-related genes were also identified. The intersection of three gene sets was regarded as candidate genes for additional validation at the protein level. The protein expression of CCT5 and ELF1 were significantly different in the NCRT-sensitive and NCRT-resistant groups according to the results of immunohistochemistry staining of candidate genes. Furthermore, high-level expression of CCT5 and ELF1 were linked to a poor prognosis.

The NAR score has recently been considered a potential surrogate endpoint for the prognosis of LARC patients undergoing NCRT [11]. This score formula was established on the basis of Valentini et al. nomogram for predicting the overall survival in LARC patients, and it considered both pretreatment tumor burden and posttreatment tumor regression [17]. Previous researches have demonstrated that the NAR score can accurately predict clinical outcomes and assist in assessing the necessity of adjunctive therapy [18,19,20]. Thus, we found NAR score-related genes using WCGNA, a more biological approach to correlating modules with phenotypic features. However, there is some debate as to the prognostic value of the NAR score. A study from the Netherlands demonstrated that a combination model based on clinical data and pathological data exceeded the NAR score in assessing survival outcomes [21]. As a result, cancer progression (also PFS)-associated genes were also discovered. Therefore, discerning therapy response before the surgical procedure aided clinical decision-making. After adequate evaluation, Watch & Wait and local excision might become a suitable therapy for NCRT-sensitive patients, maximizing the patient and enhancingn appropriat the quality of life [22]. Hence, we identify TRG-associated genes, and the intersection of three gene sets contributes to determining the genes with the value of predicting prognosis and treatment response.

FBXO7, CCT5, ELF1, GSTT4, and SLC44A1 were discovered as biomarkers for treatment response and prognosis in our cohort. The value of CCT5 and ELF1 in predicting response to therapy was validated in the GSE3493 and GSE119409 cohorts. The survival analysis of the GSE133057 cohort showed that increased expression of FBXO7, CCT5, ELF1, and SLC44A1 implies poor survival, whereas high expression of GSTT4 predicts the opposite outcomes. These survival trends match those obtained in our cohort. CCT5 and ELF1 were eventually determined after the IHC validation of protein levels.

CCT5 is a protein folding subunit of the chaperonin containing TCP1 complex (also known as TCP1-ring complex). Recently, accumulating evidence reveals that CCT5 was implicated in tumor progression, and high expression of CCT5 has been discovered in a range of malignancies and associated with worse survival [23,24,25]. Notably, CCT5 is involved in chemotherapy resistance. In breast cancer, the knockdown of CCT5 leads to increased apoptosis after docetaxel treatment [26]. CCT5 expression was elevated in multidrug-resistant gastric carcinoma cells [27]. Nevertheless, the functions of CCT5 in NCRT of LARC patients remain elusive. Our studies demonstrate that high CCT5 expression correlated with NAR score, NCRT resistance, and poor prognosis in LARC patients.

ELF1 is a transcription factor that belongs to the ETS family and plays contrasting roles in different tumors. Some reports pointed out the part of ELF1 has tumor-promoting effects in glioma [28, 29], oral squamous cell carcinoma [30], choroidal melanoma [31], endometrial carcinoma [32], acute myeloid leukemia [33], nasopharyngeal carcinoma [34], and osteosarcoma [35], while others have reported a tumor-suppressive function in Hodgkin lymphoma [36] and prostate cancer [37]. Starr et al. found that ELF1 regulates the expression of TM9SF2, an oncogene in colorectal cancer [38]. Herein, our findings indicated that ELF1 expression was more strongly raised in NCRT-resistant patients, and ELF1 was an independent predictor of survival. Going further, GSEA analysis revealed apoptosis, cell cycle, and mTOR signaling pathway are significantly enriched in high ELF1 expression, and those pathways were implicated in treatment resistance [39,40,41].

Compared to a single prognostic factor, the advantages of the risk model and nomogram have previously been reported [42, 43]. In our study, we build a risk score model based on CCT5 and ELF1 IHC expression that effectively discriminated between non-pCR and pCR patients and survival. Nomograms were created by taking together the result of Cox regression analysis. These models are helpful in predicting the prognosis of LARC patients undergoing NCRT.

Furthermore, a growing body of studies has revealed the impact of tumor microenvironment on chemoradiotherapy sensitivity. One study reported that increased CD163 + tumor-associated macrophages (TAM) were observed in non-pCR group in LARC patients [44]. Meanwhile, TAM infiltration was confirmed with chemoradiotherapy resistance in oral squamous cell carcinoma and cervical cancer [45, 46]. In addition, several previous studies have revealed that increased neutrophils were associated with poor chemoradiotherapy response and worse clinical outcomes [47,48,49,50]. Here, our study also explored the potential role of CCT5 and ELF1 in the tumor microenvironment and found the infiltration level of CD8 + T cell and neutrophil were positively associated with CCT5 and ELF1 expression, and the same trend was observed in the correlation between B cell and macrophage and ELF1 expression. However, there are limited studies about CCT5 or ELF1 leading to immune infiltration, and the specific mechanism remains to be elucidated. Nonetheless, these findings provided valuable information and potential directions for future research.

There are limitations to this study. More external datasets are required to confirm and validate our findings. Meanwhile, bioinformatics has deconstructed the probable processes of CCT5 and ELF1, but further investigation will be needed to clarify their biological role and impact on the cancer microenvironment in future studies.

Conclusion

In summary, CCT5 and ELF1 were determined as biomarkers for NCRT treatment response and prognosis by internal and external validation. The risk score model and nomogram were constructed to predict survival. These findings contributed to personalized clinical decision-making for LARC patients undergoing NCRT.