Introduction

Clear-cell renal cell carcinoma (ccRCC) represents the most prevalent histological subtype of renal cell carcinoma (RCC), accounting for over 70% of the cases worldwide [1,2,3,4,5]. With its high invasiveness and recurrence, ccRCC annually causes more than 175,000 mortality rates worldwide and a poor 5-year survival probability of approximately 50% [6,7,8,9,10]. Although partial or radical nephrectomy is currently the optimal choice for ccRCC, about 25% of patients present with lymph node or distant metastasis at the first diagnosis and are unable to undergo surgery, while over 30% of ccRCC patients experience relapse and metastases after surgery [11,12,13]. The considerable exploration of prognostic models of ccRCC and the wide application of immune checkpoint-blocking therapies have gradually shown powerful potential and may offer new therapeutic options for ccRCC patients [14,15,16,17,18]. Emerging evidence has proven that immune cells serve an essential function in ccRCC pathogenesis, providing a novel outlook into immunotherapy for ccRCC [19,20,21,22]. The previously constructed immune/stromal scores have indicated a strong association between ccRCC prognosis monitoring and precision immunotherapy [23]. And the 3 immune infiltration patterns of ccRCC showed that ccRCC had the highest immune infiltration and T cell infiltration score [24]. Importantly, the complete single-cell chromatin chart of immune cells in ccRCC assisted us in better understanding immune cell functional states in ccRCC [25]. Meanwhile, several single-cell sequencing analyses provided an in-depth knowledge of the immune landscape and immune cell infiltration patterns of ccRCC, which contributed to setting a foundation for the theoretical basis of targeted therapy [26,27,28].

Recently, natural killer (NK) cells have been shown to play a critical role in immune surveillance against viruses, bacterial infections, and tumors [29,30,31]. NK cells encode both activating and inhibitory receptors and further integrate diverse signaling pathways to exert corresponding functional outcomes, especially in the production of cytokines and chemokines [32,33,34]. The significant advances in NK-cell biology and the gradual unveiling of molecular mechanisms have provided novel strategies involving the regulation of NK activation and cancer cell recognition, which are considered as important for cancer immunotherapy [35,36,37,38,39]. Moreover, immune cell-based prognostic models have been widely explored in clinical settings for guiding cancer management [40,41,42]. An increasing number of NK cell-related genes are being widely studied for their important roles in various types of tumors. For example, NKG2D, NKp30, and NKp46 trigger cytotoxic effects by binding to ligands on the surface of tumor cells, thus exerting anti-tumor effects [43,44,45]. And NK cells are recognized as a significant prognostic factor for multiple cancers. In gastric cancer, researchers have found that abnormal percentage of NK cell in peripheral blood predicts patients' poor survival rates [46]. In triple-negative breast cancer, a 5-NK cell-related gene model also exhibit powerful value for prognosis prediction and immunotherapy evaluation [47]. However, there is currently no comprehensive understanding on the role of NK cell-related genes in ccRCC.

Therefore, the objective of this study is to investigate the role of NK cells in the carcinogenesis process of ccRCC, and to evaluate the potential value of targeting NK cell-related genes for optimizing risk stratification and predicting treatment efficacy in ccRCC. We estimated associations between the expression features and functional features of NK cell-related genes with ccRCC, and constructed a prognostic risk model to estimate the tumor immune microenvironment, predict prognosis, explore the treatment response, and allow for risk stratification. Our findings can aid in accurate therapeutic decisions being made for ccRCC patients.

Materials and methods

Data collection and processing

The transcriptome RNA-sequencing (RNA-seq) data and relevant clinical information on ccRCC were retrieved from TCGA-KIRC project of the Cancer Genome Atlas (TCGA) database (https://portal.gdc.cancer.gov/projects/TCGA-KIRC). Total 134 genes associated with natural killer cells were obtained from the Immunology Database and Analysis Portal (ImmPort) database (https://www.immport.org/home) [48]. The RECA-EU database of the International Cancer Genome Consortium (ICGC) database was searched to obtain the corresponding RNA-seq data and clinical information (https://dcc.icgc.org/projects/RECA-EU) as validation set.

Cell culture and qRT-PCR analysis

Human ccRCC ACHN cells and kidney proximal tubular epithelial (HK2) cells were cultivated in Dulbecco's modified Eagle’s medium (DMEM) with 10% fetal bovine serum (FBS) and 1% penicillin–streptomycin. All cells were cultured under humidified conditions at 5% CO2 and at 37 °C. We conducted a qRT-PCR to quantify the expression of the genes selected to be used in the model. Total RNA was obtained from the cultured cells using TRIzol reagent (Invitrogen, CA, USA). The PrimeScript™ RT reagent kit (Takara, Shiga, Japan) was applied to reverse transcribe the RNA into complementary DNA (cDNA), following the manufacturer’s instructions [49]. Sequences of all primers used are presented in Additional file 1: Table S1. The experiments on each gene were performed in triplicate and the average cycle threshold (Ct) was computed. The mRNA levels were normalized to that of the housekeeping gene, GAPDH, using the 2−ΔΔCt method.

Identification of NK cell-related genes specific to ccRCC prognosis and the establishment of a molecular cluster

A univariate Cox regression model analysis was performed on the TCGA database to select prognostic genes among the NK cell-related genes using the R package survival “coxph” function. Then, the R package, ‘ConsensusClusterPlus’, was used to construct a consistency matrix and classify the ccRCC samples in TCGA dataset into distinct molecular clusters. The optimal number of clustering was identified using the cumulative distribution function (CDF) curve [50]. Kaplan–Meier survival analysis was performed to estimate the prognosis of patients in each molecular cluster.

Relationships between molecular clusters and clinical features, immune infiltration, and treatment response

Moreover, we also explored the distribution of clinical-pathological features among the distinct molecular clusters using a Chi-square test and one-way analysis of variance (ANOVA) using SPSS version 20.0 (SPSS Inc., Chicago, IL). To describe the differences of immune cell infiltration and the abundance of stromal cells in each distinct cluster, we implemented the R package, microenvironment cell populations-counter (MCP-counter) [51], single-sample gene set enrichment analysis (ssGSEA) [52], as well as the Estimation of Stromal and Immune cells in Malignant Tumors using Expression data (ESTIMATE) algorithm [53]. Meanwhile, we assessed the immunotherapeutic response of each cluster to immune checkpoint blockade (ICB) using the Tumor Immune Dysfunction and Exclusion (TIDE) algorithm [54]. In addition, IC50 concentrations were also assessed using the R package, ‘pRRophetic’, to measure the sensitivity of the ccRCC patients in each cluster to several chemotherapeutic agents in the GDSC database (Genomics of Drug Sensitivity in Cancer, https://www.cancerrxgene.org/).

Establishment of a NK cell-related model

Based on the above molecular clusters, we performed differential gene expression (DGE) analysis on each couple of clusters using the R package, ‘limma’. The selection criteria were log fold change (FC) > 1 and adjusted p-value < 0.05. Then, we took the intersection between the significantly differential expressed NK cell-related genes (DE-NKRGs) of each cluster. The univariate Cox analysis was used to identify the DE-NKRGs with a threshold of p < 0.001. To construct the prognostic model, least absolute shrinkage and selection operator (LASSO) regression were performed on the TCGA database using the R package, ‘glmnet’. Based on the minimum criteria, the optimal tuning parameter λ and coefficients were calculated through tenfold cross-validation. The multivariable stepwise Cox regression model was further used to identify parameter values using the Akaike information criterion (AIC). Then, a prognostic gene model was constructed based on the linear combination of the LASSO regression coefficient (β) weighted by its mRNA expression level. The patients were divided into high-risk or low-risk groups based on the median score. Meanwhile, the ROC curve was constructed using the R package, ‘time ROC’ to estimate model performance. The Kaplan–Meier (KM) curve was analyzed using the R package, ‘survival’ to analyze the outcomes of both groups. Finally, we used the ICGC database to validate the prognostic value and robustness of the model.

Relationships between risk score and clinical characteristics, immune infiltration, and therapeutic effects

The relationships between the risk score and multiple clinical factors, including neoadjuvant status, sex, and age, were estimated. Subsequently, we employed the MCP-counter method to compare the degree of immune cell infiltration between the high- and low-risk groups of the TCGA cohort. It is generally known that differences in the expression of immune checkpoint genes influence the response after immune checkpoint inhibitor treatment in malignant tumors. Therefore, we evaluated differences in the expression levels of the immune checkpoints downloaded from the HisgAtlas database (http://biokb.ncpsb.org/HisgAtlas/) between two risky groups [55]. The TIDE algorithm was further used to analyze the efficacy of immunotherapy. To explore the sensitivity to different types of treatments in each risk group of TCGA cohort, ridge regression was applied to assess the IC50 of the chemotherapy drugs.

Predictive decision-tree model and nomogram construction

To explore the importance of the risk score and clinical factors, 4 clinical factors, age, gender, grade, and TNM stage, were extracted from the above analysis and used as input features, and were used along with the risk score to construct the decision tree model using R package, ‘rpart’. Univariate and multivariate Cox regression analyses were further performed to identify significant prognostic factors for ccRCC. Furthermore, the nomogram model, which integrated clinical features (age, gender, and tumor stage) and the risk score, was constructed using the R package, ‘rms’, to clearly and precisely predict the 1-, 3-, and 5-year outcomes of the patients [56]. Subsequently, the calibration curves were used to test the performance of the nomogram against the actual survival rate. Finally, time-dependent (tROC) analysis was performed using the R package, ‘survivalROC’ to estimate the predictive power of the nomogram, stage, grade, TNM stage, age, and risk score [57].

Statistical analysis

Statistical analyses were performed using R version 4.0.4 and SPSS 26.0 software, while figures were drawn using GraphPad Prism 8.4.2 software. Differences between groups were compared using the Wilcoxon test or Kruskal–Wallis test. A p value < 0.05 was considered to indicate statistical significance in all analyses.

Results

Molecular clusters of ccRCC based on the NK cell-related genes

First, univariate Cox regression analysis was used to screen the NK cell-related genes that were significantly associated with the prognosis of ccRCC data obtained from the TCGA database. Overall, 55 genes were selected based on their prognostic value, including 18 hazardous genes and 37 protective genes (Fig. 1a). Based on the expression levels of these 55 genes, we conducted a consensus clustering analysis on the ccRCC patients to better understand the role of the NK cell-related genes in ccRCC. We continuously increased the clustering variable k from 2 to 9, and found that k = 3 produced the ideal cumulative distribution function (CDF) value and delta area (Fig. 1b, c). Therefore, the patients in the TCGA cohort were divided into 3 clusters of NK-related genes: C1, C2, and C3 (Fig. 1d). KM analysis illustrated that C2 genes increased overall survival (OS) and that the C3 patients had relatively poorer outcomes (Fig. 1e).

Fig. 1
figure 1

Development of molecular clusters based on the expression of NK cell-related gene in ccRCC using the TCGA database. A Univariate Cox analysis of 134 NK cell-related genes. B CDF curves for k ranging from 2 to 9. C Corresponding change in the area under the CDF curve. D Consensus clustering matrix for the optimal, k = 3. E Kaplan–Meier survival analysis of patients in cluster 1, cluster 2, and cluster 3

Comparison of clinical features, immune characteristics of TME, and the therapeutic response between the 3 molecular clusters

The distribution of clinical features among the 3 clusters showed that differences exist based on the stage, grade, T stage, M stage, age, and gender of the 3 clusters in the TCGA database. Patients younger than 60 years with less advanced tumor stages were more concentrated in the C2 cluster (Additional file 1: Fig. S1A). Additionally, the interaction between the tumor immune microenvironment and cancer cells determined cancer progression and the efficacy of immunotherapy. The MCP-Counter algorithm was applied to identify the degree of infiltration of 10 immune cells and it was found that the immune scores were highest in the C2 cluster (Additional file 1: Fig. S1B). Through the ESTIMATE method, we also found significant differences in the infiltration of stromal and immune cells into tumor tissues. The results revealed that the C2 cluster possessed higher stromal and immune scores (Additional file 1: Fig. S1C). Based on these immune features, we determined the expression of immune checkpoint genes and most immune checkpoint genes were found to be highly expressed in the C2 cluster (Additional file 1: Fig. S1D). To further evaluate patient populations that may benefit from immunotherapy, the TIDE algorithm found that the C3 cluster had a higher TIDE score than the other 2 clusters (Additional file 1: Fig. S1E). The C2 cluster was found to be more sensitive to traditional chemotherapies, such as rapamycin, sunitinib, paclitaxel, sorafenib, crizotinib, and AKT inhibitor VIII (Additional file 1: Fig. S1F).

Determination of crucial NK cell-related genes

We conducted differential gene expression analyses separately on each pair of clusters using the “limma” R package and obtained 1183 common differentially expressed NK-related genes (DE-NKRGs) for the intersection between the results. Furthermore, univariate COX regression analysis of the DE-NKRGs indicated that 518 genes were mainly involved in patient prognosis and included 3 risk-related genes and 515 protective genes (Fig. 2a). Then, the LASSO Cox regression was employed to shrink the scope of the genes and the model was found to be optimal when the value of lambda was 0. 0473 (Fig. 2b, c). The expression of 6 genes included 2 risk-related genes (DPYSL3 and SLPI) and 4 protective genes (SLC44A4, ZNF521, LIMCH1, and AHR) was estimated in two cell lines by qRT-PCR (Fig. 2d). The qRT-PCR results exhibited that the risk-related gene DPYSL3 was highly expressed and the protective gene SLC44A4 was lowly expressed in ACHN cells, which was consistent with expectations. Meanwhile, ZNF521, LIMCH1, and AHR expression were upregulated in the ACHN cells compared with the HK2 cells. The SLPI expression was no clear difference between ACHN and HK2 cells (Fig. 2e).

Fig. 2
figure 2

Construction and validation of a NK cell-related risk score model. A Univariate COX regression analysis of the DE-NKRGs in the TCGA database. B LASSO coefficient profiles. C Lambda selection through tenfold cross-validation. D Coefficient of the 6 screened genes. E Expression profiles of DPYSL3, SLPI, SLC44A4, ZNF521, LIMCH1, and AHR in the ACHN cells and HK2 cells.

Construction of a NK cell-related risk signature

The selected 6 DE-NKRGs were incorporated into the risk score: risk score = − 0.375*LIMCH1-0.384*AHR-0.287*ZNF521 + 0.412*DPYSL3-0.137*SLC44A4 + 0.121*SLPI. Based on the median value, we separated the patients into low-risk and high-risk groups. It was observed that patients with a high-risk score had a poorer prognosis than that those with a low-risk score (Fig. 3a) [58]. The ROC analysis was conducted to analyze the prognostic efficiency of the risk score. The AUC of the TCGA cohort reached 0.78,95 after 1 year, 0.74,95 after 3 years, and 0.75,95 after 5 years. Similarly, the AUC of the ICGC cohort achieved 0.7,95 after 1 year, 0.69,95 after 3 years, and 0.66,95 after 5 years. Kaplan–Meier analyses of both TCGA and ICGC cohorts consistently showed that low-risk patients tended to have a more favorable outcome (Fig. 3b).

Fig. 3
figure 3

Exploration of the correlation of risk score with ccRCC prognosis. A Kaplan–Meier analysis and ROC curves at 1, 3, and 5 years of high- and low-risk groups in TCGA database. B Kaplan–Meier analysis and ROC curves at 1, 3, and 5 years of high- and low-risk groups in ICGC database

Relationship between the risk score and clinical features, immune microenvironment, and treatment response

We further analyzed the distribution of the clinical features of patients in the TCGA cohort [59]. A remarkable difference exists between the 2 groups in terms of tumor stage. The C2 cluster showed a significantly lower risk score (Additional file 1: Fig. S2A). To elucidate the relationship between the risk score and the immune microenvironment of ccRCC, the MCP-counter tool was used to investigate differences in the infiltration of 10 immune cells between the high- and low-risk groups. The results highlighted that the low-risk group had a higher level of immune cell infiltration (Additional file 1: Fig. S2B) [60]. Moreover, we performed TIDE analysis to further explore the clinical utility of the risk score. The low-risk group possessed a lower TIDE score compared with the high-risk group (Additional file 1: Fig. S2C). In addition, patients in the low-risk group exhibited a higher degree of sensitivity to conventional drugs, such as rapamycin, sunitinib, sorafenib, crizotinib, and AKT inhibitor VIII (Additional file 1: Fig. S2D).

Building a survival decision tree and a predictive nomogram to predict ccRCC survival

Additionally, validated the prognostic model using decision tree analysis integrating risk score and clinicopathological features, including age, gender, T stage, N stage, M stage, stage, and grade. Age, stage, grade, and risk score were entered as inputs into the survival decision tree to obtain 7 risk subgroups (Fig. 4a). Survival analyses also revealed statistically significant differences between the 7 risk subgroups (Fig. 4b). Next, we used univariate and multivariate Cox regression analyses and found that the risk score was the most significant factor for the prediction of ccRCC prognosis (Fig. 4c, d). To create a clinical approach for the prognostic estimation of ccRCC patients, the factors mentioned in the decision tree analysis were further applied to construct a nomogram (Fig. 4e). The results of the nomogram showed that the risk score had a dramatic influence on the survival prediction of ccRCC patients. The calibration plots for 1-year, 3-year, and 5-year survival probabilities showed high consistency between the results of the constructed nomogram and actual clinical conditions (Fig. 4f). Moreover, the tROC curves with high AUC values above 0.7 indicated that the nomogram and risk score provided an accurate prediction of ccRCC prognosis (Fig. 4g). These results demonstrate the favorable predictive value of our risk score.

Fig. 4
figure 4

Establishment of a decision tree and nomogram by combining the risk score and clinical features of TCGA database. A Framework of a decision tree. B Survival analysis of the 7 clusters obtained using the decision tree. C Results of the univariate Cox analysis. D Results of the multivariate Cox analysis. E The nomogram constructed to predict the survival probabilities. F Calibration curves of the nomogram. G The tROC curves of the nomogram.

Discussion

Emerging evidence suggests that NK cells play essential and dual roles in diverse disease models such as infection, transplantation, autoimmunity, as well as cancer. Studies have reported that NK cells function as crucial components of tumor immunosurveillance, exerting a potent cytolytic effect by recognizing ligands presented on tumor cells and inducing the involvement of a series of NK activating receptors [61, 62]. For ccRCC, immune-based therapies (including targeted PD-1/PD-L1) have emerged as the main management modality since it has a restricted response to conventional chemotherapy [63,64,65]. Thus, a more comprehensive understanding of the features of immune cells of ccRCC patients is imperative for developing effective and improved immunotherapies [25].

Research has uncovered the heterogeneity of infiltrating NK cells in ccRCC, revealing their intricate association with tumor metastasis and differences in the efficacy of immunotherapy [66]. Sierra et al. observed an increased expression of PD-L1 on NK cells and a corresponding high infiltration inccRCC, which paradoxically correlated with worsen survival outcomes of ccRCC patients [67]. Ziblat et al. found that tumor-infiltrating NK cells (TINK) in ccRCC represent an activated residency phenotype, contributing to desensitizing NK cells to tumor cells and further limiting their anti-tumor effect [68].

To identify NK cell-related prognostic biomarkers for ccRCC, we initially identified three prognostic clusters based on NK cell-related gene expression from the TCGA and ICGC databases. Bioinformatics analysis of ccRCC revealed distinct clinicopathological characteristics, prognosis, immune infiltration properties, and therapeutic responses among these three NK cell-related clusters. Risk scores have been proven to allow clinicians to adopt more personalized clinical approaches and facilitate prognostic outcomes as high-throughput genomic technologies become increasingly integrated into clinical practice [69,70,71,72]. Zhang et al. structured a risk score model for oral squamous cell carcinoma based on the expression of the Shelterin complex gene (SG) to evaluate the prognosis and immunotherapy responses of patients [73]. Additionally, Cao et al. identified a risk model based on immune-related gene signatures to predict prognosis and guide individualized therapies for advanced RCC [74]. Considering the distinctive characteristics of the molecular clusters, we constructed a six-gene based NK-related risk score model to explore the association between NK-related genes and ccRCC development. Our research reported that the high-risk score group significantly implicated poorer outcomes, advanced clinical stages, reduced immune cell infiltration, and increased insensitivity to anti-tumor immune responses. Given its role in representing the clinical decision-making processes, we implemented a decision tree model to further validate the performance of risk scores. Importantly, we indicated that a proposed nomogram integrating risk scores with clinical characteristics showed superior effective predictive efficacy for ccRCC prognosis.

Recent advancements in imaging technology and genomics have significantly aided clinical physicians in stratifying risk, selecting treatment plans, developing follow-up strategies, and predicting patient outcomes for various diseases. The combination of radiomics features and genomic data has demonstrated promising applications in clinical settings, such as RCC [75, 76]. In this study, we primarily analyzed NK cell-related gene expression in ccRCC and established a corresponding prognostic model. We validated the model's robustness and efficacy by utilizing public ccRCC patient databases. Based on individual risk characteristics, the model can be utilized to stratify patients, predict treatment responses, and evaluate clinical prognoses. Further integration of our prognostic model with radiomics offers a meaningful direction for future research. By combining clinical radiological information of ccRCC patients, our model holds potential to provide more personalized risk assessment, elevated diagnostic accuracy, and refined treatment plans and follow-up strategies for patients.

However, our research has several limitations. First, the gene expression data used in our study were obtained from online platforms, and there is a lack of data from our own center to validate the constructed prognostic model. Second, we focused only on 6 genes with the most significant correlation with ccRCC prognosis to establish the model. Although we found evidence supporting the importance of the expression and prognosis of these 6 genes, their functional mechanisms during ccRCC development are worthy of further experimental studies and validation in vivo. The qRT-PCR merely confirmed the mRNA expression profiles of the 6 genes and further analysis is needed to expand the exploration of 6 genes at the protein level and also on the animal models.

Conclusion

Taken together, we identified three distinct molecular clusters and proposed a generalized and robust prognostic model for ccRCC based on NK-related gene signatures. The risk score was confirmed to exert a powerful role for the prediction of ccRCC prognosis and can provide more appropriate guidance for the treatment of ccRCC patients.