Introduction

Traditionally, patients with small renal masses (SRMs) are managed by radical or partial nephrectomies (PNs). However, laparoscopic PNs (LPNs) are associated with significant complication rates (~20%) [1]. Percutaneous image-guided radio-frequency ablation (RFA) in SRMs was first reported in 1997 [2]. The adoption of image-guided ablation (IGA) has rapidly increased in the management of SRMs due to its minimally invasive nature and the theoretical ability to offer preservation of renal function and lower complication rate when compared to PN [3]. Other energy sources have been adopted to manage SRMs, including cryoablation (CRYO) [4], microwave ablation [5], and, more recently, irreversible electroporation [6].

The current European Association of Urology (EAU) guidelines suggest strong evidence to perform PN for T1 renal masses, and weak evidence to only offer IGA to those with significantly co-morbidity and frailty [7]. The EAU guidelines have also suggested IGA to be associated with higher rates of recurence, although unlikely after 5 years, based on limited evidence [7]. On the other hand, the American Urological Association (AUA) guidelines suggest thermal ablation as an alternative approach in managing cT1a tumours; however, the lack of high-quality literature with long follow-up periods of patients with confirmed histology was emphasised [8]. The AUA guidelines also specify the importance of long follow-up periods (> 5 years) to accurately assess for late local recurrences. While there are some non-randomised evidence base to perform PN over radical nephrectomy, there is only one study by Andrews et al, showing comparable long-term oncological outcomes of IGA and LPN for SRMs for up to 5 years[3, 9]. Chang et al, had also shown comparable 5-year outcomes between laparoscopic or imaged-guided RFA and PN [10]. Furthermore, the overall quality of studies comparing IGA and LPN is limited. Single-arm studies have suggested effective long-term cancer control in patients undergoing percutaneous RFA at 10 years [11].

While there is a desperate need for a high-quality randomised controlled trial to compare RFA, CRYO, and LPN, prospective recruitment has proven to be difficult as seen by the SURAB feasilibity study and the CONSERVE trial, which both failed in recruitment [12, 13]. This study aims to provide 10 years of experience and evidence to inform guidelines for long-term oncological outcomes in patients undergoing image-guided CRYO or RFA and LPN for biopsy- or histology-proven T1aN0M0 and T1bN0M0 renal cell carcinomas (RCCs).

Methods

Study design

This is a retrospective analysis of a prospectively maintained registry from 2003 to 2016. Following institutional health and research authority approval, consecutive adult patients who underwent image-guided CRYO, RFA, or laparosocpic LPN for cT1N0M0 histology-confirmed RCC were included for the study. The patient selection process at our institution was previously described [14]. cT1 renal masses were defined as a maximum tumour diameter of ≤ 7 cm limited to the kidney on radiographic imaging according to the American Joint Committee on Cancer staging manual [15]; with cT1 further divided to cT1a (≤ 4 cm) and cT1b (> 4 cm and ≤ 7 cm). Patients presenting with multiple renal tumours, recurrence, inherited RCC syndromes, or a solitary kidney were excluded from the analysis [16]. Patients with a history of LPN, CRYO, or RFA of the same kidney were also excluded from analysis. Primary outcome of the study was to evaluate and compare the long-term local recurrence-free survival (LRFS) between CRYO, RFA, and LPN. Secondary outcomes include overall survival (OS), cancer-specific survival (CSS), metastasis-free survival (MFS), rate and severity of complications, and change in renal function peri-operatively. The detailed methods of the performance of IGA and LPN are outlined in the supplementary appendix.

Patient follow-up

The follow-up protocol for IGA was previously described in detail [14]. All patients were followed at 1, 3, and 6 months after the procedure and annually onwards for a period of 10 years using MRI or CT. Local recurrence was defined as new area(s) of enhancement in the zone of ablation after at least one imaging study had shown complete lack of enhancement in the treated area. Metastatic disease was defined as extra-renal disease on imaging confirmed or suspicioned to have originated from the kidney. Cancer-specific death was defined as any deaths from RCC.

Clinical features, variables, covariates, and data acquisition

Patient clinical features such as age, sex, treatment date, follow-up details, histopathological details, R.E.N.A.L. nephrometry score [17], co-morbidities (according to the Charlson Comorbidity Index [CCI][18]), procedure details, complications (according to the Clavien Dindo Classification [19]), and estimated glomerular filtration rate (eGFR; CKD-EPI [20]) were extracted from the prospectively maintained database. Utilising the National Health Service (NHS) patient records, the patients were followed for their living status and cause of death until 25th January 2021.

Outcomes and data synthesis

Differences in baseline characteristics were evaluated using the chi-square test and the Kruskal-Wallis test. CSS, OS, LRFS, and MFS were evaluated from the time of treatment to the time of event using the Kaplan-Meier method. Ten-year survival rates and corresponding 95% confidence intervals (95% CI) were reported. The Cox proportional hazard regression model was utilised to evaluate survival in CRYO, RFA, and LPN patients, reporting as hazard ratios (HRs), 95% CI, and p-values. To allow evaluation of HRs when no events were observed in an arm, an event was artificially created at the latest follow-up for that arm. Complication rates and severity were evaluated using the chi-squared test and logistic regression. Changes in peri-operative renal function were evaluated using the Kruskal-Wallis test and the Wilcoxon matched pairs signed rank sum test. Propensity score matching [21, 22] was intended to be used to compare RFA and CRYO with LPN. However, the groups were so different for a number of the key matching variables that this approach became impractical, as detailed in the “Results” section, and in the supplementary appendix. In order to facilitate comparison of the two groups, we therefore used Cox’s proportional hazards model [23], including all the variables we had intended to use in the matching analysis (age, sex, laterality, CCI, R.E.N.A.L nephrometry score, lesion size, RCC type, grade, and t-stage), to adjust for imbalances in these variables between the various treatment groups. This is, of course, not a substitute for a randomised trial, and the results have to be interpreted with a degree of caution. Furthermore, there are relatively few events, creating sensitivity issues with the model results. However, given the difficulties in undertaking such a randomised trial, and the time that such a trial would take to complete, we decided that this approach provides an appropriate means of performing and interpreting inter-group comparisons. Amongst 10 patients with missing CCI or R.E.N.A.L. nephrometry score, median CCI or nephrometry score was imputed. Sensitivity analyses have shown identical results and hence all patients were included in the final analyses. All analyses are two-tailed at a significance level of 0.05. All statistical analyses were performed on STATA/MP 16.0 (StataCorp).

Results

A total of 290 patients were included in the analysis. Supplementary figure 1 shows how these patients were selected for inclusion in the study.

Oncological outcomes in T1a patients using univariate analysis

Baseline characteristics of T1a patients

A summary of the clinical and pathological characteristics of the 238 T1a patients included in the analysis is given in Table 1. RCC histology, Fuhrman grade, age, tumour size, R.E.N.A.L nephrometry score, baseline eGFR and CCI were found to be significantly different between the three groups. The median (IQR) follow-up time was 75.6 (66.8–86.5) months, 106.0 (61.2–135.1) months, and 72 (64.6–99.7) months in CRYO, RFA, and LPN patients, respectively.

Table 1 Baseline characteristics of T1a patients

Event-specific outcomes

Totals of 204, 238, 233, and 233 patients were evaluated for CSS, OS, LRFS, and MFS, respectively, with exclusions being for lack of follow-up (LRFS: 5, MFS: 5), and unknown causes of death (CSS: 4) in the LPN group only. Results were comparable between the 3 groups for all 4 endpoints (Figs. 1 and 2). Only two RCC-related deaths were observed: one in the RFA group and one in the LPN group. A total of 31 deaths were observed (CRYO: 13, RFA: 9, LPN: 9). Ten local recurrences were observed (CRYO: 2, RFA: 5, LPN: 3). Five metastatic events were observed (CRYO: 0, RFA: 2, LPN: 3). A total of 72 and 87 patients were evaluated for CRYO and RFA for all outcomes, respectively. A total of 75, 79, 74, and 74 patients undergoing LPN were evaluated for CSS, OS, LRFS, and MFS, respectively.

Fig. 1
figure 1

Forest plot summary of all oncological outcomes in T1a and T1b patients undergoing cryoablation or RFA compared to LPN using the univariate Cox proportional hazard model; 95% CI, 95% confidence interval; CSS, cancer-specific survival; OS, overall survival; LRFS, local recurrence-free survival; MFS, metastasis free-survival; RFA, radio-frequency ablation; LPN, partial nephrectomy

Fig. 2
figure 2

a Cancer-specific survival, (b) overall survival, (c) local recurrence-free survival, and (d) metastasis-free survival in T1a patients

Oncological outcomes in T1b patients on univariate analysis

A total of 58 T1b patients were included in this study. A summary of their clinical and pathological characteristics are outlined in Table 2. RCC histology, Fuhrman grade, age, tumour size, R.E.N.A.L nephrometry score, baseline eGFR, and CCI were found to be significantly different between the three groups. The median (IQR) follow-up duration is 72.5 (42.0–100.9) months, 59.5 (27.5–99.39) months, and 67.9 (50.8–91.3) months for CRYO, RFA, and LPN, respectively. CSS, OS, LRFS, and MFS are all comparable between patients undergoing CRYO, RFA, or LPN (Figs. 1 and 3). The details of the results are outlined in the supplementary appendix.

Table 2 Baseline characteristics of T1b patients
Fig. 3
figure 3

a Cancer-specific survival, b overall survival, c local recurrence-free survival, and d metastasis-free survival in T1b patients

Post-operative complications

The rate and severity of post-operative complications for all three modalities were found to be similar in both cT1a (CRYO: 11.1%, RFA: 18.4%, LPN: 14.1%) and cT1b patients (CRYO: 19.4%, RFA: 15.4%, LPN: 7.7%). Both logistic regression and multinomial logistic regression did not show significant difference between the three groups’ rate and severity of complications (Supplementary Table 1 and 2). A summary of all complications occurring during the study period are reported in Supplementary Table 3.

Change in renal function

The post-operative eGFR and change in eGFR peri-operatively of T1a and T1b patients undergoing CRYO, RFA, and LPN are shown in Table 3. Only small changes in eGFR were found in patients undergoing CRYO and RFA, as compared to substantial falls in eGFR in LPN patients (Wilcoxon matched pairs signed rank sum Z and p-values; CRYO: 3.0, 0.003, RFA: 2.4, 0.02, LPN: 6.0, < .0001). When comparing the change in renal function peri-operatively using the Wilcoxon 2-sample rank sum test, in both T1a (Z = 4.1, p < .0001) and T1b (Z = 2.5, p = .01) patients, those undergoing IGA had a significantly smaller median change in eGFR compared to LPN (Table 3).

Table 3 Peri-operative change in eGFR in T1a and T1b patients undergoing image-guided cryoablation, RFA, and PN

Results of propensity-score matching and multivariate analysis

Initially, it was intended to explore the propensity score matching approach, as described in the “Methods” section. However, this proved to be infeasible due to large differences in baseline factors between the treatment groups, most substantially in age (Supplementary Figure 4; Tables 1 and 2). Further details, results, and explanation are given in supplementary Figures 2 and 3. Therefore, as described in the “Methods” section, the Cox multivariate method was used to adjust for these imbalances and compare the treatment arms (Table 4). As events are relatively scarce in this study, sensitivity analyses were performed by replacing an event with censoring at that time (results not presented). Minimal differences to the results presented were observed for all of the outcomes, demonstrating that the results are relatively insensitive to such small changes, and are therefore relatively robust. Certainly, the overall findings would be unchanged as a result of a single patient having a different outcome.

Table 4 Oncological outcomes in T1a and combined T1a/T1b patients in multivariate Cox proportional hazards model

In univariate Kaplan-Meier analyses, IGA and LPN were shown to have comparable LRFS. However, given that the CRYO and RFA groups consist of patients with considerably worse prognostic factors, after multivariate adjustment, CRYO and RFA appear to be superior to LPN for LRFS. The magnitude of the effect in the two ablative therapy groups is almost identical (see Supplementary Figure 5) so a combined group analysis, stratified by group, was performed, demonstrating ablative therapies to be superior to LPN for LRFS (HR 0.006, 95% CI 0.00–0.15, p = 0.002). Note that the RFA/LPN comparison reaches statistical significance on its own (Table 4; p = 0.003), and, although the CRYO/LPN result is not statistically significant (Table 4, p = 0.087), this is largely a result of paucity of patient and event numbers. Although effect sizes (HR) appear to be substantial for statistically significant outcomes (LRFS, MFS), suggesting extreme advantage to IGA patients, they are unlikely to reflect real effect sizes due to a combination of the extreme selection bias. Finally, the lower 90% confidence interval on the hazard ratio is less than 1 for CRYO (Supplementary Table 5), which demonstrates at least 90% confidence that CRYO is as good as LPN for LRFS. For clarity, characteristics of all patients with T1a tumours and subsequent local recurrences are shown in Table 5.

Table 5 Characteristics of all patients with T1a tumour and local recurrences

Discussion

The number of high-quality studies comparing the use of IGA and LPN is scarce, with most limited by extreme selection bias and short follow-up periods [3, 24].

The univariate analysis results hereby reported are similar to that reported by Andrews et al [9] in 2019 and a recent published meta-analysis [3] as CSS, LRFS, and MFS were found to be comparable amongst the three modalities in both T1a and T1b patients. However, the available studies only assessed outcomes up to 5 years. In this cohort, as a result of serious selection bias, where LPN patients are signifcantly younger and less comorbid and have smaller tumours, propensity score matching was impossible. Therefore, a multivariate Cox proportional hazards model approach was utilitsed. In the multivariate analysis, we have found all oncological outcomes are at least comparable. Although LRFS is shown to be superior in T1a and T1b patients undergoing RFA (p = 0.011) and CRYO (p = 0.026), given the small number of events, model sensivity issues, and the fact that this is not a randomised trial, it is perhaps inappropriate to think that the results demonstrate superiority for ablative therapies. However, it seems reasonable to conclude that IGAs are at least as good as the surgical alternative. Furthermore, in contrast to the EAU’s guidance [7], our results have shown that recurrences after 5 years may have been more common than usually perceived, with five recurrences observed after 5 years (Table 5).

Despite selection bias, in contradiction to previous cohorts [9, 25] and a recent meta-analysis [3], our study did not find OS to be significantly different in the three treatment arms in both T1a and T1b patients. Andrews et al have reported 5-year OS to be significantly worse in CRYO and RFA patients with T1a/T1b disease even after propensity matching and subgroup analysis in patients with RCC [9]. The positive finding in our study could be the result of the extended follow-up time, offsetting potential selection bias arising over age of the included patients. Furthermore, life expectancy in the UK is significantly higher than that in the USA, further offsetting the age selection bias in the study [26]. While age is commonly regarded as a confounder in similar studies, our study found it be a significant, but only small predictor of overall survival in patients with T1a tumours in this cohort (HR 1.05, 95% CI 1.01–1.08, p = 0.016), explaining the minimal effect of selection bias on our results.

The rate and severity of complications in our study was not significantly different amongst the three modalities. This is in line with recent studies and with the meta-analysis of percutaneous IGA and PN [3]. While no theoretical advantage of reduced complications is observed in the literature, the learning curve for both LPN and percutaneous IGA is at about 100 cases [14, 27, 28], and few results have been reported for centres significantly beyond the learning curve [3].

As expected, as renal parachyma is better preserved in CRYO and RFA, our study found little or no change in eGFR in patients undergoing CRYO and RFA, as compared to a significant fall of eGFR in LPN patients. Although not investigated in this study, this will help inform treatment decisions in those with solitary kidneys or impaired renal function.

The strengths of this study include long-term follow-up, inclusion of R.E.N.A.L nephrometry scores and confirmed RCC status. While the results may be positive, this study does not come without its limitations. Firstly, our sample size (especially with T1b patients) is too small to be well-powered statistically. Secondly, the study is also limited by strong selection bias owing to the retrospective study design. This is evident by the inability to perform propensity score matching, with attempted mitigation using multivariate analysis. However, despite the seletion bias, the results are still positive. Thirdly, it is recognised that treatment options may depend on the location of the tumour and the nephrometry score may not be a complete representation of tumour complexity for treatment. Ultimately, it may not be entrirely safe to treat central tumours with CRYO, RFA, or even LPN, and radical nephrectomy may remain an option for some of the patients. Finally, the inclusion of only LPN may not be representative of patients undergoing robotic PN or open PN, as complication profiles and oncological outcomes may significantly differ [29].

The optimal investigations and management of small RCCs are debated, and there are factors that must be taken into consideration in order to compile evidence to allow better patient care. For example, the use of renal tumour biopsies should be considered, at least in a research context, as this allows evaluation of the treatment effects of malignant lesions without the biases arising from a proportion of benign results [30, 31] The use of active surveillance to manage small RCCs is becoming increasingly popular [32, 33], and the use of renal tumour biopsy prior to both active surveillance and IGA will allow for better comparison between the different managements of small RCCs.

This study reported long-term outcomes of patients undergoing CRYO, RFA, or LPN for T1a/T1b RCC. Although, in this cohort, patients undergoing CRYO and RFA have superior LRFS and comparable oncological outcome in general, the extreme selection bias and lack of events suggest the cautious conclusion that CRYO and RFA are at least as good as LPN in oncological outcomes. However, this study can conclude that CRYO and RFA have better renal function preservation compared to LPN. Therefore, percutaneous IGA, CRYO, and RFA should be potentially reflected in guidelines to be considered first-line treatment along with LPN for small RCCs providing more promising outcomes from larger prospective and multicentre cohorts can be made avaliable evaluating both the peri-operative and long-term outcomes for both T1a and T1b RCCs. The highly anticipated NEST trial [34], a RCT comparing LPN and IGA in T1a is designed to address the much needed level one evidence in this area.