Background

Radical prostatectomy (RP) is the most commonly used treatment for patients with localised prostate cancer and life expectancy greater than 10 years [1]. Open radical prostatectomy (ORP) and robot-assisted radical prostatectomy (RARP) are the two main RP approaches performed in Victoria [2]. The increased availability of robotic surgery has led to the use of a robotic approach in the majority of RPs in Victoria [3].

The American Urological Association and the European Association of Urology guidelines for localised prostate cancer state that there is no evidence of different urinary or sexual functional outcomes between ORP and RARP, however the evidence level is rated low in both [4, 5]. The largest randomised controlled trial (RCT) comparing ORP and RARP found no difference in oncological and urinary and sexual functional outcomes at 12- and 24-months following surgery [6, 7]. However, minimally invasive surgeries such as RARP have gained popularity globally due to the potential for reduced morbidity [8]. This may be due to improved perioperative outcomes, such as decreased blood loss, improved short-term postoperative outcomes, such as reduced postoperative pain and length of hospital stay [6], and clinician and patient preference [8]. Marketing campaigns, hospital and urologist competition, and centralised health systems may have also contributed to the perception that RARP is superior [9, 10]. Furthermore, patient and surgeon characteristics may differ significantly between ORP and RARP cohorts [11]. Due to conflicting evidence and its increasing adoption [12], ongoing comparison of outcomes is warranted that is focussed on issues that impact quality of life. For this reason, observational studies and prostate cancer clinical registries have grown in number [13]. Population-based clinical registries allow stronger inferences to be made about the population compared to single- or multi-centre studies due to volume-outcome relationships [14].

This paper used a population-based clinical registry to compare urinary and sexual PROs from the EPIC-26 questionnaire of men undergoing ORP and RARP at one-year following surgery.

Methods

PCOR-Vic

Data for this study were obtained from the PCOR-Vic, which collects clinical and patient-reported data on men diagnosed with prostate cancer from contributing institutions in Victoria [15]. Recruitment and data collection methods of the PCOR-Vic have previously been described [15]. At approximately 12 months post-diagnosis for men on active surveillance or watchful waiting, and 12 months post initial active treatment, participants are invited to complete the EPIC-26 quality of life questionnaire [16] via telephone, email or paper form [15].

Data collection

PCOR-Vic provided a dataset with a range of demographic, diagnostic and follow-up variables. Socioeconomic status was calculated using the patient’s postcode to determine the Australian Bureau of Statistics index of relative socio-economic advantage and disadvantage (IRSAD) for the 2016 year [17]. Year of attaining surgical specialisation was collected from the Australian Health Practitioner Regulation Agency website or directly from surgeons.

Patients

Patients were included if they attended a participating institution of the PCOR-Vic, did not opt out of the registry, received an ORP or RARP between January 2014 and May 2018, and completed at minimum the urinary and sexual bother items of the EPIC-26 questionnaire between 0.7 to 1.3 years following RP.

Outcomes

Urinary outcomes were the EPIC-26 urinary bother item (dichotomised into moderate/big bother, and small/very small/no bother, consistent with cut-off points reported elsewhere) [14, 18], urinary incontinence and urinary irritative/obstructive domain scores, and pad usage (dichotomised into ‘no pads per day’ and ‘≥ 1 pad per day’) [19]. Sexual outcomes were the EPIC-26 sexual bother item (dichotomised into moderate/big bother, and small/very small/no bother) and sexual domain scores [19].

Statistical analysis

Descriptive analyses included medians and interquartile ranges for continuous data. Associations were determined using independent t-tests for continuous data and chi-squared tests for categorical data. A two-tailed 5% significance level was used throughout.

Propensity score matching was used to assess differences in urinary and sexual outcomes between ORP and RARP after accounting for the differences in patient and surgeon characteristics. Detailed steps involved in propensity score matching are described elsewhere [20, 21]. The propensity score refers to the estimated probability of receiving one of the treatment options.  It was defined as the probability of receiving RARP, as opposed to ORP. Propensity scores were calculated for each patient using a logistic regression model based on the preoperative factors of age at surgery, PSA at surgery, surgeon’s years since specialisation, National Comprehensive Cancer Network (NCCN) risk category, hospital location (metropolitan vs. regional), hospital type (public vs. private), IRSAD quintile and year of surgery.

The propensity score matching estimator used was a nearest neighbour 1:1 matching model, with replacement and no calipers. Therefore each patient in both treatment groups was matched to a patient in the other treatment group based on propensity scores. This model was used to define similarity in order to find the closest matches across the population. The matching estimator imputed the missing potential outcome for each individual (i.e. the outcome if they had received the other treatment option). Each potential outcome became an observation in the data.

In this manner, two groups of patients were formed that were similar on their propensity scores. After matching, the covariates used in calculating the propensity scores were checked for balance across the two treatment groups using a maximum standardised difference of 10%, as previously recommended [20, 22].

The average treatment effect for each outcome was estimated before and after propensity score matching. These were expressed as risk differences (RARP minus ORP) for binary variables and as mean differences (RARP minus ORP) for continuous variables.

All analyses were conducted using STATA version 15.1, with p values ≤ 0.05 considered statistically significant. The matched analyses used the teffects command. Ethics approval was obtained from the Monash University Human Research Ethics Committee (ID: 19196).

Results

Of the 3826 patients included in this study, 1047 (27%) underwent ORP and 2779 (73%) underwent RARP.

Differing baseline characteristics included that RARP patients were more likely to have low or intermediate NCCN disease risk (78.3% vs. 72.0%, p < 0.001) and reside in postcodes in the top quintile of the IRSAD (43.9% vs. 34.3%, p < 0.001), compared to ORP patients (Table 1). RARP patients were also more likely to have surgery at a private institution (85.3% vs. 60.6%, p < 0.001) and have surgery at a metropolitan institution (93.5% vs. 71.5%, p < 0.001), compared to ORP patients.

Table 1 Baseline patient characteristics at diagnosis and surgery

Table 2 shows the covariate balance in the propensity score matched cohort, compared to the unmatched cohort. After propensity score matching, no variables exceeded a standardised difference of 8%.

Table 2 Covariate balance table and standardised differences of variables included in the propensity score model

Urinary bother

In the unmatched cohort, there was no statistically significant difference in the risk of reporting moderate/big urinary bother in RARP compared to ORP patients (Rd = − 1.69%, P = 0.125) (Table 3). After adjusting for baseline characteristics in the propensity score matched cohort, there was also no significant difference in reporting moderate/big urinary bother (Rd = 0.47%, P = 0.707) (Table 3).

Table 3 Propensity score matching results for EPIC-26 items

Urinary incontinence

In the unmatched cohort, there was no significant difference in urinary incontinence domain scores in RARP patients compared to ORP patients (Coeff = 1.27, P = 0.195). In the propensity score matched cohort, there was also no significant difference in scores (Coeff = − 0.84, P = 0.506).

Pad usage

In the unmatched cohort, there was no significant difference in the risk of wearing ≥ 1 pad per day in RARP patients (Rd = − 3.36%, P = 0.074). In the propensity score matched cohort, there remained no significant difference in wearing ≥ 1 pad per day (Rd = 0.75%, P = 0.771). Sensitivity analysis with different cut-offs for binary variables can be seen in Additional file 1: Table S1, showing no difference between ORP and RARP in terms of pad usage, urinary bother and sexual bother.

Sexual bother

In the unmatched cohort, there was no significant difference in the risk of reporting moderate/big sexual bother in RARP compared to ORP patients (Rd = − 0.33%, P = 0.860). In the propensity score matched cohort, the difference remained insignificant (Rd = − 0.89%, P = 0.731).

Sexual domain

In the unmatched sample, there were superior outcomes for men undergoing RARP, compared to those undergoing ORP, in sexual domain score (30.15 vs 23.48, < 0.001. In the propensity score matched cohort, this superiority persisted, but was less pronounced (29.57 vs. 25.92, respectively, P = 0.005).

Discussion

This study used a large, registry-based cohort of patients and evaluated one-year urinary and sexual PROs. The unmatched cohort had different baseline characteristics, which we propose is due to a disparity in access to RARP. RARP patients were more likely to undergo surgery at metropolitan hospitals than ORP patients (93.5% vs. 71.5%). RARP patients were also more likely to undergo surgery at private hospitals than ORP patients (85.3% vs. 60.6%). RARP patients had lower risk disease, which is potentially due to earlier screening.

Propensity score matching was used to decrease these and other baseline differences. After propensity score matching, no variables exceeded a standardised difference of 8%. Many covariates had major reductions in standardised difference, including hospital location, hospital type and IRSAD category. The differences in four out of six PROs decreased once patient and surgeon characteristics were more even amongst matched groups. This may be slightly due to an increased sample size from matching but is more likely due to a reduction in baseline differences.

Our results show that there is no statistically significant difference between ORP and RARP in reporting any urinary outcomes, including urinary bother, urinary incontinence, urinary irritative/obstructive and pad usage. Furthermore, there was no statistically significant difference between ORP and RARP in reporting sexual bother. Sexual domain scores were statistically superior in the RARP group. However, an absolute difference of less than 4 points on a 100-point scale was deemed unlikely to be clinically significant [23]. In large datasets, including those collected by clinical registries, small differences in outcomes between treatment groups may be considered statistically significant even though they are not clinically meaningful [24].

Urinary outcomes

The largest RCT in the field reported no difference in urinary incontinence domain scores at 6 weeks, 12 weeks and 6-, 12- and 24-months [6, 7]. However, they only used two surgeons with varying levels of experience and therefore their findings may be confounded by surgeon differences. Three studies have used propensity score matching to compare postoperative urinary outcomes between ORP and RARP [18, 25, 26]. Each reported no difference in urinary outcomes between ORP and RARP (Additional file 1: Table S1) [18, 25, 26].

The majority of studies in the field are observational and have retrospectively compared ORP and RARP. A previous study from the PCOR-Vic found no difference in 12-month urinary bother [18]. In contrast, Herlemann et al. concluded that men undergoing ORP were more likely to report superior EPIC-26 urinary incontinence domain scores and less bother than RARP patients within one year of surgery, in unadjusted analysis [27]. However, they also noted that the ORP group had significantly lower risk scores, Gleason grades and pT stages, which introduced selection bias. Results like these should be adjusted for baseline differences through techniques such as propensity score matching or regression analysis. Herlemann et al. hypothesised that the worse urinary outcomes for RARP patients in the first year may be influenced by increased expectations of RARP, especially in America [27].

One Swedish study found no difference between ORP and RARP in the number of pads worn per day for all daily cut-offs (≥ 1, ≥ 2, ≥ 4 and ≥ 6) [28]. We also found no difference between ORP and RARP patients in the unmatched and matched cohorts, with cut-offs at ≥ 1, ≥ 2 and ≥ 3 pads per day (Additional file 1: Table S2).

Sexual outcomes

Our finding of marginal improvement in sexual domain scores at 12 months in the RARP group is comparable with two observational studies that used propensity score matching [25, 29]. An Italian study by Antonelli et al. found a significant difference in 6-month unadjusted sexual function scores in favour of RARP by 12.41 points (p < 0.001), using the University of California Los Angeles Prostate Cancer Index (UCLA-PCI) questionnaire [25]. However, there was no difference at 12 months [25]. Antonelli et al. noted a clear trend towards centralisation of RARPs in Italy and therefore surgeon factors and volume-outcome relationships might explain the superior outcomes for RARP [25]. Similarly, RARP is centralised to metropolitan regions in Victoria, with greater access in private hospitals, which may skew views of its superiority.

Sooriakumaran et al. reported that patients in low or intermediaterisk groups recovered earlier from RARP, based on a single 5-scale penile stiffness question [29]. This was claimed to be due to RARP being a more flexible technique that has a greater capacity for nerve-sparing [29].

In a large American cohort, O’Neil et al. found a strong relationship between baseline function and function at later timepoints using common items from the UCLA-PCI and EPIC questionnaires to form a modified sexual domain summary score [30]. Above a baseline sexual function score of 62/100, RARP patients were found to have significantly improved sexual function at 12 months compared to ORP patients by a magnitude of 2.70 to 10.31 points [30]. This may indicate that RARP is able to preserve function in patients with higher preoperative function.

Strengths and limitations

Our study included data from 100 surgeons across metropolitan and regional Victoria. The use of propensity score matching allowed us to mimic a RCT by using matching in place of randomisation to create ORP and RARP patients who were alike on important baseline factors. It is important to understand the intricacies of each adjustment model and the variables selected in each model to assess the significance of outcomes. In contrast to other propensity score matching studies that used inverse probability treatment weighting [18, 25, 26, 29] we used 1:1 nearest neighbour matching, with replacement and no calipers. We included a number of known and novel factors in our propensity score model. Given the evidence that surgeon experience and skill are undervalued determinants of patient outcomes [31], our model included a novel variable, the number of years since the surgeon’s specialisation. Furthermore, hospital location (metropolitan vs. regional) was included, which has not yet been examined as a predictor of quality of life. The effect of hospital type (private vs. public) will vary depending on each country’s healthcare system. However, we included a hospital type variable as Victorian private sector patients have an increased likelihood of receiving RARP than public sector patients [32].

This study is not without limitations. First, the observational study design introduces the potential of unmeasured confounding and bias. The analysis was limited to outcome data collected at one timepoint and therefore no inferences could be made regarding time to recovery. Second, the absence of PRO data at baseline prevents adjusting for preoperative function. Participants of the PCOR-Vic are identified and contacted for consent a few months following diagnosis. Therefore collection of baseline data is not possible as many men will have had active management by the time they enter the registry. Furthermore, other variables were not collected by PCOR-Vic and therefore not included in the analysis, such as previous urological surgery and comorbidity data. Third, surgeries in the public sector are often performed by trainees under the supervision of nominal surgeons, and surgeon learning curve factors were not assessed. Therefore, our stratification of surgeon experience may not be truly representative. Fourth, the PCOR-Vic database had 75% coverage of all RPs in Victoria in 2013 [33] and 89% population coverage in 2019 [34], and captured data from patients at 61 sites (30/33 public and 31/41 private). We believe the available data is representative of the Victorian population, however may not be applicable to other settings. Fifth, the PCOR-Vic database lacks standardised data collection for specimen handling and histological subtyping across all sites.

We suggest that future studies should capture surgeon-specific factors such as experience, skill and technique as well as ancillary services offered in the public and private sectors. Assessment of centralising services to high-volume centres is required as it may pool resources to better manage needs, yet may increase disparities in access to care [2]. Furthermore, the emergence of new surgical and radiological techniques, adjuvant treatments and medications, and demographic changes in patients and surgeons has the potential to change outcomes following RP and therefore requires ongoing assessment.

Conclusion

In this state-wide cohort of patients receiving ORP or RARP, there were no clinically significant differences in urinary PROs at one-year following RP. In terms of sexual function, there was slightly superior sexual domain scores for the RARP group, which was not deemed clinically significant.