MRI-derived PRECISE scores for predicting pathologically-confirmed radiological progression in prostate cancer patients on active surveillance

Objectives To assess the predictive value and correlation to pathological progression of the Prostate Cancer Radiological Estimation of Change in Sequential Evaluation (PRECISE) scoring system in the follow-up of prostate cancer (PCa) patients on active surveillance (AS). Methods A total of 295 men enrolled on an AS programme between 2011 and 2018 were included. Baseline multiparametric magnetic resonance imaging (mpMRI) was performed at AS entry to guide biopsy. The follow-up mpMRI studies were prospectively reported by two sub-specialist uroradiologists with 10 years and 13 years of experience. PRECISE scores were dichotomized at the cut-off value of 4, and the sensitivity, specificity, positive predictive value and negative predictive value were calculated. Diagnostic performance was further quantified by using area under the receiver operating curve (AUC) which was based on the results of targeted MRI-US fusion biopsy. Univariate analysis using Cox regression was performed to assess which baseline clinical and mpMRI parameters were related to disease progression on AS. Results Progression rate of the cohort was 13.9% (41/295) over a median follow-up of 52 months. With a cut-off value of category ≥ 4, the PRECISE scoring system showed sensitivity, specificity, PPV and NPV for predicting progression on AS of 0.76, 0.89, 0.52 and 0.96, respectively. The AUC was 0.82 (95% CI = 0.74–0.90). Prostate-specific antigen density (PSA-D), Likert lesion score and index lesion size were the only significant baseline predictors of progression (each p < 0.05). Conclusion The PRECISE scoring system showed good overall performance, and the high NPV may help limit the number of follow-up biopsies required in patients on AS. Key Points • PRECISE scores 1–3 have high NPV which could reduce the need for re-biopsy during active surveillance. • PRECISE scores 4–5 have moderate PPV and should trigger either close monitoring or re-biopsy. • Three baseline predictors (PSA density, lesion size and Likert score) have a significant impact on the progression-free survival (PFS) time. Electronic supplementary material The online version of this article (10.1007/s00330-020-07336-0) contains supplementary material, which is available to authorized users.


Introduction
Active surveillance (AS) is now the recommended management option for men with localized and low-risk prostate cancer (PCa) [1,2]. The aim of AS is to reduce overtreatment whilst appropriately identifying when progression occurs in order to trigger deferred treatment during the window of curability [3].
Multiparametric magnetic resonance imaging (mpMRI) has become established as an integral part of patient selection for AS [2][3][4][5]; however, follow-up is typically based on a clinical combination of prostate-specific antigen (PSA), digital rectal examination and re-staging biopsies [3]. The invasive nature of protocol-driven biopsies may limit patient uptake of AS [6,7], with MRI potentially offering a means to avoid or limit the number of interventions. The role of mpMRI during AS follow-up is evolving; nevertheless, agreement is lacking on what constitutes radiological progression and whether a positive mpMRI can be used as a stand-alone tool to prompt treatment [8]. A key reason for this is a lack of robust published data, in particular due to inconsistent reporting of follow-up mpMRI findings for patients on AS, thus precluding any meaningful analysis and comparison of the data between the studies [8,9].
In 2016, a panel of experts in urology, radiology and oncology developed the Prostate Cancer Radiological Estimation of Change in Sequential Evaluation (PRECISE) recommendations in order to standardize reporting and to facilitate data collection regarding the natural history of mpMRI findings in men on active surveillance. The cornerstone of the recommendations was a proposed 5-point Likert scoring scale to standardize the language used to convey the likelihood of radiologic progression, potentially removing any ambiguity in this message [8]. However, its clinical utility is yet to be validated. The aim of our study was to assess the value of the PRECISE scoring system in follow-up of prostate cancer patients on AS and its correlation to disease progression. In addition, we investigated the association between baseline clinical and mpMRI features and progression on AS.

Active surveillance enrolment
Patients with newly diagnosed low-to-intermediate-risk prostate cancer who were selected for active surveillance management at our institution were prospectively entered into an AS study from 2011. The local ethics committee waived the need for informed consent for retrospective analysis from this database (Cambridge University Hospital Trust, Cambridge, UK; registration number: 3592). Enrolment criteria included men aged 50-80 years with Gleason 3 + 3 = 6 or Gleason 3 + 4 = 7 with 10% or less Gleason pattern 4 overall (equivalent ISUP grades 1-2), involving < 50% of all cores; with < 50% involvement of any single-core and ≤ 2-core Gleason pattern 4; clinical stages T1-T2; PSA ≤ 20 ng/ml; and who were otherwise medically fit for radical treatment options. Exclusion criteria included diagnosis of PCa but not meeting pathologically defined enrolment criteria, or previous treatment for PCa. A baseline mpMRI was performed at AS entry, either prior to biopsy or following a standard 12-core systematic TRUS biopsy. In cases where there was a discordance between an initial biopsy result and subsequent mpMRI findings, a repeat targeted transperineal (TP) biopsy was performed within 3 months; any patients upgraded on the basis of this biopsy and no longer matching local AS criteria were considered not to have enrolled for AS.

Active surveillance follow-up and progression
Follow-up protocol incorporated 3-month PSA testing, annual mpMRI and yearly clinic appointments. Re-biopsies were performed at protocol-driven time points (12 months and 36 months) or were triggered earlier by a clinical suspicion for progression based on three consecutive rises in PSA level or suspected MRI progression. This was defined as PRECISE score ≥ 4 or MRI-based criteria (increase in the number of lesions, increase in lesion size or stage progression) for the scans which predated PRECISE scoring system, as previously reported [10]. In cases where an MRI lesion was visible, a targeted MRI-US image-fusion TP biopsy was performed with 2-4 cores per target in addition to acquiring 24 background systematic cores (2 per each of 12 anatomic sectors) [11]. Progression on AS was defined as pathological progression at re-biopsy or stage progression on mpMRI (from T2 to T3). Pathological progression was defined as a Grade Group increase between diagnostic and repeat biopsy and no longer meeting pathological AS enrolment criteria. Patients with evidence of progression but choosing to not undergo treatment and thus changing to watchful waiting management were considered to be progressing from the date of repeat biopsy. Patients leaving the programme without pathological evidence of progression were excluded from analysis, for instance patient choice, or clinician choice based on PSA progression alone, or MRI increase in lesion size with no confirmatory biopsy. To ensure adequate follow-up and outcome evaluation, patients were followed up for a minimum of 12 months after their last MRI.

Multiparametric MRI
Patients underwent prostate MRI on a 3-T Discovery MR750 HDx or a 1.5-T MR450 scanner (GE Healthcare) using a 16-32-channel coil, respectively (Supplemental Tables 1 and 2). Axial fast spin-echo T1-weighted images of the pelvis, along with T2-weighted fast recovery fast spin-echo images of the prostate, were acquired in the axial, sagittal and coronal planes, with an axial slice thickness of 3-3.5 mm and a gap of 0-0.5 mm. Diffusion-weighted (DW) imaging was performed using a spin-echo echo-planar imaging pulse sequence (slice thickness 3-4 mm; gap 0 mm), with b values of 150 s/mm 2 , 750 s/mm 2 , 1000 s/mm 2 and 1400 s/mm 2 (additional 2000 s/mm 2 at 3 T) and apparent diffusion coefficient (ADC) maps automatically calculated. Dynamic contrast-enhanced (DCE) imaging was performed at baseline, but not in follow-up studies.

Image analysis
All baseline MRIs were reported by two sub-specialist uroradiologists (B.C.K. and T.B.), with 10 years and 13 years of experience in reporting prostate MRI, respectively, and subsequently reviewed in a multidisciplinary team setting. mpMRI findings were evaluated using a Likert scale, which was initially based on the Prostate Imaging-Reporting and Data System (PI-RADS) v.1 structured scoring criteria developed by the European Society of Urogenital Radiology (ESUR) and, after 2015, on version 2, together with clinical information [12,13]. The final score was defined by combining all scores for T2WI, DWI and DCE sequences as is now recommended in PI-RADS (version 2.1) [14]: 1 = cancer highly unlikely, 2 = cancer unlikely, 3 = equivocal for cancer, 4 = cancer likely and 5 = cancer highly likely. Likert ≥ 3 of any size on baseline imaging was considered to be an MRI-positive lesion for the purposes of subsequent analysis. The prostate volume was calculated by MRI-based prolate ellipsoid formula (three diameters measured directly on the MRI images, volume = length × width × height × π / 6). Prostate-specific antigen density (PSA-D) was then calculated using the MRI-derived gland volume and baseline PSA. Index lesion size was defined at baseline mpMRI as the maximum diameter (mm) using axial T2WI.
MRI studies during follow-up were scored on a 5-point scale according to the PRECISE system: (1) resolution of suspicious MRI features (e.g. previous area with restricted diffusion no longer shows it), (2) reduction in volume/ conspicuity of MRI features (e.g. reduction in the size of previously seen lesion which remains suspicious for clinically significant cancer), (3) stable MRI appearance (either no suspicious features or all lesions stable in size and appearance), (4) significant increase in the size/conspicuity of features suspicious for PCa (e.g. significant increase in the size of the previously seen lesion or new area of restricted diffusion) and (5) definitive radiologic stage progression (features of extracapsular extension, seminal vesicle involvement or lymph node/bone involvement) [8] (Figs. 1, 2, 3 and 4). PRECISE scores were prospectively reported from June 2016 onwards (n = 428). For MRIs performed prior to this period (n = 255), PRECISE scores were retrospectively assigned for the 153 cases (22.4% of the cohort total) in which a lesion was present by a single uroradiologist (T.B.).

Statistical analysis
Statistical analysis was performed using SPSS Statistics 17.0 (IBM Corporation). The Mann-Whitney U test was performed to compare continuous baseline parameters (age, PSA, gland volume, PSA density and index lesion size) between patients who showed evidence of disease progression and remained on AS. Pearson's chi-square test was used for an intergroup comparison of baseline Likert and Gleason scores, treated as ordinal variables. Univariate Cox regression analysis was used to calculate hazard ratios with 95% CIs to identify the prognostic utility of each of the aforementioned baseline parameters. PRECISE scores were dichotomized at a cut-off value of 4 with their diagnostic performance evaluated by the calculation of sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and accuracy per patient level. Kaplan-Meier curves were used to describe progression-free survival outcomes for patients with dichotomized PRECISE scores. Time was measured from the date of enrolment on AS and censored at the date of the last follow-up. P values < 0.05 were considered statistically significant.

Results
Three hundred nine men were identified from our database, and 14 patients were excluded from analysis due to commencing treatment without evidence of pathological progression: 10 due to clinician choice (PSA and/or MRI progression only) and 4 due to patient choice (Fig.  5). Two hundred ninety-five men were assessed, with the baseline median age of 66 years (IQR 61-69), PSA 5.6 ng/ml (IQR 4-7.9) and median baseline PSA density of 0.10 (IQR 0.1-0.2). Two hundred forty-eight (84%) men had baseline Grade Group 1 (Gleason 3 + 3), and 47 (16%) had Grade Group 2 disease (Gleason 3 + 4) ( Table 1). Nine hundred seventy-eight MRI studies were performed, including 683 follow-up studies with PRECISE scores. One hundred thirty-six (46%) cases had a negative MRI at baseline, and of the

Baseline parameters and AS outcome
PSA-D, index lesion size and Likert score were all significantly higher, and gland volume was significantly lower for patients who progressed compared with those remaining on AS (p < 0.05, Table 2); baseline Gleason score was not a significant predictor of outcome (p = 0.33). Univariate Cox regression analysis revealed that baseline PSA density, index lesion size and baseline Likert score had a significant effect on mean progression-free survival (PFS) time, with hazard ratios of 2.3, 1.1 and 1.9 (p < 0.01), respectively (Table 3).

MRI lesion presence
Four of 136 (2.9%) patients with no MRI-visible tumour at baseline progressed, whilst progression was observed in 37/ 159 (23.3%) patients who had visible disease (Likert ≥ 3) at baseline (p < 0.001). The Kaplan-Meier curve showed significantly higher PFS at 60 months for baseline non-visible lesions versus visible lesions at 97.1% and 76.1%, respectively (p < 0.01) (Fig. 6). In addition, a significant difference in PFS at 5 years was also observed in patients who only ever scored PRECISE score 3 when dividing these patients into those with no MRI lesion or those having a visible baseline MRI lesion/s at 100.0% versus 91.1% (p = 0.001), respectively (Supplemental Figure).  (Table 4). For overall PRECISE scoring, the AUC was 0.82 (95% CI 0.74-0.90), and at a cut-off PRECISE score of ≥ 4, the sensitivity, specificity and accuracy were 75.6%, 88.6% and 86.8%, respectively (Table 5).

Discussion
Our work serves to validate the proposed MRI-based PRECISE scoring system for follow-up assessment of prostate cancer patients on active surveillance. We report a good overall diagnostic performance with a high NPV of PRECISE in predicting progression on AS in a prospective clinical setting. We also demonstrate that diagnostic Likert score, index lesion size and baseline PSA-D are independent baseline predictors of progression on AS. In addition, MRI-visible lesions have a significantly lower progression-free survival than MRI non-visible lesions. Although a marginally higher proportion of Grade Group 2 (17%) versus Grade Group 1 (13%) cancers progressed, this was not statistically significant.
Although previous work has also shown that baseline MRI lesion score and PSA-D are significant predictors of AS progression [15,16] and evaluation of baseline risk factors is a key in selecting patients for AS and in tailoring follow-up [17], this will not predict the time point when changes may occur, which is potentially offered by MRI-based PRECISE scoring as part of a follow-up programme. PRECISE scoring with a cut-off value of ≥ 4 had an AUC of 0.83 and overall accuracy of 86.8% in predicting AS progression. In addition, the overall NPV for PRECISE scores 1-3 was high at 95.7% with sub-analysis showing that among 41 patients who progressed, only 2 had PRECISE scores 1-2, whilst NPV reached 100% in cases with no MRIvisible lesion. The presence of an MRI lesion is known to predict upgrading in AS patients [18,19], and it is notable in our cohort that only 2.9% of patients with no lesion progressed compared to 23.3% with an MRIvisible lesion. Conversely, the PPV of PRECISE ≥ 4 was only moderate (51.7%). This is consistent with previous studies showing MRI to have a PPV of 34-69% and an NPV of 70-93% in predicting progression on AS [20][21][22][23][24]; however, a direct comparison is limited by the variable criteria for radiological progression employed by these studies. The high NPV of PRECISE scores 1-3 may reduce the need for followup biopsies, whilst the moderate PPV of PRECISE score 4 for predicting true pathological change should, depending on PSA-D and Likert score, trigger either close monitoring or re-biopsy rather than a direct treatment switch.
It is notable that PRECISE scores 1, 2 and 5 were rarely assigned (combined 7.3% of all studies), with score 3 being the most commonly assigned (83.7% of cases). Given the rarity of the extreme scores 1 and 5  (1.6%), the system essentially became a 3-point scoring system, i.e. radiological improvement versus stability versus radiological progression. Of note, the appearance of new lesions is not separately defined within the current PRECISE system and we scored these prospectively as PRECISE-4. Importantly, the PPV of progression for new lesions at 23.5% was noted to be significantly lower than PRECISE scores 4-5 for already existing lesions at 62.8%. Our findings also highlighted that PRECISE 3 has significantly different outcomes for patients with and without a baseline MRI-visible lesion; thus, refinements to the scoring could be considered in the next guideline update.
Another important finding of our study is the low progression rate of 13.9% over a median follow-up of 52 months. This compares favourably with previous studies reporting higher progression rates between 20 and 36% over shorter follow-up periods (1.8-3.9 years) [25][26][27][28][29] and likely reflects the stringent enrolment criteria employed, incorporating MRI and early re-biopsy for discordant historadiological findings.
The data from the recent ASIST trial is supportive of our results, and their authors reported a lower rate of pathological progression and 50% fewer AS failures over a 2-year followup in the cohort which incorporated baseline MRI and targeted biopsy [30]. Overall, our strategy should limit cases of "pseudo-progression" due to baseline misclassification; therefore, discontinuation of AS likely reflected true pathological progression and enabled more accurate evaluation of the PRECISE system.
Our study benefits from prospective PRECISE scoring in a large AS cohort, with close follow-up and robust outcome data. There are, however, several limitations, including the single-centre and retrospective nature of the analysis. PRECISE scores were prospectively recorded from 2016; however, 22% of studies required a PRECISE score to be retrospectively assigned. This was essential because outcome evaluation requires longer follow-up for AS cohorts. In 8 cases, PRECISE score 5 triggered a direct switch to treatment. One of 8 patients underwent radical prostatectomy where T3a was confirmed at final  pathology, whereas 7 of 8 patients were treated by radiotherapy; thus, pathological progression was not definitely confirmed; however, the specificity of MRI for T staging is known to be high [31]. In addition, prospective assignment of PRECISE scores did not allow multi-reader approach for image interpretation and evaluation of inter-reader agreement; however, the main aim of our study was to test the scoring system against real-world outcomes. One of the limitations to this study was the use of different slice thickness and gap parameters at different magnet strengths; however, the protocols remained within the technical specifications of the PI-RADS guidelines and this was done to ensure that optimal imaging quality is achieved on both 3-T and 1.5-T scanning systems. Finally, we employed a Likert scoring system rather than PI-RADS; however, PI-RADS scoring can only be used for baseline evaluation and cannot be used for the follow-up assessment of patients on AS [32], and outcome data in biopsy-naïve patients has shown Likert-based scoring to perform well [33][34][35]. Future prospective studies assessing the predictive value of PRECISE with standardized AS end-points are required to address these limitations [16].
In conclusion, this study validates the MRI-based PRECISE scoring system in a prospective clinical cohort. Overall performance of PRECISE was considered good in predicting disease progression on AS. Our results show PRECISE scores 1-3 have high NPV which may reduce the need for re-biopsy, whilst PRECISE scores 4-5 have moderate PPV and could trigger either close monitoring or re-biopsy. AUC area under the curve, PPV positive predictive value, NPV negative predictive value, Acc. accuracy