False positives in PIRADS (V2) 3, 4, and 5 lesions: relationship with reader experience and zonal location

To investigate the effect of reader experience and zonal location on the occurrence of false positives (FPs) in PIRADS (V2) 3, 4, and 5 lesions on multiparametric (MP)-MRI of the prostate. This retrospective study included 139 patients who had consecutively undergone an MP-MRI of the prostate in combination with a transrectal ultrasound MRI fusion-guided biopsy between 2014 and 2017. MRI exams were prospectively read by a group of inexperienced radiologists (cohort 1; 54 patients) and an experienced radiologist (cohort 2; 85 patients). Multivariable logistic regression analysis was performed to determine the association of experience of the radiologist and zonal location with a FP reading. FP rates were compared between readings by inexperienced and experienced radiologists according to zonal location, using Chi-square (χ 2) tests. A total of 168 lesions in 139 patients were detected. Median patient age was 68 years (Interquartile range (IQR) 62.5–73), and median PSA was 10.9 ng/mL (IQR 7.6–15.9) for the entire patient cohort. According to multivariable logistic regression, inexperience of the radiologist was significantly (P = 0.044, odds ratio 1.927, 95% confidence interval [CI] 1.017–3.651) and independently associated with a FP reading, while zonal location was not (P = 0.202, odds ratio 1.444, 95% CI 0.820–2.539). In the transition zone (TZ), the FP rate of the inexperienced radiologists 59% (17/29) was significantly higher (χ 2 P = 0.033) than that of the experienced radiologist 33% (13/40). Inexperience of the radiologist is significantly and independently associated with a FP reading, while zonal location is not. Inexperienced radiologists have a significantly higher FP rate in the TZ.


Introduction
Multiparametric (MP) MRI of the prostate is the best option for local diagnosis of prostate cancer (PCa) [1][2][3]. It can be useful in various clinical settings, such as for detection or staging purposes, guiding biopsies, detection of local recurrence, and as a tool to select candidates for active surveillance.
A negative transrectal-ultrasound (TRUS)-guided prostate biopsy with persistent clinical suspicion of PCa is still the most frequent reason to perform MP-MRI in clinical practice. Together with the introduction of the commercially available TRUS-MRI fusion-guided prostate biopsy systems, the use of MP-MRI of the prostate has increased tremendously over the last years. However, along with the increasing popularity of this diagnostic test, serious concerns on quality issues have been raised [4,5]. These 1 3 quality concerns include image acquisition, interpretation, and reporting. The Prostate Imaging Reporting and Data System (PIRADS) has been introduced to address some of these aforementioned issues. Nonetheless, even with the use of the PIRADS version (V) 2 system, interreader variability is still a common present-day problem [6]. Reader experience is probably an important issue in this setting. Previous studies have already emphasized the importance of subspecialty reading in prostate MRI [7,8]. Strongly related to this subject are the pitfalls encountered on MP-MRI of the prostate. Awareness of the causes of false positives (FPs) can theoretically improve the diagnostic performance of the radiologist and decrease interreader variability. Besides the experience of the radiologist, zonal location can also be a source of FPs [9]. For example, detection of PCa in the transition zone (TZ) is often perceived as a difficult task. Pictorial reviews and case series have touched on the topic of FPs [10][11][12], but systematic studies specifically investigating FPs in a pure clinical setting are scarce and from the pre-PIRADS era [13]. We hypothesized that, when not educated properly, the novice reader will, in daily clinical practice, cause unnecessary biopsies especially in the TZ.
Therefore, the purpose of this study was to investigate the effects of reader experience and zonal location on the occurrence of FPs in PIRADS (V2) 3, 4, and 5 lesions on MP-MRI of the prostate, with targeted TRUS-MRI fusionguided prostate biopsy as reference standard.

Patients
This retrospective study was approved by the local ethics committee, and the need for informed consent was waived (registration number 201700780). All patients (n = 186) who had consecutively undergone MP-MRI of the prostate in combination with a TRUS-MRI fusion-guided prostate biopsy between 2014 and 2017 were potentially eligible for inclusion. MP-MRI scans of the prostate were either obtained from in-house records or from referring centers. Patients whose MRI scans were reported before the introduction of standard reporting according to PIRADS version 2 [14] (n = 38) were excluded. An additional number of 9 patients were excluded because of missing crucial information, such as a PIRADS score (n = 7), inconclusive histopathology report (n = 1), and a history of radiotherapy (n = 1). Finally, 139 unique patients were included in this study, of whom 54 patients were prospectively read by the inexperienced radiologists (cohort 1), and 85 patients were read by the experienced radiologist (cohort 2) (Fig. 1

MRI acquisition and analysis
96 patients were referred from outside hospitals (total of 9 different institutions), while 43 patients were scanned in our institution. In total, 88 exams were performed on a 1.5T scanner (of which none was performed with an endorectal coil) and 51 exams were performed on a 3T MRI scanner (of which 10 with an endorectal coil). Of the 51 exams performed on a 3T scanner, five were primarily scanned on a 1.5T scanner in outside hospitals. Due to technical limitations (e.g., motion artifacts, inadequate quality), these were re-obtained in our institution. The MP-MRI examinations of the prostate of 131 patients comprised an axial T1-weighted image, a high-resolution multiplanar T2-weighted image (slice thickness of 3 mm), a diffusion-weighted image (DWI) with at least three b-values (varying between 0-2000 s/mm 2 and with a minimum highest value of 800 s/mm 2 ) and a calculated apparent diffusion coefficient (ADC) map, and a dynamic contrast-enhanced sequence. The remaining 8 patients had undergone the same sequences, except for a dynamic contrast-enhanced (DCE) sequence. All MRI examinations were read at our tertiary referral center. Radiologists were not blinded to clinical data or external reports, completely in line with clinical practice. MRI data were read prospectively by four different radiologists, with different levels of experience in MP-MRI prostate reporting. One trained genitourinary radiologist (with 5 years of experience and > 500 case readings with histopathologic correlation) was defined as the experienced reader, while all other three radiologists (with 1-2 years of experience and < 100 case readings without histopathologic correlation) were categorized as inexperienced readers. The MRI scans were reported by using PIRADS V2, with PIRADS 1-5 representing an incremental scoring system of very low likelihood to very high likelihood of clinically significant PCa [14]. All prostate MRI reports were analyzed for reader experience, PIRADS score, and zonal location (peripheral zone (PZ), TZ, or central zone (CZ)). Furthermore, the FP lesions were retrospectively investigated by the experienced radiologist for other sources of error such as known mimickers of PCa (e.g., prostatitis, anatomic pitfalls) or possible technical inacuracies (e.g., needle delivery, fusion misregistration).

TRUS-MRI fusion-guided prostate biopsy
All patients with at least one lesion with a PIRADS score of 3 or higher were biopsied according to our institutional standard, with up to a maximum of three biopsied lesions per patient. The Dynacad Uronav fusion biopsy system (Invivo, Gainesville, Florida, USA) was used for all biopsies.
Targeted biopsies were performed in all patients, while two patients also received an additional systematic biopsy. Biopsy procedures were undertaken by four different urologists (with 4, 4, 1, and 1 year(s) of experience with fusion biopsy, respectively). The number of cores per lesion was at the discretion of the urologist performing the biopsy.

Histopathology
The targeted biopsies were analyzed by a specialist uropathologist according to the International Society of Urological Pathology (ISUP) 2014 recommendations [16]. Biopsy results containing cancer (Gleason ≥ 3 + 3) were categorized as true positives, while the ones containing no cancer were categorized as FPs.

Statistical analysis
The Shapiro-Wilk's test was used to test if the continuous variables age, PSA, PSA density, prostate volume, number of cancer-suspicious lesions per patient, and number of cores taken per patient, were normally distributed. These variables were then compared between the patient cohort read by the inexperienced radiologists (cohort 1) and the patient cohort read by the experienced radiologist (cohort 2), using the unpaired t test for normally distributed data, and the Mann-Whitney U test for not normally distributed data. The FP rate was calculated as the number of biopsied lesions containing no PCa divided by the total number of biopsied lesions. FP rate readings were compared between inexperienced and experienced radiologists according to zonal location, using Chi-square (χ 2 ) tests. Multivariable logistic regression analysis was performed to determine the association of the experience of the radiologist (inexperienced vs. experienced) and zonal location (TZ vs. PZ) with a FP reading. P-values less than 0.05 (two-sided) were regarded statistically significant. All statistical analyses were performed using MedCalc Statistical Software version 18.5 (MedCalc Software bvba, Ostend, Belgium).

Results
The 139 included patients had a total of 168 PIRADS 3-5 lesions. Median patient age was 68 years (interquartile range (IQR) 62.5-73) and median PSA was 10.9 ng/mL (IQR 7.6-15.9) for the entire patient cohort. Age, PSA, PSA density, prostate volume, number of cancer-suspicious lesions per patient, and number of cores taken per patient were not significantly different (P > 0.101) between the patient cohort read by the inexperienced radiologists and the patient cohort read by the experienced radiologist (Table 1).
According to multivariable logistic regression analysis, inexperience of the radiologist was significantly (P = 0.044, odds ratio 1.927, 95% confidence interval [CI] 1.017-3.651) and independently associated with a FP reading, while zonal location was not (P = 0.202, odds ratio 1.444, 95% CI 0.820-2.539) ( Table 2). Overall, FP rate was 50% for the inexperienced radiologists, and 32% for the experienced radiologist (P = 0.020). The FP rate in the TZ for the inexperienced radiologists (59%, 17/29) was significantly higher (P = 0.033) than that for the experienced radiologist (33%). On the other hand, the FP rate in the PZ for the inexperienced radiologists (43%, 16/37) was not significantly different (P = 0.164) from that for the experienced radiologist (29%). Corresponding results are also displayed in Table 3.
After a further breakdown of the 17 FP lesions in the TZ for the inexperienced radiologists, all were, retrospectively, perceived to be misclassifications of benign prostatic hyperplasia (BPH) nodules (Fig. 2). Of the 16 FP lesions in the PZ, 10 could have retrospectively been avoided (e.g., anatomic variants of the CZ (Fig. 3), low-grade prostatitis with very little-to-no diffusion restriction, BPH nodule in PZ). In six cases, a FP reading was perceived as not avoidable (3 cases of prostatitis with substantial diffusion restriction mimicking PCa, and 3 cases with, also in retrospect, MR imaging abnormalities but with normal histopathology results). These latter three cases (9%, 3/33) were interpreted as likely due to technical inaccuracies (e.g., needle delivery, inaccurate fusion of MR, and TRUS images). For the experienced radiologist, of the 13 FP lesions in the TZ, 6 were perceived as misclassifications (of BPH nodules and prominent anterior fibromuscular stroma). Two were prospectively called as granulomateus prostatitis (however, biopsy was still advised), five were perceived as MR imaging abnormalities but with normal histopathology results. Of the 17 FP lesions in the PZ, five were high-grade PIN and/or prostatitis, and two were perceived as misclassifications (very little diffusion restriction). Ten were perceived, also in retrospect, as MR imaging abnormalities, but with normal histopathology results. Altogether for both the TZ and PZ, 15 (50%, 15/30) FP lesions were likely due to technical inaccuracies.

Discussion
To our knowledge, this study is the first to investigate the FP findings of PIRADS (V2) 3-5 lesions in a pure clinical setting with emphasis on reader experience and zonal location, in a patient population undergoing TRUS MRI fusion-guided targeted prostate biopsy. Our results show that inexperience is independently and significantly associated with a FP reading while zonal location is not. Furthermore, the evaluation of the TZ is mainly what makes the difference between the inexperienced and experienced radiologist. In general, the TZ is an anatomic area that is considered difficult when interpreting MP-MRI of the prostate. The challenging aspect of detecting PCa in the TZ is probably because of the heterogeneous appearance of the TZ which is mainly due to the BPH it contains. Our study shows that the evaluation of the TZ is even more difficult for the inexperienced radiologist. We also found that, to a lesser degree, other potential sources of a FP reading for the inexperienced reader are anatomic variants of the CZ and low-grade prostatitis. One of the main advantages of MP-MRI is its potential to reduce unnecessary biopsies [17,18]. Nonetheless, in case of an inexperienced radiologist and a suspected TZ lesion, this could lead to more unnecessary biopsies. The PIRADS 3 classification had a high number of FPs, for both the inexperienced and the experienced radiologists. Even though this is likely to be expected as PIRADS is a Likert scale, efforts for reducing the number of unnecessary biopsies in this specific group should be made. A possible solution could be the addition of PSA density [19]. Consequently, all lesions with a PIRADS score of 3 and a cut off value of for example below 0.15 could be refrained from biopsy. However, before implementation this needs to be investigated further with special attention to the number of missed significant cancers with this approach. Also noteworthy, FP rates do not only depend on factors accountable to the reader, but also on inaccuracies related to the biopsy technique itself, most likely in case of smaller lesions [20][21][22]. Nonetheless, when analyzing the FPs in the patient cohort of this study that were read by the inexperienced radiologists the majority appeared to be classical examples of misclassification for the TZ as well as for the PZ. In our study, we found that only in 9% of the FP cases for the inexperienced radiologists and in 50% of the FP cases for the experienced radiologist technical issues could have been the reason for a FP lesion. A study by Sheriden et al. [23] reported that 28% of the FP PIRADS 5 lesions could have been missed at biopsy. The difference with our study is that they only investigated PIRADS 5 lesions that are usually large and therefore technique-wise less likely to be missed at biopsy. The findings of this study have two potential clinical implications. First, the relative underperformance of inexperienced readers underlines that reading MP-MRI of the prostate should be reserved to experienced radiologists. The importance of subspecialty training is a well-acknowledged issue in breast and cardiac imaging. This should not be different for MRI of the prostate. After internal evaluation of the results of this study, we decided to reduce the number of radiologists reporting MP-MRI of the prostate and educate internally. Since then, only two of six specialized abdominal radiologists have been reporting MP-MRI of the prostate at our institution. The second potential implication of our findings is that training sessions for inexperienced radiologists should, along with a sufficient caseload with histopathological correlation, pay special attention to the TZ [24,25], CZ, and low-grade prostatitis.
The overall detection (or true positive) rate of the experienced reader in this study (68% overall detection rate in PIRADS 3-5 lesions) was slightly higher than other expert centers performing MRI targeted prostate biopsy (59%-62%) [26,27]; however, both studies were performed using PIRADS V1 which could also have accounted for the slight differences. In 2013, Bratan et al. [13] were the first to investigate the FPs. However, this publication was from the pre-PIRADS era, and they did not assess different levels of experience. In another study published in 2017, the issue of experience was addressed [5]. They reported that reader experience may help to reduce overcalling and avoid over targeting of lesions, which is in concordance with our results. Nevertheless, none of these studies specifically looked at the combination of reader experience and zonal location in FP lesions.
Our study had several limitations. First, because of its retrospective design, MRI protocols were heterogeneous (i.e., different magnetic field strengths and slightly varying MRI sequence settings), and possibly of suboptimal quality (e.g., majority of scans were obtained with a 1.5T scanner; and no endorectal coil or the inability to control imaging parameters due to referrals from outside hospitals). However, this is completely in line with routine clinical practice, and this issue is also frequently encountered in multicenter studies [5,18], which in fact increases the generalizability of our results. Second, we were not able to analyze false negatives and their relation to reader experience, which could have been done if the reference standard would have been prostatectomy specimens or if longterm follow-up data would have been available. Third, this study did not investigate other factors potentially associated with a FP reading, such as clinical variables and apex-base location [23]. Fourth, this study consisted of patients who had received prior biopsy (either negative or with cancer but on active surveillance) possibly reducing the future generalizability of the results. It is expected 1 3 that as we move forward, more biopsy-naïve patients will undergo MP-MRI and targeted biopsy. Fifth, we did not include Gleason 3 + 3 lesions in our FP definition. In the era of the increasing number of active surveillance candidates, we are well aware of the importance of being able to discriminate between clinically significant (currently regarded as Gleason ≥ 3 + 4) and insignificant PCa (currently regarded as Gleason ≤ 3 + 3). However, including 3 + 3 in the FP definition on MP-MRI would, unfairly, imply that there are proven MRI features that are prospectively able to discriminate Gleason 3 + 3 from 3 + 4. There are retrospective and validation studies that have investigated the relationship between ADC values and Gleason grades, and they have shown promising results [28]. Yet substantial overlaps between the different grades and ADC values exist [28], making it not very useful in a clinical setting. Due to the lack of robust discriminatory imaging features, publications addressing pitfalls on prostate MRI usually only contain examples of benign disease or FPs related to more technical issues [10,11]. Likewise, the clinically important decision between Gleason 3 + 3 and 3 + 4 can also be challenging on histopathology slides in biopsy specimens. There is substantial interobserver variability together with shortcomings of the Gleason grading scoring system itself [29]. This is probably also the reason why some supposedly Gleason 3 + 3 tumors, even though not very common, can metastasize or show aggressive behavior (such as extraprostatic extension or even seminal vesical invasion) [30,31]. Furthermore, targeted biopsies obtained from the cancer-suspicious lesions detected on MP-MRI might not accurately represent the true aggressiveness found in a prostatectomy specimen [32][33][34], with some studies even mentioning percentages of upgrading from 3 + 3 to 3 + 4 in 67.4% of the cases [32]. So even though there is an absolute clinical need for discriminating significant from insignificant PCa, it is radiologically not meaningful when assessing FPs. Future studies should focus on this unsolved issue.
In conclusion, inexperience of the radiologist is significantly and independently associated with a FP reading, while zonal location is not. Inexperienced radiologists have a significantly higher FP rate in the TZ.