To benchmark the performance of a calibrated 3D convolutional neural network (CNN) applied to multiparametric MRI (mpMRI) for risk assessment of clinically significant prostate cancer (csPCa) using decision curve analysis (DCA).
We retrospectively analyzed 499 patients who had positive mpMRI (PI-RADSv2 ≥ 3) and MRI-targeted biopsy. The training cohort comprised 449 men, including a calibration set of 50 men. Biopsy decision strategies included using risk estimates from the CNN (original and calibrated), to perform biopsy in men with PI-RADSv2 ≥ 4 only, or additionally in men with PI-RADSv2 3 and PSA density (PSAd) ≥ 0.15 ng/ml/ml. Discrimination, calibration and clinical usefulness in the unseen test cohort (n = 50) were assessed using C-statistic, calibration plots and DCA, respectively.
The calibrated CNN achieved moderate calibration (Hosmer-Lemeshow calibration test, p = 0.41) and good discrimination (C = 0.85). DCA revealed consistently higher net benefit and net reduction in biopsies for the calibrated CNN compared with the original CNN, PI-RADSv2 ≥ 4 and the combined strategy of PI-RADSv2 and PSAd. Original CNN predictions were severely miscalibrated (p < 0.0001) resulting in net harm compared with a ‘biopsy all’ patients strategy. At-risk thresholds ≥ 10% using the calibrated CNN and the combined strategy reduced the number of biopsies by an estimated 201 and 55 men, respectively, per 1000 men at risk, without missing csPCa, while original CNN and PI-RADSv2 ≥ 4 could not achieve a net reduction in biopsies.
DCA revealed that our calibrated 3D-CNN resulted in fewer unnecessary biopsies compared with using PI-RADSv2 alone or in combination with PSAd. CNN calibration is important in achieving clinical utility.
• A 3D deep learning model applied to multiparametric MRI may help to prevent unnecessary prostate biopsies in patients eligible for MRI-targeted biopsy.
• Owing to miscalibration, original risk estimates by the deep learning model require prior calibration to enable clinical utility.
• Decision curve analysis confirmed a net benefit of using our calibrated deep learning model for biopsy decisions compared with alternative strategies, including PI-RADSv2 alone and in combination with prostate-specific antigen density.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
Tax calculation will be finalised during checkout.
Apparent diffusion coefficient
Area under the receiver-operating characteristic curve
Convolutional neural network
Decision curve analysis
Magnetic resonance imaging
Prostate Imaging Reporting and Data System version 2.0
Prostate-specific antigen density
Drost F-JHJH, Osses DF, Nieboer D et al (2019) Prostate MRI, with or without MRI-targeted biopsy, and systematic biopsy for detecting prostate cancer. Cochrane Database Syst Rev 2019:CD012663. https://doi.org/10.1002/14651858.CD012663.pub2
Ahdoot M, Wilbur AR, Reese SE et al (2020) MRI-targeted, systematic, and combined biopsy for prostate cancer diagnosis. N Engl J Med 382:917–928. https://doi.org/10.1056/NEJMoa1910038
Weinreb JC, Barentsz JO, Choyke PL et al (2016) PI-RADS prostate imaging – reporting and data system: 2015, version 2. Eur Urol 69:16–40
Smith CP, Harmon SA, Barrett T et al (2019) Intra- and interreader reproducibility of PI-RADSv2: a multireader study. J Magn Reson Imaging 49:1694–1703. https://doi.org/10.1002/jmri.26555
Greer MD, Shih JH, Lay N et al (2019) Interreader variability of prostate imaging reporting and data system version 2 in detecting and assessing prostate cancer lesions at prostate MRI. AJR Am J Roentgenol 212:1197–1205. https://doi.org/10.2214/AJR.18.20536
Song Y, Zhang YD, Yan X et al (2018) Computer-aided diagnosis of prostate cancer using a deep convolutional neural network from multiparametric MRI. J Magn Reson Imaging 48:1570–1577. https://doi.org/10.1002/jmri.26047
Aldoj N, Lukas S, Dewey M, Penzkofer T (2019) Semi-automatic classification of prostate cancer on multi-parametric MR imaging using a multi-channel 3D convolutional neural network. Eur Radiol. https://doi.org/10.1007/s00330-019-06417-z
Schelb P, Kohl S, Radtke JP et al (2019) Classification of cancer at prostate MRI: deep learning versus clinical PI-RADS assessment. Radiology 293:607–617. https://doi.org/10.1148/radiol.2019190938
Ishioka J, Matsuoka Y, Uehara S et al (2018) Computer-aided diagnosis of prostate cancer on magnetic resonance imaging using a convolutional neural network algorithm. BJU Int 122:411–417. https://doi.org/10.1111/bju.14397
Yang X, Liu C, Wang Z et al (2017) Co-trained convolutional neural networks for automated detection of prostate cancer in multi-parametric MRI. Med Image Anal 42:212–227. https://doi.org/10.1016/j.media.2017.08.006
Alkadi R, Taher F, El-baz A, Werghi N (2019) A deep learning-based approach for the detection and localization of prostate cancer in T2 magnetic resonance images. J Digit Imaging 32:793–807. https://doi.org/10.1007/s10278-018-0160-1
Yoo S, Gujrathi I, Haider MA, Khalvati F (2019) Prostate cancer detection using deep convolutional neural networks. Sci Rep 9:19518. https://doi.org/10.1038/s41598-019-55972-4
Clark T, Zhang J, Baig S, Wong A, Haider MA, Khalvati F (2017) Fully automated segmentation of prostate whole gland and transition zone in diffusion-weighted MRI using convolutional neural networks. J Med Imaging (Bellingham) 4:1. https://doi.org/10.1117/1.jmi.4.4.041307
Goldenberg SL, Nir G, Salcudean SE (2019) A new era: artificial intelligence and machine learning in prostate cancer. Nat Rev Urol 16:391–403
Khalvati F, Zhang J, Chung AG et al (2018) MPCaD: a multi-scale radiomics-driven framework for automated prostate cancer localization and detection. BMC Med Imaging. https://doi.org/10.1186/s12880-018-0258-4
Lay N, Tsehay Y, Greer MD et al (2017) Detection of prostate cancer in multiparametric MRI using random forest with instance weighting. J Med Imaging (Bellingham) 4:024506. https://doi.org/10.1117/1.JMI.4.2.024506
Thompson IM, Ankerst DP, Chi C et al (2006) Assessing prostate cancer risk: results from the prostate cancer prevention trial. J Natl Cancer Inst 98:529–534. https://doi.org/10.1093/jnci/djj131
Roobol MJ, van Vugt HA, Loeb S et al (2012) Prediction of prostate cancer risk: the role of prostate volume and digital rectal examination in the ERSPC risk calculators. Eur Urol 61:577–583. https://doi.org/10.1016/j.eururo.2011.11.012
Mottet N, Cornford P, van den Bergh RCN et al (2019) EAU - EANM - ESTRO - ESUR - SIOG guidelines on prostate cancer 2019. Eur Assoc Urol Guidel 53:1–161
Steyerberg EW, Vickers AJ, Cook NR et al (2010) Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiology 21:128–138
Vickers AJ, Elkin EB (2006) Decision curve analysis: a novel method for evaluating prediction models. Med Decis Making 26:565–574. https://doi.org/10.1177/0272989X06295361
Collins GS, Reitsma JB, Altman DG, Moons KGM (2015) Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. Ann Intern Med 162:55–63. https://doi.org/10.7326/M14-0697
Guo C, Pleiss G, Sun Y, Weinberger KQ (2017) On calibration of modern neural networks. Proc 34th Int Conf Mach Learn 70:1321–1330
Van Calster B, Vickers AJ (2015) Calibration of risk prediction models: impact on decision-analytic performance. Med Decis Making 35:162–169. https://doi.org/10.1177/0272989X14547233
Fitzgerald M, Saville BR, Lewis RJ (2015) Decision curve analysis. JAMA 313:409–410
Balachandran VP, Gonen M, Smith JJ, DeMatteo RP (2015) Nomograms in oncology: more than meets the eye. Lancet Oncol 16:e173–e180
Kerr KF, Brown MD, Zhu K, Janes H (2016) Assessing the clinical impact of risk prediction models with decision curves: guidance for correct interpretation and appropriate use. J Clin Oncol 34:2534–2540. https://doi.org/10.1200/JCO.2015.65.5654
Vickers AJ, Van Calster B, Steyerberg EW (2016) Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests. BMJ 352. https://doi.org/10.1136/bmj.i6
Kelly CJ, Karthikesalingam A, Suleyman M, Corrado G, King D (2019) Transforming classifier scores into accurate multiclass probability estimates clinical decision support systems view project evaluation methodology view project transforming classifier scores into accurate multiclass probability estimates. Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining https://doi.org/10.1186/s12916-019-1426-2
Nagendran M, Chen Y, Lovejoy CA et al (2020) Artificial intelligence versus clinicians: systematic review of design, reporting standards, and claims of deep learning studies in medical imaging. BMJ 368:m689. https://doi.org/10.1136/bmj.m689
Moore CM, Kasivisvanathan V, Eggener S et al (2013) Standards of reporting for MRI-targeted biopsy studies (START) of the prostate: recommendations from an international working group. Eur Urol 64:544–552. https://doi.org/10.1016/j.eururo.2013.03.030
Epstein JI, Egevad L, Amin MB et al (2016) The 2014 international society of urological pathology (ISUP) consensus conference on Gleason grading of prostatic carcinoma definition of grading patterns and proposal for a new grading system. Am J Surg Pathol 40:244–252. https://doi.org/10.1097/PAS.0000000000000530
Lehmann TM, Gönner C, Spitzer K (2001) Addendum: B-spline interpolation in medical image processing. IEEE Trans Med Imaging 20:660–665. https://doi.org/10.1109/42.932749
Kull M, Silva Filho TM, Flach P (2017) Beyond Sigmoids: how to obtain well-calibrated probabilities from binary classifiers with beta calibration. Electron J Stat 11:5052–5080. https://doi.org/10.1214/17-EJS1338SI
van der Ploeg T, Nieboer D, Steyerberg EW (2016) Modern modeling techniques had limited external validity in predicting mortality from traumatic brain injury. J Clin Epidemiol 78:83–89. https://doi.org/10.1016/j.jclinepi.2016.03.002
Schoots IG, Osses DF, Drost F-JH et al (2018) Reduction of MRI-targeted biopsies in men with low-risk prostate cancer on active surveillance by stratifying to PI-RADS and PSA-density, with different thresholds for significant disease. Transl Androl Urol 7:132–144. https://doi.org/10.21037/tau.2017.12.29
Hansen NL, Kesch C, Barrett T et al (2017) Multicentre evaluation of targeted and systematic biopsies using magnetic resonance and ultrasound image-fusion guided transperineal prostate biopsy in patients with a previous negative biopsy. BJU Int 120:631–638. https://doi.org/10.1111/bju.13711
Venderink W, van Luijtelaar A, Bomers JGR et al (2018) Results of targeted biopsy in men with magnetic resonance imaging lesions classified equivocal, likely or highly likely to be clinically significant prostate cancer. Eur Urol 73:353–360. https://doi.org/10.1016/j.eururo.2017.02.021
Van Calster B, Wynants L, Verbeek JFMM et al (2018) Reporting and interpreting decision curve analysis: a guide for investigators. Eur Urol 74:796–804. https://doi.org/10.1016/j.eururo.2018.08.038
Capogrosso P, Vickers AJ (2019) A systematic review of the literature demonstrates some errors in the use of decision curve analysis but generally correct interpretation of findings. Med Decis Making 39:493–498. https://doi.org/10.1177/0272989X19832881
Vickers AJ, van Calster B, Steyerberg EW (2019) A simple, step-by-step guide to interpreting decision curve analysis. Diagn Progn Res 3:18. https://doi.org/10.1186/s41512-019-0064-7
Bossuyt PM, Reitsma JB, Bruns DE et al (2003) Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative. Radiology 226:24–28. https://doi.org/10.1148/radiol.2261021292
Zadrozny B, Elkan C (2002) Transforming classifier scores into accurate multiclass probability estimates clinical decision support systems view project evaluation methodology view project transforming classifier scores into accurate multiclass probability estimates. https://doi.org/10.1145/775047.775151
Guarantors of the integrity of the entire study, D.D., F.K. and M.A.H.; study concepts/study design or data acquisition or data analysis/interpretation, all authors; manuscript drafting or manuscript revision for important intellectual content, all authors; approval of the final version of the submitted manuscript, all authors; literature research, D.D., N.A., F.K. and M.A.H.; clinical studies, D.D., L.M. and M.A.H; statistical analysis, D.D. and X.D. and manuscript editing, D.D., F.K. and M.A.H.
This study has received funding by the Ontario Institute for Cancer Research and the Deutsche Forschungsgemeinschaft (DFG; German Research Foundation) fellowship [DE 3207/1-1].
The scientific guarantor of this publication is Masoom A. Haider.
Conflict of interest
The authors of this manuscript declare no relationships with any companies whose products or services may be related to the subject matter of the article.
Statistics and biometry
One of the authors, Xin Dong, has a Master of Science degree in Mathematics with significant statistical expertise.
Written informed consent was waived by the Institutional Review Board.
Institutional Review Board approval was obtained.
Study subjects or cohorts overlap
Some study subjects or cohorts have been previously reported in Yoo S, Gujrathi I, Haider MA, Khalvati F (2019) Prostate cancer detection using deep convolutional neural networks. Sci Rep 9:19518. https://doi.org/10.1038/s41598-019-55972-4.
• diagnostic or prognostic study
• performed at one institution
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
About this article
Cite this article
Deniffel, D., Abraham, N., Namdar, K. et al. Using decision curve analysis to benchmark performance of a magnetic resonance imaging–based deep learning model for prostate cancer risk assessment. Eur Radiol 30, 6867–6876 (2020). https://doi.org/10.1007/s00330-020-07030-1
- Artificial intelligence
- Deep Learning
- Magnetic resonance imaging
- Prostatic neoplasms
- Decision analysis