Abstract
Objectives
Prostate volume (PV) in combination with prostate specific antigen (PSA) yields PSA density which is an increasingly important biomarker. Calculating PV from MRI is a time-consuming, radiologist-dependent task. The aim of this study was to assess whether a deep learning algorithm can replace PI-RADS 2.1 based ellipsoid formula (EF) for calculating PV.
Methods
Eight different measures of PV were retrospectively collected for each of 124 patients who underwent radical prostatectomy and preoperative MRI of the prostate (multicenter and multi-scanner MRI’s 1.5 and 3 T). Agreement between volumes obtained from the deep learning algorithm (PVDL) and ellipsoid formula by two radiologists (PVEF1 and PVEF2) was evaluated against the reference standard PV obtained by manual planimetry by an expert radiologist (PVMPE). A sensitivity analysis was performed using a prostatectomy specimen as the reference standard. Inter-reader agreement was evaluated between the radiologists using the ellipsoid formula and between the expert and inexperienced radiologists performing manual planimetry.
Results
PVDL showed better agreement and precision than PVEF1 and PVEF2 using the reference standard PVMPE (mean difference [95% limits of agreement] PVDL: −0.33 [−10.80; 10.14], PVEF1: −3.83 [−19.55; 11.89], PVEF2: −3.05 [−18.55; 12.45]) or the PV determined based on specimen weight (PVDL: −4.22 [−22.52; 14.07], PVEF1: −7.89 [−30.50; 14.73], PVEF2: −6.97 [−30.13; 16.18]). Inter-reader agreement was excellent between the two experienced radiologists using the ellipsoid formula and was good between expert and inexperienced radiologists performing manual planimetry.
Conclusion
Deep learning algorithm performs similarly to radiologists in the assessment of prostate volume on MRI.
Key Points
• A commercially available deep learning algorithm performs similarly to radiologists in the assessment of prostate volume on MRI.
• The deep-learning algorithm was previously untrained on this heterogenous multicenter day-to-day practice MRI data set.
Similar content being viewed by others
Explore related subjects
Find the latest articles, discoveries, and news in related topics.Avoid common mistakes on your manuscript.
Introduction
Prostate volume (PV) is an important parameter in the workup of benign and malignant prostate diseases [1,2,3]. Combining the prostate specific antigen (PSA) value and PV yields the PSA density (PSAD) [4,5,6]. A higher PSAD, often using a threshold of 0.15 ng/ml2, indicates a higher risk of prostate cancer [7, 8]. PSAD is an increasingly important factor in making decisions on which patients will undergo biopsies [4]. Last years’ paradigm shift towards “MRI first” [9, 10], meaning that the patient undergoes magnetic resonance imaging (MRI) before biopsies, has made MRI a cornerstone for determining PV. PIRADS [11] recommends the ellipsoid formula method for determined PV; the radiologist measures the prostate height, depth, and width. This ellipsoid formula (EF) is considered the gold standard and has been shown to be relatively accurate [12], but it has some limitations. Use of the EF is time-consuming, reader-dependent, and prone to multiplication errors due to the prostate not being a symmetrical geometrical ellipsoid body, posing anatomical challenges delineating the apex and ventral portion [4, 13, 14]. The most accurate method for assessing PV on MRI is manual planimetry [12, 15, 16], in which the radiologist uses external software to manually outline the prostate boundaries on T2-weighted MRI in three planes. However, manual planimetry is too time-consuming to be a realistic alternative in clinical routine [17].
In the last few years, there has been growing interest in the development of artificial intelligence (AI)–based algorithms in radiology. The most commonly used AI method in imaging is deep learning convolutional neural network–based algorithms [18]. Several studies have shown good performance of AI for automated assisted PV assessment [18,19,20], but questions remain about external validation, generalizability, and how well the algorithm performs in a different clinical context with heterogenous data. The number of algorithms cleared by the U.S. Food and Drug Administration (FDA) continues to grow [21], though with increasing concern regarding algorithms’ true performance in the clinical setting beyond the institutions in which they were trained and validated. To subsidize to these issues, we designed this multicenter, multi-scanner study. We used a proprietary, commercially available deep learning–based system [22] not previously exposed to the data set. To make the comparison between PV methods as comprehensive as possible, we compared them to PVs from transrectal ultrasound and prostatectomy specimens [23,24,25].
The primary aim of this study was to assess whether a previously unexposed deep learning algorithm can replace the PI-RADS 2.1-based ellipsoid formula for calculating the PV from a heterogenous data set from prostate MRI.
The secondary aims were to evaluate the inter-reader agreement between two radiologists using PI-RADS 2.1–based ellipsoid formula and between experienced and inexperienced radiologists performing manual planimetry.
Materials and methods
Study design and population
This retrospective multicenter study was approved by the local ethics review committee at Lund University (entry no. 2014-886) and the Swedish Ethical Review Authority (entry no. 2019-03674). All consecutive patients who underwent robot-assisted radical prostatectomy at Malmö University Hospital in 2018 were identified and assessed for eligibility. Patients were included if they had undergone MRI of the prostate less than 1 year before the surgery. Two patients were excluded due to MRI at a hospital outside our health care region and patient withdrawal, resulting in the inclusion of 124 patients in the study. The data collection algorithm is presented in Fig. 1. Eight different PVs were calculated per patient (Table 1).
MRI technique
The MRI scans were performed at seven different hospitals using eight different scanners, comprising seven different scanner models from two vendors, two different field strengths (1.5 and 3T), and two different T2 transaxial (axial) slice angulations. Different imaging acquisition parameters were used at different sites according to local routines. All protocols included transversal, coronal, and sagittal T2-weighted turbo spin-echo images, which were used for ellipsoid formula calculations, and the T2 axial, which was used for manual planimetry and deep learning contouring. The parameters for the T2 axial are listed in Supplemental Table 1 in electronic supplementary material.
Prostate volume calculations from imaging
All MRI exams were retrospectively read by three consulting radiologists (E.T., J.B., and J.E.) with at least 5 years of experience with MRI prostate. One radiologist (J.E.; highly experienced in manual planimetry from 5 years of fusion biopsy planning at a tertiary referral center) performed manual planimetry by manually tracing the prostate boundaries on the T2 image in three planes using external software (MIM Software, Inc.), blinded to all other volume calculations. This volume, PVMPE, was used as the reference standard, and time consumption for the whole workflow for this manual planimetry was measured on part of the exams (14 patients).
Another radiologist (E.T.) inexperienced in manual planimetry performed manual planimetry (PVMPU) using the same software, but on a different server to secure blinding. Two radiologists (E.T. and J.B.) calculated the PVs using the ellipsoid formula method ([width × depth × height] × [π/ 6]) according to PIRADS [11], shown in Fig. 2b, c. This was done independently and blinded for all other PVs; these volumes are abbreviated PVEF1 and PVEF2.
PVs were also calculated from transrectal ultrasound according to the ellipsoid formula. The ultrasounds were obtained during a routine clinical visit to the urology department and collected from the patient records.
Deep learning algorithm
The algorithm used is a proprietary commercially available product (AI-Rad Companion Prostate MR VA20A_HF02, Siemens Healthcare AG), a machine learning deep learning algorithm that uses a convolutional neural network deep image to image (DI2IN) network. The algorithm was not previously exposed to the images in the current study cohort and whole gland segmentation was performed on non-annotated T2 axial images as described by Yang [26]. The contours and volume calculations (PVDL) were exported back to and saved in the Picture Archive and Communications System (Sectra IDS7). The results were presented as burnt-in contours and PV as text. Contours outlined by deep learning algorithm and expert radiologist are shown in Fig. 2a.
Prostatectomy specimens
Prostatectomy specimens were processed and prepared according to international standard pathological procedures [27], embedding all material, including seminal vesicles and extraprostatic tissue. Specimen dimensions in three planes and weight were obtained from pathology reports. Specimen volume was calculated using the ellipsoid formula (PVSD) [12]. Specimen weight and the prostate density coefficient (1.05 g /mL) [24, 25] were also used to calculate PV (PVSW).
Statistical analysis
Descriptive statistics were used to describe the study cohort as medians and ranges. A box plot was used to present the distribution of volumes according to the different methods.
Agreement between volume measurement methods was evaluated using Bland Altman plots. First, we compared deep learning and ellipsoid formula-determined volumes (i.e., PVDL vs. PVEF1 and PVEF2) in relation to the manual planimetry standard (i.e., PVMPE). Second, we performed a sensitivity analysis comparing the deep learning and ellipsoid formula-determined volumes in relation to the volumes determined based on specimen weight (i.e., PVSW).
Using Bland Altman plots, inter-reader agreement was analyzed between an experienced and inexperienced radiologist performing manual planimetry (i.e., PVMPE vs. PVMPU) and between two radiologists using the ellipsoid formula for volume assessment (i.e., PVEF1 vs. PVEF2). Inter-reader agreement was also measured for the latter using the intraclass correlation (ICC). The formula for random effects, absolute agreement, and single rater measurements was used. The paired t-test was used to compare the mean differences between the experienced and inexperienced radiologist performing manual planimetry and paired t-test with Bonferroni correction was used to compare volume methods. All statistical analyses were performed in R version 4.0.2 [28].
Results
The cohort consisted of 124 patients with a median age of 66 years (range 45–76 years) and median preoperative PSA of 6.90 μg/L (min 0.88; max 39).
As shown in Fig. 3a, the observed mean difference between PVDL and PVMPE was lower than the observed mean difference between PVEF and PVMPE (mean difference [95% limits of agreement] PVDL: −0.33 mL [−10.80; 10.14 mL], PVEF1: −3.83 mL [−19.55; 11.89 mL], PVEF2: −3.05 mL [−18.55; 12.45 mL]). The limits of agreement were slightly narrower for PVDL than PVEF, indicating better precision.
A sensitivity analysis using PVSW as the reference standard is presented as a Bland Altman plot in Fig. 3b. The mean difference (bias) was lower for PVDL than PVEF1 or PVEF2 and the corresponding 95% limits of agreement (mean + 1.96 SD and mean −1.96 SD) narrower for PVDL than PVEF1 or PVEF2 using PVMPE as the reference standard (mean difference [95% limits of agreement] PVDL: −4.22 mL [−22.52; 14.07 mL], PVEF1: −7.89 mL [−30.50; 14.73 mL], PVEF2: −6.97 mL [−30.13; 16.18 mL]). In both Bland Altman plots (Fig. 3a, b), there was a tendency of deep learning and the ellipsoid formula to underestimate volumes of enlarges prostates compared to both manual planimetry and specimen weight, but deep learning seemed to underestimate the large volumes to a lesser extent than the ellipsoid formula.
The inter-reader agreement between the two radiologists performing manual planimetry is shown in Fig. 4a, indicating that the inexperienced reader systematically calculated lower volumes than the experienced reader, but with better precision than between the two radiologists using the ellipsoid formula (mean difference [95% limits of agreement] PVMPE vs. PVMPU: −3.73 mL [−11.90; 4.45 mL], p<0.001, paired t-test; 95% confidence interval [CI] −4.47; −2.99). The inter-reader agreement between PVEF1 and PVEF2 is shown in Fig. 4b. The mean difference (95% limits of agreement) between reader 1 and reader 2 was −0.78 mL (−15.08; 13.51 mL). The ICC (95% CI) was 0.93 (0.96; 0.953).
Table 2 and the box and whisper plot in Fig. 5 show the mean, median, minimum, and maximum values for all compared volumes. Supplemental Table 2 in electronic supplementary material shows which of the combinations of the compared volumes statistically significant differences were found. Timed observations for planimetry by an experienced reader were recorded in 14/124 patients, and mean time consumption per case was 8 min 4 s.
Discussion
This study shows that the PVs obtained from MRI using a commercially available deep learning algorithm have better agreement and precision with two reference standard volumes than the currently recommended gold standard method performed by a radiologist. There was good inter-reader agreement between two experienced radiologists using the ellipsoid formula. Furthermore, this study indicates a small but significant mean difference (low bias) with good precision when evaluating inter-reader agreement between experienced and inexperienced radiologists performing manual planimetry.
In 2013, Turkbey et al [29] showed highly accurate PV estimates by a non-commercially available deep learning algorithm compared to specimen volumes in 99 patients using correlation and Dice similarity coefficients [30] (Pearson coefficient 0.88–0.91, DICE 0.89). They reported no difference compared to manual planimetry. This result is similar in the present study, as the bias between the deep learning algorithm and manual planimetry was close to zero (−0.33 mL). In 2018, Bezinque et al [12] showed in 99 patients that the most reliable method for PV measurement was manual planimetry by an inexperienced reader (ICC 0.91), followed by ellipsoid formula on MRI (ICC 0.9), compared to manual planimetry by an experienced reader. The authors concluded that automated segmentations (ICC 0.38) made by commercially available software must be individually assessed for accuracy, contradictory to the results of the present study. The amount of training data available for algorithms is rapidly increasing, which may explain the difference in performance in the current study and older studies. In 2020, Lee et al [19] evaluated a non-commercially available deep learning algorithm on a 330 MRI (260 training and 70 test case sets) using manual planimetry as the reference standard. The authors concluded that the algorithm provided reliable PV estimates (ICC 0.90) compared to those obtained with the ellipsoid formula (ICC 0.92). The mean error between algorithm and manual planimetry was 2.5 mL (0.33 mL in our study) and 3.3 mL between ellipsoid formula and manual planimetry (3.05–3.83 mL in our study). In 2021, Salvaggio et al [31] used manual planimetry as the reference standard when they evaluated two deep learning algorithms for prostate segmentation in a cohort of 103 patients. The authors concluded that the presence of median lobe enlargement may lead to overestimation by the ellipsoid formula, recommending a segmentation method. In 2021, Cuocolo et al [32] compared different deep learning algorithms in 204 patients (99 training sets and 105 test sets) using manual planimetry as the reference standard. The efficient neural network (ENet) showed the best performance with the lowest relative volume difference compared to reference standard.
In our study, the mean difference compared to specimen weight volumes (using the specimen gravity formula) was less than zero for both the deep learning algorithm and the two ellipsoid formula measurements (−4.22 mL, −7.89 mL, and −6.97 mL, respectively), indicating a systematic underestimation in line with Bezinque et al [12]. On the other hand, Paterson et al [13] showed that the ellipsoid formula overestimated by a mean 1.4 mL and Lee et al [19] that it overestimated by 2.4%. The discrepant results may be related to a different proportion of cases with median lobe hypertrophy or measurement difficulties and inconsistencies when dealing with extraprostatic tissue.
As described by Salvaggio [31], the correlation between PVs obtained from MRI or radical prostatectomy specimens is dependent on the PV itself, a tendency that we also saw in our material, with overestimation of small prostate gland volumes and underestimation of large prostate gland volumes.
Our study showed good inter-reader agreement between two radiologists estimating PVs with the ellipsoid formula according to PIRADS v 2.1 (mean difference −0.78) and ICCs that were in line with two previous studies [17, 33].
Compared to the present study, Bezinque [12] reported somewhat better agreement between experienced and inexperienced radiologists, with a mean difference of −1.00 mL and ICC of 0.91. Differences in study cohorts and design make comparisons between the studies difficult. A small amount of bias can be acceptable as long as the precision is good (as shown by narrow limits of agreement), which seems to be the case with our results.
In this study, we evaluated the agreement between deep learning and ellipsoid formula-determined volumes against two different reference standards. The first evaluation against manual planimetry as the reference standard which was also performed by Cuocolo, Lee, and Bezinque [12, 19, 32]. To avoid the results relying mainly on the similar methodology between deep learning and manual planimetry (i.e., whole gland segmentation by outer contours), we performed a sensitivity analysis by changing the reference standard to the PV based on specimen weight, as used by Turkbey, Mazaheri, and Bulman [17, 23, 29]. The results show that, for both reference standards, the deep learning PVs had lower bias and narrower limits of agreement, meaning better precision than the ellipsoid formula volumes.
To compare the agreement between methods for PV measurement, we based our statistical analysis and visualizations on Bland Altman plots. Several previous studies comparing methods for measuring PV [12, 17, 29] have used different statistical methods based on correlation, and several studies have used DICE similarity coefficients [19, 32]. Correlation coefficients tell us about the linear relationship but not about the agreement, which is indicated by the limits of agreement. In our opinion, the DICE coefficients do not add value to the comparison of methods as evaluated in this study, but they play a key role when studying the quality of outer contour delineation for MRI/ultrasound fusion biopsy and brachytherapy planning. This is the topic of a planned future study by our research group.
Our study has several strengths. The deep learning algorithm was previously unexposed to the data and, to the best of our knowledge, the test set was larger than in previous studies [12, 17, 19, 29, 32]. To reflect a true clinical context, we used a heterogenous MRI data set with a multicenter, multi-scanner setup. We used a proprietary commercially available deep learning algorithm, whereas Lee [19] evaluated a non-commercially available 3D deep learning convolutional neural network. Cuocolo [32] compared three different deep learning networks (ENet, ERFNet, and U-net), concluding that deep learning networks can accurately segment the prostate and Enet performed best. In this study, all MRI exams were resampled to isotropic voxel size and to identical matrix resolution to facilitate the deep learning segmentation and, in our study, no resampling was necessary despite variations in MRI protocols.
We investigated the possibility to perform a validation of the deep learning algorithm on publicly available datasets. The available public datasets with manual planimetry as reference standard [34, 35] had been included in early training of the algorithm, why a performance test could not be performed as data sets for training, validation, and testing must always be unique and separated. It is reasonable to believe this applies also for other commercially available deep learning algorithms, which emphasizes the importance of study designs like the current one, using heterogeneous clinically relevant datasets when testing the robustness of AI models.
This study has some limitations. Firstly, only one algorithm was tested. In addition, although the MRI data set is heterogeneous, one vendor is dominant, which is also the same company behind the evaluated algorithm. Furthermore, the cohort was only prostatectomy patients, which does not reflect the clinical setting, where a larger variation of both cancer and non-cancer patients is scanned. The evaluated algorithm does not offer sub-segmentation of transition and peripheral zone separately. Sub-segmentation would enable more elaborated density calculation, like prostate volume index and transition zone PSAD [36, 37]. Those measurements can add prognostic value for use in a clinical setting with mixed patients (with cancer, no cancer and inflammation). However, in this current study, with cancer cases only, they play a minor role. We plan on future studies dealing with these limitations by evaluating several algorithms with different scanner vendors and a more heterogeneous patient group. A future study should ideally be designed as a non-inferiority study with power calculation for adequate cohort size. Only one experienced radiologist performed manual planimetry. However, it is known there is an interreader variability in manual planimetry [38]. We tried to take this into consideration by letting a second radiologist (less experienced) also perform manual planimetry and via performing the sensitivity analysis with specimen volumes.
Clinical implications
The number of U.S. FDA-approved commercial AI/deep learning algorithms is rapidly increasing [21], but there is a concern regarding the challenges accompanying the application of those algorithms in the clinical routine. There is a need for structured monitoring and follow-up when we start using those algorithms in day-to-day practice.
To the best of our knowledge, no earlier studies have evaluated a proprietary commercially available deep learning algorithm on such a heterogeneous MRI data set as in this study. The current study setup with a multicenter, multi-scanner, multi-parameter protocol resembles the true clinical situation in a diversified national or international setting.
Conclusion
A deep learning algorithm is at least as good as the PI-RADS 2.1–based ellipsoid formula for assessing PV when compared to in vivo and ex vivo reference standards. This is a promising step towards algorithms helping reallocate radiologist resources towards more complex work tasks than manually measuring PVs.
Change history
06 December 2022
A Correction to this paper has been published: https://doi.org/10.1007/s00330-022-09308-y
Abbreviations
- AI:
-
Artificial intelligence
- ICC:
-
Intraclass correlation
- MRI:
-
Magnetic resonance imaging
- PIRADS:
-
Prostate Imaging Reporting and Data System
- PSA:
-
Prostate-specific antigen
- PSAD:
-
Prostate-specific antigen density
- PV:
-
Prostate volume
- PVDL:
-
Prostate volume obtained from deep learning algorithm
- PVEF:
-
Prostate volume obtained from radiologist using ellipsoid formula
- PVMPE:
-
Prostate volume obtained from manual planimetry by expert radiologist
- PVMPU:
-
Prostate volume obtained from manual planimetry by unexperienced radiologist
- PVSD:
-
Prostate volume obtained from specimen dimensions
- PVSW:
-
Prostate volume obtained from specimen weight
- PVTRUS:
-
Prostate volume obtained from transrectal ultrasound
- EF:
-
PI-RADS 2.1 based ellipsoid formula
References
Garvey B, Türkbey B, Truong H et al (2014) Clinical value of prostate segmentation and volume determination on MRI in benign prostatic hyperplasia. Diagn Interv Radiol 20:229–233
Heidler S, Drerup M, Lusuardi L et al (2018) The correlation of prostate volume and prostate-specific antigen levels with positive bacterial prostate tissue cultures. Urology 115:151–156
Kim YM, Park S, Kim J et al (2013) Role of prostate volume in the early detection of prostate cancer in a cohort with slowly increasing prostate specific antigen. Yonsei Med J 54:1202–1206
Sim KC, Sung DJ, Kang KW et al (2017) Magnetic resonance imaging–based prostate-specific antigen density for prediction of Gleason Score upgrade in patients with low-risk prostate cancer on initial biopsy. J Comput Assist Tomogr 41:731–736
Nordström T, Akre O, Aly M et al (2018) Prostate-specific antigen (PSA) density in the diagnostic algorithm of prostate cancer. Prostate Cancer Prostatic Dis 21:57–63
Fascelli M, Rais-Bahrami S, Sankineni S et al (2016) Combined biparametric prostate magnetic resonance imaging and prostate-specific antigen in the detection of prostate cancer: a validation study in a biopsy-naive patient population. Urology 88:125–134
Loeb S, Bruinsma SM, Nicholson J et al (2015) Active surveillance for prostate cancer: a systematic review of clinicopathologic variables and biomarkers for risk stratification. Eur Urol 67:619–626
Blackwell KL, Bostwick DG, Myers RP et al (1994) Combining prostate specific antigen with cancer and gland volume to predict more reliably pathological stage: the influence of prostate specific antigen cancer density. J Urol 151:1565–1570
Ahmed HU, El-Shater Bosaily A et al (2017) Diagnostic accuracy of multi-parametric MRI and TRUS biopsy in prostate cancer (PROMIS): a paired validating confirmatory study. Lancet 389:815–822
Kasivisvanathan V, Rannikko AS, Borghi M et al (2018) MRI-targeted or standard biopsy for prostate-cancer diagnosis. N Engl J Med 378:1767–1777
Turkbey B, Rosenkrantz AB, Haider MA et al (2019) Prostate Imaging Reporting and Data System Version 2.1: 2019 Update of Prostate Imaging Reporting and Data System Version 2. Eur Urol 76:340-351
Bezinque A, Moriarity A, Farrell C et al (2018) Determination of prostate volume: a comparison of contemporary methods. Acad Radiol 25:1582–1587
Paterson NR, Lavallée LT, Nguyen LN et al (2016) Prostate volume estimations using magnetic resonance imaging and transrectal ultrasound compared to radical prostatectomy specimens. Can Urol Assoc J 10:264
Karademir I, Shen D, Peng Y et al (2013) Prostate volumes derived from MRI and volume-adjusted serum prostate-specific antigen: correlation with Gleason score of prostate cancer. Am J Roentgenol 201:1041–1048
Cheng R, Lay NS, Roth HR et al (2019) Fully automated prostate whole gland and central gland segmentation on MRI using holistically nested networks with short connections. J Med Imaging 6:024007
Jeong CW, Park HK, Hong SK et al (2008) Comparison of prostate volume measured by transrectal ultrasonography and MRI with the actual prostate volume measured after radical prostatectomy. Urol Int 81:179–185
Bulman JC, Toth R, Patel AD et al (2012) Automated computer-derived prostate volumes from MR imaging data: comparison with radiologist-derived MR imaging and pathologic specimen volumes. Radiology 262:144–151
Cuocolo R, Cipullo MB, Stanzione A et al (2019) Machine learning applications in prostate cancer magnetic resonance imaging. Eur Radiol Exp 3:1–8
Lee DK, Sung DJ, Kim CS et al (2020) Three-dimensional convolutional neural network for prostate MRI segmentation and comparison of prostate volume measurements by use of artificial neural network and ellipsoid formula. AJR Am J Roentgenol 214:1229–1238
Ma L, Guo R, Zhang G et al (2017) Automatic segmentation of the prostate on CT images using deep learning and multi-atlas fusionMedical Imaging 2017: image processing. International Society for Optics and Photonics, p 101332O
Allen B, Dreyer K, Stibolt R Jr et al (2021) Evaluation and real-world performance monitoring of artificial intelligence models in clinical practice purchase: try it, buy it, check it. J Am Coll Radiol. https://doi.org/10.1016/j.jacr.2021.08.022
Winkel DJ, Heye T, Weikert TJ et al (2019) Evaluation of an AI-based detection software for acute findings in abdominal computed tomography scans: toward an automated work list prioritization of routine CT examinations. Invest Radiol 54:55–59
Mazaheri Y, Goldman DA, Di Paolo PL et al (2015) Comparison of prostate volume measured by endorectal coil MRI to prostate specimen volume and mass after radical prostatectomy. Acad Radiol 22:556–562
Ohlsén H, Ekman P, Ringertz H (1982) Assessment of prostatic size with computed tomography. Methodologic aspects. Acta Radiol Diagn (Stockh) 23:219-223
Varma M, Morgan JM (2010) The weight of the prostate gland is an excellent surrogate for gland volume. Histopathology 57:55–58
Yang D, Xu D, Zhou SK et al (2017) Automatic liver segmentation using an adversarial image-to-image network. International conference on medical image computing and computer-assisted intervention. Springer, pp 507-515
Egevad L, Srigley JR, Delahunt B (2011) International society of urological pathology consensus conference on handling and staging of radical prostatectomy specimens. Adv Anat Pathol 18:301–305
Team RC (2020) R: a language and environment for statistical computing. Version 4.0. 2. Vienna, Austria
Turkbey B, Fotin SV, Huang RJ et al (2013) Fully automated prostate segmentation on MRI: comparison with manual segmentation methods and specimen volumes. Am J Roentgenol 201:W720–W729
Dice LR (1945) Measures of the amount of ecologic association between species. Ecology 26:297–302
Salvaggio G, Comelli A, Portoghese M et al (2021) Deep learning network for segmentation of the prostate gland with median lobe enlargement in T2-weighted MR images: comparison with manual segmentation method. Curr Probl Diagn Radiol. https://doi.org/10.1067/j.cpradiol.2021.06.006
Cuocolo R, Comelli A, Stefano A et al (2021) Deep learning whole-gland and zonal prostate segmentation on a public MRI dataset. J Magn Reson Imaging 54:452–459
Ghafoor S, Becker AS, Woo S et al (2020) Comparison of PI-RADS Versions 2.0 and 2.1 for MRI-based calculation of the prostate volume. Acad Radiol. https://doi.org/10.1016/j.acra.2020.07.027
Litjens G, Toth R, van de Ven W et al (2014) Evaluation of prostate segmentation algorithms for MRI: the PROMISE12 challenge. Med Image Anal 18:359–373
Armato SG 3rd, Huisman H, Drukker K et al (2018) PROSTATEx Challenges for computerized classification of prostate lesions from multiparametric magnetic resonance images. J Med Imaging (Bellingham) 5:044501
Porcaro AB, Tafuri A, Sebben M et al (2019) Prostate volume index is able to differentiate between prostatic chronic inflammation and prostate cancer in patients with normal digital rectal examination and prostate-specific antigen values <10 ng/mL: results of 564 Biopsy Naïve Cases. Urol Int 103:415–422
Schneider AF, Stocker D, Hötker AM et al (2019) Comparison of PSA-density of the transition zone and whole gland for risk stratification of men with suspected prostate cancer: a retrospective MRI-cohort study. Eur J Radiol 120:108660
Becker AS, Chaitanya K, Schawkat K et al (2019) Variability of manual segmentation of the prostate in axial T2-weighted MRI: a multi-reader study. Eur J Radiol 121:108716
Acknowledgements
We gratefully thank secretary Kajsa Trens at the department of translational medicine for helping out during data collection and statistician Andrea Dahl Sturedahl at Forum Söder for statistical support.
Funding
Open access funding provided by Lund University. This study has received funding by grants from Governmental funding for clinical research (ALF), grants for PhD students from Region Skåne and scholarship from Stig and Ragna Gorthon Foundation.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Guarantor
The scientific guarantor of this publication is Professor Sophia Zackrisson.
Conflict of interest
The authors of this manuscript declare relationships with the following companies:
Sophia Zackrisson has received speaker fees from Siemens Healthcare AB and Pfizer AB. Anders Bjartell has received grants/research supports from Ferring, Bayer and Merck. AB has received honoraria or consultation fees from Astellas, AstraZeneca, Bayer, Janssen, Merck, Recordati and Sandoz. AB has participated in company sponsored speaker´s bureau by Astellas, Bayer, IPSEN, Janssen, Recordati and Sandoz. AB is stock shareholder in LIDDS Pharma, Glactone Pharma and WntResearch.
All other authors declare no conflicts of interest.
Statistics and biometry
Andrea Dahl Sturedahl kindly provided statistical advice for this manuscript.
No complex statistical methods were necessary for this paper.
Informed consent
Written informed consent was waived by the Institutional Review Board.
Ethical approval
Institutional Review Board approval was obtained.
Methodology
• retrospective
• diagnostic or prognostic study
• multicenter study
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The original online version of this article was revised: the author's name Erik Thimansson was incorrectly given as 'Erick Thimansson'.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Thimansson, E., Bengtsson, J., Baubeta, E. et al. Deep learning algorithm performs similarly to radiologists in the assessment of prostate volume on MRI. Eur Radiol 33, 2519–2528 (2023). https://doi.org/10.1007/s00330-022-09239-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00330-022-09239-8