Abstract
Purpose
The AAST Organ Injury Scale is widely adopted for splenic injury severity but suffers from only moderate inter-rater agreement. This work assesses SpleenPro, a prototype interactive explainable artificial intelligence/machine learning (AI/ML) diagnostic aid to support AAST grading, for effects on radiologist dwell time, agreement, clinical utility, and user acceptance.
Methods
Two trauma radiology ad hoc expert panelists independently performed timed AAST grading on 76 admission CT studies with blunt splenic injury, first without AI/ML assistance, and after a 2-month washout period and randomization, with AI/ML assistance. To evaluate user acceptance, three versions of the SpleenPro user interface with increasing explainability were presented to four independent expert panelists with four example cases each. A structured interview consisting of Likert scales and free responses was conducted, with specific questions regarding dimensions of diagnostic utility (DU); mental support (MS); effort, workload, and frustration (EWF); trust and reliability (TR); and likelihood of future use (LFU).
Results
SpleenPro significantly decreased interpretation times for both raters. Weighted Cohen’s kappa increased from 0.53 to 0.70 with AI/ML assistance. During user acceptance interviews, increasing explainability was associated with improvement in Likert scores for MS, EWF, TR, and LFU. Expert panelists indicated the need for a combined early notification and grading functionality, PACS integration, and report autopopulation to improve DU.
Conclusions
SpleenPro was useful for improving objectivity of AAST grading and increasing mental support. Formative user research identified generalizable concepts including the need for a combined detection and grading pipeline and integration with the clinical workflow.
Similar content being viewed by others
Data Availability
Data for this study can be made available upon reasonable request.
References
Chahine AH, Gilyard S, Hanna TN, Fan S, Risk B, Johnson JO et al (2021) Management of splenic trauma in contemporary clinical practice: a national trauma data bank study. Acad Radiol 28(Suppl 1):S138–S147. https://doi.org/10.1016/J.ACRA.2020.11.010
Dreizin D, Munera F (2012) Blunt polytrauma: evaluation with 64-section whole-body CT angiography. Radiographics 32:609–632. https://doi.org/10.1148/RG.323115099
Dreizin D, Champ K, Dattwyler M, Bodanapally U, Smith EB, Li G, Singh R, Wang Z, Liang Y (2023) Blunt splenic injury in adults: Association between volumetric quantitative CT parameters and intervention. J Trauma Acute Care Surg 94(1):125–132. https://doi.org/10.1097/TA.0000000000003684
Dreizin D, Yu T, Motley K, Li G, Morrison JJ, Liang Y (2022) Blunt splenic injury: Assessment of follow-up CT utility using quantitative volumetry. Front Radiol 2:941863. https://doi.org/10.3389/fradi.2022.941863
Krausz MM, Hirsh M (2003) Bolus versus continuous fluid resuscitation and splenectomy for treatment of uncontrolled hemorrhagic shock after massive splenic injury. J Trauma 55:62–68. https://doi.org/10.1097/01.TA.0000074110.77122.46
Kozar RA, Crandall M, Shanmuganathan K, Zarzaur BL, Coburn M, Cribari C et al (2018) Organ injury scaling 2018 update: spleen, liver, and kidney. J Trauma Acute Care Surg 85:1119–1122. https://doi.org/10.1097/TA.0000000000002058
Requarth JA, D’Agostino RB, Miller PR (2011) Nonoperative management of adult blunt splenic injury with and without splenic artery embolotherapy: a meta-analysis. J Trauma - Inj Infect Crit Care 71:898–903. https://doi.org/10.1097/TA.0B013E318227EA50
Bhangu A, Nepogodiev D, Lal N, Bowley DM (2012) Meta-analysis of predictive factors and outcomes for failure of non-operative management of blunt splenic trauma. Injury 43:1337–1346. https://doi.org/10.1016/j.injury.2011.09.010
Haan JM, Biffl W, Knudson MM, Davis KA, Oka T, Majercik S et al (2004) Splenic embolization revisited: a multicenter review. J Trauma 56:542–547. https://doi.org/10.1097/01.TA.0000114069.73054.45
Zarzaur BL, Kozar R, Myers JG, Claridge JA, Scalea TM, Neideen TA et al (2015) The splenic injury outcomes trial: an American Association for the Surgery of Trauma multi-institutional study. J Trauma Acute Care Surg 79:335–342. https://doi.org/10.1097/TA.0000000000000782
Coccolini F, Montori G, Catena F, Kluger Y, Biffl W, Moore EE et al (2017) Splenic trauma: WSES classification and guidelines for adult and pediatric patients. World J Emerg Surg 12:1–26. https://doi.org/10.1186/S13017-017-0151-4
Barquist ES, Pizano LR, Feuer W, Pappas PA, McKenney KA, LeBlang SD et al (2004) Inter- and intrarater reliability in computed axial tomographic grading of splenic injury: why so many grading scales? J Trauma 56:334–338. https://doi.org/10.1097/01.TA.0000052364.71392.70
Clark R, Hird K, Misur P, Ramsay D, Mendelson R (2011) CT grading scales for splenic injury: why can’t we agree? J Med Imaging Radiat Oncol 55:163–169. https://doi.org/10.1111/J.1754-9485.2011.02246.X
Adams-McGavin RC, Tafur M, Vlachou PA, Wu M, Brassil M, Crivellaro P et al (2023) Interrater agreement of CT grading of blunt splenic injuries: does the AAST grading need to be reimagined? Can Assoc Radiol J. https://doi.org/10.1177/08465371231184425
Hanna TN, Loehfelm T, Khosa F, Rohatgi S, Johnson JO (2016) Overnight shift work: factors contributing to diagnostic discrepancies. Emerg Radiol 23:41–47. https://doi.org/10.1007/S10140-015-1355-0
Glover M, Almeida RR, Schaefer PW, Lev MH, Mehan WA (2017) Quantifying the impact of noninterpretive tasks on radiology report turn-around times. J Am Coll Radiol 14:1498–1503. https://doi.org/10.1016/J.JACR.2017.07.023
Banaste N, Caurier B, Bratan F, Bergerot JF, Thomson V, Millet I (2018) Whole-body CT in patients with multiple traumas: factors leading to missed injury. Radiology 289:374–383. https://doi.org/10.1148/RADIOL.2018180492
Alexander R, Waite S, Bruno MA, Krupinski EA, Berlin L, Macknik S et al (2022) Mandating limits on workload, duty, and speed in radiology. Radiology 304:274–282. https://doi.org/10.1148/RADIOL.212631/ASSET/IMAGES/LARGE/RADIOL.212631.VA.JPEG
Agrawal A, Khatri GD, Khurana B, Sodickson AD, Liang Y, Dreizin D (2023) A survey of ASER members on artificial intelligence in emergency radiology: trends, perceptions, and expectations. Emerg Radiol 30:267. https://doi.org/10.1007/S10140-023-02121-0
Dreizin D, Staziaki PV, Khatri GD, Beckmann NM, Feng Z, Liang Y et al (2023) Artificial intelligence CAD tools in trauma imaging: a scoping review from the American Society of Emergency Radiology (ASER) AI/ML Expert Panel. Emerg Radiol 30:251–265. https://doi.org/10.1007/S10140-023-02120-1
Gomez C, Unberath M, Huang CM (2023) Mitigating knowledge imbalance in AI-advised decision-making through collaborative user involvement. Int J Hum Comput Stud 172:102977. https://doi.org/10.1016/J.IJHCS.2022.102977
Chen H, Gomez C, Huang CM, Unberath M (2022) Explainable medical imaging AI needs human-centered design: guidelines and evidence from a systematic review. Npj Digit Med 5:1–15. https://doi.org/10.1038/s41746-022-00699-2
Castro DC, Walker I, Glocker B (2020) Causality matters in medical imaging. Nat Commun 11:3673. https://doi.org/10.1038/S41467-020-17478-W
Leung HKN, Wong PWL (1997) A study of user acceptance tests. Software Qual J 6:137–149. https://doi.org/10.1023/A:1018503800709/METRICS
Quandt M, Freitag M (2021) A systematic review of user acceptance in industrial augmented reality. Front Educ (Lausanne) 6:700760. https://doi.org/10.3389/FEDUC.2021.700760/BIBTEX
West E, Mutasa S, Zhu Z, Ha R (2019) Global trend in artificial intelligence-based publications in radiology from 2000 to 2018. AJR Am J Roentgenol 213:1204–1206. https://doi.org/10.2214/AJR.19.21346
Fujita H (2020) AI-based computer-aided diagnosis (AI-CAD): the latest review to read first. Radiol Phys Technol 13:6–19. https://doi.org/10.1007/S12194-019-00552-4
Chen H, Unberath M, Dreizin D (2023) Toward automated interpretable AAST grading for blunt splenic injury. Emerg Radiol 30:41–50. https://doi.org/10.1007/S10140-022-02099-1
Cai CJ, Reif E, Hegde N, Hipp J, Kim B, Smilkov D, Wattenberg M, Viegas F, Corrado GS, Stumpe MC, Terry M, (2019) Human-centered tools for coping with imperfect algorithms during medical decision-making. In Proceedings of the 2019 chi conference on human factors in computing systems pp. 1–14
mPower Clinical Analytics for Medical Imaging | Nuance n.d. https://www.nuance.com/healthcare/diagnostics-solutions/radiology-performance-analytics/mpower-clinical-analytics.html (accessed December 20, 2023)
Hillis SL, Obuchowski NA, Berbaum KS (2011) Power estimation for multireader ROC methods: an updated and unified approach. Acad Radiol 18:129–142. https://doi.org/10.1016/J.ACRA.2010.09.007
The Mathworks Inc. MATLAB 2022b n.d.
Dreizin D, Chen T, Liang Y, Zhou Y, Paes F, Wang Y, Yuille AL, Roth P, Champ K, Li G, McLenithan A, Morrison JJ (2021Jun) Added value of deep learning-based liver parenchymal CT volumetry for predicting major arterial injury after blunt hepatic trauma: a decision tree analysis. Abdom Radiol (NY) 46(6):2556–2566. https://doi.org/10.1007/s00261-020-02892-x
Dreizin D, Zhou Y, Fu S, Wang Y, Li G, Champ K, Siegel E, Wang Z, Chen T, Yuille AL (2020) A Multiscale Deep Learning Method for Quantitative Visualization of Traumatic Hemoperitoneum at CT: Assessment of Feasibility and Comparison with Subjective Categorical Estimation. Radiol Artif Intell 2(6):e190220. https://doi.org/10.1148/ryai.2020190220
Dreizin D, Nixon B, Hu J, Albert B, Yan C, Yang G, Chen H, Liang Y, Kim N, Jeudy J, Li G, Smith EB, Unberath M (2022) A pilot study of deep learning-based CT volumetry for traumatic hemothorax. Emerg Radiol 29(6):995–1002. https://doi.org/10.1007/s10140-022-02087-5
Dreizin D, Zhou Y, Chen T, Li G, Yuille AL, McLenithan A, Morrison JJ (2020) Deep learning-based quantitative visualization and measurement of extraperitoneal hematoma volumes in patients with pelvic fractures: Potential role in personalized forecasting and decision support. J Trauma Acute Care Surg 88(3):425–433. https://doi.org/10.1097/TA.0000000000002566
Zapaishchykova A, Dreizin D, Li Z, Wu JY, Roohi SF, Unberath M (2021) An Interpretable Approach to Automated Severity Scoring in Pelvic Trauma. Med Image Comput Comput Assist Interv 12903:424–433. https://doi.org/10.1007/978-3-030-87199-4_40
Dreizin D, Goldmann F, LeBedis C, Boscak A, Dattwyler M, Bodanapally U, Li G, Anderson S, Maier A, Unberath M (2021) An Automated Deep Learning Method for Tile AO/OTA Pelvic Fracture Severity Grading from Trauma whole-Body CT. J Digit Imaging 34(1):53–65. https://doi.org/10.1007/s10278-020-00399-x
Sarkar N, Zhang L, Campbell P, Liang Y, Li G, Khedr M, Khetan U, Dreizin D (2023) Pulmonary contusion: automated deep learning-based quantitative visualization. Emerg Radiol 30(4):435–441. https://doi.org/10.1007/s10140-023-02149-2
Zhou Y, Dreizin D, Wang Y, Liu F, Shen W, Yuille AL (2022) External Attention Assisted Multi-Phase Splenic Vascular Injury Segmentation With Limited Data. IEEE Trans Med Imaging 41(6):1346–1357. https://doi.org/10.1109/TMI.2021.3139637
Zhou Y, Dreizin D, Li Y, Zhang Z, Wang Y, Yuille A (2019) Multi-Scale Attentional Network for Multi-Focal Segmentation of Active Bleed after Pelvic Fractures. Mach Learn Med Imaging 11861:461–469. https://doi.org/10.1007/978-3-030-32692-0_53
Hamghalam M, Moreland R, Gomez D, Simpson A, Lin HM, Jandaghi AB, Tafur M, Vlachou PA, Wu M, Brassil M, Crivellaro P, Mathur S, Hosseinpour S, Colak E (2024) Machine Learning Detection and Characterization of Splenic Injuries on Abdominal Computed Tomography. Can Assoc Radiol J 8465371231221052. https://doi.org/10.1177/08465371231221052
Chen YL, Chung IF, Cheng CT, Lin HS (2023) A 2-step deep learning approach to splenic injury detection. In 2023 International Conference on Fuzzy Theory and Its Applications (iFUZZY). IEEE, pp 1–5
Sarkar N, Khedr M, Dreizin D (2023) Does acuity and severity of injury affect trauma whole-body CT report turnaround time? A large-scale study. PREPRINT (Version 1) Available at Research Square. https://doi.org/10.21203/RS.3.RS-3147692/V1
Audigé L, Bhandari M, Hanson B, Kellam J (2005) A concept for the validation of fracture classifications. J Orthop Trauma 19:404–409. https://doi.org/10.1097/01.BOT.0000155310.04886.37
McHugh ML (2012) Interrater reliability: the kappa statistic. Biochem Med (Zagreb) 22:276. https://doi.org/10.11613/bm.2012.031
Babina T, Fedyk A, He A (2022) Firm investments in artificial intelligence technologies and changes in workforce composition. Available at SSRN 4060233
IPython Documentation — IPython 7.31.0 documentation n.d. https://ipython.readthedocs.io/en/7.31.0/index.html (accessed January 1, 2024).
Zhang L, LaBelle W, Unberath M, Chen H, Hu J, Li G, et al. (2023) A vendor-agnostic, PACS integrated, and DICOMcompatible software-server pipeline for testing segmentation algorithms within the clinical radiology workflow. Res Sq:1–18. https://doi.org/10.21203/RS.3.RS-2837634/V1
Dreizin D, Zhang L, Sarkar N, Bodanapally UK, Li G, Hu J, et al. (2023) Accelerating voxelwise annotation of cross-sectional imaging through AI collaborative labeling with quality assurance and bias mitigation. Front Radiol 3. https://doi.org/10.3389/FRADI.2023.1202412
RSNA Abdominal Trauma Detection AI Challenge (2023) | RSNA n.d. https://www.rsna.org/rsnai/ai-image-challenge/abdominal-trauma-detection-ai-challenge (accessed December 6, 2023).
Funding
This research was supported by the National Institutes of Health, grant numbers NIH-K08-EB027141-01A1 and NIH-R01-GM148987-01.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethics approval
The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Institutional Review Board of the University of Maryland Baltimore.
Consent to participate
Patient consent was waived by the Institutional Review Board of the University of Maryland Baltimore due to no more than minimal risk to patients.
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sarkar, N., Kumagai, M., Meyr, S. et al. An ASER AI/ML expert panel formative user research study for an interpretable interactive splenic AAST grading graphical user interface prototype. Emerg Radiol 31, 167–178 (2024). https://doi.org/10.1007/s10140-024-02202-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10140-024-02202-8