Abstract
Objectives
To compare diagnostic accuracy of a deep learning artificial intelligence (AI) for cervical spine (C-spine) fracture detection on CT to attending radiologists and assess which undetected fractures were injuries in need of stabilising therapy (IST).
Methods
This single-centre, retrospective diagnostic accuracy study included consecutive patients (age ≥18 years; 2007–2014) screened for C-spine fractures with CT. To validate ground truth, one radiologist and three neurosurgeons independently examined scans positive for fracture. Negative scans were followed up until 2022 through patient files and two radiologists reviewed negative scans that were flagged positive by AI. The neurosurgeons determined which fractures were ISTs. Diagnostic accuracy of AI and attending radiologists (index tests) were compared using McNemar.
Results
Of the 2368 scans (median age, 48, interquartile range 30–65; 1441 men) analysed, 221 (9.3%) scans contained C-spine fractures with 133 IST. AI detected 158/221 scans with fractures (sensitivity 71.5%, 95% CI 65.5–77.4%) and 2118/2147 scans without fractures (specificity 98.6%, 95% CI 98.2–99.1). In comparison, attending radiologists detected 195/221 scans with fractures (sensitivity 88.2%, 95% CI 84.0–92.5%, p < 0.001) and 2130/2147 scans without fracture (specificity 99.2%, 95% CI 98.8–99.6, p = 0.07). Of the fractures undetected by AI 30/63 were ISTs versus 4/26 for radiologists. AI detected 22/26 fractures undetected by the radiologists, including 3/4 undetected ISTs.
Conclusion
Compared to attending radiologists, the artificial intelligence has a lower sensitivity and a higher miss rate of fractures in need of stabilising therapy; however, it detected most fractures undetected by the radiologists, including fractures in need of stabilising therapy.
Clinical relevance statement
The artificial intelligence algorithm missed more cervical spine fractures on CT than attending radiologists, but detected 84.6% of fractures undetected by radiologists, including fractures in need of stabilising therapy.
Key Points
-
The impact of artificial intelligence for cervical spine fracture detection on CT on fracture management is unknown.
-
The algorithm detected less fractures than attending radiologists, but detected most fractures undetected by the radiologists including almost all in need of stabilising therapy.
-
The artificial intelligence algorithm shows potential as a concurrent reader.
Similar content being viewed by others
Notes
van der Kolk BYM, van den Wittenboer GJ, Warringa N et al (2022) Assessment of cervical spine CT scans by emergency physicians: a comparative diagnostic accuracy study in a non-clinical setting. JACEP Open 3:e12609
Abbreviations
- AI:
-
Artificial intelligence
- CI:
-
Confidence interval
- C-spine:
-
Cervical spine
- IST:
-
Injury in need of stabilising therapy
- NPV:
-
Negative predictive value
- PPV:
-
Positive predictive value
References
Izzo R, Popolizio T, Balzano RF, Pennelli AM, Simeone A, Muto M (2019) Imaging of cervical spine traumas. Eur J Radiol 117:75–88
Bokhari AR, Sivakumar B, Sefton A et al (2019) Morbidity and mortality in cervical spine injuries in the elderly. ANZ J Surg 89(4):412–417
Gupta R, Siroya HL, Bhat DI, Shukla DP, Pruthi N, Devi BI (2022) Vertebral artery dissection in acute cervical spine trauma. J Craniovertebr Junction Spine 13(1):27–37
Shank CD, Walters BC, Hadley MN (2017) Management of acute traumatic spinal cord injuries. Handb Clin Neurol 140:275–298
Holmes J, Akkinepalli R (2005) Computed tomography versus plain radiography to screen for cervical spine injury: a meta-analysis. J Trauma Acute Care Surg 58(5):902–905
Davis JW, Phreaner DL, Hoyt DB, Mackersie RC (1993) The etiology of missed cervical spine injuries. J Trauma 34(3):342–6
Poonnoose PM, Ravichandran G, McClelland MR (2002) Missed and mismanaged injuries of the spinal cord. J Trauma 53(2):314–320
Levi AD, Hurlbert RJ, Anderson P et al (2006) Neurologic deterioration secondary to unrecognized spinal instability following trauma–a multicenter study. Spine 31(4):451–458
Guven R, Akca AH, Caltili C et al (2018) Comparing the interpretation of emergency department computed tomography between emergency physicians and attending radiologists: a multicenter study. Niger J Clin 21(10):1323
Van Zyl HP, Bilbey J, Vukusic A et al (2014) Can emergency physicians accurately rule out clinically important cervical spine injuries by using computed tomography? Can Med Assoc J 16(2):131–135
Stengel D, Ottersbach C, Matthes G et al (2012) Accuracy of single-pass whole-body computed tomography for detection of injuries in patients with major blunt trauma. Can Med Assoc J 184(8):869–876
van der Kolk BYM, van den Wittenboer GJ, Warringa N et al (2022) Assessment of cervical spine CT scans by emergency physicians: a comparative diagnostic accuracy study in a non-clinical setting. J Am Coll Emerg Physicians Open 3(1):e12609
Small JE, Osler P, Paul AB, Kunst M (2021) CT cervical spine fracture detection using a convolutional neural network. AJNR Am J Neuroradiol 42(7):1341–1347
Guermazi A, Tannoury C, Kompel AJ et al (2022) Improving radiographic fracture recognition performance and efficiency using artificial intelligence. Radiology 302(3):627–636
RSNA announces results of cervical spine fracture AI challenge, Brooks L, Harris I, RSNA (2022) Available via https://press.rsna.org/timssnet/media/pressreleases/14_pr_target.cfm?ID=2399. Accessed 15 May 2023
Radiologists' go-to AI solution, AIDOC Medical ltd. Available via https://www.aidoc.com/solutions/radiology/. Accessed 15 May 2023.
Voter AF, Larson ME, Garrett JW, Yu JJ (2021) Diagnostic accuracy and failure mode analysis of a deep learning algorithm for the detection of cervical spine fractures. AJNR Am J Neuroradiol 42(8):1550–1556
Hosny A, Parmar C, Quackenbush J, Schwartz LH, Aerts Hugo J W L (2018) Artificial intelligence in radiology. Nat Rev Cancer 18(8):500–510
Chomutare T, Tejedor M, Svenning TO et al (2022) Artificial intelligence implementation in healthcare: a theory-based scoping review of barriers and facilitators. Int J Environ Res Public Health 19(23):16359
Yousefi Nooraie R, Lyons PG, Baumann AA, Saboury B (2021) Equitable implementation of artificial intelligence in medical imaging: what can be learned from implementation science? PET Clin 16(4):643–653
Bossuyt PM, Reitsma JB, Bruns DE et al (2015) STARD 2015: an updated list of essential items for reporting diagnostic accuracy studies. BMJ 351:h5527
Mongan J, Moy L, Kahn CE (2020) Checklist for artificial intelligence in medical imaging (CLAIM): A guide for authors and reviewers. Radiology AI 2(2):e200029
Occipitocervical trauma, Lehman R, Riew D, Schnake K, AO spine Web site (2023) Available via https://surgeryreference.aofoundation.org/spine/trauma/occipitocervical. Accessed 15 May 2023
Schnake KJ, Schroeder GD, Vaccaro AR, Oner C (2017) AOSpine classification systems (subaxial, thoracolumbar). J Orthop Trauma 31(Suppl 4):S14–S23. https://doi.org/10.1097/BOT.0000000000000947
Trajman A, Luiz RR (2008) McNemar χ2 test revisited: comparing sensitivity and specificity of diagnostic examinations. Scand J Clin Lab Invest 68(1):77–80
Dratsch T, Chen X, Rezazade Mehrizi M et al (2023) Automation bias in mammography: the impact of artificial intelligence BI-RADS suggestions on reader performance. Radiology 307(4):e222176
Funding
The authors state that this work has not received any funding.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Guarantor
The scientific guarantor of this publication is M.F. Boomsma.
Conflict of interest
The authors of this manuscript declare no relationships with any companies, whose products or services may be related to the subject matter of the article.
Statistics and biometry
One of the authors has significant statistical expertise, namely I.M. Nijholt.
Informed consent
Written informed consent was waived by the Institutional Review Board.
Ethical approval
Institutional Review Board approval was obtained.
Study subjects or cohorts overlap
In a previous study, we reported on 411 patients who were also included in the current study. The previous study evaluated the diagnostic accuracy of emergency physicians compared to radiologists in detecting cervical spine injuries and is titled “Assessment of cervical spine CT scans by emergency physicians: A comparative diagnostic accuracy study in a non-clinical setting”.Footnote 1 The current study uses 2368 patients to compare the diagnostic accuracy of an artificial intelligence algorithm compared to attending radiologists.
Methodology
• retrospective
• diagnostic study
• performed at one institution
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
van den Wittenboer, G., van der Kolk, B.Y.M., Nijholt, I.M. et al. Diagnostic accuracy of an artificial intelligence algorithm versus radiologists for fracture detection on cervical spine CT. Eur Radiol (2024). https://doi.org/10.1007/s00330-023-10559-6
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00330-023-10559-6