Abstract
Objective
Radiographic bone age assessment (BAA) is used in the evaluation of pediatric endocrine and metabolic disorders. We previously developed an automated artificial intelligence (AI) deep learning algorithm to perform BAA using convolutional neural networks. We compared the BAA performance of a cohort of pediatric radiologists with and without AI assistance.
Materials and methods
Six board-certified, subspecialty trained pediatric radiologists interpreted 280 age- and gender-matched bone age radiographs ranging from 5 to 18 years. Three of those radiologists then performed BAA with AI assistance. Bone age accuracy and root mean squared error (RMSE) were used as measures of accuracy. Intraclass correlation coefficient evaluated inter-rater variation.
Results
AI BAA accuracy was 68.2% overall and 98.6% within 1 year, and the mean six-reader cohort accuracy was 63.6 and 97.4% within 1 year. AI RMSE was 0.601 years, while mean single-reader RMSE was 0.661 years. Pooled RMSE decreased from 0.661 to 0.508 years, all individually decreasing with AI assistance. ICC without AI was 0.9914 and with AI was 0.9951.
Conclusions
AI improves radiologist’s bone age assessment by increasing accuracy and decreasing variability and RMSE. The utilization of AI by radiologists improves performance compared to AI alone, a radiologist alone, or a pooled cohort of experts. This suggests that AI may optimally be utilized as an adjunct to radiologist interpretation of imaging studies to improve performance.
Similar content being viewed by others
References
Johnson M, Schuster M, Le QV, et al. Google’s multilingual neural machine translation system: enabling zero-shot translation. arXiv [csCL]. 2016. http://arxiv.org/abs/1611.04558.
Maas R, Rastrow A, Goehner K, Tiwari G, Joseph S, Hoffmeister B. Domain-specific utterance end-point detection for speech recognition. In: Interspeech 2017. 2017. https://doi.org/10.21437/interspeech.2017-1673.
Silver D, Schrittwieser J, Simonyan K, et al. Mastering the game of go without human knowledge. Nature. 2017;550(7676):354–9. https://doi.org/10.1038/nature24270.
Gulshan V, Peng L, Coram M, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 2016;316(22):2402–10. https://doi.org/10.1001/jama.2016.17216.
Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115–8. https://doi.org/10.1038/nature21056.
Lewis-Kraus G. The Great A.I. Awakening. The New York Times. https://www.nytimes.com/2016/12/14/magazine/the-great-ai-awakening.html. Published December 14, 2016. Accessed 23 Oct 2017.
Mukherjee S. A.I. Versus M.D. The New Yorker. https://www.newyorker.com/magazine/2017/04/03/ai-versus-md. Published March 27, 2017. Accessed 23 Oct 2017.
Larson DB, Chen MC, Lungren MP, Halabi SS, Stence NV, Langlotz CP. Performance of a deep-learning neural network model in assessing skeletal maturity on pediatric hand radiographs. Radiology. 2017;170236. https://doi.org/10.1148/radiol.2017170236.
Greulich WW, Pyle SI. Radiographic atlas of skeletal development of the hand and wrist. Am J Med Sci. 1959;238(3):393. https://doi.org/10.1097/00000441-195909000-00030.
Ehrenberg ASC. J R Stat Soc Ser C Appl Stat. 1977;26(1):80. https://doi.org/10.2307/2346874.
Lee H, Tajmir S, Lee J, et al. Fully automated deep learning system for bone age assessment. J Digit Imaging. 2017. https://doi.org/10.1007/s10278-017-9955-8.
Zeiler MD, Fergus R. Visualizing and understanding convolutional networks. In: Fleet D, Pajdla T, Schiele B, Tuytelaars T, editors. Computer vision – ECCV 2014. Lecture Notes in Computer Science. Springer International Publishing; 2014. p. 818-833. https://doi.org/10.1007/978-3-319-10590-1_53.
Gilsanz V, Ratib O. Hand bone age: a digital atlas of skeletal maturity. Berlin Heidelberg: Springer; 2011. https://doi.org/10.1007/978-3-642-23762-1.
Abuzaghleh O, Barkana BD, Faezipour M. Noninvasive real-time automated skin lesion analysis system for melanoma early detection and prevention. IEEE J Transl Eng Health Med. 2015;3:2900310. https://doi.org/10.1109/JTEHM.2015.2419612.
van Grinsven MJJP, van Ginneken B, Hoyng CB, Theelen T, Sanchez CI. Fast convolutional neural network training using selective data sampling: application to hemorrhage detection in color fundus images. IEEE Trans Med Imaging. 2016;35(5):1273–84. https://doi.org/10.1109/TMI.2016.2526689.
Lakhani P, Sundaram B. Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology. 2017:162326. https://doi.org/10.1148/radiol.2017162326.
González G, Ash SY, Vegas Sanchez-Ferrero G, et al. Disease staging and prognosis in smokers using deep learning in chest computed tomography. Am J Respir Crit Care Med. 2017. https://doi.org/10.1164/rccm.201705-0860OC.
Lee H, Troschel FM, Tajmir S, et al. Pixel-level deep segmentation: artificial intelligence quantifies muscle on computed tomography for body morphometric analysis. J Digit Imaging. 2017. https://doi.org/10.1007/s10278-017-9988-z.
Bahl M, Barzilay R, Yedidia AB, Locascio NJ, Yu L, Lehman CD. High-risk breast lesions: a machine learning model to predict pathologic upgrade and reduce unnecessary surgical excision. Radiology. 2017:170549. https://doi.org/10.1148/radiol.2017170549.
Michael DJ, Nelson AC. HANDX: a model-based system for automatic segmentation of bones from digital hand radiographs. IEEE Trans Med Imaging. 1989;8(1):64–9. https://doi.org/10.1109/42.20363.
Thodberg HH, Kreiborg S, Juul A, Pedersen KD. The BoneXpert method for automated determination of skeletal maturity. IEEE Trans Med Imaging. 2009;28(1):52–66. https://doi.org/10.1109/TMI.2008.926067.
King DG, Steventon DM, O’Sullivan MP, et al. Reproducibility of bone ages when performed by radiology registrars: an audit of Tanner and Whitehouse II versus Greulich and Pyle methods. Br J Radiol. 1994;67(801):848–51. https://doi.org/10.1259/0007-1285-67-801-848.
Cao F, Huang HK, Pietka E, Gilsanz V. Digital hand atlas and web-based bone age assessment: system design and implementation. Comput Med Imaging Graph. 2000;24(5):297–307. http://www.ncbi.nlm.nih.gov/pubmed/10940607
Kim SY, Oh YJ, Shin JY, Rhie YJ, Lee KH. Comparison of the Greulich-Pyle and Tanner Whitehouse (TW3) methods in bone age assessment. J Korean Soc Pediatr Endocrinol. 2008;13(1):50–5. https://www.koreamed.org/SearchBasic.php?RID=0113JKSPE/2008.13.1.50&DT=1
Kim JR, Shim WH, Yoon HM, et al. Computerized bone age estimation using deep learning-based program: evaluation of the accuracy and efficiency. AJR Am J Roentgenol. 2017:1-7. https://doi.org/10.2214/AJR.17.18224.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
None
Additional information
This work has been accepted for presentation at RSNA 2017 and awarded an RSNA Trainee Research Prize.
Appendices
Appendices
Appendix 1
Appendix 2
Rights and permissions
About this article
Cite this article
Tajmir, S.H., Lee, H., Shailam, R. et al. Artificial intelligence-assisted interpretation of bone age radiographs improves accuracy and decreases variability. Skeletal Radiol 48, 275–283 (2019). https://doi.org/10.1007/s00256-018-3033-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00256-018-3033-2