Automatic Determination of the Need for Intravenous Contrast in Musculoskeletal MRI Examinations Using IBM Watson’s Natural Language Processing Algorithm
Magnetic resonance imaging (MRI) protocoling can be time- and resource-intensive, and protocols can often be suboptimal dependent upon the expertise or preferences of the protocoling radiologist. Providing a best-practice recommendation for an MRI protocol has the potential to improve efficiency and decrease the likelihood of a suboptimal or erroneous study. The goal of this study was to develop and validate a machine learning-based natural language classifier that can automatically assign the use of intravenous contrast for musculoskeletal MRI protocols based upon the free-text clinical indication of the study, thereby improving efficiency of the protocoling radiologist and potentially decreasing errors. We utilized a deep learning-based natural language classification system from IBM Watson, a question-answering supercomputer that gained fame after challenging the best human players on Jeopardy! in 2011. We compared this solution to a series of traditional machine learning-based natural language processing techniques that utilize a term-document frequency matrix. Each classifier was trained with 1240 MRI protocols plus their respective clinical indications and validated with a test set of 280. Ground truth of contrast assignment was obtained from the clinical record. For evaluation of inter-reader agreement, a blinded second reader radiologist analyzed all cases and determined contrast assignment based on only the free-text clinical indication. In the test set, Watson demonstrated overall accuracy of 83.2% when compared to the original protocol. This was similar to the overall accuracy of 80.2% achieved by an ensemble of eight traditional machine learning algorithms based on a term-document matrix. When compared to the second reader’s contrast assignment, Watson achieved 88.6% agreement. When evaluating only the subset of cases where the original protocol and second reader were concordant (n = 251), agreement climbed further to 90.0%. The classifier was relatively robust to spelling and grammatical errors, which were frequent. Implementation of this automated MR contrast determination system as a clinical decision support tool may save considerable time and effort of the radiologist while potentially decreasing error rates, and require no change in order entry or workflow.
KeywordsIBM Watson Machine learning Artificial intelligence Deep learning Natural language processing (NLP) Imaging protocol Workflow efficiency Quality improvement
Institutional Review Board
Magnetic resonance imaging
Natural language processing
HT was supported by an NIH T32 Fellowship, 5T32EB001631-10. JHS was supported by the NVIDIA academic grant program.
Compliance with Ethical Standards
Conflict of Interest
The authors declare that they have no conflict of interest.
- 5.Blackmore CC, Castro A: Improving the quality of imaging in the emergency department. Acad Emerg Med 22(12):1385–1392, 2015 https://doi.org/10.1111/acem.12816
- 6.Kim, Yoon: Convolutional Neural Networks for Sentence Classification. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014. https://doi.org/10.3115/v1/d14-1181
- 8.Hassanpour S, Bay G, Langlotz CP: Characterization of change and significance for clinical findings in radiology reports through natural language processing. J Digit Imaging 30(3):314-322, 2017. https://doi.org/10.1007/s10278-016-9931-8
- 12.Cheng LTE, Zheng J, Savova GK, Erickson BJ: Discerning tumor status from unstructured MRI reports-completeness of information in existing reports and utility of automated natural language processing. J Digit Imaging. 23(2):119–132, 2010. https://doi.org/10.1007/s10278-009-9215-7 CrossRefPubMedGoogle Scholar
- 13.Meystre SM, Savova GK, Kipper-Schuler KC, Hurdle JF: Extracting information from textual documents in the electronic health record: a review of recent research. Yearb Med Inform. 35(8):128–144, 2008.Google Scholar
- 17.R Core Team (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL: https://www.R-project.org/
- 18.Jurka TP, Collingwood L, Boydstun AE, Grossman E, van Atteveldt W: RTextTools: a supervised learning package for text classification. R Journal. 5(1):6–12, 2013Google Scholar
- 19.Jurka T: MAXENT: an R package for low-memory multinomial logistic regression with support for semi-automated text classification. R J 4(1):56, 2012Google Scholar
- 20.Friedman J, Hastie T, Tibshirani R: Regularization paths for generalized linear models via coordinate descent. J Stat Soft 33(1):1–968, 2010. https://doi.org/10.1109/TPAMI.2005.127
- 21.Liaw A, Wiener M: Classification and regression by randomForest. R News 2(1):18, 2002Google Scholar
- 22.Feinerer I, Hornik K, Meyer D: Text Mining Infrastructure in R. J Stat Softw 25(5):1–54, 2008.Google Scholar
- 23.Peters A, Hothorn T, Lausen B: Ipred: improved predictors. R News, 2002. Available at https://cran.r-project.org/web/packages/ipred/vignettes/ipred-examples.pdf. Accessed 12 Sept 2017
- 24.Tuszynski J.: caTools: tools: moving window statistics, GIF, Base64, ROC AUC, Etc. R package version, 2008. Available at https://cran.r-project.org/web/packages/caTools/caTools.pdf. Accessed 12 Sept 2017
- 25.Ripley B.: Classification and regression trees. Available at https://cran.r-project.org/web/packages/tree/tree.pdf. Accessed 13 Sept 2017