Skip to main content

Automatic Determination of the Need for Intravenous Contrast in Musculoskeletal MRI Examinations Using IBM Watson’s Natural Language Processing Algorithm


Magnetic resonance imaging (MRI) protocoling can be time- and resource-intensive, and protocols can often be suboptimal dependent upon the expertise or preferences of the protocoling radiologist. Providing a best-practice recommendation for an MRI protocol has the potential to improve efficiency and decrease the likelihood of a suboptimal or erroneous study. The goal of this study was to develop and validate a machine learning-based natural language classifier that can automatically assign the use of intravenous contrast for musculoskeletal MRI protocols based upon the free-text clinical indication of the study, thereby improving efficiency of the protocoling radiologist and potentially decreasing errors. We utilized a deep learning-based natural language classification system from IBM Watson, a question-answering supercomputer that gained fame after challenging the best human players on Jeopardy! in 2011. We compared this solution to a series of traditional machine learning-based natural language processing techniques that utilize a term-document frequency matrix. Each classifier was trained with 1240 MRI protocols plus their respective clinical indications and validated with a test set of 280. Ground truth of contrast assignment was obtained from the clinical record. For evaluation of inter-reader agreement, a blinded second reader radiologist analyzed all cases and determined contrast assignment based on only the free-text clinical indication. In the test set, Watson demonstrated overall accuracy of 83.2% when compared to the original protocol. This was similar to the overall accuracy of 80.2% achieved by an ensemble of eight traditional machine learning algorithms based on a term-document matrix. When compared to the second reader’s contrast assignment, Watson achieved 88.6% agreement. When evaluating only the subset of cases where the original protocol and second reader were concordant (n = 251), agreement climbed further to 90.0%. The classifier was relatively robust to spelling and grammatical errors, which were frequent. Implementation of this automated MR contrast determination system as a clinical decision support tool may save considerable time and effort of the radiologist while potentially decreasing error rates, and require no change in order entry or workflow.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2



Institutional Review Board


Magnetic resonance imaging




Natural language processing


With contrast


  1. 1.

    Boland GW, Duszak, Jr R, Kalra M: Protocol design and optimization. Journal of the American College of Radiology. 11(5):440–441, 2014.

    Article  PubMed  Google Scholar 

  2. 2.

    Ginat DT, Uppuluri P, Christoforidis G, Katzman G, Lee S-K: Identification of neuroradiology MRI protocol errors via a quality-driven categorization approach. J Am Coll Radiol. 13(5):545–548, 2016.

    Article  PubMed  Google Scholar 

  3. 3.

    Bairstow PJ, Persaud J, Mendelson R, Nguyen L: Reducing inappropriate diagnostic practice through education and decision support. International Journal for Quality in Health Care. 22(3):194–200, 2010.

    Article  PubMed  Google Scholar 

  4. 4.

    Garg AX, Adhikari NKJ, McDonald H et al.: Effects of computerized clinical decision support systems on practitioner performance and patient outcomes. JAMA. 293(10):1223, 2005.

    CAS  Article  PubMed  Google Scholar 

  5. 5.

    Blackmore CC, Castro A: Improving the quality of imaging in the emergency department. Acad Emerg Med 22(12):1385–1392, 2015

  6. 6.

    Kim, Yoon: Convolutional Neural Networks for Sentence Classification. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014.

  7. 7.

    Pons E, Braun LMM, Hunink MGM, Kors JA: Natural language processing in radiology: a systematic review. Radiology. 279(2):329–343, 2016.

    Article  PubMed  Google Scholar 

  8. 8.

    Hassanpour S, Bay G, Langlotz CP: Characterization of change and significance for clinical findings in radiology reports through natural language processing. J Digit Imaging 30(3):314-322, 2017.

  9. 9.

    Huang M-W, Chen C-W, Lin W-C, Ke S-W, Tsai C-F: SVM and SVM ensembles in breast cancer prediction. PLOS ONE. 12(1):e0161501, 2017.

    Article  PubMed  PubMed Central  Google Scholar 

  10. 10.

    Lakhani P, Langlotz CP: Automated detection of radiology reports that document non-routine communication of critical or significant results. J Digit Imaging. 23(6):647–657, 2010.

    Article  PubMed  Google Scholar 

  11. 11.

    Hassanpour S, Langlotz CP: Information extraction from multi-institutional radiology reports. Artif Intell Med. 66:29–39, 2016.

    Article  PubMed  Google Scholar 

  12. 12.

    Cheng LTE, Zheng J, Savova GK, Erickson BJ: Discerning tumor status from unstructured MRI reports-completeness of information in existing reports and utility of automated natural language processing. J Digit Imaging. 23(2):119–132, 2010.

    Article  PubMed  Google Scholar 

  13. 13.

    Meystre SM, Savova GK, Kipper-Schuler KC, Hurdle JF: Extracting information from textual documents in the electronic health record: a review of recent research. Yearb Med Inform. 35(8):128–144, 2008.

  14. 14.

    Chapman WW, Bridewell W, Hanbury P, Cooper GF, Buchanan BG: A simple algorithm for identifying negated findings and diseases in discharge summaries. J Biomed Inform. 34(5):301–310, 2001.

    CAS  Article  PubMed  Google Scholar 

  15. 15.

    LeCun Y, Bengio Y, Hinton G: Deep learning. Nature. 521(7553):436–444, 2015.

    CAS  Article  PubMed  Google Scholar 

  16. 16.

    Ferrucci D, Levas A, Bagchi S, Gondek D, Mueller ET: Watson: beyond jeopardy! Artificial Intelligence. 199:93–105, 2013.

    Article  Google Scholar 

  17. 17.

    R Core Team (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL:

  18. 18.

    Jurka TP, Collingwood L, Boydstun AE, Grossman E, van Atteveldt W: RTextTools: a supervised learning package for text classification. R Journal. 5(1):6–12, 2013

    Google Scholar 

  19. 19.

    Jurka T: MAXENT: an R package for low-memory multinomial logistic regression with support for semi-automated text classification. R J 4(1):56, 2012

  20. 20.

    Friedman J, Hastie T, Tibshirani R: Regularization paths for generalized linear models via coordinate descent. J Stat Soft 33(1):1–968, 2010.

  21. 21.

    Liaw A, Wiener M: Classification and regression by randomForest. R News 2(1):18, 2002

  22. 22.

    Feinerer I, Hornik K, Meyer D: Text Mining Infrastructure in R. J Stat Softw 25(5):1–54, 2008.

  23. 23.

    Peters A, Hothorn T, Lausen B: Ipred: improved predictors. R News, 2002. Available at Accessed 12 Sept 2017

  24. 24.

    Tuszynski J.: caTools: tools: moving window statistics, GIF, Base64, ROC AUC, Etc. R package version, 2008. Available at Accessed 12 Sept 2017

  25. 25.

    Ripley B.: Classification and regression trees. Available at Accessed 13 Sept 2017

  26. 26.

    Feinerer I, Hornik K, Meyer D: Text mining infrastructure in R. Journal of Statistical Software 25(5):1–54, 2008. 10.18637/jss.v025.i05

    Article  Google Scholar 

  27. 27.

    Ferrucci DA: Introduction to “this is Watson”. IBM Journal of Research and Development 56(3.4):1:1–1:15, 2012.

    Article  Google Scholar 

Download references


HT was supported by an NIH T32 Fellowship, 5T32EB001631-10. JHS was supported by the NVIDIA academic grant program.

Author information



Corresponding author

Correspondence to Jae Ho Sohn.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Electronic supplementary material


(DOCX 15 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Trivedi, H., Mesterhazy, J., Laguna, B. et al. Automatic Determination of the Need for Intravenous Contrast in Musculoskeletal MRI Examinations Using IBM Watson’s Natural Language Processing Algorithm. J Digit Imaging 31, 245–251 (2018).

Download citation


  • IBM Watson
  • Machine learning
  • Artificial intelligence
  • Deep learning
  • Natural language processing (NLP)
  • Imaging protocol
  • Workflow efficiency
  • Quality improvement