Skip to main content

BoneBert: A BERT-based Automated Information Extraction System of Radiology Reports for Bone Fracture Detection and Diagnosis

Part of the Lecture Notes in Computer Science book series (LNISA,volume 12695)


Radiologists make the diagnoses of bone fractures through examining X-ray radiographs and document them in radiology reports. Applying information extraction techniques on such radiology reports to retrieve the information of bone fracture diagnosis could yield a source of structured data for medical cohort studies, image labelling and decision support concerning bone fractures. In this study, we proposed an information extraction system of Bone X-ray radiology reports to retrieve the details of bone fracture detection and diagnosis, based on a bio-medically pre-trained Bidirectional Encoder Representations from Transformers (BERT) natural language processing (NLP) model by Google. The model, named as BoneBert, was first trained on annotations automatically generated by a handcrafted rule-based labelling system using a dataset of 6,048 X-ray radiology reports and then fine-tuned on a small set of 4,890 expert annotations. Thus, the model was trained in a “semi-supervised” fashion. We evaluated the performance of the proposed model and compared it with the conventional rule-based labelling system on two typical tasks: Assertion Classification (AC) for bone fracture status detection (positive, negative or uncertainty) and Named Entity Recognition (NER) related to the fracture type, the bone type and location of a fracture occurs. BoneBert outperformed the rule-based system in both tasks, showing great potential for automated information extraction of the detection and diagnosis of bone fracture from radiology reports, such as, the clinical status, type and location of bone fracture, and more related observations.


  • Electronic medical records
  • Machine learning
  • Natural language processing
  • Semisupervised learning

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-74251-5_21
  • Chapter length: 12 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
USD   84.99
Price excludes VAT (USA)
  • ISBN: 978-3-030-74251-5
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   109.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.


  1. 1.

  2. 2.

  3. 3.

  4. 4.


  1. Banerjee, I., Chen, M.C., Lungren, M.P., Rubin, D.L.: Radiology report annotation using intelligent word embeddings: applied to multi-institutional chest CT cohort. J. Biomed. Inform. 77, 11–20 (2018).,

  2. Bozkurt, S., Alkim, E., Banerjee, I., Rubin, D.L.: Automated detection of measurements and their descriptors in radiology reports using a hybrid natural language processing algorithm. Journal of Digital Imaging 32(4), 544–553 (2019).

    CrossRef  Google Scholar 

  3. Chambers, N., et al.: Learning alignments and leveraging natural logic. In: Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing - RTE 2007. Association for Computational Linguistics, Morristown, NJ, USA, pp. 165–170 (2007).

  4. Charniak, E., Johnson, M.: Coarse-to-fine n-best parsing and MaxEnt discriminative reranking. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics - ACL 2005. Association for Computational Linguistics, Morristown, NJ, USA, vol. 1, pp. 173–180 (2005).

  5. Datta, S., Si, Y., Rodriguez, L., Shooshan, S.E., Demner-Fushman, D., Roberts, K.: Understanding spatial language in radiology: representation framework, annotation, and spatial relation extraction from chest X-ray reports using deep learning. J. Biomed. Inform. 108, 103473 (2019).,

  6. De Marneffe, M.C., et al.: Universal stanford dependencies: a cross-linguistic typology. In: Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014, pp. 4585–4592 (2014)

    Google Scholar 

  7. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  8. Grundmeier, R., et al.: Identification of long bone fractures in radiology reports using natural language processing to support healthcare quality improvement. Appl. Clin. Inform. 7(4), 1051–1068 (2016).

  9. Hassanpour, S., Langlotz, C.P.: Information extraction from multi-institutional radiology reports. Artif. Intell. Med. 66, 29–39 (2016).,

  10. Irvin, J., et al.: CheXpert: A large chest radiograph dataset with uncertainty labels and expert comparison. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, pp. 590–597 (2019).,

  11. Johnson, A.E., et al.: MIMIC-III, a freely accessible critical care database. Sci. Data 3(1), 160035 (2016).

  12. Liventsev, V., Fedulova, I., Dylov, D.: Deep text prior: weakly supervised learning for assertion classification. In: Tetko, I.V., Kůrková, V., Karpov, P., Theis, F. (eds.) ICANN 2019. LNCS, vol. 11731, pp. 243–257. Springer, Cham (2019).

    CrossRef  Google Scholar 

  13. McDermott, M.B.A., Hsu, T.M.H., Weng, W.H., Ghassemi, M., Szolovits, P.: CheXpert++: approximating the cheXpert labeler for speed, differentiability, and probabilistic output. arXiv preprint arXiv:2006.15229 (2020)

  14. Peng, Y., Wang, X., Lu, L., Bagheri, M., Summers, R., Lu, Z.: NegBio: a high-performance tool for negation and uncertainty detection in radiology reports. AMIA Joint Summits Trans. Sci. Proc. 2017, 188–196 (2018).

  15. Peng, Y., Yan, S., Lu, Z.: Transfer learning in biomedical natural language processing: an evaluation of BERT and ELMo on ten benchmarking datasets. In: Proceedings of the 18th BioNLP Workshop and Shared Task. Association for Computational Linguistics, Stroudsburg, PA, USA, pp. 58–65 (2019).

  16. Qi, P., Zhang, Y., Zhang, Y., Bolton, J., Manning, C.D.: Stanza: a Python natural language processing toolkit for many human languages. arXiv preprint arXiv:2003.07082 (2020)

  17. Santus, E., et al.: Do neural information extraction algorithms generalize across institutions? JCO Clin. Cancer Inform. 3, 1–8 (2019).

  18. Schuster, S., Manning, C.D.: Enhanced English universal dependencies: an improved representation for natural language understanding tasks. In: Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016, pp. 2371–2378 (2016)

    Google Scholar 

  19. Sevenster, M., Buurman, J., Liu, P., Peters, J., Chang, P.: Natural language processing techniques for extracting and categorizing finding measurements in narrative radiology reports. Appl. Clin. Inform. 6(3), 600–610 (2015).

  20. Smit, A., Jain, S., Rajpurkar, P., Pareek, A., Ng, A.Y., Lungren, M.P.: CheXbert: combining automatic labelers and expert annotations for accurate radiology report labeling using BERT. arXiv preprint arXiv:2004.09167 (2020)

  21. Steinkamp, J.M., Chambers, C., Lalevic, D., Zafar, H.M., Cook, T.S.: Toward complete structured information extraction from radiology reports using machine learning. J. Digit. Imaging 32(4), 554–564 (2019).

    CrossRef  Google Scholar 

  22. Tibbo, M.E., et al.: Use of natural language processing tools to identify and classify periprosthetic femur fractures. J. Arthroplasty 34(10), 2216–2219 (2019).,

  23. Wang, Y., Mehrabi, S., Sohn, S., Atkinson, E.J., Amin, S., Liu, H.: Natural language processing of radiology reports for identification of skeletal site-specific fractures. BMC Med. Inform. Decis. Making 19(S3), 73 (2019).

  24. Wang, Y., et al.: A clinical text classification paradigm using weak supervision and deep representation. BMC Med. Inform. Decis. Making 19(1), 1 (2019).

  25. Yadav, K., Sarioglu, E., Smith, M., Choi, H.A.: Automated outcome classification of emergency department computed tomography imaging reports. Acad. Emerg. Med. 20(8), 848–854 (2013).

  26. Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R., Le, Q.V.: XLNet: Generalized autoregressive pretraining for language understanding. arXiv preprint arXiv:1906.08237 pp. 1–11 (2019)

Download references


This work was supported in part by the SBRI competition: AI supporting early detection and diagnosis in heart failure management.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Lianghao Han .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Dai, Z., Li, Z., Han, L. (2021). BoneBert: A BERT-based Automated Information Extraction System of Radiology Reports for Bone Fracture Detection and Diagnosis. In: Abreu, P.H., Rodrigues, P.P., Fernández, A., Gama, J. (eds) Advances in Intelligent Data Analysis XIX. IDA 2021. Lecture Notes in Computer Science(), vol 12695. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-74250-8

  • Online ISBN: 978-3-030-74251-5

  • eBook Packages: Computer ScienceComputer Science (R0)