BoneBert: A BERT-based Automated Information Extraction System of Radiology Reports for Bone Fracture Detection and Diagnosis

Dai, Zhihao; Li, Zhong; Han, Lianghao

doi:10.1007/978-3-030-74251-5_21

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12695))

Included in the following conference series:

International Symposium on Intelligent Data Analysis

1057 Accesses
3 Citations

Abstract

Radiologists make the diagnoses of bone fractures through examining X-ray radiographs and document them in radiology reports. Applying information extraction techniques on such radiology reports to retrieve the information of bone fracture diagnosis could yield a source of structured data for medical cohort studies, image labelling and decision support concerning bone fractures. In this study, we proposed an information extraction system of Bone X-ray radiology reports to retrieve the details of bone fracture detection and diagnosis, based on a bio-medically pre-trained Bidirectional Encoder Representations from Transformers (BERT) natural language processing (NLP) model by Google. The model, named as BoneBert, was first trained on annotations automatically generated by a handcrafted rule-based labelling system using a dataset of 6,048 X-ray radiology reports and then fine-tuned on a small set of 4,890 expert annotations. Thus, the model was trained in a “semi-supervised” fashion. We evaluated the performance of the proposed model and compared it with the conventional rule-based labelling system on two typical tasks: Assertion Classification (AC) for bone fracture status detection (positive, negative or uncertainty) and Named Entity Recognition (NER) related to the fracture type, the bone type and location of a fracture occurs. BoneBert outperformed the rule-based system in both tasks, showing great potential for automated information extraction of the detection and diagnosis of bone fracture from radiology reports, such as, the clinical status, type and location of bone fracture, and more related observations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Banerjee, I., Chen, M.C., Lungren, M.P., Rubin, D.L.: Radiology report annotation using intelligent word embeddings: applied to multi-institutional chest CT cohort. J. Biomed. Inform. 77, 11–20 (2018). https://doi.org/10.1016/j.jbi.2017.11.012, https://linkinghub.elsevier.com/retrieve/pii/S1532046417302575
Bozkurt, S., Alkim, E., Banerjee, I., Rubin, D.L.: Automated detection of measurements and their descriptors in radiology reports using a hybrid natural language processing algorithm. Journal of Digital Imaging 32(4), 544–553 (2019). https://doi.org/10.1007/s10278-019-00237-9
Article Google Scholar
Chambers, N., et al.: Learning alignments and leveraging natural logic. In: Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing - RTE 2007. Association for Computational Linguistics, Morristown, NJ, USA, pp. 165–170 (2007). https://doi.org/10.3115/1654536.1654570
Charniak, E., Johnson, M.: Coarse-to-fine n-best parsing and MaxEnt discriminative reranking. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics - ACL 2005. Association for Computational Linguistics, Morristown, NJ, USA, vol. 1, pp. 173–180 (2005). https://doi.org/10.3115/1219840.1219862
Datta, S., Si, Y., Rodriguez, L., Shooshan, S.E., Demner-Fushman, D., Roberts, K.: Understanding spatial language in radiology: representation framework, annotation, and spatial relation extraction from chest X-ray reports using deep learning. J. Biomed. Inform. 108, 103473 (2019). https://doi.org/10.1016/j.jbi.2020.103473, http://arxiv.org/abs/1908.04485
De Marneffe, M.C., et al.: Universal stanford dependencies: a cross-linguistic typology. In: Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014, pp. 4585–4592 (2014)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Grundmeier, R., et al.: Identification of long bone fractures in radiology reports using natural language processing to support healthcare quality improvement. Appl. Clin. Inform. 7(4), 1051–1068 (2016). https://doi.org/10.4338/ACI-2016-08-RA-0129
Hassanpour, S., Langlotz, C.P.: Information extraction from multi-institutional radiology reports. Artif. Intell. Med. 66, 29–39 (2016). https://doi.org/10.1016/j.artmed.2015.09.007, https://linkinghub.elsevier.com/retrieve/pii/S0933365715001244
Irvin, J., et al.: CheXpert: A large chest radiograph dataset with uncertainty labels and expert comparison. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, pp. 590–597 (2019). https://doi.org/10.1609/aaai.v33i01.3301590, https://aaai.org/ojs/index.php/AAAI/article/view/3834
Johnson, A.E., et al.: MIMIC-III, a freely accessible critical care database. Sci. Data 3(1), 160035 (2016). https://doi.org/10.1038/sdata.2016.35
Liventsev, V., Fedulova, I., Dylov, D.: Deep text prior: weakly supervised learning for assertion classification. In: Tetko, I.V., Kůrková, V., Karpov, P., Theis, F. (eds.) ICANN 2019. LNCS, vol. 11731, pp. 243–257. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30493-5_26
Chapter Google Scholar
McDermott, M.B.A., Hsu, T.M.H., Weng, W.H., Ghassemi, M., Szolovits, P.: CheXpert++: approximating the cheXpert labeler for speed, differentiability, and probabilistic output. arXiv preprint arXiv:2006.15229 (2020)
Peng, Y., Wang, X., Lu, L., Bagheri, M., Summers, R., Lu, Z.: NegBio: a high-performance tool for negation and uncertainty detection in radiology reports. AMIA Joint Summits Trans. Sci. Proc. 2017, 188–196 (2018). http://www.ncbi.nlm.nih.gov/pubmed/29888070
Peng, Y., Yan, S., Lu, Z.: Transfer learning in biomedical natural language processing: an evaluation of BERT and ELMo on ten benchmarking datasets. In: Proceedings of the 18th BioNLP Workshop and Shared Task. Association for Computational Linguistics, Stroudsburg, PA, USA, pp. 58–65 (2019). https://doi.org/10.18653/v1/W19-5006
Qi, P., Zhang, Y., Zhang, Y., Bolton, J., Manning, C.D.: Stanza: a Python natural language processing toolkit for many human languages. arXiv preprint arXiv:2003.07082 (2020)
Santus, E., et al.: Do neural information extraction algorithms generalize across institutions? JCO Clin. Cancer Inform. 3, 1–8 (2019). https://doi.org/10.1200/CCI.18.00160
Schuster, S., Manning, C.D.: Enhanced English universal dependencies: an improved representation for natural language understanding tasks. In: Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016, pp. 2371–2378 (2016)
Google Scholar
Sevenster, M., Buurman, J., Liu, P., Peters, J., Chang, P.: Natural language processing techniques for extracting and categorizing finding measurements in narrative radiology reports. Appl. Clin. Inform. 6(3), 600–610 (2015). https://doi.org/10.4338/ACI-2014-11-RA-0110
Smit, A., Jain, S., Rajpurkar, P., Pareek, A., Ng, A.Y., Lungren, M.P.: CheXbert: combining automatic labelers and expert annotations for accurate radiology report labeling using BERT. arXiv preprint arXiv:2004.09167 (2020)
Steinkamp, J.M., Chambers, C., Lalevic, D., Zafar, H.M., Cook, T.S.: Toward complete structured information extraction from radiology reports using machine learning. J. Digit. Imaging 32(4), 554–564 (2019). https://doi.org/10.1007/s10278-019-00234-y
Article Google Scholar
Tibbo, M.E., et al.: Use of natural language processing tools to identify and classify periprosthetic femur fractures. J. Arthroplasty 34(10), 2216–2219 (2019). https://doi.org/10.1016/j.arth.2019.07.025, https://linkinghub.elsevier.com/retrieve/pii/S0883540319307090
Wang, Y., Mehrabi, S., Sohn, S., Atkinson, E.J., Amin, S., Liu, H.: Natural language processing of radiology reports for identification of skeletal site-specific fractures. BMC Med. Inform. Decis. Making 19(S3), 73 (2019). https://doi.org/10.1186/s12911-019-0780-5
Wang, Y., et al.: A clinical text classification paradigm using weak supervision and deep representation. BMC Med. Inform. Decis. Making 19(1), 1 (2019). https://doi.org/10.1186/s12911-018-0723-6
Yadav, K., Sarioglu, E., Smith, M., Choi, H.A.: Automated outcome classification of emergency department computed tomography imaging reports. Acad. Emerg. Med. 20(8), 848–854 (2013). https://doi.org/10.1111/acem.12174
Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R., Le, Q.V.: XLNet: Generalized autoregressive pretraining for language understanding. arXiv preprint arXiv:1906.08237 pp. 1–11 (2019)

Download references

Acknowledgements

This work was supported in part by the SBRI competition: AI supporting early detection and diagnosis in heart failure management.

Author information

Authors and Affiliations

Department of Computer Science, University of Warwick, Coventry, CV4 7AL, UK
Zhihao Dai
InterSystem, Eton, SL4 6BB, UK
Zhong Li
Department of Computer Science, Brunel University, Uxbridge, UB8 3PH, UK
Lianghao Han

Authors

Zhihao Dai
View author publications
You can also search for this author in PubMed Google Scholar
Zhong Li
View author publications
You can also search for this author in PubMed Google Scholar
Lianghao Han
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lianghao Han .

Editor information

Editors and Affiliations

University of Coimbra, Coimbra, Portugal
Pedro Henriques Abreu
University of Porto, Porto, Portugal
Pedro Pereira Rodrigues
University of Granada, Granada, Spain
Alberto Fernández
University of Porto, Porto, Portugal
João Gama

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dai, Z., Li, Z., Han, L. (2021). BoneBert: A BERT-based Automated Information Extraction System of Radiology Reports for Bone Fracture Detection and Diagnosis. In: Abreu, P.H., Rodrigues, P.P., Fernández, A., Gama, J. (eds) Advances in Intelligent Data Analysis XIX. IDA 2021. Lecture Notes in Computer Science(), vol 12695. Springer, Cham. https://doi.org/10.1007/978-3-030-74251-5_21

Download citation

DOI: https://doi.org/10.1007/978-3-030-74251-5_21
Published: 13 April 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-74250-8
Online ISBN: 978-3-030-74251-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics