Semantic Tree Driven Thyroid Ultrasound Report Generation by Voice Input

Liu, Lihao; Wang, Mei; Dong, Yijie; Zhao, Weiliang; Yang, Jian; Su, Jianwen

doi:10.1007/978-3-030-71051-4_32

Lihao Liu⁷,
Mei Wang⁷,
Yijie Dong⁸,
Weiliang Zhao⁷,
Jian Yang^9,10 &
…
Jianwen Su¹¹

Part of the book series: Transactions on Computational Science and Computational Intelligence ((TRACOSCI))

780 Accesses

Abstract

The automatic speech recognition has achieved quite good performance in the medical domain in the past several years. However, it is still lacking of enough practical solutions with considering the characteristics of real applications. In this work, we develop an approach to automatically generate semantic-coherent ultrasound reports with voice input. The solution includes key algorithms based on a proposed semantic tree structure. The radiologists do not need to follow the fixed templates. They just need to speak their specific observations for individual patients. We have carried out a set of experiments against a real world thyroid ultrasound dataset with more than 40,000 reports from a reputable hospital in Shanghai, China. The experimental results show that our proposed solution can generate concise and accurate reports.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 349.00; Price excludes VAT (USA)

Softcover Book: USD 449.99; Price excludes VAT (USA)

Hardcover Book: USD 449.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

V.Y. Park, K. Han, Y.K. Seong, M.H. Park, E. Kim, Moon, H.J. et al., Diagnosis of Thyroid nodules: performance of a deep learning convolutional neural network model vs. radiologists. Sci. Rep. 9, 17843 (2019). https://doi.org/10.1038/s41598-019-54434-1
X. Mei, H. Lee, K. Diao, M. Huang, B. Lin, C. Liu, et al., Artificial intelligence–enabled rapid diagnosis of patients with COVID-19. Nat. Med. 26, 1224–1228 (2020). https://doi.org/10.1038/s41591-020-0931-3
Article Google Scholar
X. Wang, Y. Peng, L. Lu, Z. Lu, R.M. Summers, TieNet: Text-image embedding network for common thorax disease classification and reporting in chest X-rays, in The IEEE Conference on Computer Vision and Pattern Recognition (2018), pp. 9049–9058
Google Scholar
P. Kisilev, E. Walach, E. Barkan, B. Ophir, S. Alpert, S.Y. Hashoul, From medical image to automatic medical report generation. IBM J. Res. Develop. 59(2/3), 2:1–2:7 (2015)
Google Scholar
A. Graves, N. Jaitly, Towards end-to-end speech recognition with recurrent neural networks, in International Conference on Machine Learning (2014), pp. 1764–1772
Google Scholar
Y. He, T.N. Sainath, R. Prabhavalkar, I. McGraw, R. Alvarez, D. Zhao, et al., Streaming end-to-end speech recognition for mobile devices, in 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (2019), pp. 6381–6385
Google Scholar
D. Amodei, S. Ananthanarayanan, R. Anubhai, J. Bai, E. Battenberg, Deep speech 2: End-to-end speech recognition in English and mandarin, in Proceedings of the 33rd International Conference on Machine Learning (2016), pp. 173–182
Google Scholar
L.E. Shafey, H. Soltau, I. Shafran, Joint speech recognition and speaker diarization via sequence transduction, in Conference of the International Speech Communication Association (2019), pp. 396–400
Google Scholar
L. Zhou, S.V. Blackley, L. Kowalski, B. Adam, E. Kontrient, D. Mack, et al., Analysis of errors in dictated clinical documents assisted by speech recognition software and professional transcriptionists. JAMA Netw. Open. 1(3), e180530 (2018)
Google Scholar
Nuance Communications, Control your computer by voice with speed and accuracy. https://www.nuance.com/en-gb/dragon.html#standardpage-mainpar_backgroundimage_copy. Accessed 18 Decemebr 2019
Nuance Communications, Dragon Medical One: Secure, cloud-based clinical speech recognition. https://www.nuance.com/en-au/healthcare/provider-solutions/speech-recognition/dragon-medical-one.html. Accessed 18 Decemebr 2019
Amazon Web Service, Amazon Transcribe Medical. https://aws.amazon.com/cn/transcribe/medical/. Accessed 16 January 2020
WebChartMD, Healthcare’s leading dictation and medical transcription software. https://www.webchartmd.org/. Accessed 27 May 2020
VoiceboxMD, Medical Dictation for Physicians and Nurse Practitioners. https://voiceboxmd.com/medical-dictation/. Accessed 27 May 2020
A. Paats, T. Alumäe, E. Meister, I. Fridolin, Retrospective analysis of clinical performance of an Estonian speech recognition system for radiology: effects of different acoustic and language models. J. Digit. Imaging. 31(5), 615–621 (2018)
Article Google Scholar
T. Takao, R. Masumura, S. Sakauchi, Y. Ohara, E. Bilgic, E. Umegaki, et al., New report preparation system for endoscopic procedures using speech recognition technology. Endoscopy Int. Open 6(6), E676–E687 (2018)
Article Google Scholar
A. Trujillo, M. Orellana, M.I. Acosta, Design of emergency call record support system applying natural language processing techniques, in Conference on Information Technologies and Communication of Ecuador (2019), pp. 53–65
Google Scholar
T.N. Hanna, H. Shekhani, K. Maddu, C. Zhang, Z. Chen, J. Johnson, Structured report compliance: Effect on audio dictation time, report length, and total radiologist study time. Emerg Radiol. 23(5), 449–453 (2016)
Article Google Scholar
K. Papineni, S. Roukos, T. Ward, W. Zhu, BLEU: A method for automatic evaluation of machine translation, in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (2002), pp. 311–318
Google Scholar

Download references

Acknowledgement

This work was supported by the National Key R&D Program of China under Grant 2019YFE0190500.

Author information

Authors and Affiliations

School of Computer Science and Technology, Donghua University, Shanghai, China
Lihao Liu, Mei Wang & Weiliang Zhao
Department of Ultrasound, Ruijin Hospital, School of Medicine Shanghai Jiao Tong University, Shanghai, China
Yijie Dong
School of Computer Science and Technology, Donghua University, Shanghai, China
Jian Yang
Computing Department, Macquarie University, Sydney, NSW, Australia
Jian Yang
Department of Computer Science, University of California, Santa Barbara, CA, USA
Jianwen Su

Authors

Lihao Liu
View author publications
You can also search for this author in PubMed Google Scholar
Mei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yijie Dong
View author publications
You can also search for this author in PubMed Google Scholar
Weiliang Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Jian Yang
View author publications
You can also search for this author in PubMed Google Scholar
Jianwen Su
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Georgia, Athens, GA, USA
Hamid R. Arabnia
School of Computing and Data Sciences, Wentworth Institute of Technology, Boston, MA, USA
Leonidas Deligiannidis
Graduate School of Information Science & Engineering, University of Electro-Communications, Chofu, Japan
Hayaru Shouno
Facultad de Informática - CIC PBA, Universidad Nacional de La Plata, La Plata, Argentina
Fernando G. Tinetti
Department of Computer Science, Southeastern Louisiana University, Hammond, LA, USA
Quoc-Nam Tran

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, L., Wang, M., Dong, Y., Zhao, W., Yang, J., Su, J. (2021). Semantic Tree Driven Thyroid Ultrasound Report Generation by Voice Input. In: Arabnia, H.R., Deligiannidis, L., Shouno, H., Tinetti, F.G., Tran, QN. (eds) Advances in Computer Vision and Computational Biology. Transactions on Computational Science and Computational Intelligence. Springer, Cham. https://doi.org/10.1007/978-3-030-71051-4_32

Download citation

DOI: https://doi.org/10.1007/978-3-030-71051-4_32
Published: 01 March 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-71050-7
Online ISBN: 978-3-030-71051-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics