Extracting Structured Data from Physician-Patient Conversations by Predicting Noteworthy Utterances

Krishna, Kundan; Pavel, Amy; Schloss, Benjamin; Bigham, Jeffrey P.; Lipton, Zachary C.

doi:10.1007/978-3-030-53352-6_14

Kundan Krishna⁵,
Amy Pavel⁵,
Benjamin Schloss⁶,
Jeffrey P. Bigham⁵ &
…
Zachary C. Lipton⁵

Part of the book series: Studies in Computational Intelligence ((SCI,volume 914))

1740 Accesses
4 Citations

Abstract

Despite concerted efforts to mine various modalities of medical data, the conversations between physicians and patients at the time of care remain an untapped resource of insights. In this paper, we explore the possibility of leveraging this data to extract structured information that might assist physicians with post-visit documentation in electronic health records, potentially lightening the clerical burden. In this exploratory study, we describe a new dataset consisting of conversation transcripts, post-visit summaries, corresponding supporting evidence (in the transcript), and structured labels. We focus on the tasks of recognizing relevant diagnoses and abnormalities in the review of organ systems (RoS). One methodological challenge is that the conversations are long (around 1500 words) making it difficult for modern deep-learning models to use them as input. To address this challenge, we extract noteworthy utterances—parts of the conversation likely to be cited as evidence supporting some summary sentence. We find that by first filtering for (predicted) noteworthy utterances, we can significantly boost predictive performance for recognizing both diagnoses and RoS abnormalities.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alsentzer, E., et al.: Publicly available clinical BERT embeddings. In: North American Association for Computational Linguistics - Human Language Technologies (NAACL-HLT) (2019)
Google Scholar
Aronson, A.R.: Effective mapping of biomedical text to the UMLs metathesaurus: the MetaMap program. In: Proceedings of the AMIA Symposium. American Medical Informatics Association (2001)
Google Scholar
Cheng, Y., Wang, F., Zhang, P., Hu, J.: Risk prediction with electronic health records: a deep learning approach. In: SIAM International Conference on Data Mining (SDM). SIAM (2016)
Google Scholar
Datla, V., et al.: Automated clinical diagnosis: the role of content in various sections of a clinical document. In: IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE (2017)
Google Scholar
Dernoncourt, F., Lee, J.Y., Uzuner, O., Szolovits, P.: De-identification of patient notes with recurrent neural networks. J. Am. Med. Inform. Assoc. 24, 596–606 (2017)
Google Scholar
Fries, J.: Brundlefly at SemEval-2016 task 12: recurrent neural networks vs. joint inference for clinical temporal information extraction. In: International Workshop on Semantic Evaluation (SemEval) (2016)
Google Scholar
Gardner, M., et al.: AllenNLP: a deep semantic natural language processing platform. In: Workshop for NLP Open Source Software (NLP-OSS) (2018)
Google Scholar
Gardner, R.L., et al.: Physician stress and burnout: the impact of health information technology. J. Am. Med. Inform. Assoc. 26, 106–114 (2018)
Article Google Scholar
Guo, D., Duan, G., Yu, Y., Li, Y., Wu, F.X., Li, M.: A disease inference method based on symptom extraction and bidirectional long short term memory networks. Methods 173, 75–82 (2019)
Article Google Scholar
Jagannatha, A.N., Yu, H.: Bidirectional RNN for medical event detection in electronic health records. In: North American Association for Computational Linguistics - Human Language Technologies (NAACL-HLT) (2016)
Google Scholar
Li, C., Konomis, D., Neubig, G., Xie, P., Cheng, C., Xing, E.: Convolutional neural networks for medical diagnosis from admission notes. arXiv preprint arXiv:1712.02768 (2017)
Google Scholar
Lipton, Z.C., Elkan, C., Naryanaswamy, B.: Optimal thresholding of classifiers to maximize F1 measure. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer (2014)
Google Scholar
Lipton, Z.C., Kale, D.C., Elkan, C., Wetzel, R.: Learning to diagnose with LSTM recurrent neural networks. In: International Conference on Learning Representations (ICLR) (2016)
Google Scholar
Murray, G., Carenini, G.: Summarizing spoken and written conversations. In: Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics (2008)
Google Scholar
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Rajkomar, A., et al.: Automatically charting symptoms from patient-physician conversations using machine learning. JAMA Internal Med. 179, 836–838 (2019)
Article Google Scholar
Roter, D., Frankel, R.: Quantitative and qualitative approaches to the evaluation of the medical dialogue. Soc. Sci. Med. 34, 1097–1103 (1992)
Article Google Scholar
Roter, D.L.: Patient participation in the patient-provider interaction: the effects of patient question asking on the quality of interaction, satisfaction and compliance. Health Educ. Monographs 5, 281–315 (1977)
Article Google Scholar
Roter, D.L., Hall, J.A.: Physicians’ interviewing styles and medical information obtained from patients. J. General Internal Med. 2, 325–329 (1987)
Article Google Scholar
Shickel, B., Tighe, P.J., Bihorac, A., Rashidi, P.: Deep EHR: a survey of recent advances in deep learning techniques for electronic health record (EHR) analysis. IEEE J. Biomed. Health Inform. 22, 1589–1604 (2017)
Article Google Scholar
Sinsky, C., et al.: Allocation of physician time in ambulatory practice: a time and motion study in 4 specialties. Ann. Internal Med. 165, 753–760 (2016)
Article Google Scholar
Soldaini, L., Goharian, N.: QuickUMLS: a fast, unsupervised approach for medical concept extraction. In: MedIR Workshop, sigir (2016)
Google Scholar
Van Asch, V.: Macro-and micro-averaged evaluation measures. Technical report (2013)
Google Scholar
Wang, L., Cardie, C.: Summarizing decisions in spoken meetings. In: Workshop on Automatic Summarization for Different Genres, Media, and Languages. Association for Computational Linguistics (2011)
Google Scholar
Wu, Y., Jiang, M., Lei, J., Xu, H.: Named entity recognition in Chinese clinical text using deep neural network. Stud. Health Technol. Inform. 216, 624 (2015)
Google Scholar
Wu, Y., et al.: Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016)

Download references

Acknowledgements

We gratefully acknowledge support from the Center for Machine Learning and Health in a joint venture between UPMC and Carnegie Mellon University and Abridge AI, who created the dataset that we used for this research.

Author information

Authors and Affiliations

Carnegie Mellon University, Pittsburgh, USA
Kundan Krishna, Amy Pavel, Jeffrey P. Bigham & Zachary C. Lipton
Abridge AI Inc., Pittsburgh, USA
Benjamin Schloss

Authors

Kundan Krishna
View author publications
You can also search for this author in PubMed Google Scholar
Amy Pavel
View author publications
You can also search for this author in PubMed Google Scholar
Benjamin Schloss
View author publications
You can also search for this author in PubMed Google Scholar
Jeffrey P. Bigham
View author publications
You can also search for this author in PubMed Google Scholar
Zachary C. Lipton
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kundan Krishna .

Editor information

Editors and Affiliations

Department of Pediatrics, College of Medicine, The University of Tennessee Health Science Center (UTHSC), Oak-Ridge National Lab (ORNL), Memphis, TN, USA
Arash Shaban-Nejad
School of Nursing, University of Minnesota, Minneapolis, MN, USA
Martin Michalowski
McGill Clinical & Health Informatics, Montreal, QC, Canada
David L. Buckeridge

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Krishna, K., Pavel, A., Schloss, B., Bigham, J.P., Lipton, Z.C. (2021). Extracting Structured Data from Physician-Patient Conversations by Predicting Noteworthy Utterances. In: Shaban-Nejad, A., Michalowski, M., Buckeridge, D.L. (eds) Explainable AI in Healthcare and Medicine. Studies in Computational Intelligence, vol 914. Springer, Cham. https://doi.org/10.1007/978-3-030-53352-6_14

Download citation

DOI: https://doi.org/10.1007/978-3-030-53352-6_14
Published: 03 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-53351-9
Online ISBN: 978-3-030-53352-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics