Abstract
Despite concerted efforts to mine various modalities of medical data, the conversations between physicians and patients at the time of care remain an untapped resource of insights. In this paper, we explore the possibility of leveraging this data to extract structured information that might assist physicians with post-visit documentation in electronic health records, potentially lightening the clerical burden. In this exploratory study, we describe a new dataset consisting of conversation transcripts, post-visit summaries, corresponding supporting evidence (in the transcript), and structured labels. We focus on the tasks of recognizing relevant diagnoses and abnormalities in the review of organ systems (RoS). One methodological challenge is that the conversations are long (around 1500 words) making it difficult for modern deep-learning models to use them as input. To address this challenge, we extract noteworthy utterances—parts of the conversation likely to be cited as evidence supporting some summary sentence. We find that by first filtering for (predicted) noteworthy utterances, we can significantly boost predictive performance for recognizing both diagnoses and RoS abnormalities.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alsentzer, E., et al.: Publicly available clinical BERT embeddings. In: North American Association for Computational Linguistics - Human Language Technologies (NAACL-HLT) (2019)
Aronson, A.R.: Effective mapping of biomedical text to the UMLs metathesaurus: the MetaMap program. In: Proceedings of the AMIA Symposium. American Medical Informatics Association (2001)
Cheng, Y., Wang, F., Zhang, P., Hu, J.: Risk prediction with electronic health records: a deep learning approach. In: SIAM International Conference on Data Mining (SDM). SIAM (2016)
Datla, V., et al.: Automated clinical diagnosis: the role of content in various sections of a clinical document. In: IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE (2017)
Dernoncourt, F., Lee, J.Y., Uzuner, O., Szolovits, P.: De-identification of patient notes with recurrent neural networks. J. Am. Med. Inform. Assoc. 24, 596–606 (2017)
Fries, J.: Brundlefly at SemEval-2016 task 12: recurrent neural networks vs. joint inference for clinical temporal information extraction. In: International Workshop on Semantic Evaluation (SemEval) (2016)
Gardner, M., et al.: AllenNLP: a deep semantic natural language processing platform. In: Workshop for NLP Open Source Software (NLP-OSS) (2018)
Gardner, R.L., et al.: Physician stress and burnout: the impact of health information technology. J. Am. Med. Inform. Assoc. 26, 106–114 (2018)
Guo, D., Duan, G., Yu, Y., Li, Y., Wu, F.X., Li, M.: A disease inference method based on symptom extraction and bidirectional long short term memory networks. Methods 173, 75–82 (2019)
Jagannatha, A.N., Yu, H.: Bidirectional RNN for medical event detection in electronic health records. In: North American Association for Computational Linguistics - Human Language Technologies (NAACL-HLT) (2016)
Li, C., Konomis, D., Neubig, G., Xie, P., Cheng, C., Xing, E.: Convolutional neural networks for medical diagnosis from admission notes. arXiv preprint arXiv:1712.02768 (2017)
Lipton, Z.C., Elkan, C., Naryanaswamy, B.: Optimal thresholding of classifiers to maximize F1 measure. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer (2014)
Lipton, Z.C., Kale, D.C., Elkan, C., Wetzel, R.: Learning to diagnose with LSTM recurrent neural networks. In: International Conference on Learning Representations (ICLR) (2016)
Murray, G., Carenini, G.: Summarizing spoken and written conversations. In: Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics (2008)
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Rajkomar, A., et al.: Automatically charting symptoms from patient-physician conversations using machine learning. JAMA Internal Med. 179, 836–838 (2019)
Roter, D., Frankel, R.: Quantitative and qualitative approaches to the evaluation of the medical dialogue. Soc. Sci. Med. 34, 1097–1103 (1992)
Roter, D.L.: Patient participation in the patient-provider interaction: the effects of patient question asking on the quality of interaction, satisfaction and compliance. Health Educ. Monographs 5, 281–315 (1977)
Roter, D.L., Hall, J.A.: Physicians’ interviewing styles and medical information obtained from patients. J. General Internal Med. 2, 325–329 (1987)
Shickel, B., Tighe, P.J., Bihorac, A., Rashidi, P.: Deep EHR: a survey of recent advances in deep learning techniques for electronic health record (EHR) analysis. IEEE J. Biomed. Health Inform. 22, 1589–1604 (2017)
Sinsky, C., et al.: Allocation of physician time in ambulatory practice: a time and motion study in 4 specialties. Ann. Internal Med. 165, 753–760 (2016)
Soldaini, L., Goharian, N.: QuickUMLS: a fast, unsupervised approach for medical concept extraction. In: MedIR Workshop, sigir (2016)
Van Asch, V.: Macro-and micro-averaged evaluation measures. Technical report (2013)
Wang, L., Cardie, C.: Summarizing decisions in spoken meetings. In: Workshop on Automatic Summarization for Different Genres, Media, and Languages. Association for Computational Linguistics (2011)
Wu, Y., Jiang, M., Lei, J., Xu, H.: Named entity recognition in Chinese clinical text using deep neural network. Stud. Health Technol. Inform. 216, 624 (2015)
Wu, Y., et al.: Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016)
Acknowledgements
We gratefully acknowledge support from the Center for Machine Learning and Health in a joint venture between UPMC and Carnegie Mellon University and Abridge AI, who created the dataset that we used for this research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Krishna, K., Pavel, A., Schloss, B., Bigham, J.P., Lipton, Z.C. (2021). Extracting Structured Data from Physician-Patient Conversations by Predicting Noteworthy Utterances. In: Shaban-Nejad, A., Michalowski, M., Buckeridge, D.L. (eds) Explainable AI in Healthcare and Medicine. Studies in Computational Intelligence, vol 914. Springer, Cham. https://doi.org/10.1007/978-3-030-53352-6_14
Download citation
DOI: https://doi.org/10.1007/978-3-030-53352-6_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-53351-9
Online ISBN: 978-3-030-53352-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)