Lessons Learned from the Usability Evaluation of a Simulated Patient Dialogue System

Campillos-Llanos, Leonardo; Thomas, Catherine; Bilinski, Éric; Neuraz, Antoine; Rosset, Sophie; Zweigenbaum, Pierre

doi:10.1007/s10916-021-01737-4

Lessons Learned from the Usability Evaluation of a Simulated Patient Dialogue System

Education & Training
Published: 17 May 2021

Volume 45, article number 69, (2021)
Cite this article

Journal of Medical Systems Aims and scope Submit manuscript

Leonardo Campillos-Llanos ORCID: orcid.org/0000-0003-3040-1756¹^nAff2,
Catherine Thomas³,
Éric Bilinski¹,
Antoine Neuraz⁴,
Sophie Rosset¹ &
…
Pierre Zweigenbaum¹

771 Accesses
6 Citations
2 Altmetric
Explore all metrics

Abstract

Simulated consultations through virtual patients allow medical students to practice history-taking skills. Ideally, applications should provide interactions in natural language and be multi-case, multi-specialty. Nevertheless, few systems handle or are tested on a large variety of cases. We present a virtual patient dialogue system in which a medical trainer types new cases and these are processed without human intervention. To develop it, we designed a patient record model, a knowledge model for the history-taking task, and a termino-ontological model for term variation and out-of-vocabulary words. We evaluated whether this system provided quality dialogue across medical specialities (n = 18), and with unseen cases (n = 29) compared to the cases used for development (n = 6). Medical evaluators (students, residents, practitioners, and researchers) conducted simulated history-taking with the system and assessed its performance through Likert-scale questionnaires. We analysed interaction logs and evaluated system correctness. The mean user evaluation score for the 29 unseen cases was 4.06 out of 5 (very good). The evaluation of correctness determined that, on average, 74.3% (sd = 9.5) of replies were correct, 14.9% (sd = 6.3) incorrect, and in 10.7% the system behaved cautiously by deferring a reply. In the user evaluation, all aspects scored higher in the 29 unseen cases than in the 6 seen cases. Although such a multi-case system has its limits, the evaluation showed that creating it is feasible; that it performs adequately; and that it is judged usable. We discuss some lessons learned and pivotal design choices affecting its performance and the end-users, who are primarily medical students.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Intelligent Systems in Learning and Education

Modelling Domain-Specific Self-regulatory Activities in Clinical Reasoning

Instantiating Interactive Narratives from Patient Education Documents

Data availability

The dialogue data collected during development and evaluation is available at: https://pvdial.limsi.fr/data/PG-logs-eval.zip A demonstration of the dialogue system can be tested at: http://vps-9069f76a.vps.ovh.net

Code availability

Not applicable.

Notes

We refer with this term to virtual standardised patients.
http://vps-9069f76a.vps.ovh.net
http://umvf.cerimes.fr/portail/ecn.php
https://pvdial.limsi.fr

References

Washburn M., Bordnick P., Rizzo A. S.: A pilot feasibility study of virtual patient simulation to enhance social work students’ brief mental health assessment skills. Soc. Work Health Care 55 (9): 675–693, 2016
Article Google Scholar
Barnett S. G., Gallimore C. E., Pitterle M., Morrill J.: Impact of a paper vs virtual simulated patient case on student-perceived confidence and engagement. Am. J. Pharm. Educ. 80 (1): 16, 2016
Article Google Scholar
McCoy L., Pettit R. K., Lewis J. H., Allgood J. A., Bay C., Schwartz F. N.: Evaluating medical student engagement during virtual patient simulations: A sequential, mixed methods study. BMC Med. Educ. 16: 20, 2016
Article Google Scholar
Tait L., Lee K., Rasiah R., Cooper J. M., Ling T., Geelan B., Bindoff I. (2018) Simulation and feedback in health education: A mixed methods study comparing three simulation modalities. Pharmacy (Basel) 6(2):41–57
Courteille O., Fahlstedt M., Ho J., Hedman L., Fors U., von Holst H., Fellander-Tsai L., Moller H.: Learning through a virtual patient vs. recorded lecture: A comparison of knowledge retention in a trauma case. Int. J. Med. Educ. 9: 86–92, 2018
Article Google Scholar
Gupta A., Singh S., Khaliq F., Dhaliwal U., Madhu S. V.: Development and validation of simulated virtual patients to impart early clinical exposure in endocrine physiology. Adv. Physiol. Educ. 42 (1): 15–20, 2018
Article Google Scholar
de Cock C., Milne-Ives M., van Velthoven M. H., Alturkistani A., Lam C., Meinert E.: Effectiveness of conversational agents (virtual assistants) in health care: Protocol for a systematic review. JMIR Res. Protoc. 9 (3): e16934, 2020
Article Google Scholar
Ellaway R., Candler C., Greene P., Smothers V. An architectural model for MedBiquitous virtual patients. 2006 http://groups.medbiq.org/medbiq/display/VPWG/MedBiquitous+Virtual+Patient+Architecture, Accessed: 1 Apr 2021
Sijstermans R., Jaspers M. W., Bloemendaal P., Schoonderwaldt E.: Training inter-physician communication using the dynamic patient simulator®; Int. J. Med. Inf. 76 (5–6): 336–343, 2007
Article CAS Google Scholar
Danforth D. R., Procter M., Chen R., Johnson M., Heller R.: Development of virtual patient simulations for medical education. J. Virtual Worlds Res. 2 (2): 4–11, 2009
Article Google Scholar
Rombauts N. (2014) Patients virtuels: pédagogie, état de l’art et développement du simulateur Alphadiag. PhD dissertation, Faculty of Medicine, Claude Bernard University, Lyon France
Menendez E., Balisa-Rocha B., Jabbur-Lopes M., Costa W., Nascimento J. R., Dósea M., Silva L., Junior D. L.: Using a virtual patient system for the teaching of pharmaceutical care. Int. J. Med. Inf. 84 (9): 640–646, 2015
Article Google Scholar
Lin C. J., Pao C. W., Chen Y. H., Liu C. T., Hsu H. H.: Ellipsis and coreference resolution in a computerized virtual patient dialogue system. J. Med. Syst. 40 (9): 206–221, 2016
Article Google Scholar
Laleye F. A., Blanié A., Brouquet A., Behnamou D., de Chalendar G.: Semantic similarity to improve question understanding in a virtual patient.. In: Proceedings of the 35th Annual ACM Symposium on Applied Computing, 2020, pp 859–866
Chen F., Lee Y., Hubal R.: Work-in-progress—testing of a virtual patient: Linguistic and display engagement findings.. In: 2020 6th International Conference of the Immersive Learning Research Network (iLRN). IEEE, 2020, pp 348–350
Candler C.: Effective use of educational technology in medical education.. In: Colloquium on Educational Technology: Recommendations and Guidelines for Medical Educators. AAMC Institute for Improving Medical Education, Washington, DC, 2007, pp 1–19
Schmidlen T., Schwartz M., DiLoreto K., Kirchner H. L., Sturm A. C.: Patient assessment of chatbots for the scalable delivery of genetic counseling. J. Genet. Couns. 28 (6): 1166–1177, 2019
Article Google Scholar
Chetlen A., Artrip R., Drury B., Arbaiza A., Moore M.: Novel use of chatbot technology to educate patients before breast biopsy. J. Am. Coll. Radiol. 16 (9 Pt B): 1305–1308, 2019
Article Google Scholar
Kokciyan N., Chapman M., Balatsoukas P., Sassoon I., Essers K., Ashworth M., Curcin V., Modgil S., Parsons S., Sklar E. I.: A collaborative decision support tool for managing chronic conditions. Stud. Health Technol. Inform. 264: 644–648, 2019
PubMed Google Scholar
Cook D. A., Erwin P. J., Triola M. M.: Computerized virtual patients in health professions education: A systematic review and meta-analysis. Acad. Med. 85 (10): 1589–1602, 2010. https://doi.org/10.1097/ACM.0b013e3181edfe13
Article Google Scholar
Wattanasoontorn V., Hernández R. J. G., Sbert M.: Embodied conversational virtual patients. In: (Diana P. M., Nieto I. P., Eds.) Conversational Agents and Natural Language Interaction: Techniques and Effective Practices. Information Science Reference, IGI Global, Hershey, 2011, pp 254–281. https://doi.org/10.4018/978-1-60960-617-6.ch011
Rossen B., Lok B.: A crowdsourcing method to develop virtual human conversational agents. Int. J. Hum. Comput. Stud. 70 (4): 301–319, 2012
Article Google Scholar
Lelardeux C., Panzoli D., Alvarez J., Galaup M., Lagarrigue P.: Serious game, simulateur, serious play: État de l’art pour la formation en santé.. In: Actes du colloque Serious Games en Médecine et Santé (SeGaMED) 2013. e-virtuoses, Nice, 2013, pp 27–38
Wattanasoontorn V., Hernández R.J.G., Sbert M.: Serious games for e-health care. In: (Cai Y., Goei S., Eds.) Simulations, Serious Games and Their Applications. Springer, Singapore, 2014, pp 127–146. https://doi.org/10.1007/978-981-4560-32-0_9
Reiswich A., Haag M.: Evaluation of chatbot prototypes for taking the virtual patient’s history. Stud. Health Technol. Inform. 260: 73–80, 2019
PubMed Google Scholar
Nirenburg S., Beale S., McShane M., Jarrell B., Fantry G.: Language understanding in Maryland virtual patient.. In: Proceedings of the International Conference on on Computational Linguistics. Citeseer, Manchester, 2008, pp 36–39
Campillos-Llanos L., Bouamor D., Bilinski É., Ligozat A. L., Zweigenbaum P., Rosset S.: Description of the PatientGenesys dialogue system.. In: Proceedings of SIGDIAL. Association for Computational Linguistics, Prague, 2015, pp 438–440
Leuski A., Traum D.: Practical language processing for virtual humans.. In: Proceedings on Innovative Applications of Artificial Intelligence Conference, Atlanta, 2010, pp 1740–1747
Rizk Y., Kshoury K., Chehab M., Chidiac P., Awad M., Antoun J.: Virtual patient.. In: Proceedings of WINLP, Vancouver, 2017, pp 1–3
Datta D., Brashers V., Owen J., White C., Barnes L. E.: A deep learning methodology for semantic utterance classification in virtual human dialogue systems. In: (Traum D., Swartout W., Khooshabeh P., Kopp S., Scherer S., Leuski A., Eds.) Intelligent Virtual Agents, Los Angeles. Springer, Berlin, 2016, pp 451–455
Jin L., White M., Jaffe E., Zimmerman L., Danforth D.: Combining cnns and pattern matching for question interpretation in a virtual patient dialogue system.. In: Proceedings on Workshop Innovative Use NLP Building Educational Applications. Copenhagen, 2017, pp 11–21
Dickerson R., Johnsen K., Raij A., Lok B., Hernandez J., Stevens A., Lind D. S.: Evaluating a script-based approach for simulating patient-doctor interaction.. In: Proceedings of the International, Conference on Human-Computer Interface Advances Modeling and Simulation, New Orleans, 2005, pp 79–84
Pence T. B., Dukes L. C., Hodges L. F., Meehan N. K., Johnson A.: The effects of interaction and visual fidelity on learning outcomes for a virtual pediatric patient system.. In: IEEE International Conference on Healthcare Informatics (ICHI). IEEE, Philadelphia, 2013, pp 209–218. https://doi.org/10.1109/ICHI.2013.36
Maicher K., Danforth D., Price A., Zimmerman L., Wilcox B., Liston B., Cronau H., Belknap L., Ledford C., Way D., et al.: Developing a conversational virtual standardized patient to enable students to practice history-taking skills. Simul. Healthc. 12 (2): 124–131, 2017. https://doi.org/10.1097/SIH.0000000000000195
Article Google Scholar
Talbot T. B., Sagae K., John B., Rizzo A. A.: Sorting out the virtual patient: How to exploit artificial intelligence, game technology and sound educational practices to create engaging role-playing simulations. Int. J. Gaming Comput. Mediat. Simul. 4 (3): 1–19, 2012. https://doi.org/10.4018/jgcms.2012070101
Article Google Scholar
Scherly D., Nendaz M.: Simulation du raisonnement clinique sur ordinateur: Le patient virtuel. In: (Boet S., Granry J., Savoldelli G., Eds.) La Simulation en Santé. De la Théorie à la Pratique. Springer, Paris, 2013, pp 43–50. https://doi.org/10.1007/978-2-8178-0469-9_5
Hubal R. C., Kizakevich P. N., Guinn C. I., Merino K. D., West S. L.: The virtual standardized patient. Stud. Health Technol. Inform. 70: 133–138, 2000
CAS PubMed Google Scholar
Stevens A., Hernandez J., Johnsen K., Dickerson R., Raij A., Harrison C., DiPietro M., Allen B., Ferdig R., Foti S., et al.: The use of virtual patients to teach medical students history taking and communication skills. Am. J. Surg. 191 (6): 806–811, 2006
Article Google Scholar
Kenny P., Rizzo A. A., Parsons T. D., Gratch J., Swartout W.: A virtual human agent for training novice therapists clinical interviewing skills. Annu. Rev. CyberTherapy Telemed. 5: 77–83, 2007. https://doi.org/10.1145/159544.159587
Google Scholar
Kenny P., Parsons T. D., Gratch J., Rizzo A. A.: Evaluation of Justina: A virtual patient with PTSD. In: (Prendinger H., Lester J., Ishizuka M., Eds.) Intelligent Virtual Agents. Springer, Berlin, 2008, pp 394–408
Parsons T. D.: Virtual standardized patients for assessing the competencies of psychologists.. In: Encyclopedia of Information Science and Technology, 3rd edn. IGI Global, 2015, pp 6484–6492. https://doi.org/10.4018/978-1-4666-5888-2.ch637
Persad A., Stroulia E., Forgie S.: A novel approach to virtual patient simulation using natural language processing. Med. Educ. 50 (11): 1162–1163, 2016. https://doi.org/10.1111/medu.13197
Article Google Scholar
Gokcen A., Jaffe E., Erdmann J., White M., Danforth D.: A corpus of word-aligned asked and anticipated questions in a virtual patient dialogue system.. In: LREC International Conference on Language Resources and Evaluation, Portorož, 2016, pp 3174–3179
Talbot T. B., Kalisch N., Christoffersen K., Lucas G., Forbell E.: Natural language understanding performance and use considerations in virtual medical encounters. Stud Health Technol. Inform. 220: 407–413, 2016
PubMed Google Scholar
Leleu J., Caillat-Grenier R., Pierard N., Rica P., Granry J. C., Lehousse T., Pereira S., Bretier P., Rosec O., Bilinski É., Bouamor D., Campillos-Llanos L., Grau B., Ligozat A. L., Zweigenbaum P., Rosset S.: Patient Genesys: Outil de création de cas cliniques de simulation médicale proposant des cas patients virtuels en 3D.. In: Applications Pratiques de l’Intelligence Artificielle, Rennes, 2015, p 2
Campillos-Llanos L., Bouamor D., Zweigenbaum P., Rosset S.: Managing linguistic and terminological variation in a medical dialogue system.. In: LREC International Conference on Language Resources and Evaluation, Portorož, 2016, pp 3167–3173
Campillos-Llanos L., Thomas C., Bilinski É., Zweigenbaum P., Rosset S.: Designing a virtual patient dialogue system based on terminology-rich resources: Challenges and evaluation. Nat. Lang. Eng. 26 (2): 183–220, 2020
Article Google Scholar
Bodenreider O.: The Unified Medical Language System (UMLS): Integrating biomedical terminology. Nucleic Acids Res. 32 (suppl 1): D267–D270, 2004
Article CAS Google Scholar
Dybkjær L., Bernsen N.O.: Usability evaluation in spoken language dialogue systems.. In: Proceedings of Workshop on Evaluation for Language and Dialogue Systems. Association for Computational Linguistics, 2001, pp 9–18
Duplessis G. D., Letard V., Ligozat A. L., Rosset S.: Purely corpus-based automatic conversation authoring.. In: LREC International Conference on Language Resources and Evaluation, Portorož, 2016, pp 2728–2735
Campillos-Llanos L., Rosset S., Zweigenbaum P.: Automatic classification of doctor-patient questions for a virtual patient record query task.. In: Proceedings of BioNLP. Association for Computational Linguistics, Vancouver, 2017, pp 333–341

Download references

Acknowledgements

We greatly thank all doctors who evaluated the system and gave valuable remarks, and also Dr. Aurélie Névéol for her helpful comments on the manuscript. We also thank the anonymous reviewers for their constructive suggestions. We developed the dialogue system in a collaborative project led by Interaction Healthcare and having as partners VIDAL, Angers University Hospital, Voxygen and LIMSI.^{Footnote 4}

Funding

This work was funded by BPI (FUI Project PatientGenesys, F1310002-P) and by the Société d’Accélération de Transfert Technologique (SATT) Paris Saclay (PVDial project). The funding bodies did not take part in the design of the study, analysis and interpretation of data and writing the manuscript.

Author information

Leonardo Campillos-Llanos
Present address: ILLA - Consejo Superior de Investigaciones Científicas (CSIC), Madrid, Spain

Authors and Affiliations

Université Paris-Saclay, CNRS, LISN, Orsay, France
Leonardo Campillos-Llanos, Éric Bilinski, Sophie Rosset & Pierre Zweigenbaum
SATT Paris-Saclay, Orsay, France
Catherine Thomas
Assistance Publique-Hôpitaux de Paris, Paris, France
Antoine Neuraz

Authors

Leonardo Campillos-Llanos
View author publications
You can also search for this author in PubMed Google Scholar
Catherine Thomas
View author publications
You can also search for this author in PubMed Google Scholar
Éric Bilinski
View author publications
You can also search for this author in PubMed Google Scholar
Antoine Neuraz
View author publications
You can also search for this author in PubMed Google Scholar
Sophie Rosset
View author publications
You can also search for this author in PubMed Google Scholar
Pierre Zweigenbaum
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Sophie Rosset (SR), Leonardo Campillos-Llanos (LC) and Catherine Thomas (CT) developed the VP dialogue system, and Pierre Zweigenbaum (PZ) contributed to the medical terminology components and patient record model. Éric Bilinski (EB) implemented the web evaluation tool and the online demonstration of the dialogue system. Antoine Neuraz (AN) helped to engage the evaluation participants and made valuable remarks about the system and article. SR and PZ designed the evaluation protocol, and LC collected and analysed the evaluation data. LC and SR double-checked a subset of the data. LC, SR and PZ wrote the manuscript, and all authors read and approved the final article.

Corresponding author

Correspondence to Leonardo Campillos-Llanos.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the Topical Collection on Education & Training

Appendix

Table 6 Examples of correct, incorrect and deferred replies (I: ‘input’; R: ‘system reply’); we show the English translation of dialogue interactions using the French system

Full size table

Table 7 Results of prediction methods of part-of-speech (PoS) category and morphology data for out-of-vocabulary (OOV) words (in percentage); the number of instances per class is shown in brackets; results of morphology data were only computed on OOVs for which the PoS category was predicted correctly

Full size table

Table 8 Analysis of incorrect replies with examples (I: ‘user input’; R: ‘system reply’); we show the English translation of dialogue interactions using the French system

Full size table

Table 9 Sample clinical record (top) and sample of the output for OOV words in a new VP record (bottom); adj stands for ‘adjective’; fp, for ‘feminine plural’; the format is YAML

Full size table

Table 10 Description of the seen cases used in the usability study

Full size table

Table 11 Description of the unseen cases used in the usability study

Full size table

Table 12 Summary of lessons learned from the development and usability evaluation and implications on design and development

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Campillos-Llanos, L., Thomas, C., Bilinski, É. et al. Lessons Learned from the Usability Evaluation of a Simulated Patient Dialogue System. J Med Syst 45, 69 (2021). https://doi.org/10.1007/s10916-021-01737-4

Download citation

Received: 23 December 2020
Accepted: 05 April 2021
Published: 17 May 2021
DOI: https://doi.org/10.1007/s10916-021-01737-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Lessons Learned from the Usability Evaluation of a Simulated Patient Dialogue System

Abstract

Access this article

Similar content being viewed by others

Intelligent Systems in Learning and Education

Modelling Domain-Specific Self-regulatory Activities in Clinical Reasoning

Instantiating Interactive Narratives from Patient Education Documents

Data availability

Code availability

Notes

References

Acknowledgements

Funding