Skip to main content

Advertisement

Log in

Lessons Learned from the Usability Evaluation of a Simulated Patient Dialogue System

  • Education & Training
  • Published:
Journal of Medical Systems Aims and scope Submit manuscript

Abstract

Simulated consultations through virtual patients allow medical students to practice history-taking skills. Ideally, applications should provide interactions in natural language and be multi-case, multi-specialty. Nevertheless, few systems handle or are tested on a large variety of cases. We present a virtual patient dialogue system in which a medical trainer types new cases and these are processed without human intervention. To develop it, we designed a patient record model, a knowledge model for the history-taking task, and a termino-ontological model for term variation and out-of-vocabulary words. We evaluated whether this system provided quality dialogue across medical specialities (n = 18), and with unseen cases (n = 29) compared to the cases used for development (n = 6). Medical evaluators (students, residents, practitioners, and researchers) conducted simulated history-taking with the system and assessed its performance through Likert-scale questionnaires. We analysed interaction logs and evaluated system correctness. The mean user evaluation score for the 29 unseen cases was 4.06 out of 5 (very good). The evaluation of correctness determined that, on average, 74.3% (sd = 9.5) of replies were correct, 14.9% (sd = 6.3) incorrect, and in 10.7% the system behaved cautiously by deferring a reply. In the user evaluation, all aspects scored higher in the 29 unseen cases than in the 6 seen cases. Although such a multi-case system has its limits, the evaluation showed that creating it is feasible; that it performs adequately; and that it is judged usable. We discuss some lessons learned and pivotal design choices affecting its performance and the end-users, who are primarily medical students.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data availability

The dialogue data collected during development and evaluation is available at: https://pvdial.limsi.fr/data/PG-logs-eval.zip A demonstration of the dialogue system can be tested at: http://vps-9069f76a.vps.ovh.net

Code availability

Not applicable.

Notes

  1. We refer with this term to virtual standardised patients.

  2. http://vps-9069f76a.vps.ovh.net

  3. http://umvf.cerimes.fr/portail/ecn.php

  4. https://pvdial.limsi.fr

References

  1. Washburn M., Bordnick P., Rizzo A. S.: A pilot feasibility study of virtual patient simulation to enhance social work students’ brief mental health assessment skills. Soc. Work Health Care 55 (9): 675–693, 2016

    Article  Google Scholar 

  2. Barnett S. G., Gallimore C. E., Pitterle M., Morrill J.: Impact of a paper vs virtual simulated patient case on student-perceived confidence and engagement. Am. J. Pharm. Educ. 80 (1): 16, 2016

    Article  Google Scholar 

  3. McCoy L., Pettit R. K., Lewis J. H., Allgood J. A., Bay C., Schwartz F. N.: Evaluating medical student engagement during virtual patient simulations: A sequential, mixed methods study. BMC Med. Educ. 16: 20, 2016

    Article  Google Scholar 

  4. Tait L., Lee K., Rasiah R., Cooper J. M., Ling T., Geelan B., Bindoff I. (2018) Simulation and feedback in health education: A mixed methods study comparing three simulation modalities. Pharmacy (Basel) 6(2):41–57

  5. Courteille O., Fahlstedt M., Ho J., Hedman L., Fors U., von Holst H., Fellander-Tsai L., Moller H.: Learning through a virtual patient vs. recorded lecture: A comparison of knowledge retention in a trauma case. Int. J. Med. Educ. 9: 86–92, 2018

    Article  Google Scholar 

  6. Gupta A., Singh S., Khaliq F., Dhaliwal U., Madhu S. V.: Development and validation of simulated virtual patients to impart early clinical exposure in endocrine physiology. Adv. Physiol. Educ. 42 (1): 15–20, 2018

    Article  Google Scholar 

  7. de Cock C., Milne-Ives M., van Velthoven M. H., Alturkistani A., Lam C., Meinert E.: Effectiveness of conversational agents (virtual assistants) in health care: Protocol for a systematic review. JMIR Res. Protoc. 9 (3): e16934, 2020

    Article  Google Scholar 

  8. Ellaway R., Candler C., Greene P., Smothers V. An architectural model for MedBiquitous virtual patients. 2006 http://groups.medbiq.org/medbiq/display/VPWG/MedBiquitous+Virtual+Patient+Architecture, Accessed: 1 Apr 2021

  9. Sijstermans R., Jaspers M. W., Bloemendaal P., Schoonderwaldt E.: Training inter-physician communication using the dynamic patient simulator®; Int. J. Med. Inf. 76 (5–6): 336–343, 2007

    Article  CAS  Google Scholar 

  10. Danforth D. R., Procter M., Chen R., Johnson M., Heller R.: Development of virtual patient simulations for medical education. J. Virtual Worlds Res. 2 (2): 4–11, 2009

    Article  Google Scholar 

  11. Rombauts N. (2014) Patients virtuels: pédagogie, état de l’art et développement du simulateur Alphadiag. PhD dissertation, Faculty of Medicine, Claude Bernard University, Lyon France

  12. Menendez E., Balisa-Rocha B., Jabbur-Lopes M., Costa W., Nascimento J. R., Dósea M., Silva L., Junior D. L.: Using a virtual patient system for the teaching of pharmaceutical care. Int. J. Med. Inf. 84 (9): 640–646, 2015

    Article  Google Scholar 

  13. Lin C. J., Pao C. W., Chen Y. H., Liu C. T., Hsu H. H.: Ellipsis and coreference resolution in a computerized virtual patient dialogue system. J. Med. Syst. 40 (9): 206–221, 2016

    Article  Google Scholar 

  14. Laleye F. A., Blanié A., Brouquet A., Behnamou D., de Chalendar G.: Semantic similarity to improve question understanding in a virtual patient.. In: Proceedings of the 35th Annual ACM Symposium on Applied Computing, 2020, pp 859–866

  15. Chen F., Lee Y., Hubal R.: Work-in-progress—testing of a virtual patient: Linguistic and display engagement findings.. In: 2020 6th International Conference of the Immersive Learning Research Network (iLRN). IEEE, 2020, pp 348–350

  16. Candler C.: Effective use of educational technology in medical education.. In: Colloquium on Educational Technology: Recommendations and Guidelines for Medical Educators. AAMC Institute for Improving Medical Education, Washington, DC, 2007, pp 1–19

  17. Schmidlen T., Schwartz M., DiLoreto K., Kirchner H. L., Sturm A. C.: Patient assessment of chatbots for the scalable delivery of genetic counseling. J. Genet. Couns. 28 (6): 1166–1177, 2019

    Article  Google Scholar 

  18. Chetlen A., Artrip R., Drury B., Arbaiza A., Moore M.: Novel use of chatbot technology to educate patients before breast biopsy. J. Am. Coll. Radiol. 16 (9 Pt B): 1305–1308, 2019

    Article  Google Scholar 

  19. Kokciyan N., Chapman M., Balatsoukas P., Sassoon I., Essers K., Ashworth M., Curcin V., Modgil S., Parsons S., Sklar E. I.: A collaborative decision support tool for managing chronic conditions. Stud. Health Technol. Inform. 264: 644–648, 2019

    PubMed  Google Scholar 

  20. Cook D. A., Erwin P. J., Triola M. M.: Computerized virtual patients in health professions education: A systematic review and meta-analysis. Acad. Med. 85 (10): 1589–1602, 2010. https://doi.org/10.1097/ACM.0b013e3181edfe13

    Article  Google Scholar 

  21. Wattanasoontorn V., Hernández R. J. G., Sbert M.: Embodied conversational virtual patients. In: (Diana P. M., Nieto I. P., Eds.) Conversational Agents and Natural Language Interaction: Techniques and Effective Practices. Information Science Reference, IGI Global, Hershey, 2011, pp 254–281. https://doi.org/10.4018/978-1-60960-617-6.ch011

  22. Rossen B., Lok B.: A crowdsourcing method to develop virtual human conversational agents. Int. J. Hum. Comput. Stud. 70 (4): 301–319, 2012

    Article  Google Scholar 

  23. Lelardeux C., Panzoli D., Alvarez J., Galaup M., Lagarrigue P.: Serious game, simulateur, serious play: État de l’art pour la formation en santé.. In: Actes du colloque Serious Games en Médecine et Santé (SeGaMED) 2013. e-virtuoses, Nice, 2013, pp 27–38

  24. Wattanasoontorn V., Hernández R.J.G., Sbert M.: Serious games for e-health care. In: (Cai Y., Goei S., Eds.) Simulations, Serious Games and Their Applications. Springer, Singapore, 2014, pp 127–146. https://doi.org/10.1007/978-981-4560-32-0_9

  25. Reiswich A., Haag M.: Evaluation of chatbot prototypes for taking the virtual patient’s history. Stud. Health Technol. Inform. 260: 73–80, 2019

    PubMed  Google Scholar 

  26. Nirenburg S., Beale S., McShane M., Jarrell B., Fantry G.: Language understanding in Maryland virtual patient.. In: Proceedings of the International Conference on on Computational Linguistics. Citeseer, Manchester, 2008, pp 36–39

  27. Campillos-Llanos L., Bouamor D., Bilinski É., Ligozat A. L., Zweigenbaum P., Rosset S.: Description of the PatientGenesys dialogue system.. In: Proceedings of SIGDIAL. Association for Computational Linguistics, Prague, 2015, pp 438–440

  28. Leuski A., Traum D.: Practical language processing for virtual humans.. In: Proceedings on Innovative Applications of Artificial Intelligence Conference, Atlanta, 2010, pp 1740–1747

  29. Rizk Y., Kshoury K., Chehab M., Chidiac P., Awad M., Antoun J.: Virtual patient.. In: Proceedings of WINLP, Vancouver, 2017, pp 1–3

  30. Datta D., Brashers V., Owen J., White C., Barnes L. E.: A deep learning methodology for semantic utterance classification in virtual human dialogue systems. In: (Traum D., Swartout W., Khooshabeh P., Kopp S., Scherer S., Leuski A., Eds.) Intelligent Virtual Agents, Los Angeles. Springer, Berlin, 2016, pp 451–455

  31. Jin L., White M., Jaffe E., Zimmerman L., Danforth D.: Combining cnns and pattern matching for question interpretation in a virtual patient dialogue system.. In: Proceedings on Workshop Innovative Use NLP Building Educational Applications. Copenhagen, 2017, pp 11–21

  32. Dickerson R., Johnsen K., Raij A., Lok B., Hernandez J., Stevens A., Lind D. S.: Evaluating a script-based approach for simulating patient-doctor interaction.. In: Proceedings of the International, Conference on Human-Computer Interface Advances Modeling and Simulation, New Orleans, 2005, pp 79–84

  33. Pence T. B., Dukes L. C., Hodges L. F., Meehan N. K., Johnson A.: The effects of interaction and visual fidelity on learning outcomes for a virtual pediatric patient system.. In: IEEE International Conference on Healthcare Informatics (ICHI). IEEE, Philadelphia, 2013, pp 209–218. https://doi.org/10.1109/ICHI.2013.36

  34. Maicher K., Danforth D., Price A., Zimmerman L., Wilcox B., Liston B., Cronau H., Belknap L., Ledford C., Way D., et al.: Developing a conversational virtual standardized patient to enable students to practice history-taking skills. Simul. Healthc. 12 (2): 124–131, 2017. https://doi.org/10.1097/SIH.0000000000000195

    Article  Google Scholar 

  35. Talbot T. B., Sagae K., John B., Rizzo A. A.: Sorting out the virtual patient: How to exploit artificial intelligence, game technology and sound educational practices to create engaging role-playing simulations. Int. J. Gaming Comput. Mediat. Simul. 4 (3): 1–19, 2012. https://doi.org/10.4018/jgcms.2012070101

    Article  Google Scholar 

  36. Scherly D., Nendaz M.: Simulation du raisonnement clinique sur ordinateur: Le patient virtuel. In: (Boet S., Granry J., Savoldelli G., Eds.) La Simulation en Santé. De la Théorie à la Pratique. Springer, Paris, 2013, pp 43–50. https://doi.org/10.1007/978-2-8178-0469-9_5

  37. Hubal R. C., Kizakevich P. N., Guinn C. I., Merino K. D., West S. L.: The virtual standardized patient. Stud. Health Technol. Inform. 70: 133–138, 2000

    CAS  PubMed  Google Scholar 

  38. Stevens A., Hernandez J., Johnsen K., Dickerson R., Raij A., Harrison C., DiPietro M., Allen B., Ferdig R., Foti S., et al.: The use of virtual patients to teach medical students history taking and communication skills. Am. J. Surg. 191 (6): 806–811, 2006

    Article  Google Scholar 

  39. Kenny P., Rizzo A. A., Parsons T. D., Gratch J., Swartout W.: A virtual human agent for training novice therapists clinical interviewing skills. Annu. Rev. CyberTherapy Telemed. 5: 77–83, 2007. https://doi.org/10.1145/159544.159587

    Google Scholar 

  40. Kenny P., Parsons T. D., Gratch J., Rizzo A. A.: Evaluation of Justina: A virtual patient with PTSD. In: (Prendinger H., Lester J., Ishizuka M., Eds.) Intelligent Virtual Agents. Springer, Berlin, 2008, pp 394–408

  41. Parsons T. D.: Virtual standardized patients for assessing the competencies of psychologists.. In: Encyclopedia of Information Science and Technology, 3rd edn. IGI Global, 2015, pp 6484–6492. https://doi.org/10.4018/978-1-4666-5888-2.ch637

  42. Persad A., Stroulia E., Forgie S.: A novel approach to virtual patient simulation using natural language processing. Med. Educ. 50 (11): 1162–1163, 2016. https://doi.org/10.1111/medu.13197

    Article  Google Scholar 

  43. Gokcen A., Jaffe E., Erdmann J., White M., Danforth D.: A corpus of word-aligned asked and anticipated questions in a virtual patient dialogue system.. In: LREC International Conference on Language Resources and Evaluation, Portorož, 2016, pp 3174–3179

  44. Talbot T. B., Kalisch N., Christoffersen K., Lucas G., Forbell E.: Natural language understanding performance and use considerations in virtual medical encounters. Stud Health Technol. Inform. 220: 407–413, 2016

    PubMed  Google Scholar 

  45. Leleu J., Caillat-Grenier R., Pierard N., Rica P., Granry J. C., Lehousse T., Pereira S., Bretier P., Rosec O., Bilinski É., Bouamor D., Campillos-Llanos L., Grau B., Ligozat A. L., Zweigenbaum P., Rosset S.: Patient Genesys: Outil de création de cas cliniques de simulation médicale proposant des cas patients virtuels en 3D.. In: Applications Pratiques de l’Intelligence Artificielle, Rennes, 2015, p 2

  46. Campillos-Llanos L., Bouamor D., Zweigenbaum P., Rosset S.: Managing linguistic and terminological variation in a medical dialogue system.. In: LREC International Conference on Language Resources and Evaluation, Portorož, 2016, pp 3167–3173

  47. Campillos-Llanos L., Thomas C., Bilinski É., Zweigenbaum P., Rosset S.: Designing a virtual patient dialogue system based on terminology-rich resources: Challenges and evaluation. Nat. Lang. Eng. 26 (2): 183–220, 2020

    Article  Google Scholar 

  48. Bodenreider O.: The Unified Medical Language System (UMLS): Integrating biomedical terminology. Nucleic Acids Res. 32 (suppl 1): D267–D270, 2004

    Article  CAS  Google Scholar 

  49. Dybkjær L., Bernsen N.O.: Usability evaluation in spoken language dialogue systems.. In: Proceedings of Workshop on Evaluation for Language and Dialogue Systems. Association for Computational Linguistics, 2001, pp 9–18

  50. Duplessis G. D., Letard V., Ligozat A. L., Rosset S.: Purely corpus-based automatic conversation authoring.. In: LREC International Conference on Language Resources and Evaluation, Portorož, 2016, pp 2728–2735

  51. Campillos-Llanos L., Rosset S., Zweigenbaum P.: Automatic classification of doctor-patient questions for a virtual patient record query task.. In: Proceedings of BioNLP. Association for Computational Linguistics, Vancouver, 2017, pp 333–341

Download references

Acknowledgements

We greatly thank all doctors who evaluated the system and gave valuable remarks, and also Dr. Aurélie Névéol for her helpful comments on the manuscript. We also thank the anonymous reviewers for their constructive suggestions. We developed the dialogue system in a collaborative project led by Interaction Healthcare and having as partners VIDAL, Angers University Hospital, Voxygen and LIMSI.Footnote 4

Funding

This work was funded by BPI (FUI Project PatientGenesys, F1310002-P) and by the Société d’Accélération de Transfert Technologique (SATT) Paris Saclay (PVDial project). The funding bodies did not take part in the design of the study, analysis and interpretation of data and writing the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

Sophie Rosset (SR), Leonardo Campillos-Llanos (LC) and Catherine Thomas (CT) developed the VP dialogue system, and Pierre Zweigenbaum (PZ) contributed to the medical terminology components and patient record model. Éric Bilinski (EB) implemented the web evaluation tool and the online demonstration of the dialogue system. Antoine Neuraz (AN) helped to engage the evaluation participants and made valuable remarks about the system and article. SR and PZ designed the evaluation protocol, and LC collected and analysed the evaluation data. LC and SR double-checked a subset of the data. LC, SR and PZ wrote the manuscript, and all authors read and approved the final article.

Corresponding author

Correspondence to Leonardo Campillos-Llanos.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the Topical Collection on Education & Training

Appendix

Appendix

Fig. 6
figure 6

Procedures and weighting scheme to predict linguistic information for OOV items

Table 6 Examples of correct, incorrect and deferred replies (I: ‘input’; R: ‘system reply’); we show the English translation of dialogue interactions using the French system
Table 7 Results of prediction methods of part-of-speech (PoS) category and morphology data for out-of-vocabulary (OOV) words (in percentage); the number of instances per class is shown in brackets; results of morphology data were only computed on OOVs for which the PoS category was predicted correctly
Table 8 Analysis of incorrect replies with examples (I: ‘user input’; R: ‘system reply’); we show the English translation of dialogue interactions using the French system
Table 9 Sample clinical record (top) and sample of the output for OOV words in a new VP record (bottom); adj stands for ‘adjective’; fp, for ‘feminine plural’; the format is YAML
Fig. 7
figure 7

Interface to input data to create a new virtual patient record

Table 10 Description of the seen cases used in the usability study
Table 11 Description of the unseen cases used in the usability study
Table 12 Summary of lessons learned from the development and usability evaluation and implications on design and development
Fig. 8
figure 8

Overall functioning of the dialogue system and update components; further technical details are provided in [27, 46, 47]

Fig. 9
figure 9

Graphical abstract

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Campillos-Llanos, L., Thomas, C., Bilinski, É. et al. Lessons Learned from the Usability Evaluation of a Simulated Patient Dialogue System. J Med Syst 45, 69 (2021). https://doi.org/10.1007/s10916-021-01737-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10916-021-01737-4

Keywords

Navigation