Designing, Choosing, and Using Assessment Tools in Healthcare Simulation Research

  • John BouletEmail author
  • David J. Murray


Studies in healthcare simulation research are often based on performance scores. These scores can be used to compare provider groups, establish the efficacy of competing educational programs, and identify clinical skills deficiencies. The following chapter provides an overview of the development and use of assessment tools. Researchers need to select tools that align with the purpose of the assessment. Where human evaluators are employed, they should have sufficient expertise in the domains being assessed. Training is also necessary to ensure that evaluators are using the rubrics as intended. In the future, technology may be helpful for gathering accurate data from, and providing standardized scoring for, various simulation-based assessments. Healthcare simulation researchers who employ assessment tools need to evaluate whether the scores that are produced represent reliable and valid estimates of ability. Without some assurance of the psychometric rigor of the scores, their use in any research study could be questioned.


Simulation-based assessment Scoring Reliability Validity 


  1. 1.
    Boulet JR, Murray DJ. Simulation-based assessment in anesthesiology: requirements for practical implementation. Anesthesiology. 2010;112(4):1041–52.PubMedCrossRefGoogle Scholar
  2. 2.
    Hatala RA, Cook D. Reliability and validity. In: Nestel D, Hui J, Kunkler K, Calhoun A, Scerbo M, editors. Healthcare simulation research: a practical guide. Cham: Springer.Google Scholar
  3. 3.
    Holmboe E, Rizzolo MA, Sachdeva AK, Rosenberg M, Ziv A. Simulation-based assessment and the regulation of healthcare professionals. Simul Healthc. 2011;6(7 SUPPL):58–62.CrossRefGoogle Scholar
  4. 4.
    Epstein RM. Assessment in medical education. N Engl J Med 2007;356(4):387–396. Cox M, Irby DM, editors.Google Scholar
  5. 5.
    Boulet JR, Mckinley DW, Whelan GP, Hambleton RK. Quality assurance methods for performance-based assessments. Adv Health Sci Educ. 2003;8(1):27–47.CrossRefGoogle Scholar
  6. 6.
    Cizek GJ. Defining and distinguishing validity: interpretations of score meaning and justifications of test use. Psychol Methods. 2012;17(1):31–43.PubMedCrossRefGoogle Scholar
  7. 7.
    Cook DA, Hatala R. Validation of educational assessments: a primer for simulation and beyond. Adv Simul. 2016;1(1):31.CrossRefGoogle Scholar
  8. 8.
    Clauser BE, Margolis MJ, Swanson DB. Practical guide to the evaluation of clinical competence. In: Holmboe ES, Durning SJ, Hawkins RE, editors. Practical guide to the evaluation of clinical competence. 2nd ed. Philadelphia: Elsevier; 2017. p. 22–36.Google Scholar
  9. 9.
    Tavares W, Brydges R, Myre P, Prpic J, Turner L, Yelle R, et al. Applying Kane’s validity framework to a simulation based assessment of clinical competence. Adv Health Sci Educ. 2017;23(2):1–16.Google Scholar
  10. 10.
    Kane MT. Current concerns in validity theory. J Educ Meas. 2001;38(4):319–42.CrossRefGoogle Scholar
  11. 11.
    Kane MT. Validating the interpretations and uses of test scores. J Educ Meas. 2013;50(1):1–73.CrossRefGoogle Scholar
  12. 12.
    Blum RH, Muret-Wagstaff SL, Boulet JR, Cooper JB, Petrusa ER, Baker KH, et al. Simulation-based assessment to reliably identify key resident performance attributes. Anesthesiology. 2018;128(4):821–31.PubMedCrossRefGoogle Scholar
  13. 13.
    Henrichs BM, Avidan MS, Murray DJ, Boulet JR, Kras J, Krause B, et al. Performance of certified registered nurse anesthetists and anesthesiologists in a simulation-based skills assessment. Anesth Analg. 2009;108(1):255–62.PubMedCrossRefGoogle Scholar
  14. 14.
    Watkins SC, Roberts DA, Boulet JR, Mcevoy MD, Weinger MB. Evaluation of a simpler tool to assess nontechnical skills during simulated critical events. Simul Healthc. 2017;12(2):69–75.PubMedCrossRefGoogle Scholar
  15. 15.
    Kreiter CD, Gordon JA, Elliott ST, Ferguson KJ. A prelude to modeling the expert: a generalizability study of expert ratings of performance on computerized clinical simulations. Adv Health Sci Educ. 1999;4(3):261–70.CrossRefGoogle Scholar
  16. 16.
    Mcgaghie WC, Issenberg SB, Barsuk JH, Wayne DB. A critical review of simulation-based mastery learning with translational outcomes. Med Educ. 2014;48(4):375–85.PubMedCrossRefGoogle Scholar
  17. 17.
    Griswold-Theodorson S, Ponnuru S, Dong C, Szyld D, Reed T, McGaghie WC. Beyond the simulation laboratory: a realist synthesis review of clinical outcomes of simulation-based mastery learning. Acad Med. 2015;90:1553–60.PubMedCrossRefGoogle Scholar
  18. 18.
    Weinger MB, Banerjee A, Burden AR, Mcivor WR, Boulet J, Cooper JB, et al. Simulation-based assessment of the management of critical events by board-certified anesthesiologists. Anesthesiology. 2017;127(3):475–89.PubMedCrossRefGoogle Scholar
  19. 19.
    Wiggins LL, Morrison S, Lutz C, O’Donnell J. Using evidence-based best practices of simulation, checklists, deliberate practice, and debriefing to develop and improve a regional anesthesia training course. AANA J. 2018;86(2):119–26.Google Scholar
  20. 20.
    Boulet JR, Swanson DB. Psychometric challenges of using simulations for high-stakes assessment. In: Dunn WF, editor. Simulators in critical care and beyond. Des Plains: Society of Critical Care Medicine; 2004. p. 119–30.Google Scholar
  21. 21.
    Jonsson A, Svingby G. The use of scoring rubrics: reliability, validity and educational consequences. Educ Res Rev. 2007;2(2):130–44.CrossRefGoogle Scholar
  22. 22.
    Vu NV, Barrows HS, Marcy ML, Verhulst SJ, Colliver IA, Travis T. Six years of comprehensive, clinical, performance-based assessment using standardized patients at the southern illinois university school of medicine. Acad Med. 1992;67(1):42–50.PubMedCrossRefGoogle Scholar
  23. 23.
    Ilgen JS, Ma IWY, Hatala R, Cook DA. A systematic review of validity evidence for checklists versus global rating scales in simulation-based assessment. Med Educ. 2015;49(2):161–73.PubMedCrossRefGoogle Scholar
  24. 24.
    Boulet JR, McKinley DW. Criteria for a good assessment. In: McGaghie WC, editor. International best practices for evaluation in the health professions. London: Radcliffe Publishing, Inc; 2013. p. 19–43.Google Scholar
  25. 25.
    CanMeds. The Royal College of physicians and surgeons of Canada: CanMEDS framework [Internet]. 2017 [cited 2018 Jul 16]. Available from:
  26. 26.
    Durning SJ, Artino AR, Boulet JR, Dorrance K, van der Vleuten C, Schuwirth L. The impact of selected contextual factors on experts’ clinical reasoning performance (does context impact clinical reasoning performance in experts?). Adv Health Sci Educ. 2012;17(1):65–79.CrossRefGoogle Scholar
  27. 27.
    Scalese RJ, Obeso VT, Issenberg SB. Simulation technology for skills training and competency assessment in medical education. J Gen Intern Med. 2008;23(1 SUPPL):46–9.PubMedCrossRefGoogle Scholar
  28. 28.
    Ryan CA, Walshe N, Gaffney R, Shanks A, Burgoyne L, Wiskin CM. Using standardized patients to assess communication skills in medical and nursing students. BMC Med Educ. 2010;10(1):1–8.CrossRefGoogle Scholar
  29. 29.
    McDowell I. Measuring health: a guide to rating scales and questionnaires. 3rd ed. New York: Oxford University Press, Inc.; 2006.. 768 pCrossRefGoogle Scholar
  30. 30.
    Athey TR, McIntyre RM. Effect of rater training on rater accuracy: levels-of-processing theory and social facilitation theory perspectives. J Appl Psychol. 1987;72(4):567–72.CrossRefGoogle Scholar
  31. 31.
    Cheng A, Auerbach M, Hunt EA, Chang TP, Pusic M, Nadkarni V, et al. Designing and conducting simulation-based research. Pediatrics. 2014;133(6):1091–101.CrossRefPubMedGoogle Scholar
  32. 32.
    Boulet JR, Van Zanten M, De Champlain A, Hawkins RE, Peitzman SJ. Checklist content on a standardized patient assessment: an ex post facto review. Adv Health Sci Educ. 2008;13(1):59–69.CrossRefGoogle Scholar
  33. 33.
    Boulet JR, Smee SM, Dillon GF, Gimpel JR. The use of standardized patient assessments for certification and licensure decisions. Simul Healthc. 2009;4(1):35–42.PubMedCrossRefGoogle Scholar
  34. 34.
    Ben-David MF, Boulet JR, Burdick WP, Ziv A, Hambleton RK, Gary NE. Issues of validity and reliability concerning who scores the post-encounter patient-progress note. Acad Med. 1997;72(10 Suppl 1):S79–81.PubMedGoogle Scholar
  35. 35.
    Boulet JR, McKinley DW, Norcini JJ, Whelan GP. Assessing the comparability of standardized patient and physician evaluations of clinical skills. Adv Health Sci Educ Theory Pract. 2002;7(2):85–97.PubMedCrossRefGoogle Scholar
  36. 36.
    Boulet JR, Errichetti AM. Training and assessment with standardized patients. In: Riley RH, editor. Manual of simulation in healthcare. 2nd ed. Oxford: Oxford University Press; 2016. p. 185–207.Google Scholar
  37. 37.
    Wu JT, Dernoncourt F, Gehrmann S, Tyler PD, Moseley ET, Carlson ET, et al. Behind the scenes: a medical natural language processing project. Int J Med Inform. 2018;112:68–73.PubMedCrossRefGoogle Scholar
  38. 38.
    Hodges BD. Learning from Dorothy Vaughan: artificial intelligence and the health professions. Med Educ. 2018;52(1):11–3.PubMedCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Vice President, Research and Data Resources, Educational Commission for Foreign GraduatesFoundation for Advancement of International Medical Education and ResearchPhiladelphiaUSA
  2. 2.Department of AnesthesiologyWashington University School of MedicineSt. LouisUSA

Personalised recommendations