Reliability and Validity

  • Rose Hatala
  • David A. Cook


The choice of outcome measure for simulation studies is a crucial element of the research design, for without careful forethought and planning how can we be confident that we are measuring what we intend to measure? In this chapter, we follow on from the concepts introduced in Chap. 25, outlining the key elements in developing and examining the validity argument for the outcome measure used in a simulation research study, with an emphasis on Kane’s framework.


Validity Reliability Assessment Research design Simulation 


  1. 1.
    Cook DA, Hatala R. Validation of educational assessments: a primer for simulation and beyond. Adv Simul. 2016;1:31. Scholar
  2. 2.
    Cook DA, Brydges R, Ginsburg S, Hatala R. A contemporary approach to validity arguments: a practical guide to Kane’s framework. Med Educ. 2015;49(6):560–75. Scholar
  3. 3.
    Messick S. Validity. In: Linn RL, editor. Educational measurement. 3rd ed. New York: American Council on Education and Macmillan; 1989. p. 13–104.Google Scholar
  4. 4.
    Kane MT. Validating the interpretations and uses of test scores. J Educ Meas. 2013;50(1):1–73.CrossRefGoogle Scholar
  5. 5.
    Cook DA, Kuper A, Hatala R, Ginsburg S. When assessment data are words: validity evidence for qualitative educational assessments. Acad Med. 2016;91:1359–69.
  6. 6.
    Martin JA, Regehr G, Reznick R, MacRae H, Murnaghan J, Hutchison C, Brown M. Objective structured assessment of technical skill (OSATS) for surgical residents. Br J Surg. 1997;84(2):273–8.CrossRefGoogle Scholar
  7. 7.
    Hatala R, Cook DA, Brydges R, Hawkins R. Constructing a validity argument for the Objective Structured Assessment of Technical Skills (OSATS): a systematic review of validity evidence. Adv Health Sci Educ. 2015;20:1149–75. Scholar
  8. 8.
    Brydges R, Hatala R, Zendejas B, Erwin PJ, Cook DA. Linking simulation-based educational assessments and patient-related outcomes. Acad Med. 2015;90(2):246–56. Scholar
  9. 9.
    St-Onge C, Young M, Eva KW, Hodges B. Validity: one word with a plurality of meanings. Adv Health Sci Educ. 2016;22(4):853–67. Scholar
  10. 10.
    Ilgen JS, Ma IWY, Hatala R, Cook DA. A systematic review of validity evidence for checklists versus global rating scales in simulation-based assessment. Med Educ. 2015;49(2):161–73. Scholar
  11. 11.
    Holmboe ES, Hawkins RE, Huot SJ. Effects of training in direct observation of medical residents’ clinical competence: a randomized trial. Ann Intern Med. 2004;140(11):874–81.CrossRefGoogle Scholar
  12. 12.
    Cook DA, Zendejas B, Hamstra SJ, Hatala R, Brydges R. What counts as validity evidence? Examples and prevalence in a systematic review of simulation-based assessment. Adv Health Sci Educ Theory Pract. 2014;19(2):233–50.Google Scholar
  13. 13.
    Cook DA. Much ado about differences: why expert-novice comparisons add little to the validity argument. Adv Health Sci Educ. 2015;20(3):829–34. Scholar
  14. 14.
    Cook DA, Brydges R, Zendejas B, Hamstra SJ, Hatala R. Technology-enhanced simulation to assess health professionals: a systematic review of validity evidence, research methods, and reporting quality. Acad Med. 2013;88(6):872–83.
  15. 15.
    Hatala R, Sawatsky AP, Dudek N, Ginsburg S, Cook DA. Using In-Training Evaluation Report (ITER) qualitative comments to assess medical students and residents: a systematic review. Acad Med. 2017;92(6):868–79.

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Rose Hatala
    • 1
  • David A. Cook
    • 2
  1. 1.Department of Medicine, St. Paul’s HospitalThe University of British ColumbiaVancouverCanada
  2. 2.Mayo Clinic Multidisciplinary Simulation Center, Office of Applied Scholarship and Education Science, and Division of General Internal MedicineMayo Clinic College of Medicine and ScienceRochesterUSA

Personalised recommendations