Advances in Health Sciences Education

, Volume 18, Issue 5, pp 1009–1027 | Cite as

An argument-based approach to the validation of UHTRUST: can we measure how recent graduates can be trusted with unfamiliar tasks?

  • M. Wijnen-MeijerEmail author
  • M. Van der Schaaf
  • E. Booij
  • S. Harendza
  • C. Boscardin
  • J. Van Wijngaarden
  • Th. J. Ten Cate


There is a need for valid methods to assess the readiness for clinical practice of medical graduates. This study evaluates the validity of Utrecht Hamburg Trainee Responsibility for Unfamiliar Situations Test (UHTRUST), an authentic simulation procedure to assess whether medical trainees are ready to be entrusted with unfamiliar clinical tasks near the highest level of Miller’s pyramid. This assessment, in which candidates were judged by clinicians, nurses and standardized patients, addresses the question: can this trainee be trusted with unfamiliar clinical tasks? The aim of this paper is to provide a validity argument for this assessment procedure. We collected data from various sources during preparation and administration of a UHTRUST-assessment. In total, 60 candidates (30 from the Netherlands and 30 from Germany) participated. To provide a validity argument for the UHTRUST-assessment, we followed Kane’s argument-based approach for validation. All available data were used to design a coherent and plausible argument. Considerable data was collected during the development of the assessment procedure. In addition, a generalizability study was conducted to evaluate the reliability of the scores given by assessors and to determine the proportion of variance accounted by candidates and assessors. It was found that most of Kane’s validity assumptions were defendable with accurate and often parallel lines of backing. UHTRUST can be used to compare the readiness for clinical practice of medical graduates. Further exploration of the procedures for entrustment decisions is recommended.


Argument based approach Assessment Authentic simulation Coping with unfamiliar clinical situations Entrustment decisions Readiness for clinical practice Validity 


  1. Arnold, L. (2002). Assessing professional behavior: Yesterday, today and tomorrow. Academic Medicine, 77, 502–515.CrossRefGoogle Scholar
  2. Bakker, M. (2008). Design and evaluation of video portfolios. Reliability, generalizability, and validity of an authentic performance assessment for teachers. Leiden: Mostert & Van Onderen.Google Scholar
  3. Barton, J. R., Corbett, S., & Van der Vleuten, C. P. (2012). The validity and reliability of a direct observation of procedural skills assessment tool: Assessing colonoscopic skills of senior endoscopists. Gastrointestinal Endoscopy, 75(3), 591–597.CrossRefGoogle Scholar
  4. Birenbaum, M., & Dochy, F. (Eds). (1996). Alternatives in assessment of achievement, learning processes and prior knowledge. Boston: Kluwer.Google Scholar
  5. Boursicot, K., & Roberts, T. (2005). How to set up an OSCE. The Clinical Teacher, 2, 16–20.CrossRefGoogle Scholar
  6. Boyce, P., Spratt, C., Davies, M., & McEvoy, P. (2011). Using entrustable professional activities to guide curriculum development in psychiatry training. BMC Medical Education, 11, 96.CrossRefGoogle Scholar
  7. Brennan, R. L. (2006). Perspectives on the evolution and future of educational measurement. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 1–16). Westport, CT: American Council on Education and Praeger Publishers.Google Scholar
  8. Chang, A., Bowen, J. L., Buranosky, R. A., Frankel, R. M., Gosh, N., Rosenblum, M. J., Thompson, S., & Green, M. L. (2012). Transforming primary care training-patient-centered medical home entrustable professional activities for internal medicine residents. Journal of General Internal Medicine (early online).Google Scholar
  9. Chapelle, C. A., Enright, M. K., & Jamieson, J. (2010). Does an argument-based approach to validity make a difference? Educational Measurement Issues and practice, 29, 3–13.CrossRefGoogle Scholar
  10. Cleland, J. A., Abe, K., & Rethans, J. (2009). The use of simulated patients in medical education: AMEE Guide no. 42. Medical Teacher, 31, 477–486.CrossRefGoogle Scholar
  11. Cohen, A. S., & Wollack, J. A. (2006). Test administration, security, scoring and reporting. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 17–64). Westport, CT: American Council on Education and Praeger Publishers.Google Scholar
  12. Cronbach, L. J. (1971). Test validation. In R. L. Thorndike (Ed.), Educational measurement (2nd ed., pp. 443–507). Washington, DC: American Council on Education.Google Scholar
  13. Crossley, J., Johnson, G., Booth, J., & Wade, W. (2011). Good questions, good answers: Construct alignment improves the performance of workplace-based assessment scales. Medical Education, 45, 560–569.CrossRefGoogle Scholar
  14. Cureton, E. E. (1951). Validity. In E. F. Lindquist (Ed.), Educational measurement (pp. 621–694). Washington, DC: American Council on Education.Google Scholar
  15. Dijksterhuis, M. G. K., Teunissen, P. W., Voorhuis, M., Schuwirth, L. W. T., Ten Cate, Th. J., Braat, D. D. M., et al. (2009). Determining competence and progressive independence in postgraduate clinical training. Medical Education, 43, 1156–1165.CrossRefGoogle Scholar
  16. Durning, S. J., Artino, A., Boulet, J., La Rochelle, J., Van der Vleuten, C., Arze, B., et al. (2012). The feasibility, reliability and validity of a post-encounter form for evaluating clinical reasoning. Medical Teacher, 34, 30–37.CrossRefGoogle Scholar
  17. Dwyer, C. A. (1995). Criteria for performance-based teacher assessments: Validity, standards and issues. In A. J. Shinkfield & D. Stufflebeam (Eds.), Teacher evaluation guide to effective practice (pp. 62–80). Boston: Kluwer.Google Scholar
  18. Epstein, R. M. (2007). Assessment in medical education. The New England journal of medicine, 356, 387–396.CrossRefGoogle Scholar
  19. Fraser, S. W., & Greenhalgh, T. (2001). Coping with complexity: Educating for capability. BMJ, 323, 799–803.CrossRefGoogle Scholar
  20. Freidson, E. (1970). Profession of medicine: A study of the sociology of applied knowledge. New York: Dodd, Mead & Company.Google Scholar
  21. Ginsburg, S. (2011). Respecting the expertise of clinician assessors: construct alignment is one good answer. Medical Education, 45, 546–548.CrossRefGoogle Scholar
  22. Ginsburg, S., McIlroy, J., Oulanova, O., Eva, K., & Regehr, G. (2010). Toward authentic clinical evaluation: Pitfalls in the pursuit of competency. Academic Medicine, 85, 780–786.CrossRefGoogle Scholar
  23. Gipps, C. V. (1994). Beyond testing. Towards a theory of educational assessment. London: RoutledgeFalmer.Google Scholar
  24. Govaerts, M. J. B., Van der Vleuten, C. P. M., Schuwirth, L. W. T., & Muijtjens, A. M. (2007). Broadening perspectives on clinical performance assessment: Rethinking the nature of in-training assessment. Advances in Health Sciences Education, 12, 239–260.CrossRefGoogle Scholar
  25. Harden, R. M., & Gleeson, F. A. (1979). Assessment of clinical competence using an objective structured clinical examination (OSCE). Medical Education, 13(1), 41–54.CrossRefGoogle Scholar
  26. Harendza, S. (2011). “HUS” diary of a German nephrologist during the current EHEC outbreak in Europe. Kidney International, 80, 687–689.CrossRefGoogle Scholar
  27. Hawkins, R. E., Katsufrakis, P. J., Holtman, M. C., & Clauser, B. E. (2009). Assessment of medical professionalism: Who, what, when, where, how, and… why? Medical Teacher, 31, 348–361.CrossRefGoogle Scholar
  28. Holmboe, E. S., & Hawkins, R. E. (Eds.). (2008). Practical guide to the evaluation of clinical competence. Philadelphia: Mosby-Elsevier.Google Scholar
  29. Howley, L. D. (2004). Performance assessment in medical education: Where we’ve been and where we’re going. Evaluation and the Health Professions, 27, 285–301.CrossRefGoogle Scholar
  30. Kane, M. T. (1992). An argument-based approach to validity. Psychological Bulletin, 112(3), 527–535.CrossRefGoogle Scholar
  31. Kane, M. (2004). Certification testing as an illustration of argument-based validation. Measurement: Interdisciplinary Research & Perspective, 2, 135–170.Google Scholar
  32. Kane, M. (2006). Validation. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 17–64). Westport, CT: American Council on Education and Praeger Publishers.Google Scholar
  33. Knight, P. T. (2002). The Achilles’ heel of quality: The assessment of student learning. Quality in Higher Education, 8, 107–115.CrossRefGoogle Scholar
  34. Kreiter, C. D., & Bergus, G. (2008). The validity of performance-based measures of clinical reasoning and alternative approaches. Medical Education, 43, 320–325.CrossRefGoogle Scholar
  35. Lane, S., & Stone, C. A. (2006). Performance assessment. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 387–432). Westport, CT: American Council on Education and Praeger Publishers.Google Scholar
  36. Mercer, S. W., Maxwell, M., Heaney, D., & Watt, G. C. M. (2004). The consultation and relational empathy (CARE) measure: Development and preliminary validation and reliability of an empathy-based consultation process measure. Family Practice, 21, 699–705.CrossRefGoogle Scholar
  37. Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–104). New York: American Council on Education and Macmillan.Google Scholar
  38. Messick, S. (1995). Validity of psychological assessment: Validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. American Psychologist, 50(9), 741–749.CrossRefGoogle Scholar
  39. Miller, G. E. (1990). The assessment of clinical skills/competence/performance. Academic Medicine, 65(9), S63–S67.CrossRefGoogle Scholar
  40. Newble, D. (2004). Techniques for measuring clinical competence: Objective structured clinical examinations. Medical Education, 38, 199–203.CrossRefGoogle Scholar
  41. Nijveldt, M. (2007). Validity in teacher assessment. An exploration of the judgement processes of assessors. Enschede: Gildeprint.Google Scholar
  42. Norcini, J. J., Blank, L. L., Arnold, G. K., & Kimball, H. R. (1995). The Mini-CEX (Clinical Evaluation Exercise): A preliminary investigation. Annals of Internal Medicine, 123, 795–799.CrossRefGoogle Scholar
  43. Sterkenburg, A., Barach, P., Kalkman, C., Gielen, M., & Ten Cate, O. T. J. (2010). When do supervising physicians decide to entrust residents with unsupervised tasks? Academic Medicine, 85, 1408–1417.CrossRefGoogle Scholar
  44. Tavares, W., & Eva, K. W. (2012). Exploring the impact of mental workload on rater-based assessments. Advances in Health Sciences Education. doi: 10.1007/s10459-012-9370-3.
  45. Ten Cate, O. (2005). Entrustability of professional activities and competency-based training. Medical Education, 39, 1176–1177.CrossRefGoogle Scholar
  46. Ten Cate, O. & Scheele, F. (2007). Competence-based postgraduate training: Can we bridge the gap between educational theory and clinical practice? Academic Medicine, 82, 542–547.Google Scholar
  47. Ten Cate, O., Snell, L., & Carraccio, C. (2010). Medical competence: The interplay between individual ability and the health care environment. Medical Teacher, 32, 669–675.CrossRefGoogle Scholar
  48. Van der Vleuten, C. P. M. (1996). The assessment of professional competence: Developments, research and practical implications. Advances in Health Sciences Education, 1, 41–67.CrossRefGoogle Scholar
  49. Wass, V., & Archer, J. (2011). Assessing learners. In T. Dornan, K. Mann, A. Scherpbier, & J. Spencer (Eds.), Medical education: Theory and practice (pp. 229–255). Toronto: Churchill Livingstone Elsevier.Google Scholar
  50. Wass, V., Van der Vleuten, C., Shatzer, J., & Jones, R. (2001). Assessment of clinical competence. The Lancet, 357, 945–949.CrossRefGoogle Scholar
  51. Wetzel, A. P. (2012). Analysis methods and validity evidence: A review of instrument development across the medical education continuum. Academic Medicine, 87(8), 2012.CrossRefGoogle Scholar
  52. Wittert, G. A., & Nelson, A. J. (2009). Medical Education: Revolution, devolution and evolution in curriculum philosophy and design. Medical Journal of Australia, 191, 35–37.Google Scholar
  53. Wijnen-Meijer, M., Van der Schaaf, M., Nillesen, K., Harendza, S. & Ten Cate, O. Essential FOCs that enable trust in graduates: A Delphi study among physician educators in the Netherlands. Journal of Graduate Medical Education (Accepted for publication).Google Scholar
  54. Wijnen-Meijer, M., Van der Schaaf, M., Nillesen, K., Harendza, S. & Ten Cate, O. Essential facets of competence that enable trust in medical graduates: A ranking study among physician educators in two countries. (Submitted).Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2013

Authors and Affiliations

  • M. Wijnen-Meijer
    • 1
    Email author
  • M. Van der Schaaf
    • 2
  • E. Booij
    • 1
  • S. Harendza
    • 3
  • C. Boscardin
    • 4
  • J. Van Wijngaarden
    • 5
  • Th. J. Ten Cate
    • 1
    • 4
  1. 1.Center for Research and Development of EducationUniversity Medical Center UtrechtUtrechtThe Netherlands
  2. 2.Department of EducationUtrecht UniversityUtrechtThe Netherlands
  3. 3.Department of Internal MedicineUniversity Medical Center Hamburg-EppendorfHamburgGermany
  4. 4.Department of MedicineUniversity of CaliforniaSan FranciscoUSA
  5. 5.Department Clinical Skills TrainingUniversity Medical Center UtrechtUtrechtThe Netherlands

Personalised recommendations