Abstract
The way quality of assessment has been perceived and assured has changed considerably in the recent 5 decades. Originally, assessment was mainly seen as a measurement problem with the aim to tell people apart, the competent from the not competent. Logically, reproducibility or reliability and construct validity were seen as necessary and sufficient for assessment quality and the role of human judgement was minimised. Later, assessment moved back into the authentic workplace with various workplace-based assessment (WBA) methods. Although originally approached from the same measurement framework, WBA and other assessments gradually became assessment processes that included or embraced human judgement but based on good support and assessment expertise. Currently, assessment is treated as a whole system problem in which competence is evaluated from an integrated rather than a reductionist perspective. Current research therefore focuses on how to support and improve human judgement, how to triangulate assessment information meaningfully and how to construct fairness, credibility and defensibility from a systems perspective. But, given the rapid changes in society, education and healthcare, yet another evolution in our thinking about good assessment is likely to lurk around the corner.
This is a preview of subscription content, access via your institution.
References
Albanese, M. A., Mejicano, G., Mullan, P., Kokotailo, P., & Gruppen, L. (2008). Defining characteristics of educational competencies. Medical Education, 42(3), 248–255.
Berendonk, C., Stalmeijer, R. E., & Schuwirth, L. W. T. (2013). Expertise in performance assessment: Assessors’ perspectives. Advances in Health Sciences Education, 18(4), 559–571.
Boreham, N. C. (1994). The dangerous practice of thinking. Medical Education, 28, 172–179.
Botsman, R. (2017). Who can you trust?: How technology brought us together and why it might drive us apart. New York: Hachette.
Boud, D. (1990). Assessment and the promotion of academic values. Studies in Higher Education, 15(1), 101–111.
Boud, D. (1995). Assessment and learning: Contradictory or complementary. In P. Knight (Ed.), Assessment for learning in higher education (pp. 35–48). London: Kogan.
Canmeds. (2005). Retrieved 26 July, April 2017 from, http://www.royalcollege.ca/portal/page/portal/rc/canmeds.
Checkland, P. (1985). From optimizing to learning: A development of systems thinking for the 1990s. The Journal of the Operational Research Society, 36(9), 757–767.
Chi, M. T. H., Glaser, R., & Rees, E. (1982). Expertise in problem solving. In R. J. Sternberg (Ed.), Advances in the psychology of human intelligence (pp. 7–76). Hillsdale: Lawrence Erlbaum Associates.
Cilliers, F. J., Schuwirth, L. W. T., Adendorff, H. J., Herman, N., & Van der Vleuten, C. P. M. (2010). The mechanisms of impact of summative assessment on medical students’ learning. Advances in Health Sciences Education, 15, 695–715.
Cilliers, F. J., Schuwirth, L. W. T., Herman, N., Adendorff, H. J., & Van der Vleuten, C. P. M. (2012). A model of the pre-assessment learning effects of summative assessment in medical education. Advances in Health Sciences Education, 17, 39–53.
Cook, D. A., Kuper, A., Hatala, R., & Ginsburg, S. (2016). When assessment data are words: Validity evidence for qualitative educational assessments. Academic Medicine, 91(10), 1359–1369.
Cooper, L., Orrell, J., & Bowden, M. (2010). Work integrated learning: A fuide to effective practice. Milton Park: Routledge.
Cronbach, L. J. (1983). What price simplicity? Educational Measurement: Issues and Practice, 2(2), 11–12.
Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52(4), 281–302.
Dijkstra, J., Galbraith, R., Hodges, B., McAvoy, P., McCrorie, P., Southgate, L., et al. (2012). Expert validation of fit-for-purpose guidelines for designing programmes of assessment. BMC Medical Education, 12(20), 1–8.
Dijkstra, J., Van der Vleuten, C. P. M., & Schuwirth, L. W. T. (2010). A new framework for designing programmes of assessment. Advances in Health Sciences Education, 15, 379–393.
Driessen, E., Van der Vleuten, C. P. M., Schuwirth, L. W. T., Van Tartwijk, J., & Vermunt, J. (2005). The use of qualitative research criteria for portfolio assessment as an alternative to reliability evaluation: A case study. Medical Education, 39(2), 214–220.
Durning, S. J., Artino, A., Pangaro, L., Van der Vleuten, C., & Schuwirth, L. (2010). Redefining context in the clinical encounter: implications for research and training in medical education. Academic Medicine, 85(5), 894–901.
Ebel, R. L. (1983). The practical validation of tests of ability. Educational Measurement: Issues and Practice, 2(2), 7–10.
Epstein, R. M., & Hundert, E. M. (2002). Defining and assessing professional competence. The Journal of the American Medical Association, 287(2), 226–235.
Ericsson, K. A., Krampe, R. T., & Tesch-Romer, C. (1993). The role of deliberate practice in the acquisition of expert performance. Psychological Review, 100(3), 363–406.
Eva, K. (2003). On the generality of specificity. Medical Education, 37, 587–588.
Eva, K. W., Neville, A. J., & Norman, G. R. (1998). Exploring the etiology of content specificity: Factors influencing analogic transfer and problem solving. Academic Medicine, 73(10), s1–s5.
Friedman, L. W., & Friedman, H. H. (2008). The new media technologies: Overview and research framework. Available at SSRN 1116771.
Gingerich, A. (2015). Questioning the rater idiosyncrasy explanation for error variance by searching for mulitple signals within the noise. Maastricht: Maastricht University.
Gingerich, A., Kogan, J., Yeates, P., Govaerts, M., & Holmboe, E. (2014). Seeing the ‘black box’differently: Assessor cognition from three research perspectives. Medical Education, 48(11), 1055–1068.
Gingerich, A., Ramlo, S. E., Van der Vleuten, C. P. M., Eva, K. W., & Regehr, G. (2017). Inter-rater variability as mutual disagreement: Identifying raters’ divergent points of view. Advances in Health Sciences Education, 22(4), 819–838.
Ginsburg, S., McIlroy, J., Oulanova, O., Eva, K., & Regehr, G. (2010). Toward authentic clinical evaluation: Pitfalls in the pursuit of competency. Academic Medicine, 85(5), 780–786.
Ginsburg, S., Regehr, G., Lingard, L., & Eva, K. (2015). Reading between the lines: Faculty interpretations narrative evaluation comments. Medical Education, 49, 296–306.
Ginsburg, S., Vleuten, C. P. M., Eva, K. W., & Lingard, L. (2017). Cracking the code: Residents’ interpretations of written assessment comment. Medical Education, 51, 401–410.
Govaerts, M. (2008). Educational competencies or education for professional competence? Medical Education, 42(3), 234–236.
Govaerts, M. J. B., Schuwirth, L. W. T., Van der Vleuten, C. P. M., & Muijtjens, A. M. M. (2011). Workplace-based assessment: Effects of rater expertise. Advances in Health Sciences Education, 16(2), 151–165.
Govaerts, M. J. B., Wiel, M. W. J., Schuwirth, L. W. T., Vleuten, C. P. M., & Muijtjens, A. M. M. (2012). Workplace-based assessment: Raters’ performance theories and constructs. Advances in Health Sciences Education, 18, 1–22.
Hager, P., & Gonczi, A. (1996). What is competence? Medical Teacher, 18(1), 15–18.
Harrison, C. J., Könings, K. D., Dannefer, E. F., Schuwirth, L. W. T., Wass, V., & Van der Vleuten, C. P. M. (2016). Factors influencing students’ receptivity to formative feedback emerging from different assessment cultures. Perspectives on Medical Education, 5, 276–284.
Harrison, C. J., Könings, K. D., Schuwirth, L., Wass, V., & Van der Vleuten, C. (2015). Barriers to the uptake and use of feedback in the context of summative assessment. Advances in Health Sciences Education, 20(1), 229–245.
Harrison, C. J., Könings, K. D., Schuwirth, L. W. T., Wass, V., & Van der Vleuten, C. P. M. (2017). Changing the culture of assessment: the dominance of the summative assessment paradigm. BMC Medical Education, 17(1), 73.
Hodges, B. (2013). Assessment in the post-psychometric era: Learning to love the subjective and collective. Medical Teacher, 35(7), 564–568.
Hodges, B., & Lingard, L. (2012). The question of competence: Reconsidering medical education in the twenty-first century. Ithaka New York: Cornell University Press.
Kane, M. T. (2006). Validation. In R. L. Brennan (Ed.), Educational measurement (Vol. 1, pp. 17–64). ACE/Praeger: Westport.
Moonen-van Loon, J. M. W., Overeem, K., Donkers, H. H. L. M., Van der Vleuten, C. P. M., & Driessen, E. W. (2013). Composite reliability of a workplace-based assessment toolbox for postgraduate medical education. Advances in Health Sciences Education, 18(5), 1087–1102.
Norcini, J., Blank, L. L., Arnold, G. K., & Kimball, H. R. (1995). The mini-CEX (clinical evaluation exercise);a preliminary investigation. Annals of Internal Medicine, 123(10), 795–799.
Norman, G. (2009). Dual processing and diagnostic errors. Advances in Health Sciences Education, 14, 37–49.
Norman, G. (2011). Chaos, complexity and complicatedness: Lessons from rocket science. Medical Education, 45, 549–559.
Norman, G., Tugwell, P., Feightner, J., Muzzin, L., & Jacoby, L. (1985). Knowledge and clinical problem-solving. Medical Education, 19, 344–356.
Norman, G. R. (1988). Problem-solving skills, solving problems and problem-based learning. Medical Education, 22, 270–286.
Norman, G. R., Smith, E. K. M., Powles, A. C., Rooney, P. J., Henry, N. L., & Dodd, P. E. (1987). Factors underlying performance on written tests of knowledge. Medical Education, 21, 297–304.
Norman, G. R., Van der Vleuten, C. P. M., & De Graaff, E. (1991). Pitfalls in the pursuit of objectivity: Issues of validity, efficiency and acceptability. Medical Education, 25(2), 119–126.
Popham, W. J. (2009). Assessment literacy for teachers: Faddish or fundamental? Theory Into Practice, 48, 4–11.
Schuwirth, L. W. T., & Van der Vleuten, C. P. M. (2006). A plea for new psychometrical models in educational assessment. Medical Education, 40(4), 296–300.
Schuwirth, L. W. T., & Van der Vleuten, C. P. M. (2012). Programmatic assessment and Kane’s validity perspective. Medical Education, 46(1), 38–48.
Schuwirth, L. W. T., Van der Vleuten, C. P. M., & Donkers, H. H. L. M. (1996). A closer look at cueing effects in multiple-choice questions. Medical Education, 30, 44–49.
Schuwirth, L. W. T., Vleuten, C. P. M., & Durning, S. J. (2017). What programmatic assessment in medical education can learn from healthcare. Perspectives on Medical Education, 6, 1–5.
Shanahan, M. E., Van der Vleuten, C., & Schuwirth, L. (2019). Conflict between clinician teachers and their students: the clinician perspective. Advances in Health Sciences Education, 25, 401–414.
Shirky, C. (2010). Cognitive surplus: Creativity and generosity in a connected age. London: Penguin.
Swanson, D. B. (1987). A measurement framework for performance-based tests. In I. Hart & R. Harden (Eds.), Further developments in Assessing Clinical Competence (pp. 13–45). Montreal: Can-Heal Publications.
Swanson, D. B., & Norcini, J. J. (1989). Factors influencing reproducibility of tests using standardized patients. Teaching and Learning in Medicine, 1(3), 158–166.
Swanson, D. B., Norcini, J. J., & Grosso, L. J. (1987). Assessment of clinical competence: Written and computer-based simulations. Assessment and Evaluation in Higher Education, 12(3), 220–246.
Ten Cate, Th J. (2005). Entrustability of professional activities and competency-based training. Medical Education, 39, 1176–1177.
Ten Cate, Th J, & Scheele, F. (2007). Competency-based postgraduate training: Can we bridge the gap between theory and clinical practice. Academic Medicine, 82, 542–547.
Ulrich, W. (2001). The quest for competence in systemic research and practice. SystemsResearch and Behavioral Science, 18, 3–28.
Valentine, N., Durnig, S. J., Shanahan, E. M. & Schuwirth, L. W. T. (accepted for publication). Fairness in human judgement in assessment: A hermeneutic literature review and conceptual framework. Advances in Health Sciences Education.
Valentine, N., & Schuwirth, L. W. T. (2019). Identifying the narrative used by educators in articulating judgement of performance. J Perspectives on Medical Education, 8(2), 1–7.
Valentine, N., Wignes, J., Benson, J., Clota, S., & Schuwirth, L. W. T. (2019). Entrustable professional activities for workplace assessment of general practice trainees. Medical Journal of Australia, 210(8), 354–359.
Van der Vleuten, C. P. M. (1996). The assessment of professional competence: Developments, research and practical implications. Advances in Health Science Education, 1(1), 41–67.
Van der Vleuten, C. P. M., Norman, G. R., & De Graaff, E. (1991). Pitfalls in the pursuit of objectivity: Issues of reliability. Medical Education, 25, 110–118.
Van der Vleuten, C. P. M., & Schuwirth, L. W. T. (2005). Assessing professional competence: From methods to programmes. Medical Education, 39(3), 309–317.
Van der Vleuten, C. P. M., Schuwirth, L. W. T., Driessen, E. W., Dijkstra, J., Tigelaar, D., Baartman, L. K. J., et al. (2012). A model for programmatic assessment fit for purpose. Medical Teacher, 34, 205–214.
Van der Vleuten, C. P. M., Schuwirth, L. W. T., Driessen, E. W., Govaerts, M. J. B., & Heeneman, S. (2015). 12 Tips for programmatic assessment. Medical Teacher, 37(7), 641–646.
Van der Vleuten, C. P. M., & Swanson, D. (1990). Assessment of clinical skills with standardized patients: State of the art. Teaching and Learning in Medicine, 2(2), 58–76.
Van der Vleuten, C. P. M., Van Luyk, S. J., & Beckers, H. J. M. (1988). A written test as an alternative to performance testing. Medical Education, 22, 97–107.
Van Merrienboer, J. J. G., & Sweller, J. (2005). Cognitive load theory and complex learning: Recent developments and future directions. Educational Psychology Review, 17(2), 147–177.
Ward, W. C. (1982). A comparison of free-response and multiple-choice forms of verbal aptitude tests. Applied Psychological Measurement, 6(1), 1–11.
Watling, C., Driessen, E., Van der Vleuten, C. P. M., Vanstone, M., & Lingard, L. (2013). Beyond individualism: Professional culture and its influence on feedback. Medical Education, 47(6), 585–594.
Weller, J. M., Misur, M., Nicolson, S., Morris, J., Ure, S., & Jolly, B. (2014). Can I leave the theatre? A key to more reliable workplace-based assessment. British Journal of Anaesthesia, 112(6), 1083–1091.
Young, M., Thomas, A., Gordon, D., Gruppen, L., Lubarsky, S., Rencic, J., et al. (2019). The terminology of clinical reasoning in health professions education: Implications and considerations. Medical Teacher, 41, 1–8.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Schuwirth, L.W.T., van der Vleuten, C.P.M. A history of assessment in medical education. Adv in Health Sci Educ 25, 1045–1056 (2020). https://doi.org/10.1007/s10459-020-10003-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10459-020-10003-0
Keywords
- Assessment
- History
- Programmatic assessment
- Workplace based assessment