Abstract
Advocates of holistic assessment consider the ITER a more authentic way to assess performance. But this assessment format is subjective and, therefore, susceptible to rater bias. Here our objective was to study the association between rater variables and ITER ratings. In this observational study our participants were clerks at the University of Calgary and preceptors who completed online ITERs between February 2008 and July 2009. Our outcome variable was global rating on the ITER (rated 1–5), and we used a generalized estimating equation model to identify variables associated with this rating. Students were rated “above expected level” or “outstanding” on 66.4 % of 1050 online ITERs completed during the study period. Two rater variables attenuated ITER ratings: the log transformed time taken to complete the ITER [β = −0.06, 95 % confidence interval (−0.10, −0.02), p = 0.002], and the number of ITERs that a preceptor completed over the time period of the study [β = −0.008 (−0.02, −0.001), p = 0.02]. In this study we found evidence of leniency bias that resulted in two thirds of students being rated above expected level of performance. This leniency bias appeared to be attenuated by delay in ITER completion, and was also blunted in preceptors who rated more students. As all biases threaten the internal validity of the assessment process, further research is needed to confirm these and other sources of rater bias in ITER ratings, and to explore ways of limiting their impact.
Similar content being viewed by others
References
Alicke, M. D., & Govorun, O. (2005). The better-than-average effect. In M. D. Alicke, D. A. Dunning, & J. Krueger (Eds.), The self in social judgment. Studies in self and identity. Hove, NY: Psychology Press.
Alicke, M. D., Klotz, M. L., Breitenbecher, D. L., Yurak, T. J., & Vredenburg, D. S. (1995). Personal contact, individuation, and the better-than-average effect. Journal of Personality and Social Psychology, 68, 804–825.
Bandiera, G., & Lendrum, D. (2008). Daily encounter cards facilitate competency-based feedback while leniency bias persists. CJEM Canadian Journal of Emergency Medical Care, 10, 44–50.
Barnett, A. G., van der Pols, J. C., & Dobson, A. J. (2005). Regression to the mean: what it is and how to deal with it. International Journal of Epidemiology, 34, 215–220.
Burdick, W. P., & Schoffstall, J. (1995). Observation of emergency medicine residents at the bedside: How often does it happen? Academic Emergency Medicine, 2, 909–913.
Carline, J. D., Paauw, D. S., Thiede, K. W., & Ramsey, P. G. (1992). Factors affecting the reliability of ratings of students’ clinical skills in a medicine clerkship. Journal of General Internal Medicine, 7, 506–510.
Croyle, R. T., Loftus, E. F., Barger, S. D., Sun, Y. C., Hart, M., & Gettig, J. A. (2006). How well do people recall risk factor test results? Accuracy and bias among cholesterol screening participants. Health Psychology, 25, 425–432.
Dudek, N. L., Marks, M. B., & Regehr, G. (2005). Failure to fail: the perspectives of clinical supervisors. Academic Medicine, 80(10 Suppl), S84–S87.
Frank, J. R., & Danoff, D. (2007). The CanMEDS initiative: implementing an outcomes-based framework of physician competencies. Medical Teacher, 29, 642–647.
Giladi, E. E., & Klar, Y. (2002). When standards are wide of the mark: Nonselective superiority and inferiority biases in comparative judgments of objects and concepts. Journal of Experimental Psychology General, 131, 538–551.
Ginsburg, S., McIlroy, J., Oulanova, O., Eva, K., & Regehr, G. (2010). Toward authentic clinical evaluation: pitfalls in the pursuit of competency. Academic Medicine, 85, 780–786.
Gordon, M. E. (1970). The effect of the correctness of the behavior observed on the accuracy of ratings. Organizational Behavior and Human Performance, 5, 366–377.
Holmboe, E. S. (2004). Faculty and the observation of trainees’ clinical skills: Problems and opportunities. Academic Medicine, 79, 16–22.
Hoorens, V. (1993). Self-enhancement and superiority biases in social comparison. European Review of Social Psychology, 4, 113–139.
Huber, V. L. (1987). Judgment by heuristics: Effect of rate and rater characteristics and performance standards on performance-related judgments. Organizational Behavior and Human Decision Processes, 40, 149–169.
Kruger, J., & Dunning, D. (1999). Unskilled and unaware of it: How difficulties in recognizing one’s own incompetence lead to inflated self-assessments. Journal of Personality and Social Psychology, 77, 1121–1134.
LCME (2011) Functions and structure of a medical school: Standards for accreditation of medical education programs leading to the M.D. Degree (May 2011). http://www.lcme.org/functions2011may.pdf (Accessed July 2011).
MacCoun, R. J., & Kerr, N. L. (1988). Asymmetric influence in mock jury deliberation: Jurors’ bias for leniency. Journal of Personality and Social Psychology, 54, 21–33.
Nesselroade, J. R., Stigler, S. M., & Baltes, P. B. (1980). Regression toward the mean and the study of change. Psychological Bulletin, 88, 622–637.
Nisbett, R. E., & Wilson, T. D. (1977). The halo effect: evidence for unconscious alteration of judgments. Journal of Personality and Social Psychology, 35, 250–256.
Prescott-Clements, L., van der Vleuten, C. P., Schuwirth, L. W., Hurst, Y., & Rennie, J. S. (2008). Evidence for validity within workplace assessment: the Longitudinal Evaluation of Performance (LEP). Medical Education, 42, 488–495.
Ryan Lowitt, N. (2000). How are we doing? The problem of in-training evaluation. Journal of General Internal Medicine, 15, 605–606.
Thorndike, E. L. (1920). A constant error in psychological ratings. Journal of Applied Psychology, 4, 25–29.
Tonesk, X., & Buchanan, R. G. (1987). An AAMC pilot study by 10 medical schools of clinical evaluation of students. Journal of Medical Education, 62, 707–718.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Paget, M., Wu, C., McIlwrick, J. et al. Rater variables associated with ITER ratings. Adv in Health Sci Educ 18, 551–557 (2013). https://doi.org/10.1007/s10459-012-9391-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10459-012-9391-y