, Volume 14, Issue 5, pp 655-664
Date: 26 Nov 2008

Does scale length matter? A comparison of nine- versus five-point rating scales for the mini-CEX

Rent the article at a discount

Rent now

* Final gross prices may vary according to local VAT.

Get Access

Abstract

Educators must often decide how many points to use in a rating scale. No studies have compared interrater reliability for different-length scales, and few have evaluated accuracy. This study sought to evaluate the interrater reliability and accuracy of mini-clinical evaluation exercise (mini-CEX) scores, comparing the traditional mini-CEX nine-point scale to a five-point scale. Methods: The authors conducted a validity study in an academic internal medicine residency program. Fifty-two program faculty participated. Participants rated videotaped resident-patient encounters using the mini-CEX with both a nine-point scale and a five-point scale. Some cases were scripted to reflect a specific level of competence (unsatisfactory, satisfactory, superior). Outcome measures included mini-CEX scores, accuracy (scores compared to scripted competence level), interrater reliability, and domain intercorrelation. Results: Interviewing, exam, counseling, and overall ratings varied significantly across levels of competence (P < .0001). Nine-point scale scores accurately classified competence more often (391/720 [54%] for overall ratings) than five-point scores (316/723 [44%], P < .0001). Interrater reliability was similar for scores from the nine- and five-point scales (0.43 and 0.40, respectively, for overall ratings). With the exception of correlation between exam and counseling scores using the five-point scale (r = 0.38, P = .13), score correlations among all domain combinations were high (r = 0.46–0.89) and statistically significant (P ≤ .015) for both scales. Conclusions: Mini-CEX scores demonstrated modest interrater reliability and accuracy. Although interrater reliability is similar for nine- and five-point scales, nine-point scales appear to provide more accurate scores. This has implications for many educational assessments.