Abstract
Objective: To determine whether raters using the American Board of Internal Medicine (ABIM) Resident Evaluation Form can detect differences among residents in clinical competence.
Design: Cross-sectional study.
Setting: Inpatient general medicine service in a university-affiliated public hospital.
Participants: University-based internal medicine (UCIM) residents (ABIM certifying examination pass rate, 91%; mean score, 95th percentile), community hospital-based internal medicine (CHIM) residents (ABIM examination pass rate, 68%; mean score, 42nd percentile), and residents from three university-based non-internal medicine (UC non-IM) programs all assigned to the same inpatient general medicine service over a three-year period. Four hundred eighty-nine evaluations of 110 postgraduate-year-one residents were analyzed.
Measurements and main results: Mean ratings for the UCIM residents were significantly higher than those for the CHIM or UC non-IM residents (analysis of variance [ANOVA], p<0.05). Variance was smallest for the UCIM residents (F test, p<0.01), and only the UCIM residents’ mean scores were in the “superior” range (7–9) in all evaluated categories. The mean ratings for the CHIM residents while at the university-affiliated hospital were not significantly different from the ratings of the same residents at their home hospital. The ratings for the CHIM residents at either site were significantly lower than those for the UCIM residents in all categories (ANOVA, p<0.05). Factor analysis revealed a single factor accounting for 76% of the variance among the ratings with all dimensions loading high on that factor (0.75–0.95), providing evidence for a “halo” effect. Mean interrater agreement over all variables was 0.87, indicating good consistency among raters.
Conclusions: Ratings on the ABIM Resident Evaluation Form detect global differences among residents in clinical competence in the expected direction based on type of training program and performance on the ABIM certification examination, but fail to differentiate among the nine evaluated dimensions of clinical care. This rating method may be valid for assessing overall clinical performance, but is less useful for providing feedback in specific areas to individual residents.
Similar content being viewed by others
References
The American Board of Internal Medicine. Guide to Evaluation of Residents in Internal Medicine 1988–1989. Portland, OR: Office of the President, American Board of Internal Medicine, 1988.
Thompson WG, Lipkin M Jr, Gilbert DA, Guzzo RA, Roberson L. Evaluating evaluation: assessment of the American Board of Internal Medicine Resident Evaluation Form. J Gen Intern Med. 1990;5:214–7.
Benson JA Jr, Blank LL, Norcini JJ Jr. Examining the ABIM’s evaluation form [letter], J Gen Intern Med. 1990;5:535–6.
Program Director’s Report A. American Board of Internal Medicine, 1986–1991. In:
Snedecor G, Cochran W. Statistical Methods. 8th ed. Ames, IA: Iowa State University Press, 1989.
Sokal RR, Rohlf FJ. Biometry, 2nd ed. New York: W. H. Freeman, 1981;242–62.
Stevens J. Applied Multivariate Statistics for the Social Sciences. Hillsdale, NJ: Lawrence Erlbaum, 1986.
James LR, Demaree RG, Wolf G. Estimating within-group interrater reliability with and without response bias. J Appl Psychol. 1984;69:85–98.
Neufeld VR An introduction to measurement properties. In: Neufeld VR, Normal GR (eds). Assessing Clinical Competence. Springer Series on Medical Education, vol. 7. New York: Springer Publishing, 1985;7:39–50.
Benson JA Jr. Certification and recertification: one approach to professional accountability. Ann Intern Med. 1991;114:238–42.
Noricini JJ Jr, Webster GD, Grosso LJ, Blank LL, Benson JA Jr. Ratings of residents’ clinical competence and performance on certification examination. J Med Educ. 1987;62:457–62.
Ramsey PG, Carline JD, Inui TS, Larson EB, LoGerfo JP, Wenrich MD. Predictive validity of certification by the American Board of Internal Medicine. Ann Intern Med. 1989;110:719–26.
Streiner DL. Global rating scales. In: Neufeld VR, Norman GR (eds). Assessing Clinical Competence. Springer Series on Medical Education, vol. 7. New York: Springer Publishing, 1985;7:119–41.
Hess JW. A comparison of methods for evaluating medical student skill in relating to patients. J Med Educ. 1969;44:934–8.
Stillman P, Swanson D, Regan MB, et al. Assessment of clinical skills of residents utilizing standardized patients. A follow-up study and recommendations for application. Ann Intern Med. 1991;114:393–401.
Geertsma RH, Chapman JF. The evaluation of medical students. J Med Educ. 1967;42:938–48.
Gough HG, Hall WB, Harris RE. Evaluation of performance in medical training. J Med Educ. 1964;39:679–92.
Landy FJ, Farr JL. Performance rating. Psychol Bull. 1980;87:72–107.
Levine HG, McGuire CH. Rating habitual performance in graduate medical education. J Med Educ. 1971;46:306–11.
Printen KJ, Chappell W, Whitney DR. Clinical performance evaluation of junior medical students. J Med Educ. 1973;48:343–8.
Dowaliby FJ, Andrew BJ. Relationships between clinical competence ratings and examination performance. J Med Educ. 1976;51:181–88.
Brumback GB, Howell MA. Rating the clinical effectiveness of employed physicians. J Appl Psychol. 1972;56:241–4.
Gluskinos U, Brennan TF. Selection and evaluation procedure for operating room personnel. J Appl Psychol. 1971;55:165–69.
Cranton PA, Dauphinee WD, McQueen MM, Smith LP. The reliability and validity of in-training evaluation reports in obstetrics and gynecology. Proceedings, 23rd Annual Conference on Research in Medical Education. Washington, DC: Association of American Medical Colleges, 1984;59–64.
Kegel-Flom P. Predicting supervisor, peer, and self ratings of intern performance. J Med Educ. 1975;50:812–5.
Pierleoni RG, Dudding BA, Clark GM. An analysis of pediatric clerkship performance in a multicomponent evaluation system. Proceedings, 17th Annual Conference on Research in Medical Education. Washington, DC: Association of American Medical Colleges, 1978;31–6.
Borman WC. Effects of instructions to avoid halo error on reliability and validity of performance evaluation ratings. J Appl Psychol. 1975;60:556–60.
Borman WC, Dunnette MD. Behavior-based versus trait-oriented performance ratings: an empirical study. J Appl Psychol. 1975;60:561–5.
Noel GL, Herbers JE Jr, Caplow MP, Cooper GS, Pangaro LN, Harvey J. How well do internal medicine faculty members evaluate the clinical skills of residents? Ann Intern Med. 1992;117:757–65.
Kroboth FJ, Hanusa BH, Parker S, et al. The inter-rater reliability and internal consistency of a clinical evaluation exercise. J Gen Intern Med. 1992;7:174–9.
Author information
Authors and Affiliations
Additional information
Supported in part by grant PE 19179 for residency training in general internal medicine and general pediatrics from the Bureau of Health Professions, Health Resources and Services Administration of the Public Health Service.
Rights and permissions
About this article
Cite this article
Haber, R.J., Avins, A.L. Do ratings on the american board of internal medicine resident evaluation form detect differences in clinical competence?. J Gen Intern Med 9, 140–145 (1994). https://doi.org/10.1007/BF02600028
Issue Date:
DOI: https://doi.org/10.1007/BF02600028