Objective:To assess the internal consistency and interrater reliability of a clinical evaluation exercise (CEX) format that was designed to be easily utilized, but sufficiently detailed, to achieve uniform recording of the observed examination.
Design:A comparison of 128 CEXs conducted for 32 internal medicine interns by full-time faculty. This paper reports alpha coefficients as measures of internal consistency and several measures of inter-rater reliability.
Setting:A university internal medicine program. Observations were conducted at the end of the internship year.
Participants:Participants were 32 interns and observers were 12 full-time faculty in the department of medicine. The entire intern group was chosen in order to optimize the spectrum of abilities represented. Patients used for the study were recruited by the chief resident from the inpatient medical service based on their ability and willingness to participate.
Intervention:Each intern was observed twice and there were two examiners during each CEX. The examiners were given a standardized preparation and used a format developed over five years of previous pilot studies.
Measurements and main results:The format appeared to have excellent internal consistency; alpha coefficients ranged from 0.79 to 0.99. However, multiple methods of determining inter-rater reliability yielded similar results; intraclass correlations ranged from 0.23 to 0.50 and generalizability coefficients from a low of 0.00 for the overall rating of the CEX to a high of 0.61 for the physical examination section. Transforming scores to eliminate rater effects and dichotomizing results into pass-fail did not appear to enhance the reliability results.
Conclusions:Although the CEX is a valuable didactic tool, its psychometric properties preclude reliable assessment of clinical skills as a one-time observation.
Blank LL, Grosso LJ, Benson JA Jr. A survey of clinical skills evaluation practices in internal medicine residency programs. J Med Educ. 1984;59:401–6.PubMedGoogle Scholar
Petersdorf RG, Beck JC. The new procedure for evaluating the clinical competence of candidates to be certified by the American Board of Internal Medicine. Ann Intern Med. 1972;76:491–6.PubMedGoogle Scholar
Woolliscroft JO, Stross JK, Silva J Jr. Clinical competence certification: a critical appraisal. J Med Educ. 1984;59:799–805.PubMedGoogle Scholar
Herbers JE Jr, Noel GL, Cooper GS, Harvey J, Pangaro LN, Weaver MJ. How accurate are faculty evaluations of clinical competence? J Gen Intern Med. 1989;4:202–8.PubMedGoogle Scholar
Kroboth FJ, Kapoor W, Brown FH, Karpf M, Levey GS. A comparative trial of the clinical evaluation exercise. Arch Intern Med. 1985;145:1121–3.PubMedCrossRefGoogle Scholar
Lipkin M. The medical interview and related skills. In: Branch W. Office practice of medicine, 2nd ed. Philadelphia: W. B. Saunders, 1987;1287–306.Google Scholar
Brennan RL, Kane MT. Generalizability theory: a review. In: Traub RE (ed.). New directions for testing and measurement (no. 4): methodological developments. San Francisco: Jossey-Bass, 1979;33–51.Google Scholar
Thompson WG, Lipkin M Jr, Gilbert DA, Guzzo RA, Roberson L. Evaluating evaluation: assessment of the American Board of Internal Medicine resident evaluation form. J Gen Intern Med. 1990;5:214–7.PubMedGoogle Scholar
Benson JA Jr, Blank LL, Norcini JJ Jr. Examining the ABIM’s evaluation form [letter]. J Gen Intern Med. 1990;5:535–6.PubMedGoogle Scholar
Barrows HS, Abrahamson S. The programmed patient: a technique for appraising student performance in clinical neurology. J Med Educ. 1964;39:802–5.PubMedGoogle Scholar
Owen A, Winkler R. General practitioners and psychosocial problems: an evaluation using pseudopatients. Med J Aust. 1974;2:393–98.PubMedGoogle Scholar
Godkins TR, Duffy D, Greenwood J, Stanhope WD. Utilization of simulated patients to teach the ‘routine’ pelvic examination. J Med Educ. 1974;49:1174–8.PubMedGoogle Scholar
Anderson KK, Meyer TC. The use of instructor-patients to teach physical examination techniques. J Med Educ. 1978;53:831–6.PubMedGoogle Scholar
Elliot DL, Hickman DH. Evaluation of physical examination skills: reliability of faculty observers and patient instructors. JAMA. 1987;258:3405–8.PubMedCrossRefGoogle Scholar
Stillman PL, Swanson PD, Snee S, et al. Assessing clinical skills of residents with standardized patients. Ann Intern Med. 1986;105:762–71.PubMedGoogle Scholar
Stillman P, Swanson D, Regan MB, et al. Assessment of clinical skills of residents utilizing standardized patients. Ann Intern Med. 1991;114:393–401.PubMedGoogle Scholar
Harden R, Stevenson M, Downie WW, Wilson GM. Assessment of clinical competence using objective structured clinical examination. Br Med J. 1975;1:447–51.PubMedGoogle Scholar
Robb KV, Rothman AI. The assessment of clinical skills in general medical residents—comparison of the objective structured clinical examination to a conventional oral examination. Ann R Coll Phys Surg Can. 1985;18:235–8.Google Scholar
Petrusa ER, Blackwell TA, Rogers LP, et al. An objective measure of clinical performance. Am J Med. 1987;83:34–41.PubMedCrossRefGoogle Scholar
Petrusa ER, Blackwell TA, Ainsworth MA. Reliability and validity of an objective structured clinical examination for assessing the clinical performance of residents. Arch Intern Med. 1990;150:573–7.PubMedCrossRefGoogle Scholar
Newble DI, Swanson DB. Psychometric characteristics of the objective structured clinical examination. Med Educ. 1988;22:325–34.PubMedCrossRefGoogle Scholar
Weiner BJ. Standardized principles and experimental design. New York: McGraw-Hill, 1971.Google Scholar