Abstract
Objective: To assess the sources ofmeasurement error in an electrocardiogram (ECG)interpretation examination given in athird-year internal medicine clerkship.
Design: Three successive generalizabilitystudies were conducted. (1) Multiple facultyrated student responses to a previouslyadministered exam. (2) The rating criteria wererevised and study 1 was repeated. (3) Theexamination was converted into an extendedmatching format including multiple cases withthe same underlying cardiac problem.
Results: The discrepancies among raters(main effects and interactions) were dwarfed bythe error associated with case specificity. Thelargest source of the differences among raterswas in rating student errors of commissionrather than student errors of omission.Revisions in the rating criteria may havehelped increase inter-rater reliabilityslightly however, due to case specificity, ithad little impact on the overall reliability ofthe exam. The third study indicated themajority of the variability in studentperformance across cases was in performanceacross cases within the same type of cardiacproblem rather than between different types ofcardiac problems.
Conclusions: Case specificity was theoverwhelming source of measurement error. Thevariation among cases came mainly fromdiscrepancies in performance between examplesof the same cardiac problem rather than fromdifferences in performance across differenttypes of cardiac problems. This suggests it isnecessary to include a large number of caseseven if the goal is to assess performance ononly a few types of cardiac problems.
Similar content being viewed by others
References
Brennan, R.L. (2001).Generalizability Theory. St. Paul Mn: Assessment Systems Corporation.
Crick, J.E. & Brennan, R.L. (1984).A General Purpose Analysis of Variance System, Version 2.2. American College Testing Service.
Downing, S. (2000). Assessment of knowledge with written test forms. In G.R. Norman, C.P.M. van der Vleuten & D.I. Newble (eds.), International Handbook of Research in Medical Education. Dordrecht/Boston/London: Kluwer Academic Publishers, pp. 647–672.
Hancock, E.W., Norcini, J.J. & Webster, G.D. (1987).A standardized exam in the interpretation of electrocardiograms. JACC 10(4): 882–886.
Mavis, B.E., Henry, R.C., Ogle, K.S. & Hoppe, R.B. (1996).The emperor's new clothes: the OSCE reassessed. Academic Medicine 71(5) (May): 447–453.
Norman, G.R., Tugwell, P., Feightner, J.W., Muzzin, L.J. & Jacoby, L.L. (1985).Knowledge and clinical problem-solving. Medical Education 19: 344–356.
Swanson, D.B. & Norcini, J.J. (1989).Factors influencing reproducibility of tests using standardized patients. Teaching and Learning in Medicine 1: 158–166.
Swanson, D.B., Norman, G.R. & Linn, R.L. (1995).Performance-based assessments: lessons from the health professions. Educational Researcher 24(5): 5–35.
Van Thiel, J., Kraan, H.F. & Van Der Vleuten, C.P.M. (1991).Reliability and feasibility of measuring medical interviewing skills: the revised Maastricht history-taking and advice checklist. Medical Education 25: 224–229.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Solomon, D.J., Ferenchick, G. Sources of Measurement Error in an ECG Examination: Implications for Performance-Based Assessments. Adv Health Sci Educ Theory Pract 9, 283–290 (2004). https://doi.org/10.1007/s10459-004-4844-6
Issue Date:
DOI: https://doi.org/10.1007/s10459-004-4844-6