Advances in Health Sciences Education

, Volume 24, Issue 1, pp 45–63 | Cite as

How well is each learner learning? Validity investigation of a learning curve-based assessment approach for ECG interpretation

  • Rose HatalaEmail author
  • Jacqueline Gutman
  • Matthew Lineberry
  • Marc Triola
  • Martin Pusic


Learning curves can support a competency-based approach to assessment for learning. When interpreting repeated assessment data displayed as learning curves, a key assessment question is: “How well is each learner learning?” We outline the validity argument and investigation relevant to this question, for a computer-based repeated assessment of competence in electrocardiogram (ECG) interpretation. We developed an on-line ECG learning program based on 292 anonymized ECGs collected from an electronic patient database. After diagnosing each ECG, participants received feedback including the computer interpretation, cardiologist’s annotation, and correct diagnosis. In 2015, participants from a single institution, across a range of ECG skill levels, diagnosed at least 60 ECGs. We planned, collected and evaluated validity evidence under each inference of Kane’s validity framework. For Scoring, three cardiologists’ kappa for agreement on correct diagnosis was 0.92. There was a range of ECG difficulty across and within each diagnostic category. For Generalization, appropriate sampling was reflected in the inclusion of a typical clinical base rate of 39% normal ECGs. Applying generalizability theory presented unique challenges. Under the Extrapolation inference, group learning curves demonstrated expert–novice differences, performance increased with practice and the incremental phase of the learning curve reflected ongoing, effortful learning. A minority of learners had atypical learning curves. We did not collect Implications evidence. Our results support a preliminary validity argument for a learning curve assessment approach for repeated ECG interpretation with deliberate and mixed practice. This approach holds promise for providing educators and researchers, in collaboration with their learners, with deeper insights into how well each learner is learning.


ECG interpretation Longitudinal assessment Learning curve Validity evidence 



This work was supported, in part, by a Royal College of Physicians and Surgeons of Canada Medical Education grant.

Supplementary material

10459_2018_9846_MOESM1_ESM.docx (14 kb)
Supplementary material 1 (DOCX 14 kb)


  1. Ashley, E. A., Raxwal, V. K., & Froelicher, V. F. (2000). The prevalence and prognostic significance of electrocardiographic abnormalities. Current Problems in Cardiology, 25(1), 1–72.Google Scholar
  2. Boutis, K., Cano, S., Pecaric, M., Welch-Horan, T. B., Lampl, B., Ruzal-Shapiro, C., et al. (2016). Interpretation difficulty of normal versus abnormal radiographs using a pediatric example. Canadian Medical Education Journal, 7(1), e68–e77.Google Scholar
  3. Brennan, R. L. (2001). Multivariate unbalanced designs. In R. L. Brennan (Ed.), Generalizability theory (pp. 384–387). New York: Springer.Google Scholar
  4. Chudgar, S. M., Engle, D. L., O’Connor, Grochowski C., & Gagliardi, J. P. (2016). Teaching crucial skills: An electrocardiogram teaching module for medical students. Journal of Electrocardiology, 49(4), 490–495.Google Scholar
  5. Cook, D. A. (2015). Much ado about differences: Why expert-novice comparisons add little to the validity argument. Advances in Health Sciences Education, 20(3), 829–834.Google Scholar
  6. Cook, D. A., Brydges, R., Ginsburg, S., & Hatala, R. (2015). A contemporary approach to validity arguments: A practical guide to Kane’s framework. Medical Education, 49(6), 560–575.Google Scholar
  7. Cook, D. A., & Lineberry, M. (2016). Consequences validity evidence: Evaluating the impact of educational assessments. Academic Medicine, 91(6), 785–795.Google Scholar
  8. De Bacquer, D., De Backer, G., Kornitzer, M., & Blackburn, H. (1998). Prognostic value of ECG findings for total, cardiovascular disease, and coronary heart disease death in men and women. Heart, 80(6), 570–577.Google Scholar
  9. Ericsson, K. A. (2015). Acquisition and maintenance of medical expertise. Academic Medicine, 90(11), 1471–1486.Google Scholar
  10. Ericsson, K. A., Krampe, R. T., & Tesch-Römer, C. (1993). The role of deliberate practice in the acquisition of expert performance. Psychological Review, 100(3), 363–406.Google Scholar
  11. Fent, G., Gosai, J., & Purva, M. (2015). Teaching the interpretation of electrocardiograms: Which method is best? Journal of Electrocardiology, 48(2), 190–193.Google Scholar
  12. Genders, T., Spronk, S., Stijnen, T., & Steyerberg, E. W. (2012). Methods for calculating sensitivity and specificity of clustered data: A tutorial. Radiology, 265, 910–916.Google Scholar
  13. Guglin, M. E., & Thatai, D. (2006). Common errors in computer electrocardiogram interpretation. International Journal of Cardiology, 106(2), 232–237.Google Scholar
  14. Hartman, N. D., Wheaton, N. B., Williamson, K., Quattromani, E. N., Branzetti, J. B., & Aldeen, A. Z. (2016). A novel tool for assessment of emergency medicine resident skill in determining diagnosis and management for emergent electrocardiograms: A multicenter study. Journal of Emergency Medicine, 51(6), 697–704.Google Scholar
  15. Hatala, R. M., Brooks, L. R., & Norman, G. R. (2003). Practice makes perfect: the critical role of mixed practice in the acquisition of ECG interpretation skills. Advances in Health Sciences Education: Theory and Practice, 8(1), 17–26.Google Scholar
  16. Jablonover, R. S., Lundberg, E., Zhang, Y., & Stagnaro-Green, A. (2014). Competency in electrocardiogram interpretation among graduating medical students. Teaching and Learning in Medicine, 26(3), 279–284.Google Scholar
  17. Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50(1), 1–73.Google Scholar
  18. Larsen, D. P., Butler, A. C., & Roediger, H. L., III. (2008). Test-enhanced learning in medical education. Medical Education, 42(10), 959–966.Google Scholar
  19. Livingston, S. A., & Lewis, C. (1995). Estimating the consistency and accuracy of classifications based on test scores. Journal of Educational Measurement, 32, 179–197.Google Scholar
  20. Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). New York: American Council on Education: Macmillan Publishing Company.Google Scholar
  21. Pecaric, M., Boutis, K., Beckstead, J., & Pusic, M. (2017). A big data and learning analytics approach to process-level feedback in cognitive simulations. Academic Medicine, 92(2), 175–184.Google Scholar
  22. Pusic, M. V., Andrews, J. S., Kessler, D. O., Teng, D. C., Pecaric, M. R., Ruzal-Shapiro, C., et al. (2012a). Prevalence of abnormal cases in an image bank affects the learning of radiograph interpretation. Medical Education, 46(3), 289–298.Google Scholar
  23. Pusic, M. V., Boutis, K., Hatala, R., & Cook, D. A. (2015a). Learning curves in health professions education. Academic Medicine, 90(8), 1034–1042.Google Scholar
  24. Pusic, M. V., Boutis, K., Pecaric, M. R., Savenkov, O., Beckstead, J. W., & Jaber, M. Y. (2016). A primer on the statistical modeling of learning curves in health professions education. Advances in Health Sciences Education, 22(3), 741–759.Google Scholar
  25. Pusic, M. V., Chiaramonte, R., Gladding, S., Andrews, J. S., Pecaric, M. R., & Boutis, K. (2015b). Accuracy of self-monitoring during learning of radiograph interpretation. Medical Education, 49(8), 838–846.Google Scholar
  26. Pusic, M. V., Kessler, D., Szyld, D., Kalet, A., Pecaric, M., & Boutis, K. (2012b). Experience curves as an organizing framework for deliberate practice in emergency medicine learning. Academic Emergency Medicine, 19(12), 1476–1480.Google Scholar
  27. Pusic, M., Pecaric, M., & Boutis, K. (2011). How much practice is enough? Using learning curves to assess the deliberate practice of radiograph interpretation. Academic Medicine, 86(6), 731–736.Google Scholar
  28. Ramsay, C. R., Grant, A. M., Wallace, S. A., Garthwaite, P. H., Monk, A. F., & Russell, I. T. (2001). Statistical assessment of the learning curves of health technologies. Health Technology Assessment (Winchester, England), 5(12), 1–79.Google Scholar
  29. Rourke, L., Leong, J., & Chatterly, P. (2018). Conditions-based learning theory as a framework for comparative-effectiveness reviews: A worked example. Teaching and Learning in Medicine, 16, 1–9.Google Scholar
  30. Salerno, S. M., Alguire, P. C., & Waxman, H. S. (2003a). Competency in interpretation of 12-lead electrocardiograms: A summary and appraisal of published evidence. Annals of Internal Medicine, 138(9), 751–760.Google Scholar
  31. Salerno, S. M., Alguire, P. C., & Waxman, H. S. (2003b). Training and competency evaluation for interpretation of 12-lead electrocardiograms: Recommendations from the American College of Physicians. Annals of Internal Medicine, 138, 747–750.Google Scholar
  32. Schuwirth, L. W. T., & van der Vleuten, C. P. M. (2011a). General overview of the theories used in assessment: AMEE Guide No. 57. Medical Teacher, 33(10), 783–797.Google Scholar
  33. Schuwirth, L. W. T., & van der Vleuten, C. P. M. (2011b). Programmatic assessment: From assessment of learning to assessment for learning. Medical Teacher, 33(6), 478–485.Google Scholar
  34. Shah, A. P., & Rubin, S. A. (2007). Errors in the computerized electrocardiogram interpretation of cardiac rhythm. Journal of Electrocardiology, 40(5), 385–390.Google Scholar
  35. Shute, V. (2008). Focus on formative feedback. Review of Educational Research, 78(1), 153–189.Google Scholar
  36. Sibbald, M., Davies, E. G., Dorian, P., & Yu, E. H. C. (2014). Electrocardiographic interpretation skills of cardiology residents: Are they competent? Canadian Journal of Cardiology, 30(12), 1721–1724.Google Scholar
  37. Wainer, H., & Mislevy, R. J. (2000). Item response theory, item calibration, and proficiency estimation. In H. Wainer (Ed.), Computerized adaptive testing (pp. 63–68). New Jersey: Lawrence Erlbaum & Associates.Google Scholar
  38. Webb, N. M., Shavelson, R. J., & Haertel, E. H. (2006). Reliability coefficients and generalizability theory. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics (pp. 81–124). Amsterdam: Elsevier.Google Scholar

Copyright information

© Springer Nature B.V. 2018

Authors and Affiliations

  1. 1.Department of Medicine, St. Paul’s HospitalUniversity of British ColumbiaVancouverCanada
  2. 2.Institute for Innovations in Medical EducationNew York University School of MedicineNew YorkUSA
  3. 3.Zamierowski Institute for Experiential LearningUniversity of Kansas Medical Center and Health SystemKansas CityUSA
  4. 4.Ronald O. Perelman Department of Emergency MedicineNew York University School of MedicineNew YorkUSA

Personalised recommendations