Skip to main content

How well is each learner learning? Validity investigation of a learning curve-based assessment approach for ECG interpretation

Abstract

Learning curves can support a competency-based approach to assessment for learning. When interpreting repeated assessment data displayed as learning curves, a key assessment question is: “How well is each learner learning?” We outline the validity argument and investigation relevant to this question, for a computer-based repeated assessment of competence in electrocardiogram (ECG) interpretation. We developed an on-line ECG learning program based on 292 anonymized ECGs collected from an electronic patient database. After diagnosing each ECG, participants received feedback including the computer interpretation, cardiologist’s annotation, and correct diagnosis. In 2015, participants from a single institution, across a range of ECG skill levels, diagnosed at least 60 ECGs. We planned, collected and evaluated validity evidence under each inference of Kane’s validity framework. For Scoring, three cardiologists’ kappa for agreement on correct diagnosis was 0.92. There was a range of ECG difficulty across and within each diagnostic category. For Generalization, appropriate sampling was reflected in the inclusion of a typical clinical base rate of 39% normal ECGs. Applying generalizability theory presented unique challenges. Under the Extrapolation inference, group learning curves demonstrated expert–novice differences, performance increased with practice and the incremental phase of the learning curve reflected ongoing, effortful learning. A minority of learners had atypical learning curves. We did not collect Implications evidence. Our results support a preliminary validity argument for a learning curve assessment approach for repeated ECG interpretation with deliberate and mixed practice. This approach holds promise for providing educators and researchers, in collaboration with their learners, with deeper insights into how well each learner is learning.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

References

  • Ashley, E. A., Raxwal, V. K., & Froelicher, V. F. (2000). The prevalence and prognostic significance of electrocardiographic abnormalities. Current Problems in Cardiology, 25(1), 1–72.

    Article  Google Scholar 

  • Boutis, K., Cano, S., Pecaric, M., Welch-Horan, T. B., Lampl, B., Ruzal-Shapiro, C., et al. (2016). Interpretation difficulty of normal versus abnormal radiographs using a pediatric example. Canadian Medical Education Journal, 7(1), e68–e77.

    Google Scholar 

  • Brennan, R. L. (2001). Multivariate unbalanced designs. In R. L. Brennan (Ed.), Generalizability theory (pp. 384–387). New York: Springer.

    Chapter  Google Scholar 

  • Chudgar, S. M., Engle, D. L., O’Connor, Grochowski C., & Gagliardi, J. P. (2016). Teaching crucial skills: An electrocardiogram teaching module for medical students. Journal of Electrocardiology, 49(4), 490–495.

    Article  Google Scholar 

  • Cook, D. A. (2015). Much ado about differences: Why expert-novice comparisons add little to the validity argument. Advances in Health Sciences Education, 20(3), 829–834.

    Article  Google Scholar 

  • Cook, D. A., Brydges, R., Ginsburg, S., & Hatala, R. (2015). A contemporary approach to validity arguments: A practical guide to Kane’s framework. Medical Education, 49(6), 560–575.

    Article  Google Scholar 

  • Cook, D. A., & Lineberry, M. (2016). Consequences validity evidence: Evaluating the impact of educational assessments. Academic Medicine, 91(6), 785–795.

    Article  Google Scholar 

  • De Bacquer, D., De Backer, G., Kornitzer, M., & Blackburn, H. (1998). Prognostic value of ECG findings for total, cardiovascular disease, and coronary heart disease death in men and women. Heart, 80(6), 570–577.

    Article  Google Scholar 

  • Ericsson, K. A. (2015). Acquisition and maintenance of medical expertise. Academic Medicine, 90(11), 1471–1486.

    Article  Google Scholar 

  • Ericsson, K. A., Krampe, R. T., & Tesch-Römer, C. (1993). The role of deliberate practice in the acquisition of expert performance. Psychological Review, 100(3), 363–406.

    Article  Google Scholar 

  • Fent, G., Gosai, J., & Purva, M. (2015). Teaching the interpretation of electrocardiograms: Which method is best? Journal of Electrocardiology, 48(2), 190–193.

    Article  Google Scholar 

  • Genders, T., Spronk, S., Stijnen, T., & Steyerberg, E. W. (2012). Methods for calculating sensitivity and specificity of clustered data: A tutorial. Radiology, 265, 910–916.

    Article  Google Scholar 

  • Guglin, M. E., & Thatai, D. (2006). Common errors in computer electrocardiogram interpretation. International Journal of Cardiology, 106(2), 232–237.

    Article  Google Scholar 

  • Hartman, N. D., Wheaton, N. B., Williamson, K., Quattromani, E. N., Branzetti, J. B., & Aldeen, A. Z. (2016). A novel tool for assessment of emergency medicine resident skill in determining diagnosis and management for emergent electrocardiograms: A multicenter study. Journal of Emergency Medicine, 51(6), 697–704.

    Article  Google Scholar 

  • Hatala, R. M., Brooks, L. R., & Norman, G. R. (2003). Practice makes perfect: the critical role of mixed practice in the acquisition of ECG interpretation skills. Advances in Health Sciences Education: Theory and Practice, 8(1), 17–26.

    Article  Google Scholar 

  • Jablonover, R. S., Lundberg, E., Zhang, Y., & Stagnaro-Green, A. (2014). Competency in electrocardiogram interpretation among graduating medical students. Teaching and Learning in Medicine, 26(3), 279–284.

    Article  Google Scholar 

  • Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50(1), 1–73.

    Article  Google Scholar 

  • Larsen, D. P., Butler, A. C., & Roediger, H. L., III. (2008). Test-enhanced learning in medical education. Medical Education, 42(10), 959–966.

    Article  Google Scholar 

  • Livingston, S. A., & Lewis, C. (1995). Estimating the consistency and accuracy of classifications based on test scores. Journal of Educational Measurement, 32, 179–197.

    Article  Google Scholar 

  • Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). New York: American Council on Education: Macmillan Publishing Company.

    Google Scholar 

  • Pecaric, M., Boutis, K., Beckstead, J., & Pusic, M. (2017). A big data and learning analytics approach to process-level feedback in cognitive simulations. Academic Medicine, 92(2), 175–184.

    Article  Google Scholar 

  • Pusic, M. V., Andrews, J. S., Kessler, D. O., Teng, D. C., Pecaric, M. R., Ruzal-Shapiro, C., et al. (2012a). Prevalence of abnormal cases in an image bank affects the learning of radiograph interpretation. Medical Education, 46(3), 289–298.

    Article  Google Scholar 

  • Pusic, M. V., Boutis, K., Hatala, R., & Cook, D. A. (2015a). Learning curves in health professions education. Academic Medicine, 90(8), 1034–1042.

    Article  Google Scholar 

  • Pusic, M. V., Boutis, K., Pecaric, M. R., Savenkov, O., Beckstead, J. W., & Jaber, M. Y. (2016). A primer on the statistical modeling of learning curves in health professions education. Advances in Health Sciences Education, 22(3), 741–759.

    Article  Google Scholar 

  • Pusic, M. V., Chiaramonte, R., Gladding, S., Andrews, J. S., Pecaric, M. R., & Boutis, K. (2015b). Accuracy of self-monitoring during learning of radiograph interpretation. Medical Education, 49(8), 838–846.

    Article  Google Scholar 

  • Pusic, M. V., Kessler, D., Szyld, D., Kalet, A., Pecaric, M., & Boutis, K. (2012b). Experience curves as an organizing framework for deliberate practice in emergency medicine learning. Academic Emergency Medicine, 19(12), 1476–1480.

    Article  Google Scholar 

  • Pusic, M., Pecaric, M., & Boutis, K. (2011). How much practice is enough? Using learning curves to assess the deliberate practice of radiograph interpretation. Academic Medicine, 86(6), 731–736.

    Article  Google Scholar 

  • Ramsay, C. R., Grant, A. M., Wallace, S. A., Garthwaite, P. H., Monk, A. F., & Russell, I. T. (2001). Statistical assessment of the learning curves of health technologies. Health Technology Assessment (Winchester, England), 5(12), 1–79.

    Google Scholar 

  • Rourke, L., Leong, J., & Chatterly, P. (2018). Conditions-based learning theory as a framework for comparative-effectiveness reviews: A worked example. Teaching and Learning in Medicine, 16, 1–9.

    Google Scholar 

  • Salerno, S. M., Alguire, P. C., & Waxman, H. S. (2003a). Competency in interpretation of 12-lead electrocardiograms: A summary and appraisal of published evidence. Annals of Internal Medicine, 138(9), 751–760.

    Article  Google Scholar 

  • Salerno, S. M., Alguire, P. C., & Waxman, H. S. (2003b). Training and competency evaluation for interpretation of 12-lead electrocardiograms: Recommendations from the American College of Physicians. Annals of Internal Medicine, 138, 747–750.

    Article  Google Scholar 

  • Schuwirth, L. W. T., & van der Vleuten, C. P. M. (2011a). General overview of the theories used in assessment: AMEE Guide No. 57. Medical Teacher, 33(10), 783–797.

    Article  Google Scholar 

  • Schuwirth, L. W. T., & van der Vleuten, C. P. M. (2011b). Programmatic assessment: From assessment of learning to assessment for learning. Medical Teacher, 33(6), 478–485.

    Article  Google Scholar 

  • Shah, A. P., & Rubin, S. A. (2007). Errors in the computerized electrocardiogram interpretation of cardiac rhythm. Journal of Electrocardiology, 40(5), 385–390.

    Article  Google Scholar 

  • Shute, V. (2008). Focus on formative feedback. Review of Educational Research, 78(1), 153–189.

    Article  Google Scholar 

  • Sibbald, M., Davies, E. G., Dorian, P., & Yu, E. H. C. (2014). Electrocardiographic interpretation skills of cardiology residents: Are they competent? Canadian Journal of Cardiology, 30(12), 1721–1724.

    Article  Google Scholar 

  • Wainer, H., & Mislevy, R. J. (2000). Item response theory, item calibration, and proficiency estimation. In H. Wainer (Ed.), Computerized adaptive testing (pp. 63–68). New Jersey: Lawrence Erlbaum & Associates.

    Chapter  Google Scholar 

  • Webb, N. M., Shavelson, R. J., & Haertel, E. H. (2006). Reliability coefficients and generalizability theory. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics (pp. 81–124). Amsterdam: Elsevier.

    Google Scholar 

Download references

Funding

This work was supported, in part, by a Royal College of Physicians and Surgeons of Canada Medical Education grant.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rose Hatala.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOCX 14 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Hatala, R., Gutman, J., Lineberry, M. et al. How well is each learner learning? Validity investigation of a learning curve-based assessment approach for ECG interpretation. Adv in Health Sci Educ 24, 45–63 (2019). https://doi.org/10.1007/s10459-018-9846-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10459-018-9846-x

Keywords

  • ECG interpretation
  • Longitudinal assessment
  • Learning curve
  • Validity evidence