Abstract
Learning curves can support a competency-based approach to assessment for learning. When interpreting repeated assessment data displayed as learning curves, a key assessment question is: “How well is each learner learning?” We outline the validity argument and investigation relevant to this question, for a computer-based repeated assessment of competence in electrocardiogram (ECG) interpretation. We developed an on-line ECG learning program based on 292 anonymized ECGs collected from an electronic patient database. After diagnosing each ECG, participants received feedback including the computer interpretation, cardiologist’s annotation, and correct diagnosis. In 2015, participants from a single institution, across a range of ECG skill levels, diagnosed at least 60 ECGs. We planned, collected and evaluated validity evidence under each inference of Kane’s validity framework. For Scoring, three cardiologists’ kappa for agreement on correct diagnosis was 0.92. There was a range of ECG difficulty across and within each diagnostic category. For Generalization, appropriate sampling was reflected in the inclusion of a typical clinical base rate of 39% normal ECGs. Applying generalizability theory presented unique challenges. Under the Extrapolation inference, group learning curves demonstrated expert–novice differences, performance increased with practice and the incremental phase of the learning curve reflected ongoing, effortful learning. A minority of learners had atypical learning curves. We did not collect Implications evidence. Our results support a preliminary validity argument for a learning curve assessment approach for repeated ECG interpretation with deliberate and mixed practice. This approach holds promise for providing educators and researchers, in collaboration with their learners, with deeper insights into how well each learner is learning.
Similar content being viewed by others
References
Ashley, E. A., Raxwal, V. K., & Froelicher, V. F. (2000). The prevalence and prognostic significance of electrocardiographic abnormalities. Current Problems in Cardiology, 25(1), 1–72.
Boutis, K., Cano, S., Pecaric, M., Welch-Horan, T. B., Lampl, B., Ruzal-Shapiro, C., et al. (2016). Interpretation difficulty of normal versus abnormal radiographs using a pediatric example. Canadian Medical Education Journal, 7(1), e68–e77.
Brennan, R. L. (2001). Multivariate unbalanced designs. In R. L. Brennan (Ed.), Generalizability theory (pp. 384–387). New York: Springer.
Chudgar, S. M., Engle, D. L., O’Connor, Grochowski C., & Gagliardi, J. P. (2016). Teaching crucial skills: An electrocardiogram teaching module for medical students. Journal of Electrocardiology, 49(4), 490–495.
Cook, D. A. (2015). Much ado about differences: Why expert-novice comparisons add little to the validity argument. Advances in Health Sciences Education, 20(3), 829–834.
Cook, D. A., Brydges, R., Ginsburg, S., & Hatala, R. (2015). A contemporary approach to validity arguments: A practical guide to Kane’s framework. Medical Education, 49(6), 560–575.
Cook, D. A., & Lineberry, M. (2016). Consequences validity evidence: Evaluating the impact of educational assessments. Academic Medicine, 91(6), 785–795.
De Bacquer, D., De Backer, G., Kornitzer, M., & Blackburn, H. (1998). Prognostic value of ECG findings for total, cardiovascular disease, and coronary heart disease death in men and women. Heart, 80(6), 570–577.
Ericsson, K. A. (2015). Acquisition and maintenance of medical expertise. Academic Medicine, 90(11), 1471–1486.
Ericsson, K. A., Krampe, R. T., & Tesch-Römer, C. (1993). The role of deliberate practice in the acquisition of expert performance. Psychological Review, 100(3), 363–406.
Fent, G., Gosai, J., & Purva, M. (2015). Teaching the interpretation of electrocardiograms: Which method is best? Journal of Electrocardiology, 48(2), 190–193.
Genders, T., Spronk, S., Stijnen, T., & Steyerberg, E. W. (2012). Methods for calculating sensitivity and specificity of clustered data: A tutorial. Radiology, 265, 910–916.
Guglin, M. E., & Thatai, D. (2006). Common errors in computer electrocardiogram interpretation. International Journal of Cardiology, 106(2), 232–237.
Hartman, N. D., Wheaton, N. B., Williamson, K., Quattromani, E. N., Branzetti, J. B., & Aldeen, A. Z. (2016). A novel tool for assessment of emergency medicine resident skill in determining diagnosis and management for emergent electrocardiograms: A multicenter study. Journal of Emergency Medicine, 51(6), 697–704.
Hatala, R. M., Brooks, L. R., & Norman, G. R. (2003). Practice makes perfect: the critical role of mixed practice in the acquisition of ECG interpretation skills. Advances in Health Sciences Education: Theory and Practice, 8(1), 17–26.
Jablonover, R. S., Lundberg, E., Zhang, Y., & Stagnaro-Green, A. (2014). Competency in electrocardiogram interpretation among graduating medical students. Teaching and Learning in Medicine, 26(3), 279–284.
Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50(1), 1–73.
Larsen, D. P., Butler, A. C., & Roediger, H. L., III. (2008). Test-enhanced learning in medical education. Medical Education, 42(10), 959–966.
Livingston, S. A., & Lewis, C. (1995). Estimating the consistency and accuracy of classifications based on test scores. Journal of Educational Measurement, 32, 179–197.
Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). New York: American Council on Education: Macmillan Publishing Company.
Pecaric, M., Boutis, K., Beckstead, J., & Pusic, M. (2017). A big data and learning analytics approach to process-level feedback in cognitive simulations. Academic Medicine, 92(2), 175–184.
Pusic, M. V., Andrews, J. S., Kessler, D. O., Teng, D. C., Pecaric, M. R., Ruzal-Shapiro, C., et al. (2012a). Prevalence of abnormal cases in an image bank affects the learning of radiograph interpretation. Medical Education, 46(3), 289–298.
Pusic, M. V., Boutis, K., Hatala, R., & Cook, D. A. (2015a). Learning curves in health professions education. Academic Medicine, 90(8), 1034–1042.
Pusic, M. V., Boutis, K., Pecaric, M. R., Savenkov, O., Beckstead, J. W., & Jaber, M. Y. (2016). A primer on the statistical modeling of learning curves in health professions education. Advances in Health Sciences Education, 22(3), 741–759.
Pusic, M. V., Chiaramonte, R., Gladding, S., Andrews, J. S., Pecaric, M. R., & Boutis, K. (2015b). Accuracy of self-monitoring during learning of radiograph interpretation. Medical Education, 49(8), 838–846.
Pusic, M. V., Kessler, D., Szyld, D., Kalet, A., Pecaric, M., & Boutis, K. (2012b). Experience curves as an organizing framework for deliberate practice in emergency medicine learning. Academic Emergency Medicine, 19(12), 1476–1480.
Pusic, M., Pecaric, M., & Boutis, K. (2011). How much practice is enough? Using learning curves to assess the deliberate practice of radiograph interpretation. Academic Medicine, 86(6), 731–736.
Ramsay, C. R., Grant, A. M., Wallace, S. A., Garthwaite, P. H., Monk, A. F., & Russell, I. T. (2001). Statistical assessment of the learning curves of health technologies. Health Technology Assessment (Winchester, England), 5(12), 1–79.
Rourke, L., Leong, J., & Chatterly, P. (2018). Conditions-based learning theory as a framework for comparative-effectiveness reviews: A worked example. Teaching and Learning in Medicine, 16, 1–9.
Salerno, S. M., Alguire, P. C., & Waxman, H. S. (2003a). Competency in interpretation of 12-lead electrocardiograms: A summary and appraisal of published evidence. Annals of Internal Medicine, 138(9), 751–760.
Salerno, S. M., Alguire, P. C., & Waxman, H. S. (2003b). Training and competency evaluation for interpretation of 12-lead electrocardiograms: Recommendations from the American College of Physicians. Annals of Internal Medicine, 138, 747–750.
Schuwirth, L. W. T., & van der Vleuten, C. P. M. (2011a). General overview of the theories used in assessment: AMEE Guide No. 57. Medical Teacher, 33(10), 783–797.
Schuwirth, L. W. T., & van der Vleuten, C. P. M. (2011b). Programmatic assessment: From assessment of learning to assessment for learning. Medical Teacher, 33(6), 478–485.
Shah, A. P., & Rubin, S. A. (2007). Errors in the computerized electrocardiogram interpretation of cardiac rhythm. Journal of Electrocardiology, 40(5), 385–390.
Shute, V. (2008). Focus on formative feedback. Review of Educational Research, 78(1), 153–189.
Sibbald, M., Davies, E. G., Dorian, P., & Yu, E. H. C. (2014). Electrocardiographic interpretation skills of cardiology residents: Are they competent? Canadian Journal of Cardiology, 30(12), 1721–1724.
Wainer, H., & Mislevy, R. J. (2000). Item response theory, item calibration, and proficiency estimation. In H. Wainer (Ed.), Computerized adaptive testing (pp. 63–68). New Jersey: Lawrence Erlbaum & Associates.
Webb, N. M., Shavelson, R. J., & Haertel, E. H. (2006). Reliability coefficients and generalizability theory. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics (pp. 81–124). Amsterdam: Elsevier.
Funding
This work was supported, in part, by a Royal College of Physicians and Surgeons of Canada Medical Education grant.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Hatala, R., Gutman, J., Lineberry, M. et al. How well is each learner learning? Validity investigation of a learning curve-based assessment approach for ECG interpretation. Adv in Health Sci Educ 24, 45–63 (2019). https://doi.org/10.1007/s10459-018-9846-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10459-018-9846-x