Skip to main content
Log in

Effect of clinically discriminating, evidence-based checklist items on the reliability of scores from an Internal Medicine residency OSCE

  • Published:
Advances in Health Sciences Education Aims and scope Submit manuscript

Abstract

Objective structured clinical examinations (OSCEs) are used worldwide for summative examinations but often lack acceptable reliability. Research has shown that reliability of scores increases if OSCE checklists for medical students include only clinically relevant items. Also, checklists are often missing evidence-based items that high-achieving learners are more likely to use. The purpose of this study was to determine if limiting checklist items to clinically discriminating items and/or adding missing evidence-based items improved score reliability in an Internal Medicine residency OSCE. Six internists reviewed the traditional checklists of four OSCE stations classifying items as clinically discriminating or non-discriminating. Two independent reviewers augmented checklists with missing evidence-based items. We used generalizability theory to calculate overall reliability of faculty observer checklist scores from 45 first and second-year residents and predict how many 10-item stations would be required to reach a Phi coefficient of 0.8. Removing clinically non-discriminating items from the traditional checklist did not affect the number of stations (15) required to reach a Phi of 0.8 with 10 items. Focusing the checklist on only evidence-based clinically discriminating items increased test score reliability, needing 11 stations instead of 15 to reach 0.8; adding missing evidence-based clinically discriminating items to the traditional checklist modestly improved reliability (needing 14 instead of 15 stations). Checklists composed of evidence-based clinically discriminating items improved the reliability of checklist scores and reduced the number of stations needed for acceptable reliability. Educators should give preference to evidence-based items over non-evidence-based items when developing OSCE checklists.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bloch, R., & Norman, G. G-String IV program. Retrieved August 2011 from http://www.http://fhsperd.mcmaster.ca/g_string/download.html.

  • Brannick, M. T., Erol-Korkmaz, H. T., & Prewett, M. (2011). A systematic review of the reliability of objective structured clinical examination scores. Medical Education, 45, 1181–1189.

    Article  Google Scholar 

  • Brennan, R. L. (2001a). Generalizability Theory. New York: Springer-Verlag.

    Book  Google Scholar 

  • Brennan, R. L. (2001b). urGENOVA, v.2.1. Iowa City: Center for advanced studies in measurement and assessment, College of Education, University of Iowa. Retrieved September 2013 from http://www.uiowa.edu/~casma/computer_programs.htm.

  • Downing, S. M. (2004). Reliability: On the reproducibility of assessment data. Medical Education, 38, 1006–1012.

    Article  Google Scholar 

  • Downing, S. M. (2009). Statistics of Testing. In S. M. Downing & R. Yudkowsky (Eds.), Assessment in Health Professions Education (pp. 93–117). New York: Routledge.

    Google Scholar 

  • Eva, K. W. (2004). What every teacher needs to know about clinical reasoning. Medical Education, 39, 98–106.

    Article  Google Scholar 

  • Hettinga, A. M., Denessen, E., & Postma, C. T. (2010). Checking the checklist: A content analysis of expert- and evidence-based case-specific checklist items. Medical Education, 44, 874–883.

    Article  Google Scholar 

  • Hodges, B., Regehr, G., McNaughton, N., Tiberius, R., & Hanson, M. (1999). OSCE checklists do not capture increasing levels of expertise. Academic Medicine, 74, 1129–1134.

    Article  Google Scholar 

  • IBM Corp. (2011). IBM SPSS Statistics for Windows, Version 20.0. Armonk: IBM Corp.

    Google Scholar 

  • JAMA evidence. The rational clinical examination: Evidence-based clinical diagnosis. Retrieved March 26, 2012 from: http://jamaevidence.com/resource/523.

  • McGee, S. (2007). Evidence Based Physical Diagnosis (2nd ed.). St. Louis: Saunders Elsevier.

    Google Scholar 

  • McGee, S. (2012). Evidence Based Physical Diagnosis (3rd ed.). Philadelphia: Saunders Elsevier.

    Google Scholar 

  • Norman, G., Bordage, G., Page, G., & Keane, D. (2006). How specific is case specificity? Medical Education, 40, 618–623.

    Article  Google Scholar 

  • Patrício, M. F., Julião, M., Fareleira, F., & Carneiro, A. V. (2013). Is the OSCE a feasible tool to assess competencies in undergraduate medical education? Medical Teacher, 35, 503–514.

    Article  Google Scholar 

  • Schmidt, H. G., & Rikers, R. M. (2007). How expertise develops in medicine: Knowledge encapsulation and illness script formation. Medical Education, 41, 1133–1139.

    Google Scholar 

  • Schuwirth, L. W., & van der Vleuten, C. P. (2003). The use of clinical simulations in assessment. Medical Education, 37(1 Suppl), 65–71.

    Article  Google Scholar 

  • Shavelson, R. J., & Webb, N. M. (1991). Generalizability theory: A primer. Newbury Park: Sage Publications.

    Google Scholar 

  • Yudkowsky, R., Otaki, J., Lowenstein, T., Riddle, J., Nishigori, H., & Bordage, G. (2009). A hypothesis-driven physical examination learning and assessment procedure for medical students: Initial validity evidence. Medical Education, 43, 729–740.

    Article  Google Scholar 

Download references

Acknowledgments

The authors wish to thank Amin Mousavi and Dr. Todd Rogers from the Centre for Research in Applied Measurement and Evaluation at the University of Alberta for their help with generalizability analyses. We are also grateful to our colleagues in the Department of Medicine at the University of Alberta: Dr. Cheryl Goldstein for her assistance in the literature review for evidence-based items, Dr. Zaeem Siddiqi for his input on the Parkinson’s station, and to Drs. Bibiana Cujec, Adriana Lazarescu, Fiona Lawson, Fraulein Morales, Uwais Qarni, Irwindeep Sandhu, and Stephanie Smith for their assistance in classifying checklist items. This work was supported in part by the University of Alberta, Department of Medicine’s Medical Education Research Grant.

Conflict of interest

None.

Ethical standard

This study received approval from the Internal Review Boards at the University of Alberta and the University of Illinois at Chicago.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vijay J. Daniels.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Daniels, V.J., Bordage, G., Gierl, M.J. et al. Effect of clinically discriminating, evidence-based checklist items on the reliability of scores from an Internal Medicine residency OSCE. Adv in Health Sci Educ 19, 497–506 (2014). https://doi.org/10.1007/s10459-013-9482-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10459-013-9482-4

Keywords

Navigation