Skip to main content
Log in

Clinical observed performance evaluation: a prospective study in final year students of surgery

  • Published:
Advances in Health Sciences Education Aims and scope Submit manuscript

Abstract

We report a prospective study of clinical observed performance evaluation (COPE) for 197 medical students in the pre-qualification year of clinical education. Psychometric quality was the main endpoint. Students were assessed in groups of 5 in 40-min patient encounters, with each student the focus of evaluation for 8 min. Each student had a series of assessments in a 25-week teaching programme. Over time, several clinicians from a pool of 16 surgical consultants and registrars evaluated each student by direct observation. A structured rating form was used for assessment data. Variance component analysis (VCA), internal consistency and inter-rater agreement were used to estimate reliability. The predictive and convergent validity of COPE in relation to summative OSCE, long case, and overall final examination was estimated. Median number of COPE assessments per student was 7. Generalisability of a mean score over 7 COPE assessments was 0.66, equal to that of an 8 × 7.5 min station final OSCE. Internal consistency was 0.88–0.97 and inter-rater agreement 0.82. Significant correlations were observed with OSCE performance (R = 0.55 disattenuated) and long case (R = 0.47 disattenuated). Convergent validity was 0.81 by VCA. Overall final examination performance was linearly related to mean COPE score with standard error 3.7%. COPE permitted efficient serial assessment of a large cohort of final year students in a real world setting. Its psychometric quality compared well with conventional assessments and with other direct observation instruments as reported in the literature. Effect on learning, and translation to clinical care, are directions for future research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. Standard error = SD Y  · √(1 − ρ XY ) = 21.99 · √(1 − 0.539).

  2. Standard error of measurement of the mean over n assessments = √(error variance/n). Multiplying this value by 1.96 gives a 95% confidence interval for the mean score. Error variance for COPE is 1.0512 (Table 3); for mini-CEX see Norcini et al. (2003).

References

  • Alves de Lima, A., Barrero, C., Barratta, S., et al. (2007). Validity, reliability, feasibility and satisfaction of the Mini-Clinical Evaluation Exercise (Mini-CEX) for cardiology residency training. Medical Teacher, 29(8), 785–790.

    Article  Google Scholar 

  • Bloch, R. (2010). http://fhsperd.mcmaster.ca/g_string.

  • Brennan, R. L. (2001). University of Iowa http://www.education.uiowa.edu/casma/GenovaPrograms.htm. urGENOVA.

  • Ericsson, K. A. (2004). Deliberate practice and the acquisition and maintenance of expert performance in medicine and related domains. Academic Medicine, 79(Suppl 10), S70–S81.

    Article  Google Scholar 

  • Harden, R. M., & Gleeson, F. A. (1979). Assessment of clinical competence using an objective structured clinical examination (OSCE). Medical Education, 13, 41–54.

    Google Scholar 

  • Hasnain, M., Connell, J., et al. (2004). Toward meaningful evaluation of clinical competence: The role of direct observation in clerkship ratings. Academic Medicine, 79(10), S21–S24.

    Article  Google Scholar 

  • Hatala, R., & Norman, G. R. (1999). In-training evaluation during an internal medicine clerkship. Academic Medicine, 74(Suppl 10), S118–S120.

    Article  Google Scholar 

  • Hodges, B., & McIlroy, J. H. (2003). Analytic global OSCE ratings are sensitive to level of training. Medical Education, 37, 1012–1016.

    Article  Google Scholar 

  • Kane, M. T. (1982). A sampling model for validity. Applied Psychological Measurement, 6, 125–160.

    Article  Google Scholar 

  • Kane, M. T. (1992). The assessment of professional competence. Evaluation& the Health Professions, 15, 163–182.

    Article  Google Scholar 

  • Kogan, R. S., Holmboe, E. S., & Hauer, K. E. (2009). Tools for direct observation and assessment of clinical skills of medical trainees: A systematic review. JAMA, 302(12), 1316–1326.

    Article  Google Scholar 

  • Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.

    Google Scholar 

  • Meskauskas, J. A. (1983). Studies of the oral examination: the examinations of the subspeciality Board of Cardiovascular Disease of the American Board of Internal Medicine. In J. S. Lloyd & D. G. Langsley (Eds.), Evaluating the skills of medical specialists. Chicago, IL: American Board of Medical Specialties.

    Google Scholar 

  • Noel, G. L., Herbers, J. E., Caplow, M. P., Cooper, G. S., et al. (1992). How well do internal faculty members evaluate the clinical skills of residents? Annals of Internal Medicine, 117, 757–765.

    Google Scholar 

  • Norcini, J. J., Blank, L. L., Duffy, F. D., & Fortna, G. S. (2003). The Mini-CEX: A method for assessing clinical skills. Annals of Internal Medicine, 138, 476–481.

    Google Scholar 

  • Reed, D., Price, E., Windish, D., et al. (2005). Challenges in systematic reviews of educational intervention studies. Annals of Internal Medicine, 142(12 pt 2), 1080–1089.

    Google Scholar 

  • Regehr, G., MacRae, H., Reznick, R., & Szalay, D. (1998). Comparing the psychometric properties of checklists and global rating scales for assessing performance on an OSCE-format examination. Academic Medicine, 73(9), 993–997.

    Article  Google Scholar 

  • Richards, M. L., Paukert, J. L., Downing, S. M., & Bordage, G. (2007). Reliability and usefulness of clinical encounter cards for a third year surgical clerkship. The Journal of Surgical Research, 140(1), 139–148.

    Article  Google Scholar 

  • Schuwirth, L. T., & Van der Vleuten, C. P. M. (2004). Changing education, changing assessment, changing research? Medical Education, 38, 805–812.

    Article  Google Scholar 

  • Swanson, D., Norman, G., & Linn, R. (1995). Performance-based assessment: Lessons from the health professions. Educational Researcher, 24(5), 5–11.

    Google Scholar 

  • Torre, D. M., Simpson, D. E., Elnicki, D. M., Sebastian, J. L., & Holmboe, E. S. (2007). Feasibility, reliability and user satisfaction with a PDA-based Mini- CEX to evaluate the clinical skills of third-year medical students. Teaching and Learning in Medicine, 19(3), 271–277.

    Google Scholar 

  • Turnbull, J., MacFadyen, J., van Barneveld, C., & Norman, G. (2000). Clinical work sampling: A new approach to the problem of in-training evaluation. Journal of General Internal Medicine, 15(8), 556–561.

    Article  Google Scholar 

  • Van der Vleuten, C. P. M. (1996). The assessment of professional competence: Developments, research and practical implications. Advances in Health Sciences Education, 1, 41–67.

    Article  Google Scholar 

  • Van der Vleuten, C. P. M., Norman, G. R., & De Graaff, E. (1991). Pitfalls in the pursuit of objectivity: Issues of reliability. Medical Education, 25, 110–118.

    Article  Google Scholar 

  • Van der Vleuten, C. P. M., & Schuwirth, L. T. (2005). Assessing professional competence: From methods to programmes. Medical Education, 39, 309–317.

    Article  Google Scholar 

  • Van der Vleuten, C. P. M., & Swanson, D. B. (1990). Assessment of clinical skills with standardized patients: State of the art. Teaching and Learning in Medicine, 2, 58–76.

    Article  Google Scholar 

  • Wass, V., Jones, R., & van der Vleuten, C. P. M. (2001). Standardized or real patients to test clinical competence? The long case revisited. Medical Education, 35, 321–325.

    Article  Google Scholar 

  • Wilkinson, T. J., Campbell, P. J., & Judd, S. J. (2008). Reliability of the long case. Medical Education, 42, 887–893.

    Article  Google Scholar 

  • Williams, R. G., Klamen, D. A., & Mc Gaghie, W. C. (2003). Cognitive, social and environmental sources of bias in clinical performance ratings. Teaching and Learning in Medicine, 15, 270–292.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to G. C. Markey.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Markey, G.C., Browne, K., Hunter, K. et al. Clinical observed performance evaluation: a prospective study in final year students of surgery. Adv in Health Sci Educ 16, 47–57 (2011). https://doi.org/10.1007/s10459-010-9240-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10459-010-9240-9

Keywords

Navigation