Advances in Health Sciences Education

, Volume 16, Issue 1, pp 47–57 | Cite as

Clinical observed performance evaluation: a prospective study in final year students of surgery



We report a prospective study of clinical observed performance evaluation (COPE) for 197 medical students in the pre-qualification year of clinical education. Psychometric quality was the main endpoint. Students were assessed in groups of 5 in 40-min patient encounters, with each student the focus of evaluation for 8 min. Each student had a series of assessments in a 25-week teaching programme. Over time, several clinicians from a pool of 16 surgical consultants and registrars evaluated each student by direct observation. A structured rating form was used for assessment data. Variance component analysis (VCA), internal consistency and inter-rater agreement were used to estimate reliability. The predictive and convergent validity of COPE in relation to summative OSCE, long case, and overall final examination was estimated. Median number of COPE assessments per student was 7. Generalisability of a mean score over 7 COPE assessments was 0.66, equal to that of an 8 × 7.5 min station final OSCE. Internal consistency was 0.88–0.97 and inter-rater agreement 0.82. Significant correlations were observed with OSCE performance (R = 0.55 disattenuated) and long case (R = 0.47 disattenuated). Convergent validity was 0.81 by VCA. Overall final examination performance was linearly related to mean COPE score with standard error 3.7%. COPE permitted efficient serial assessment of a large cohort of final year students in a real world setting. Its psychometric quality compared well with conventional assessments and with other direct observation instruments as reported in the literature. Effect on learning, and translation to clinical care, are directions for future research.


Assessment Direct observation Evaluation Generalisability Measurement Psychometric Reliability Validity 


  1. Alves de Lima, A., Barrero, C., Barratta, S., et al. (2007). Validity, reliability, feasibility and satisfaction of the Mini-Clinical Evaluation Exercise (Mini-CEX) for cardiology residency training. Medical Teacher, 29(8), 785–790.CrossRefGoogle Scholar
  2. Brennan, R. L. (2001). University of Iowa urGENOVA.
  3. Ericsson, K. A. (2004). Deliberate practice and the acquisition and maintenance of expert performance in medicine and related domains. Academic Medicine, 79(Suppl 10), S70–S81.CrossRefGoogle Scholar
  4. Harden, R. M., & Gleeson, F. A. (1979). Assessment of clinical competence using an objective structured clinical examination (OSCE). Medical Education, 13, 41–54.Google Scholar
  5. Hasnain, M., Connell, J., et al. (2004). Toward meaningful evaluation of clinical competence: The role of direct observation in clerkship ratings. Academic Medicine, 79(10), S21–S24.CrossRefGoogle Scholar
  6. Hatala, R., & Norman, G. R. (1999). In-training evaluation during an internal medicine clerkship. Academic Medicine, 74(Suppl 10), S118–S120.CrossRefGoogle Scholar
  7. Hodges, B., & McIlroy, J. H. (2003). Analytic global OSCE ratings are sensitive to level of training. Medical Education, 37, 1012–1016.CrossRefGoogle Scholar
  8. Kane, M. T. (1982). A sampling model for validity. Applied Psychological Measurement, 6, 125–160.CrossRefGoogle Scholar
  9. Kane, M. T. (1992). The assessment of professional competence. Evaluation& the Health Professions, 15, 163–182.CrossRefGoogle Scholar
  10. Kogan, R. S., Holmboe, E. S., & Hauer, K. E. (2009). Tools for direct observation and assessment of clinical skills of medical trainees: A systematic review. JAMA, 302(12), 1316–1326.CrossRefGoogle Scholar
  11. Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.Google Scholar
  12. Meskauskas, J. A. (1983). Studies of the oral examination: the examinations of the subspeciality Board of Cardiovascular Disease of the American Board of Internal Medicine. In J. S. Lloyd & D. G. Langsley (Eds.), Evaluating the skills of medical specialists. Chicago, IL: American Board of Medical Specialties.Google Scholar
  13. Noel, G. L., Herbers, J. E., Caplow, M. P., Cooper, G. S., et al. (1992). How well do internal faculty members evaluate the clinical skills of residents? Annals of Internal Medicine, 117, 757–765.Google Scholar
  14. Norcini, J. J., Blank, L. L., Duffy, F. D., & Fortna, G. S. (2003). The Mini-CEX: A method for assessing clinical skills. Annals of Internal Medicine, 138, 476–481.Google Scholar
  15. Reed, D., Price, E., Windish, D., et al. (2005). Challenges in systematic reviews of educational intervention studies. Annals of Internal Medicine, 142(12 pt 2), 1080–1089.Google Scholar
  16. Regehr, G., MacRae, H., Reznick, R., & Szalay, D. (1998). Comparing the psychometric properties of checklists and global rating scales for assessing performance on an OSCE-format examination. Academic Medicine, 73(9), 993–997.CrossRefGoogle Scholar
  17. Richards, M. L., Paukert, J. L., Downing, S. M., & Bordage, G. (2007). Reliability and usefulness of clinical encounter cards for a third year surgical clerkship. The Journal of Surgical Research, 140(1), 139–148.CrossRefGoogle Scholar
  18. Schuwirth, L. T., & Van der Vleuten, C. P. M. (2004). Changing education, changing assessment, changing research? Medical Education, 38, 805–812.CrossRefGoogle Scholar
  19. Swanson, D., Norman, G., & Linn, R. (1995). Performance-based assessment: Lessons from the health professions. Educational Researcher, 24(5), 5–11.Google Scholar
  20. Torre, D. M., Simpson, D. E., Elnicki, D. M., Sebastian, J. L., & Holmboe, E. S. (2007). Feasibility, reliability and user satisfaction with a PDA-based Mini- CEX to evaluate the clinical skills of third-year medical students. Teaching and Learning in Medicine, 19(3), 271–277.Google Scholar
  21. Turnbull, J., MacFadyen, J., van Barneveld, C., & Norman, G. (2000). Clinical work sampling: A new approach to the problem of in-training evaluation. Journal of General Internal Medicine, 15(8), 556–561.CrossRefGoogle Scholar
  22. Van der Vleuten, C. P. M. (1996). The assessment of professional competence: Developments, research and practical implications. Advances in Health Sciences Education, 1, 41–67.CrossRefGoogle Scholar
  23. Van der Vleuten, C. P. M., Norman, G. R., & De Graaff, E. (1991). Pitfalls in the pursuit of objectivity: Issues of reliability. Medical Education, 25, 110–118.CrossRefGoogle Scholar
  24. Van der Vleuten, C. P. M., & Schuwirth, L. T. (2005). Assessing professional competence: From methods to programmes. Medical Education, 39, 309–317.CrossRefGoogle Scholar
  25. Van der Vleuten, C. P. M., & Swanson, D. B. (1990). Assessment of clinical skills with standardized patients: State of the art. Teaching and Learning in Medicine, 2, 58–76.CrossRefGoogle Scholar
  26. Wass, V., Jones, R., & van der Vleuten, C. P. M. (2001). Standardized or real patients to test clinical competence? The long case revisited. Medical Education, 35, 321–325.CrossRefGoogle Scholar
  27. Wilkinson, T. J., Campbell, P. J., & Judd, S. J. (2008). Reliability of the long case. Medical Education, 42, 887–893.CrossRefGoogle Scholar
  28. Williams, R. G., Klamen, D. A., & Mc Gaghie, W. C. (2003). Cognitive, social and environmental sources of bias in clinical performance ratings. Teaching and Learning in Medicine, 15, 270–292.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2010

Authors and Affiliations

  • G. C. Markey
    • 1
    • 3
  • K. Browne
    • 1
  • K. Hunter
    • 2
  • A. D. Hill
    • 1
  1. 1.Department of SurgeryRCSI/Beaumont HospitalDublin 9Ireland
  2. 2.DublinIreland
  3. 3.Department of Emergency MedicineSt James’s HospitalDublin 8Ireland

Personalised recommendations