Journal of General Internal Medicine

, Volume 15, Issue 8, pp 556–561 | Cite as

Clinical work sampling

A new approach to the problem of in-training evaluation
  • J. Turnbull
  • J. MacFadyen
  • C. van Barneveld
  • G. Norman
Innovations In Education And Clinical Practice


OBJECTIVE: Existing systems of in-training evaluation (ITE) have been criticized as being unreliable and invalid methods for assessing student performance during clinical education. The purpose of this study was to assess the feasibility, reliability, and validity of a clinical work sampling (CWS) approach to ITE. This approach focused on the following: (1) basing performance data on observed behaviors, (2) using multiple observers and occasions, (3) recording data at the time of performance, and (4) allowing for a feasible system to receive feedback.

PARTICIPANTS: Sixty-two third-year University of Ottawa students were assessed during their 8-week internal medicine inpatient experience.

MEASUREMENTS AND MAIN RESULTS: Four performance rating forms (Admission Rating Form, Ward Rating Form, Multidisciplinary Team Rating Form, and Patient’s Rating Form) were introduced to document student performance. Voluntary participation rates were variable (12%–64%) with patients excluded from the analysis because of low response rate (12%). The mean number of evaluations per student per rotation (19) exceeded the number of evaluations needed to achieve sufficient reliability. Reliability coefficients were high for the Ward Form (.86) and the Admission Form (.73) but not for the Multidisciplinary Team (.22) Form. There was an examiner effect (rater leniency), but this was small relative to real differences between students. Correlations between the Ward Form and the Admission Form were high (.47), while those with the Multidisciplinary Team Form were lower (.37 and .26, respectively). The CWS approach ITE was considered to be content valid by expert judges.

CONCLUSIONS: The collection of ongoing performance data was reasonably feasible, reliable, and valid.

Key words

In-training evaluation clinical work sampling 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Hull AL, Hodder S, Berger B, et al. Validity of three clinical performance assessments of internal medicine clerks. Acad Med. 1995;70:517–22.PubMedCrossRefGoogle Scholar
  2. 2.
    Short JP. The importance of strong evaluation standards and procedures in training residents. Acad Med. 1993;68:522–5.PubMedCrossRefGoogle Scholar
  3. 3.
    Phelan S. Evaluation of the noncognitive professional traits of medical students. Acad Med. 1993;68:799–803.PubMedCrossRefGoogle Scholar
  4. 4.
    Hunt DD. Functional and dysfunctional characteristics of the prevailing model of clinical evaluation systems in North American medical schools. Acad Med. 1992;67:254–9.PubMedCrossRefGoogle Scholar
  5. 5.
    Gray JD. Global rating scales in residency education. Acad Med. 1996;71(1 suppl):S55-S63.PubMedCrossRefGoogle Scholar
  6. 6.
    Kaplan CB, Centor RM. The use of nurses to evaluate houseofficers’ humanistic behavior. J Gen Intern Med. 1990;5:410–4.PubMedCrossRefGoogle Scholar
  7. 7.
    Dauphinee D. Assessing clinical performance: Where do we stand and what might we expect? JAMA. 1995;274:741–3.PubMedCrossRefGoogle Scholar
  8. 8.
    van der Vleuten CPM, Norman GR, de Graaff E. Pitfalls in the pursuit of objectivity: issues of reliability. Med Educ. 1991;25:110–8.PubMedGoogle Scholar
  9. 9.
    Maxim BR, Dielman TE. Dimensionality, internal consistency and interrater reliability of clinical performance ratings. Med Educ. 1987;27:130–7.CrossRefGoogle Scholar
  10. 10.
    Stillman PL. Positive effects of a clinical performance assessment program. Acad Med. 1991;66:481–3.PubMedCrossRefGoogle Scholar
  11. 11.
    Turnbull J, Gray J, MacFadyen J. Improving in-training evaluation programs. J Gen Intern Med. 1998;13:317–23.PubMedCrossRefGoogle Scholar
  12. 12.
    Irby DM, Milam, S. The legal context for evaluating and dismissing medical students and residents. Acad Med. 1989;64:639–43.PubMedCrossRefGoogle Scholar
  13. 13.
    Stone AA, Shiffman S. Ecological momentary assessment (EMA) in behavioral medicine. Ann Beh Med. 1994;16:199–202.Google Scholar
  14. 14.
    Fleiss J, Shrout, PE. Approximate interval estimation for a certain inter-class correlation coefficient. Psychometrika. 1978;43:259–62.CrossRefGoogle Scholar
  15. 15.
    Streiner DL, Norman GR. Health Measurement Scales: A Practical Guide to their Development and Use. Oxford, U.K.: Oxford University Press; 1995.Google Scholar
  16. 16.
    Ramsey PG, Carline JD, Blank LL, Wenrich MD. Feasibility of hospital-based use of peer ratings to evaluate the performances of practicing physicians. Acad Med. 1996;71:364–70.PubMedCrossRefGoogle Scholar
  17. 17.
    Ramsey PG, Wenrich MD, Carline JD, Inui TS, Larson EB, Logerfo JP. Use of peer ratings to evaluate physician performance. JAMA. 1993;269:1655–60.PubMedCrossRefGoogle Scholar
  18. 18.
    Butterfield PS, Mazzaferri EL. New rating form for use by nurses in assessing residents’ humanistic behavior. J Gen Intern Med. 1991;6:155–61.PubMedCrossRefGoogle Scholar
  19. 19.
    Societal Needs Working Group. CanMEDS 2000 Project. Skills for the new millennium. Ann RCPSC. 1996;29:206–16.Google Scholar

Copyright information

© Society of General Internal Medicine 2000

Authors and Affiliations

  • J. Turnbull
    • 1
  • J. MacFadyen
    • 1
  • C. van Barneveld
    • 1
  • G. Norman
    • 2
  1. 1.the Department of MedicineUniversity of OttawaOttawaCanada
  2. 2.Department of Clinical Epidemiology and BiostatisticsMcMaster UniversityHamiltonCanada

Personalised recommendations