Testing medical student knowledge in pre-clinical years is different compared to testing during their clinical rotations. The former relies predominantly on data derived from exams, assignments, team-based learning activities, and other objective measures [1]. Testing and grading during clinical years pose a greater challenge for educators. While students still encounter standardized knowledge-based examinations, their clinical skills must be also assessed [2]. Assessing clinical skills during patient care experiences can be accomplished by both preceptor evaluations and Objective Structured Clinical Examinations (OSCEs) [3].

Preceptors gauge student strengths and weaknesses during daily activities of patient care. Ideally, preceptors develop relationships with students, enabling comment on professional conduct, interpersonal skills, and patient care skills over the course of a clerkship. Drawbacks to exclusive use of preceptor evaluations of clinical skills and professionalism include bias such as “halo effect,” subjectivity, and variability in grading [4]. Medical students often work with multiple physicians, further adding to student concerns regarding grading accuracy. Faculty development to standardize preceptor grading requires time and resources. Use of objective assessment methods can address these issues [5].

OSCEs offer time-limited examinations with standardized patients (SPs), testing clinical skills such as history-taking, physical examination, and clinical reasoning. Students also gain communication experience presenting findings, assessments, and plans. OSCEs also assess patient relationship building, level of empathy conveyed, and rapport built [6, 7]. The use and validity of OSCEs as assessment methods in psychiatry clerkships have been studied [8]. Hodges et al. found that global process ratings (i.e., interpersonal skills) correlated with construct validity and content checklists (i.e., history taking) correlated with concurrent validity [9]. Park et al. also supported construct validity of psychiatry OSCEs in determining student competencies [10]. The use of OSCEs in psychiatry medical education continues to grow [11].

In 2016, curricular change at the University of North Carolina (UNC) School of Medicine included a shift to normative-based grading for clinical courses. Prior to the curricular change, the psychiatry clerkship utilized an observed interview and oral exam, involving student interviews with hospitalized psychiatric patients, observed by an attending psychiatrist in the room. The student would then present their findings and discuss their assessment, differential diagnosis, and treatment plan with the attending. While this simulated a “realistic” interview, multiple challenges utilizing “live” patients included availability of faculty to participate in-person, identification of suitable patients for interviews, variability in patient presentation, and space availability on the inpatient units.

With the transition to normative-based grading in 2016, student complaints regarding the variability in patients increased, as did grade concerns and appeals. These were challenging to manage without records of the encounter other than reports from the student and faculty grading the observed interview. The psychiatry course director worked with the UNC Clinical Skills Center to create a standardized patient OSCE that preserved evaluation of clinical skills, including history-taking, mental status exam (MSE), oral presentation, clinical reasoning, and treatment planning. With the change in curriculum for the psychiatry clinical course, grading differences were analyzed between the academic years that utilized hospitalized patients versus trained standardized patients in OSCE format.

OSCE Development and Training

Please refer to Fig. 1 for an outline on the development and training process.

Fig.  1
figure 1

Psychiatry OSCE development process

OSCE Format

Beginning in March 2017, observed interviews transitioned entirely to a standardized patient OSCE format. OSCE encounters occurred in the UNC Clinical Skills Center. During the OSCE, medical students were provided a “door note” with the chief complaint and instructions to take a patient history and complete a mental status exam in a 30-min encounter. After the encounter, SPs were prompted to exit the room. Students were given 5 min to prepare their presentation for “staffing” with resident physicians, and 25 min for presentation and discussion of assessment and plan.

Standardized Patients

SPs received training prior to participating in the OSCE. The cases were constructed to enable SPs to develop an individual “backstory” for a more realistic interview while maintaining a standardized psychiatric, medical, family, and social history. Training also focused on portrayal of nuances of psychiatric symptoms in the history and in mental status exam findings, with both pertinent positives and pertinent negatives. OSCE training also involved observed practice of assigned cases. They receive feedback following each OSCE from the SP trainer in the Clinical Skills Center and the course director.

Resident Observers

A long-standing, highly rated interview skill building program that pairs second-year residents or above with clerkship students was leveraged to provide clinically experienced observers who could also “staff” the student presentation and discussion of assessment and plan for the OSCE. As in their prior role in the interview skill building program, resident “tutors” actively prepared students for the OSCE during the clerkship, providing direct observation and feedback on interview skills, presentation of findings, mental status exam, differential diagnosis, and formulation of assessment and plan. During the OSCE, the paired resident tutor remotely observed the interaction between the medical student and standardized patient via a video feed, which was recorded. The same paired resident tutor then joined the student in the examination room to “staff” the case where students presented their findings, discussed their differential diagnosis, and presented a treatment plan. Resident tutors were trained to ask follow-up questions to explore aspects of the differential, assessment, and plan. Typically, the medical student’s paired resident tutor was present; at times a neutral senior resident assists during the OSCE due to unavailability of the student’s tutor resident.

Residents participated in two training workshops prior to beginning their work as tutors and with the OSCE. Videos of a “gold-standard” example medical student interview and presentation/staffing were created for training. Residents were also provided workbooks with resources and preparatory materials. To address the specific skillset of “staffing,” a specific guide to the student oral presentation was developed, including approaches to engage the student, active listening, asking clarifying questions related to the history and mental status exam, and questions aimed toward development of an appropriate differential and plan (including safety, workup, medical, and psychosocial interventions). Sample questions were included in each section that residents could utilize to structure how to staff the OSCE. During the OSCE, residents were provided with specific case notes with key elements of history, MSE, differential diagnosis, and treatment considerations.

Grading

OSCEs were video recorded to enable asynchronous grading by attending psychiatrists as well as ability for later review. A cadre of up to fifteen faculty (per academic year) graded the OSCE. Importantly, the residents did not grade the students. As with the prior observed interview with live patients, for the OSCE, students were graded on seven items: establishing rapport, eliciting historical information, mental status exam, language/facilitation, clarity/organization of presentation, differential diagnosis/formulation, and treatment plan. With several versions of the course in transition with curricular change, the weight of the observed interview in the final grade varied. Thus, the seven items were scored ranging from 1 (failure) to 21 (outstanding) from 2012 to 2015 and 1 (failure) to 8 (outstanding) from 2015 to present.

This analysis was reviewed by the Office of Human Research Ethics and granted IRB exemption. Data from 8 years was gathered to compare the former live patient observed interview/oral exam with the OSCE involving standardized patients and trained residents. To grade the OSCE, rating scales were used to evaluate student performance on the seven items described above. For the purposes of this study, evaluation items were summed for a total OSCE score. This was then converted to a percentage for analysis. Independent samples t tests and analysis of variance were used to analyze differences between OSCE scores. Data analysis was conducted using IBM SPSS v 26 (Chicago, IL).

Analysis

Data for this study consisted of student records from eight academic years (2012 to 2020). Table 1 details the number of students and average OSCE percentages by year.

Table 1 Average OSCE rating

An independent samples t test was used to compare OSCE scores pre-intervention to post-intervention. The mean percentage in the pre-intervention was 81.92 versus post 75.47. The differences were significant (t = 9.06, p = 0.001).

A one-way between-subjects ANOVA was conducted to compare the average OSCE percentages across academic years. There was a significant difference at the p < 0.05 level [F(7,1117) = 23.91, p = 0.001]. Pairwise comparisons demonstrated a mean difference for AY 2012–2013, 2013–2014, and 2014–2015 with the remaining years (p = 0.000 for all pairwise comparisons). Additionally, Bonferroni post hoc tests indicated that AY 2016–2017 with AY 2017–2018 were significantly different (= 0.015).

Discussion

Transition from “live” patients to standardized patients afforded multiple benefits, including perhaps most importantly a structured exam with reduced variability. Recording the encounters allowed faculty review and grading after the OSCE, consulting the video-recording to address disputes on grades, and use for resident and faculty development.

We found significant differences in student performance as measured by percentage grades comparing grading from student encounters with live patients that were directly observed by an attending grader in the room (2012–2013 through 2016–2017) with remotely observed and graded student encounters with standardized patients (OSCE, 2017–2018 through 2019–2020). The OSCE grades were lower than grades given for the live patient encounter exams. This difference could represent effects of shortened preparation time and/or other structural changes in the clerkship over several years. It may also be the result of standardization of cases and patients in the OSCE encounters, reducing variability in student experience. The ability of graders to review videos as they grade may have reduced ambiguity and recall bias. A halo effect associated with bedside teaching may have allowed for more objective grading of student performance in the OSCE.

Involving psychiatry residents who worked with the student as part of the OSCE is unique and has been well received by students. Psychiatry resident involvement allows the medical student to present their findings to a familiar face. It also affords residents practice in staffing cases. Rarely during training do residents get the opportunity to think critically about gaps in a presentation and what additional information would be helpful to better understand the formulation of a patient. This serves as one potential way to give residents that experience in a low-risk, structured setting. The ability to staff cases is a critical skill for residents who seek careers in academic medicine and/or supervisory roles. Psychiatry residents at our institution are rarely placed in a supervisory role, leaving no formalized opportunity to develop skills in staffing case presentations. This has allowed for advancement of resident training in an enjoyable way by utilizing the highly rated tutoring experience. While the involvement of the paired resident tutor has been well received by students, it may introduce bias which is mitigated by not having the residents grade students.

There are limitations to this work. This OSCE structure, the results presented, and the involvement of residents were conducted at a single institution with a relatively large resident contingent. Schools with smaller residency programs may find this model challenging. An additional potential limitation is concern for interrater reliability. Experienced graders involved in both the “live” patient interviews and the video-recorded OSCEs using standardized patients may score differently than less experienced graders. This has been addressed to an extent by careful review of scores, identifying outliers, video review, and discussion with the grader. Additionally, the course director and coordinator annually review grading data, comparing each grader with the mean and sharing this information with the graders. There is potential to promote internal consistency among graders such as having the cohort score a video recording and comparing results.

Relatedly, use of a single OSCE encounter is a potential limitation as it relies on a single grader in its current form. Our consistent format of a single encounter allowed for the comparison made with fewer variables. While availability of resources such as SPs, staff, and space is a consideration, multi-station OSCEs can be developed to address this concern. Standardized patients could also be trained to evaluate student performance which may offer additional data from multiple graders.

There are various ways to assess students’ clinical performance during clerkship rotations, with OSCEs increasingly being used to accomplish this goal [11]. Pivoting the OSCE format from a “live” patient to a standardized patient interview alleviated many challenges, namely, faculty availability, identification of suitable patients for interviews, variability in patient presentations, and space constraints. Clear benefits of video-recording include ease of grading, settling disputes, and providing training and development to graders. These benefits outweighed the time and effort put into developing cases and training standardized patients and residents.

The involvement of psychiatry residents in the OSCE as direct virtual observers and in-person role in staffing the student presentation is unique. This allows for residents to not only facilitate exploration of the differential, assessment, and plan with the student, but it also allows for residents to gain experience with staffing cases in a simulated supervisory role.

Future directions involve more structured evaluations of the OSCE experience by medical students and residents. The school’s course evaluations did not capture specific ratings on the OSCE. Students could evaluate the OSCE in relation to attainment of clinical skills and knowledge objectives. Data on student perception of effectiveness of resident tutors could be utilized to measure of growth amongst residents and of resident preparation for teaching responsibilities. Additional data could focus on the resident experience in the OSCE including the potential benefits of staffing cases and growth as an educator.

In fact, our experience with video-recorded encounters with standardized patients enabled our clerkship to easily transition to an entirely virtual psychiatry OSCE in 2020–2021, preserving a directly observed assessment in the face of a pandemic that made non-patient care–related contact with live patients a risk for learners and patients. Furthermore, direct observation and assessment of telemedicine clinical skills prepare students for this increasingly utilized mode of care.