Introduction

Clinical clerkships by their very nature are complex curricular events beholden to many factors outside the control of clerkship directors and medical education leadership such as patient mix, healthcare team dynamics, ever-changing healthcare system policies, and external factors such as the COVID-19 pandemic [1]. The inability to completely standardize a clerkship experience for students contributes to challenges in assessing the impact of a change in the curriculum on student performance. The literature supports many different factors as correlating with student performance in the third year clinical clerkships outside of clerkship length such as sequence of rotations [2,3,4], USMLE Step 1 performance [5], student self-reported stress [6], pre-clerkship clinical evaluations [7], and even pre-matriculation factors [8, 9].

Clinical clerkship changes due to external forces such as COVID-19, or due to internal causes such as shortening preclinical curricular time, may result in anxiety for students enrolled in the shortened curriculum. Current literature describes conflicting results regarding the impact of shortening clerkship length on student National Board of Medical Examiners (NBME) subject exam performance. Several studies looking at NBME scores for students in shortened clinical rotations, in a variety of specialties, showed no effect or even at times positive effects [10,11,12,13]. In contrast, other studies showed a negative effect of shorter rotations on NBME subject exams [14,15,16]. These negative effects seemed more likely to occur with greater reductions in clerkship length [15], and at least one study of lengthening a clerkship showed an increase in scores [17]. In contrast, multiple studies found individual Step 1 score performance has a significant correlation with NBME subject scores [2, 10]. In one study, USMLE 1 scores identified students at risk of poor performance on NBME subject examinations [2]. One study found no significant association between clerkship length and internal medicine exam performance but did find Step 1 performance remained positively associated with internal medicine exam performance after controlling for all other variables in the model [10].

While there are many publications examining the impact of shortened duration in single clerkships, there are relatively few publications on the impact of an entire shortened clinical clerkship curriculum. Robert Wood Johnson published a study of its shortened curriculum in 1988 which showed a non-significant change in Step 2 CK scores [18]. Recently the University of Michigan published a study of their clinical clerkship curriculum changes which showed a 25% decrease in clerkship length resulted in no significant student performance differences [19]. There is little published literature examining the impact of shortened clerkships on other performances measures, including performance on objective structured clinical examinations (OSCEs) and student satisfaction with the clerkship. Some studies show no difference in student satisfaction with shorter curricula or even better evaluation scores for shortened curricula [20]. Another study of shortened surgery clerkship showed no difference in USMLE Step 2 CK scores but did show differences in OSCE performance and student satisfaction [21]. No published literature was found on the impact of shortened clinical clerkship curricula and performance on Step 2 Clinical Skills (CS) exam.

Due to a change in the clinical curriculum during the 2018–2019 academic year, the length of most clinical clerkships was shortened in length by an average of 20% to allow for effective transition to the new curriculum. We sought to add to the literature by assessing the impact across all clerkships while controlling for Step 1 scores and by evaluating a relatively comprehensive set of outcomes. We used the Kirkpatrick model as a framework to guide our analysis of outcomes to include aspects of learner reaction, learning, behavior change, and results. The goal of this study was to evaluate how shortening clerkship length affected student clerkship experience via (1) NBME subject examination performance, (2) Step 2 CK performance, (3) institutional OSCE performance, (4) clerkship evaluation data, (5) number of direct observations of history and physical examination, and (6) reported duty hour violations.

Methods

Participants

Students from two consecutive classes who successfully completed the preclinical curriculum, passed Step 1, and completed all eight Year 3 clerkship rotations and Step 2 CK “on-cycle” were included (inclusion criteria). Data were not included from “off-cycle” students from earlier cohorts who were returning to the curriculum after taking a leave of absence (e.g., participants enrolled in the MD/PhD program). These off-cycle students were excluded from this analysis to limit confounding variables when assessing differences between the two Year 3 curricula.

Setting

Wake Forest School of Medicine is a 4-year allopathic medical school with the previous curriculum consisting of 2 years composed of pre-clerkship coursework (e.g., anatomy, biochemistry, neuroscience, and a series of system pathophysiology blocks) and various longitudinal courses (e.g., clinical skills and population medicine). Students are required to pass the USMLE Step 1 examination before being promoted into the second half of the curriculum consisting of required third year clerkships and both required and elective fourth year clerkships. Our revised curriculum entitled “Wake Ready” shortened the pre-clerkship curricula to 18 months and integrated additional content threads of radiology, dermatology, and ultrasound. Students are still required to pass Step 1 prior to beginning their clerkships. The last year of the curriculum has been lengthened to allow greater individualization for students. The transition from Wake Forest’s traditional curriculum to Wake Ready created overlapping year 3 curricula, and therefore, students in the class of 2020 participated in a shortened 40-week year 3 curriculum to ensure sufficient clinical exposure for all students. Clerkship comparisons between the traditional and shortened curricula are presented in Table 1.

Table 1 Description of Traditional versus Shortened clerkship curriculum and clerkship changes

Measurements and Procedures

Objective scoring performance data on MCAT, USMLE Step 1, and USMLE Step 2 CK were obtained and used as outlined in the statistical analyses below. During each clerkship, both student cohorts completed NBME subject examinations. Throughout the curriculum, students complete OSCEs as noted above. Students perform a focused history and physical exam on a standardized patient and write a patient note which is submitted for evaluation. Like Step 2 CS, the patient note is composed of history, physical examination findings, a differential diagnosis list, and proposed workup plan. The physician rater scores each note section using a case-specific checklist. Each note section is 25% of the total score. Six OSCE cases were administered to both cohorts and were included in the analysis.

The institution collects data on the student’s impression of the overall educational experience at the end of each clerkship. Students are required to complete end-of-clerkship evaluations, but student evaluation data was included in this analysis only if they completed their evaluation within 30 days following the end of the clerkship. Students are required to self-report the number of instances they are directly observed performing a history and physical exam by a faculty member during each clerkship and self-report the number of duty hour violations and reason for each violation. These student-reported experiential data were also analyzed for differences between the two cohorts.

Statistical Analyses

Cohort demographics, MCAT, USMLE Step 1, NBME subject examination, and USMLE Step 2 CK scores were analyzed using descriptive statistics. MCAT and Step 1 scores were investigated to assess for differences between cohorts at entry into the preclinical and clinical phases of the curriculum.

An analysis of variance (ANOVA) was conducted to establish whether there were differences between the cohorts on Step 1, NBME subject examinations, clinical practice examinations (OSCEs), and Step 2 CK examination. Recognizing the important role Step 1 plays in NBME subject examination performance and out of concern that any potential change in NBME subject examination performance may drive change in Step 2 CK, we sought to control for these variables. Analysis of covariance (ANCOVA) explored USMLE Step 1 as a moderator on NBME subject examination performance and NBME subject examination performance as a moderator on Step 2 CK.

Descriptive statistics were used to report results of end-of-clerkship evaluations. Chi-square analyses were undertaken to explore for rating differences on student-reported evaluations. This study was approved by our Institutional Review Board (IRB00060907).

Results

A total of 120 students were initially included in the traditional curriculum cohort, and 127 students initially included in the shortened curriculum cohort. Sixteen students in the traditional cohort and twenty-three students in the shortened cohort were subsequently categorized as off-cycle (e.g., MD/PhD program), which resulted in 104 students included in each cohort for analyses.

There were no statistically significant differences in total MCAT score between the two student cohorts (p < .566). No statistically significant differences in USMLE Step 1 scores were observed between the two student cohorts (p ≤ .109). There were no statistically significant differences on Step 2 CK scores between the traditional curriculum cohort (M = 249.4, SD = 13.7) and shortened curriculum cohort (M = 248.7, SD = 15.8, F(1,206) = .117, p ≤ .732), and no difference in Step 2 CS pass rate (Table 2).

Table 2 Comparison of national licensing examination performance between two student cohorts

No statistically significant differences were noted between cohorts on NBME subject examination results. The mean score, lower and higher confidence interval bounds for the subject examination scores, and the p value for both cohorts are presented in Fig. 1.

Fig. 1
figure 1

Subject examination results comparing the mean scores and 95% confidence intervals between curriculum cohorts

Step 1 was used as a moderator on NBME subject examination performance. This analysis revealed a significant difference in the surgery clerkship. Surgery NBME exam performance showed a higher mean score for traditional curriculum cohort (M = 74.5, SD = 7.6) compared to the shortened curriculum cohort (M = 73.6, SD = 8.8), F(1,204) = 5.905, p ≤ .016. No statistically significant differences were noted, when each NBME subject examination score was investigated as a moderator on Step 2 CK performance.

Statistically significant differences between the cohorts were observed on two of the six clinical OSCE cases. Results from the left lower quadrant (LLQ) abdominal pain case demonstrated a higher mean score for the traditional curriculum cohort (M = 69.4, SD = 10.5) compared to the shortened curriculum cohort (M = 64.3, SD = 12.4, F(1,207) = 9.998, p ≤ .002). Conversely, the chest pain case demonstrated a higher mean score for the shortened curriculum cohort (M = 81.3, SD = 8.7) when compared to the traditional curriculum cohort (M = 77.1, SD = 11.6, F(1,207) = 10.611, p ≤ .001). The remaining four OSCE cases did not show any differences between the two curricula. The mean score, lower and higher confidence interval bounds for each OSCE case, and p value for both cohorts are presented in Fig. 2.

Fig. 2
figure 2

OSCE examination results comparing the mean scores and 95% confidence intervals between curriculum cohorts. *LLQ left lower quadrant, RUQ right upper quadrant

At least 97% of students completed the end-of-clerkship evaluation form for each clerkship in both student cohorts. In the traditional curriculum, at least 77% of students rated each clerkship as either “good” or “excellent.” In the shortened curriculum, at least 78% of students rated each clerkship good or excellent. Three of the shortened clerkships (obstetrics and gynecology, pediatrics, and surgery) did not have 100% of students directly observed at least once within the clinical environment in comparison to one traditional length clerkship (surgery). Neurology and emergency medicine clerkships showed increases in duty hour violations during the shortened curriculum. Student clerkship ratings, direct observation date, and duty hour violations are reported in Table 3.

Table 3 Comparison of student cohort ratings on end-of-clerkship evaluations

Chi-square analysis was undertaken by combining good and excellent ratings and comparing the number to the combined “poor” and “fair” ratings. Statistically significant differences between cohorts were noted on the pediatrics (X2 [1, N = 206] = 4.079, p = .043) and psychiatry (X2 [1, N = 208] = 11.028, p = .001) clerkships, with pediatrics demonstrating higher ratings for the traditional clerkships and psychiatry demonstrating higher ratings for the shortened curriculum.

Discussion

Students at our institution voiced concerns regarding the impact of shortened clinical clerkships on their education and clerkship performance. We had the unique opportunity to review the impact of a previous shortened clinical clerkship curriculum at our institution and its effect on student performance. In the spirit of comprehensive review, we examined all performance factors that were consistent between the two curricular cohorts. We chose to frame the multiple examined factors using Kirkpatrick’s model for curricular evaluation. We examined student satisfaction of the overall clerkship experience as a measure of Kirkpatrick’s Level 1 Reaction assessment [22]. We examined frequency of direct observations and OSCE performance as a measure of Level 2 Learning assessment. Duty hour violations were used as a measure of Kirkpatrick’s Level 3 Behavior assessment, and NBME Subject and Step 2 CK exam scores were used as a measure of Kirkpatrick’s Level 4 Results assessment.

With regard to Kirkpatrick’s Level 1 assessment, overall student satisfaction with the clerkships remained high during both the traditional and shortened curricula. In six of the eight clerkships, overall student satisfaction with their clerkship experience remained the same based on chi-square analysis. The psychiatry clerkship had statistically significant improvement in student clerkship satisfaction in the shortened curriculum, which may have been partially due to a change in clerkship leadership. A previous study of clerkship evaluations determined student perception of clerkship quality is primarily related to the clerkship’s organization, student integration, and supervision [23]. Shortening the clerkship time may have served as an impetus for clerkship directors to ensure the rotation was highly organized and to maximize student time with patients and supervising faculty.

Learning assessment was analyzed using performance on OSCEs and direct observation of students by faculty. The majority of the OSCE exams showed no difference in student performance between the two curricula. Despite the shortened curriculum, almost all the students were still observed at least once during the rotation. There was a small decrease in the percentage of students who reported being directly observed in the shortened curriculum, though it was not statistically significant. These results were similar to the findings in the Monrad paper [19]. The lack of cohort differences in the OSCE performance may be due to the focus of the OSCE on core clinical conditions, our institution’s robust pre-clerkship clinical skills course, or student time spent immersed in clinical patient care. Many of the changes to the individual clerkships involved rearranging student clinical time to maximize patient volumes and streamline student experience in different patient care settings. These thoughtful changes may have improved clerkship efficiency as it relates to student-patient interactions despite a decrease in rotation length. This is further reflected in the lack of significant disruption to student direct observation by teaching faculty.

Behavior assessment was analyzed using reported duty hour violations. Our data does show an increase in the number of duty hour violations in two clerkships with a shortened clerkship curriculum. For both clerkships, most reported violations related to students finishing a day shift late in the evening and having to report for a required lecture the next morning. These reported violations may be reflective of the need to condense clinical shifts and didactics into a shorter time, though as noted above, this did not appear to impact the students’ overall perception of the clerkships. Impact of curricular length change on duty hour violation was not assessed by Monrad et al. and does not appear to have been studied extensively in the education literature [19]. However, student behavior was assessed via student-reported well-being scales, and no differences were found in student well-being in the shortened curriculum

Kirkpatrick’s Level 4 Assessment was analyzed using performance on standardized tests. Most clerkships saw no difference in NBME subject exam scores with and without controlling for Step 1 scores. The one exception to this was the surgery clerkship, which saw a decrease in scores during the shortened clerkship when controlling for Step 1. Our results show no difference in Step 2 CK scores, both via two-way ANOVA and when using NBME subject examination scores as a covariate. The largest body of existing medical education literature examines this relationship between clerkship length and standardized test performance, as outlined in the introduction of this paper. Students may be able to maintain test performance despite shortened clerkships for several reasons. Didactic content was not typically sacrificed in our curricular revision, but rather redundant didactics were eliminated. Students continued to use popular study resources outside of scheduled clerkship time to prepare for the exams. Core clerkship experiential content was preserved in the shortened curricula, while experiences deemed more extraneous (i.e., shadowing nursing assistants) were removed. The core clerkship content is more likely to appear on NBME and Step 2 CK exams. While we included Step 2 CS performance in our analysis, our curriculum underwent extensive changes with regard to Step 2 CS preparation between the two cohorts. Due to these curricular changes, we felt there were significant confounding variables that prevented attribution of the statistical findings to the changes in curricular length alone. The impact of curricular length on Step 2 CS performance was not assessed by Monrad et al., nor could the authors find other published studies examining this specific relationship. Given the recent announcement by the USMLE organization that Step 2 CS is being permanently discontinued, the relationship between clerkship length and Step 2 CS performance is of less interest to medical educators.

Our study has important limitations. While our curriculum underwent a change in the length of the clerkships, this was not a major curricular overhaul in terms of curricular design or content. Therefore, we cannot comment on whether more significant curricular structural changes would impact student performance. We focused our study on evaluating for change on subject exam scores, but readily acknowledge that there may be (and likely are) real educational effects that come with altering the length of clinical clerkships. We are not able to comment, for example, on changes in the clinical experience or clinical performance. Our clinical performance evaluation changed in the middle of the study period, as did the grading rubric for clerkships, so no consistent measurement comparing the two groups based on the clinical evaluation scores exists. The clinical performance evaluation is the most commonly utilized assessment tool for evaluating Entrustable Professional Activity (EPA) competency; therefore, we did not analyze changes to EPA-based assessment in this study. We could not find any existing literature on the effect of clerkship length to EPA competency. Like most other published studies, ours is a single-center study given most institutions do not invoke large curricular changes in tandem. Included metrics such as direct observation and duty hour violations rely on self-reporting by students and therefore may not be entirely accurate. We do not have the ability to correlate clerkship performance with performance on acting internships or in post-graduate training, so the long-term impact of shortened clerkships is unknown. Our research group is actively interested in this longer-term assessment of curricular change and is planning to study this impact in a future project. This project would examine student performance as interns and residents by utilizing data from our institutional program director survey, which is sent to all program directors that have a current resident from Wake Forest School of Medicine.

Conclusion

Based on our prior experience using a shortened clinical clerkship curriculum, we feel confident that advising students a shortened curriculum due to COVID-19 should not have a detrimental impact on their overall clerkship performance and Step 2 CK performance, and their satisfaction with the curriculum should remain high.

Practice Points:

  1. 1.

    A decrease of up to 25% in clinical rotation length did not impact NBME subject exam or Step 2 CK exam performance

  2. 2.

    Student satisfaction in a shortened curriculum remained high

  3. 3.

    Direct observation of students continues to occur in a shortened curriculum

Notes on Contributors

Lindsay C. Strowd MD: Dr. Strowd is an Associate Professor of Dermatology at Wake Forest School of Medicine. She teaches the dermatology course for pre-clinical students and is part of the Clinical Assessment Team for the School of Medicine. Nicholas Hartman MD MPH: Dr. Hartman is an Associate Professor of Emergency Medicine at Wake Forest School of Medicine. He is part of the Clinical Assessment Team for the School of Medicine and Associate Program Director for the emergency medicine residency. Kim Askew MD: Dr. Askew is an Associate Professor of Emergency Medicine and serves as the Associate Dean for Clinical Education at Wake Forest School of Medicine. Andrea Vallevand PhD: Dr. Vallevand is a Medical Education Evaluations Specialist at Wake Forest School of Medicine in Winston-Salem, North Carolina. Kim McDonough MSN: Kim serves as an Academic Curriculum Coordinator for Wake Forest School of Medicine and member of the Clinical Assessment Team. Jon Goforth MBA: Jon is an Academic Curriculum Coordinator for Wake Forest School of Medicine and member of the Clinical Assessment Team. David Manthey MD: Dr. Manthey is a Professor of Emergency Medicine at Wake Forest School of Medicine. He is the thread director for the Medical Decision Making Thread and a member of the Clinical Assessment Team.