The United States Medical Licensing Examination (USMLE) Step 1 was designed to be a benchmark measure of knowledge and has been used heavily in the residency application process. Step 1 has moved from 3-digit scoring to a pass/fail scoring system, in part to decrease the stress associated with the exam. Emerging literature suggests that this transition has led to other stresses for students. Our study compared student stress levels, both overall and in relation to Step 1, leading up to the exam between a scored cohort and pass/fail cohort. We administered to each cohort a 14-item survey that included demographics, the PSS-4 stress scale, and 6 other potential stressors. Data was analyzed using two-tailed t test for independent means and analysis of variance. We found that while there was no difference in general overall stress between the students who took Step 1 for a score and students who took Step 1 pass/fail, we did see differences in stress related to the Step 1 exam. Step 1 stress was significantly lower for the pass/fail cohort than the score cohort during the second year of medical education leading up to the exam. However, this difference in Step 1 stress between the cohorts disappeared by the dedicated study period immediately before the exam. The change in scoring appears to have decreased stress specifically related to Step 1, but this reduction was not sustained as students entered their study period to prepare for Step 1.
The United States Medical Licensing Examination (USMLE) Step 1 is a high-stakes exam that was designed to measure competency in clinical basic sciences knowledge and provides a basis for medical licensing eligibility. Most medical schools require it for graduation and residency programs require it for entry into their programs . Though Step 1 was designed to be a benchmark and measure of content knowledge, it was not designed to be a predictor for success in residency [2, 3]. However, the Step 1 score has been used heavily in the residency application process for filtering and ranking of large numbers of applicants, particularly in certain competitive specialties . The consequences of Step 1 performance for residency selection have drawn considerable student stress, focus, and resources, including time and money, to the exam . Students devote their time to both participating in the medical school curriculum and separately studying for the “parallel curriculum” of optimizing their Step 1 performance . This “Step 1 climate” has resulted in increased stress and medical students reducing their involvement in curricular activities in favor of Step 1 studying throughout their pre-clerkship education . Because of these factors, the use of USMLE Step 1 in residency selection has had detrimental effects on student mental health, well-being, and education quality [7, 8].
Over the past decade, the medical education community has made many calls to alter how the Step 1 exam is used in resident selection, including shifting the exam from three-digit scoring to providing only pass/fail scoring . On September 15th, 2021, the USMLE announced that Step 1 would move to a pass/fail scoring system, done in part to decrease the stress associated with the exam . However, emerging literature suggests that this transition has led to other stresses and concerns for students. In focus groups, students have voiced concerns about how residency programs will react to the change to pass/fail and “almost panic” about how they themselves will be impacted . Some of these stresses relate to increased emphasis on the USMLE Step 2 score, which is a similar exam more focused on clinical applications, as well as clerkship grades in an opaque grading environment  and pressure to engage in extracurricular activities . Kogan and Hauer have suggested that a singular change in Step 1 scoring without other adjustments to the residency selection process will result in increased emphasis on and stress over the aforementioned items .
Given the anticipated concerns from thought leaders, educators, and students suggesting that the Step 1 change to pass/fail may not have had the expected effects of decreasing overall medical student stress, it is important to determine whether there is an effect of the change on student stress specifically related to the Step 1 exam. Our study aims to compare student stress levels, overall and in relation to Step 1, in the period leading up to the Step 1 exam in students who took the exam pass/fail compared with students who took the exam for a score.
Materials and Methods
Study Design and Setting
This single-institution longitudinal survey study was conducted and approved by the institutional review board, at the Georgetown University School of Medicine. Study participants included two cohorts of second-year medical students: (1) the graduating class of 2023 who took the Step 1 exam for a score in 2021 (score cohort) and (2) the graduating class of 2024 who took the Step 1 exam pass/fail in 2022 (pass/fail cohort).
The medical school curriculum at our institution is a 1.5-year pre-clerkship curriculum that ends in December of the second year and is followed by an 8-week break dedicated to studying for the USMLE Step 1 exam. Students must take and pass the USMLE Step 1 exam before advancing to their third-year clinical clerkships which begin in early March.
Procedures and Instruments
All second-year medical students in the score and pass/fail cohorts were invited via email to complete a voluntary series of online surveys via Qualtrics (Provo, Utah) about their perceived stress. The surveys were administered four times to each cohort: at the beginning of the second year (M2) of medical school (time point 1), halfway through the M2 year (time point 2), beginning of the dedicated study period for Step 1 (time point 3), and middle of the dedicated study period for Step 1 (time point 4) (Fig. 1).
The 14-item survey included four demographics questions, four questions from the Perceived Stress Scale (PSS-4), and six additional questions about stress levels pertaining to the potential stress items identified in the literature (Step 1, Step 2, research experience, extracurricular activities, pre-clerkship coursework, and clerkship coursework). The full list of questions can be found in Table 1. The Perceived Stress Scale (PSS) is a popular tool used for measuring psychological stress intended to compare subject’s stress related to objective events . Higher scores are associated with higher levels of stress [14, 15]. The 5-point PSS-4 rating sale is traditionally scored 0–4. The scores from each of the four items are then summed into a total PSS-4 score. Because our intent was not to compare participant stress scores with population norms, but rather to look at changes in stress in the participants over time, we opted to use 1–5 scoring for the scale and in determining the total PSS-4 score. Therefore, our PSS-4 scores do not correlate to published population scores and guidelines on what constitutes average vs high stress. For consistency, we also used the PSS-4 rating scale for the six questions about stress related to specific items.
We calculated descriptive statistics for all participants on the survey responses at each time point. We compare results from the two cohorts using a two-tailed t test for independent means. Each time point was evaluated separately as a cross-sectional data point that averaged the responses of all participants who participated in the survey at that particular time point. Because survey responses were anonymous, we were not able to follow or link participant responses over time. We performed a single factor analysis of variance (ANOVA) to compare the results in each individual cohort across the four time points. P values less than 0.05 were considered significant. Effect sizes for statistically significant results were measured using Hedges’ g, as the sample sizes were different for each group.
Descriptive Statistics of the Cohort
A total of 411 students were surveyed for the study across four time points and data comparing stress levels are summarized in Table 2. Response rates varied across both cohorts; however, the average response rate across both cohorts and all time points was 18.1%. The minimum and maximum response rates were 9.9% (N = 20) and 33.7% (N = 70), respectively. All response rates are shown in Table 3. Most participants identified as white in both the score (78.9%) and pass/fail cohorts (67.9%), an overrepresentation when compared with the compositions of each student body. See Table 4. There were slightly more students identifying as male in the score cohort (55.1%) and slightly more students identifying as female in the pass/fail cohort (57.1%).
Overall Stress (PSS-4) and Stress over Time
For both the scored cohort and the pass/fail cohort there was no significant difference in reported PSS-4 stress levels from time point 1 to time point 4 (p = 0.23 and p = 0.78, respectively), and stress surrounding Step 2 (p = 0.19 and p = 0.26, respectively). However, both the scored cohort and the pass/fail cohort saw a significant difference in stress levels surrounding Step 1 from time point 1 to time point 4 (p < 0.005 for both).
In addition, there was also no significant difference in reported PSS-4 stress levels between the two cohorts at any given time point (Fig. 2). Stress related to Step 1 was significantly lower in the pass/fail cohort initially, but over time, stress levels related to Step 1 became similar between the cohorts (Fig. 3). Stress related to Step 2 varied over time for both cohorts but was higher in the pass/fail cohort (Fig. 4).
Stress Levels at Each Time Point
Stress related to the Step 1 exam was significantly lower for the pass/fail cohort than the score cohort at the beginning of the M2 year (time point 1) (2.33 vs 3.75, p < 0.001). These results correlated to a 28.4% decrease in stress related to Step 1. Stress related to the Step 2 exam at time point 1 was greater for the pass/fail cohort than the score cohort (2.30 vs 1.44, p < 0.001). These results correlated to a 17.24% increase in stress related to Step 2. There were no significant differences in stress levels related to research experience, extracurricular activities, or pre-clerkship/clinical coursework. See Table 2.
Midway through the M2 year (time point 2), stress related to the Step 1 exam was significantly lower for the pass/fail cohort compared to the score cohort (3.22 vs 4.04, p < 0.001). There was a trend toward higher stress related to research experiences in the pass/fail cohort that was not statistically significant (3.81 vs 3.33, p = 0.078). There were no differences in stress levels between the cohorts for any other items. See Table 2.
At the start of the dedicated study period (time point 3), stress related to Step 1 no longer showed a significant difference between the pass/fail and score cohorts (4.00 vs 3.54, p = 0.104). Stress related to clerkship coursework was higher in the pass/fail cohort (3.23 vs 2.53, p = 0.033). See Table 2.
Halfway through the dedicated study period (time point 4), there was no significant difference in Step 1 stress levels between the pass/fail and score cohorts (4.55 vs 4.53, p = 0.92). Stress related to pre-clerkship coursework was higher in the pass/fail cohort (2.75 vs 1.51, p < 0.001). See Table 2.
We found that while there was no difference in general overall stress as measured by the PSS-4 between the students who took Step 1 for a score (M2023) and students who took Step 1 pass/fail (M2024), we did see differences in stress specific to the Step 1 exam. The score cohort reported significantly more stress related to Step 1 at the beginning of and halfway through their second year than the pass/fail cohort. Their stress over Step 1 outweighed their stress over any other element measured, such as Step 2, the curriculum, research, or extracurricular activities. However, by the time the dedicated study period commenced, the pass/fail cohort’s stress related to Step 1 reached the same level as the score cohort and remained the same through midway into the study period. Additionally, the pass/fail cohort reported more stress than the score cohort related to specific items at varying points: Step 2 at the beginning of their second year, clerkships at the start of the dedicated study period for Step 1, and pre-clerkship grades when midway through the dedicated study period for Step 1.
Our results showed that at our institution, the change to pass/fail scoring on USMLE Step 1 did ameliorate student stress and the “Step 1 climate” during part of the pre-clerkship period, suggesting that the change was successful in reducing the “Step 1 climate” during the pre-clerkship curriculum as stakeholders had intended [9, 10]. While the pass/fail cohort started with less stress surrounding Step 1, we found that both cohorts saw a significant increase in stress related to Step 1 during the dedicated study period. Though the reported stress levels in students taking the Step 1 pass/fail eventually rose to the same level as students who had taken Step 1 for a score, this did not occur until after the completion of the pre-clerkship curriculum, and they entered the dedicated study period. Thus, pass/fail scoring did indeed lead to decreased stress related to Step 1 during the pre-clerkship curriculum. Regarding the similar stress seen in both student cohorts during the dedicated study period, one might postulate that stress during dedicated study may be natural, given the high-stakes nature of passing the exam . This could suggest that a rise in stress during the dedicated study period should be expected and might need to be accepted, with medical schools continuing to offer educational and emotional support and resources.
Ideally, the drop in Step 1 stress would have resulted in a lowering of student stress overall. However, the PSS-4 scores, which measure general stress, were similar between the two cohorts across all time points. It is possible that changes in Step 1 stress were not large enough to impact the overall stress that typically accompanies medical school and/or the PSS-4 was not sensitive enough to pick up the relatively smaller changes in stress levels. Another possibility is the one suggested by Kogan, Hauer, and others, where students’ stress may be shifted to other elements without a change in overall stress [12, 13]. For instance, we found stress related to Step 2 and clerkship coursework appeared higher in the pass/fail group at certain time points which could support their concerns. However, we did not find any significant trends in increased stress surrounding research or extracurricular activities.
Of note, the elevated stress related to Step 2, clerkship coursework, and even the pre-clerkship coursework seen in the pass/fail cohort only occurred during single time points. For instance, the pass/fail cohort had more stress related to Step 2 at the start of their second year of medical school, but this difference in stress was not sustained throughout all time points, which is what one would expect if Step 1 stress were displaced toward Step 2. It is possible that in the Step 1 pass/fail reporting environment, students were able to use some of their stress bandwidth to think about other concerns, with different concerns occupying their attention at different time points: Step 2 concerns at the beginning of the second year moving to concerns about clerkships at the end of the pre-clerkship curriculum/start of the dedicated study period, to concerns about the pre-clerkship curriculum at the time that pre-clerkship performance information is released midway through the dedicated study period. This could potentially be taken as a positive sign of students’ ability to allocate attention to something other than Step 1 performance.
There are limitations to our study. This was a single-institution study involving only two medical school classes. Our response rates were low, particularly in the pass/fail cohort. It is possible that students who were the most stressed would have been the least likely to complete the survey, resulting in an underassessment of student stress, particularly in the pass/fail cohort. Also, because we were not able to link participant responses over time, our results from cross-sectional observations may be due to differences in who completed the survey at each time point. While we attempted to measure student stress longitudinally, we measured it only during the second year of medical school and in the period leading up to the Step 1 exam. It is possible that the changes in stress related to other items, such as clerkships, research, extracurricular activities, and Step 2, would have increased after the Step 1 exam and closer to residency application time. Furthermore, the PSS-4 may not have been a sensitive enough tool to assess overall stress. Finally, this study was performed during a time period of curricular adjustments due to the COVID-19 pandemic, which might have influenced student perceptions of stress. We recommend additional investigations by others to confirm our findings related to the change in stress in students prior to taking their Step 1 exam and further examine the impact on potential later stress related to residency application.
The change in USMLE Step 1 scoring from a 3-digit score to pass/fail Step 1 appears to have decreased the stress specifically related to this examination experienced by second-year medical students at one institution. However, this reduction in stress was not sustained as students entered their study period to prepare for Step 1.
Availability of Data and Material
Data is available upon request.
Step 1. USMLE. https://www.usmle.org/step-exams/step-1. Accessed 1 June 2020
Prober CG, Kolars JC, First LR, Melnick DE. In reply to Mehta et al and to London et al. Academic Medicine. 2016;91(5):610. https://doi.org/10.1097/acm.0000000000001158.
Salari S, Deng F. A stepping stone toward necessary change: how the new USMLE Step 1 scoring system could affect the residency application process. Acad Med. 2020;95(9):1312–4. https://doi.org/10.1097/acm.0000000000003501.
Gauer JL, Jackson JB. The Association of USMLE Step 1 and step 2 CK scores with residency match specialty and location. Med Educ Online. 2017;22(1):1358579. https://doi.org/10.1080/10872981.2017.1358579.
Burk-Rafel J, Santen SA, Purkiss J. Study behaviors and USMLE Step 1 performance. Academic Medicine. 2017;92. https://doi.org/10.1097/acm.0000000000001916.
Gupta A, Saks NS. Exploring medical student decisions regarding attending live lectures and using recorded lectures. Med Teach. 2013;35(9):767–71. https://doi.org/10.3109/0142159x.2013.801940.
Moynahan KF. The current use of United States Medical Licensing Examination Step 1 scores. Acad Med. 2018;93(7):963–5. https://doi.org/10.1097/acm.0000000000002101.
Chen DR, Priest KC, Batten JN, Fragoso LE, Reinfeld BI, Laitman BM. Student perspectives on the “Step 1 climate” in preclinical medical education. Acad Med. 2019;94(3):302–4. https://doi.org/10.1097/acm.0000000000002565.
Mehta NB, Hull A, Young J. More on how USMLE Step 1 scores are challenging academic medicine. Acad Med. 2016;91(5):609. https://doi.org/10.1097/acm.0000000000001153.
USMLE Step 1 transition to pass/fail only score reporting. USMLE. https://www.usmle.org/usmle-step-1-transition-passfail-only-score-reporting. Accessed 1 June 2020.
Mott NM, Kercheval JB, Daniel M. Exploring students’ perspectives on well-being and the change of United States Medical Licensing Examination Step 1 to pass/fail. Teach Learn Med. 2021;33(4):355–65. https://doi.org/10.1080/10401334.2021.1899929.
Rajesh A, Asaad M, Sridhar M. Binary reporting of USMLE Step 1 scores: resident perspectives. J Surg Educ. 2021;78(1):304–7. https://doi.org/10.1016/j.jsurg.2020.06.013.
Kogan JR, Hauer KE. Sparking change: How a shift to Step 1 pass/fail scoring could promote the educational and catalytic effects of assessment in medical education. Acad Med. 2020;95(9):1315–7. https://doi.org/10.1097/acm.0000000000003515.
Cohen S, Kamarck T, Mermelstein R. A global measure of perceived stress. J Health Soc Behav. 1983;24(4):385. https://doi.org/10.2307/2136404.
Warttig SL, Forshaw MJ, South J, White AK. New, normative, English-sample data for the short form perceived stress scale (PSS-4). J Health Psychol. 2013;18(12):1617–28. https://doi.org/10.1177/1359105313508346.
Tagher CG, Robinson EM. Critical aspects of stress in a high-stakes testing environment: a phenomenographical approach. J Nurs Educ. 2016;55(3):160–3. https://doi.org/10.3928/01484834-20160216-07.
Ethics Approval and Consent to Participate
This study is approved by Georgetown-Howard Universities Center for Clinical and Translation Science IRB as exempt (IRB ID: STUDY00002795).
Consent for Publication
All authors have consented to submission and publication.
The authors declare no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Baniadam, K., Elkadi, S., Towfighi, P. et al. The Impact on Medical Student Stress in Relation to a Change in USMLE Step 1 Examination Score Reporting to Pass/Fail. Med.Sci.Educ. 33, 401–407 (2023). https://doi.org/10.1007/s40670-023-01749-4