Introduction

Letters of recommendation (LORs) are a critical part of a residency candidate’s application. They provide an opportunity for a mentor or teacher in the field to speak to an applicant’s clinical aptitude, personality, communication skills, and demonstrated passion for a specialty which is especially beneficial in small, apprentice-style fields such as radiation oncology (RO). Program director survey studies underscore the importance of LORs in interview selection and National Resident Matching Program (NRMP) rankings[1]. As historically objective measures used to compare applicants, such as medical school grading and Step 1 scores, are shifting to a pass/fail binary system, LORs will likely become even more important to distinguish candidates from one another.

Although LORs can provide critical information regarding an applicant’s strengths and weaknesses not described elsewhere in an application, their subjective nature can lend itself to potential implicit and/or explicit bias on the basis of gender, race, or ethnicity. For example, social science research has demonstrated that, not only do LORs tend to be longer for men than for women, but they contain more standout adjectives such as “superb,” “outstanding,” and “remarkable” as well as more research-related descriptors [2]. Recently published studies in general surgery [3, 4], transplant surgery [5], otolaryngology [6, 7], and emergency medicine [8] show differences in both the length of LORs as well as the terms used to describe men and women. This trend does not end after residency acceptance; LORs submitted as part of the application package to join as medical school faculty have also shown gender biases [9]. Racial inequities are also present in the field of medicine; students who are underrepresented in medicine (UIM) were described as less “agentic” than white and Asians students in diagnostic radiology residency LORs [10]. To increase objectivity and circumvent potential bias in unstructured prose, some specialties such as emergency medicine have transitioned to a standardized LOR form [11]. However, many specialties, including RO, still use the free-text LOR format.

Even when individuals strive to behave and judge others in an egalitarian way, they may still exhibit implicit or unconscious bias [12]. These biases may exist independently of one’s consciously held beliefs and impact one’s nonverbal behaviors and social judgments [13, 14]. Understanding subtle discrepancies in how candidates are described may allow residency selection committees to become more cognizant of gender, racial, and ethnic inequities to conduct more equitable searches for candidates. In this study, we evaluated LORs written on behalf of applicants invited for interview for a RO residency position at a large, academic tertiary care cancer center. We sought to assess evidence of bias on the basis of applicant gender and race. Additionally, we aimed to analyze these LORs based on characteristics of the letter writer to evaluate whether gender, profession, or academic rank correlated with a higher incidence of writing letters containing potential bias.

Methods

Study Population

This study was conducted following approval from the institutional review board as well as the NRMP. Electronic residency application system (ERAS) files and LORs for applicants invited to interview at the University of Texas MD Anderson Cancer Center RO residency program from the 2015–2019 application cycles were pulled from institutional internal archival records by program coordinator staff. Applicants were assigned a unique study number by program coordinator staff, and details including self-reported gender, race, ethnicity, and PhD status were recorded. LORs were redacted for all identifying information before analysis. The letter writer’s gender was determined by name using the https://genderize.io/ website and secondarily by review of their professional photograph on a departmental website, where available. The letter writer’s institution, professional field (RO vs. other), and academic rank (assistant professor, associate professor, professor, or unknown) were recorded from the heading and/or signature line.

Letter of Recommendation Analysis

For the purposes of obtaining a standardized word count, the LORs were truncated to exclude the heading, salutation, and signature components of the text. Deidentified LORs were then analyzed using the latest (2015) version of Linguistic Inquiry and Word Count (LIWC) software. LIWC has been used and validated in several other published analyses of gender bias [15,16,17,18]. For this investigation, three authors (BVC, EBL, and EBH) reviewed the LORs and predetermined 11 themes of interest through an iterative process to create a customized data dictionary within LIWC. These themes included grindstone adjectives, standout adjectives, expressions of desirability, mentions of research, patient care, skills and knowledge, indications of efficiency or organization, agentic personality traits, communal or friendly nature, social or familial descriptors, and introverted personality traits. The details of the data dictionary used for this analysis are included in Supplemental Table 1.

Next, the individual LOR text documents were evaluated for gender bias using a publicly available gender bias calculator (http://slowe.github.io/genderbias/). This calculator’s output includes a percentage bias in the male or female direction based on the number of “female-associated” and “male-associated” words and was derived from previously published data collected from letters of recommendation for male and female chemistry and biochemistry job applicants [17].

Statistical Analysis

Statistical comparison of categorical applicant and letter writer characteristics by gender were conducted using Chi-square testing. Non-parametric tests were also used to compare linguistic domain scores on the basis of applicant gender and race/ethnicity as well as letter writer gender, professional field, and academic rank. Post hoc Nemenyi testing was applied to significant Kruskal-Wallis test results. All analyses were conducted using R 3.6.1 (R Core Team) and p values < 0.05 were considered statistically significant. Figures were generated using the ggplot2 package.

Results

Four hundred and eighty-seven LORs from 125 applicants (60 female and 65 male) were included in the study. The median number of LORs per applicant was 4 (range 3–5). Applicant and letter characteristics are described in Table 1, stratified by applicant gender. There were no significant differences in number of letters per applicant, PhD degree, or self-identification as UIM based on applicant gender. Similarly, there were no differences in academic rank, professional field, or institutional affiliation (home or away) of letter authors according to applicant gender.

Table 1 Characteristics of LORs and letter writer by applicant gender

LIWC score domains and linguistic gender bias scores are summarized in Table 2 by applicant gender and UIM status. There were no significant differences in these parameters between male and female applicant groups. However, letters written on behalf of UIM applicants were significantly less likely to include standout descriptors (P = 0.008), and there was a trend for these letters to more often include statements describing patient care (P = 0.06).

Table 2 Word count, frequency of linguistic domain characteristics, and gender bias calculation of letters of recommendation by applicant gender and race/ethnicity

The association of letter writer attributes with LIWC score domains is summarized in Table 3. Male writers were less likely to describe applicant characteristics related to patient care (P < 0.0001) and agentic personality (P = 0.006). Letters written by radiation oncologists were shorter (P < 0.0001), and less likely to include standout descriptors (P = 0.014). However, these LORs were also more likely to include statements regarding applicant desirability (P = 0.045) and research (P = 0.008). Author academic rank also significantly associated with several LIWC domains. LORs written by associate professors were significantly longer than letters written by both assistant professors (P = 0.032) and full professors (P < 0.001). Associate professors were more likely than full professors to use grindstone descriptors (P = 0.038). Additionally, letters from assistant professors more often included statements of applicant desirability compared with letters from full professors (P = 0.037).

Table 3 Word count, frequency of linguistic domain characteristics and gender bias of letters of recommendation by letter writer gender, professional field, and academic rank

Patterns of applicant-writer dyad characteristics are shown in Fig. 1, stratified according to applicant gender, writer gender, and writer academic rank. The majority of letters were written by men (71.9%); however, female applicants were significantly less likely than male applicants to have letters written by male authors (65.9 vs 77.3% for female and male applicants, respectively, P = 0.0058). There were no differences in proportions of letters written by authors of varying rank based on applicant gender. By contrast, there were significant differences in academic rank according to writer gender, with female writers being less likely than male writers to have more advanced academic positions (P = 0.0002).

Fig. 1
figure 1

The number of letters of recommendation displayed by gender dyad. The first column shows letters for female applicants written by female writers. The second column shows letters for female applicants written by male writers. The third column shows letters for male applicants written by female writers. The fourth column shows letters for male applicants written by male authors. Each column is separated by the academic rank of the letter writer

According to the gender bias calculator, language across all LORs was male-biased (P < 0.001). Waterfall plots describing gender bias calculator scores according to applicant and letter writer gender are provided in Figs. 2 and 3, respectively. There were no differences in gender bias based on either applicant or writer gender. However, there were differences in the use of gendered language according to the academic rank of authors, with assistant professors less often using male-biased language than both associate (P = 0.0064) and full professors (P = 0.023).

Fig. 2
figure 2

The degree of gender bias as calculated by the bias calculator: http://slowe.github.io/genderbias/. Negative values on the y-axis represent male bias in language used. Higher absolute value corresponds with the strength of the bias. Each bar represents a letter written either for a female (red) or male (blue) applicant. Each letter is considered a single unit on the x-axis

Fig. 3
figure 3

The degree of gender bias as calculated by the bias calculator: http://slowe.github.io/genderbias/. Negative values on the y-axis represent male bias in language used. Higher absolute value corresponds with the strength of the bias. Each bar represents a letter written either by a female (red) or male (blue) letter writer. Each letter is considered a single unit on the x-axis

Discussion

To our knowledge, this is the first study to evaluate evidence of bias in LORs written on behalf of RO residency applicants. Within this cohort applicants selected for interview, we did not observe linguistic differences in the composition of LORs for male versus female applicants. However, LORs for UIM applicants had fewer standout descriptors like “best,” “leader,” and “exceptional” in their LORs compared to their White and Asian counterparts, suggesting that implicit biases may be present in the selection process. In regard to letter writer demographics, we noted LORs written by ROs were shorter, included fewer standout descriptors, and emphasized an applicant’s desirability more than LORs written by mentors from other fields. Although language was overall more male-biased, assistant professors used more gender-balanced language, which may reflect generational change. Taken together, this exploratory analysis suggests that implicit biases may be disproportionately operating in letters for UIM students and among certain groups of letter writers.

Our analysis showed fewer standout descriptors were used by letter writers in the LORs for UIM applicants for RO residency positions. While there are many published works evaluating gender bias, racial and ethnic biases in the residency selection process have not been well studied [10, 18]. Black trainees are disproportionately underrepresented in RO and practicing Black ROs only represented 3.3% of the workforce in 2019 [19, 20]. There are even fewer Black and UIM ROs in the upper echelons of academia such as at the program director and chairperson level. The reasons for this striking disparity are certainly multifactorial and may be related to broader structural issues. There has been considerable attention paid recently to disparities in the pipeline of students studying science, technology, engineering, and medicine (STEM) [21, 22], but more recent studies have also shown that Black and LatinX students who choose to pursue a STEM degree are more likely to drop out or switch disciplines [23]. Racism, discrimination, and explicit bias are, unfortunately, still endemic within the medical profession [24, 25]. UIM candidates may face additional barriers as admission committees have been shown to exhibit a pro-white bias as evidenced by the Implicit Association Test [26, 27]. As physicians from UIM backgrounds are less likely to demonstrate pro-white bias, this strengthens calls for increased diversity among selection committees [28].

Studies in general surgery residency application indicate that the reputation of the letter writer is heavily weighted [29]. Although it has not been formally explored, this may also impact the radiation oncology residency application process given the relatively small size of the field. A recent study by Sidiqi et al. noted that the most common reasons to do a radiation oncology “away” rotation were an interest in a specific program (44%) and to acquire LORs for residency application (31%) [30]. Consequences from COVD-19 including the cancelation of away rotations will undoubtedly complicate the 2020–2021 application cycle. Many students will request LORs from radiation oncologists at their “home” institution or from mentors outside of radiation oncology. However, this may further amplify disadvantages among students without a “home” radiation oncology department or access to “well-known” academic ROs.

This work is important in that it starts the conversation regarding potential bias with LORs for RO residency applicants. This is particularly timely given the inevitable increase in scrutiny of LORs for future applicant classes as more objective components of the ERAS application are phased out, most notably, the numeric score of USMLE Step 1. However, limitations of this study must be addressed. Although we sincerely hope we did not observe measurable gender bias within the LORs for our sample because the playing field between men and women is becoming more level, we acknowledge that our analysis was restricted to a highly select group of applicants. Archival records were only available for applicants who were invited to interview after initial screening of their ERAS application by the program director and residency selection committee. As such, we were not able to assess the broader, unselected pool of applicants to see if bias exists in the LORs for those who did not receive an invitation. This is the first major limitation of our analysis and suggests that further work is necessary to validate our findings. Secondly, we had limited information on the demographics of the letter writers regarding race and ethnicity. Gender identify of letter writers was inferred from the letter writer photograph and name which does not capture the letter writer’s self-identified gender which in fact may be non-binary. Some writers submitted multiple letters for various applicants over the span of our study. Other factors that may impact the application package including scores, reputation of the medical school, and publication record were not accounted for. Lastly, investigating subjective topic in an objective manner has inherent flaws, but our method was systematic and consistent across all 487 letters and studied groups.

Overall, this study provides valuable insight into the presence of linguistic biases present in letters of recommendation for radiation oncology applicants that disproportionally affect students from UIM backgrounds. We posit that increasing awareness of racial disparities in medicine will provide an opportunity to address these biases, specifically at the level of residency selection committees. Some opportunities for improvement include the implementation of implicit bias training for both members of residency selection committees as well as faculty who are asked to write LORs. Standardized LORs which have been adopted by emergency medicine [11], orthopedic surgery [31], and otolaryngology [32] are other means to increase objectivity in the application but do have the caveats of a limited narrative and fewer opportunities to disclose specific skills or anecdotes. Others in RO have been outspoken about deemphasizing “away” rotations and faculty LORs to create a more even playing field across students of varying socioeconomic backgrounds [33]. Virtual medical student clerkships in the 2020–2021 cycle by virtue of concerns of COVID-19 may mitigate some discrepancies and increase access to RO mentors. There is a lot that remains to be studied to address gender and racial inequities in medicine, specifically in RO. This exploratory study lays the groundwork for future studies which will examine a broader pool of LORs and additional applicant and letter writer parameters to evaluate the presence of potential linguistic bias used to describe residency applicants.