Personality, which arises from an individual’s characteristic pattern of thinking, feeling, and behaving (Murthy et al., 2013), is a central concept in psychological research and a commonly used way of categorizing others. Informal evaluations such as “Jane is sweet” or “Joe is headstrong” fulfill people’s need to have coherent images of those around them and to be able to predict their reactions. In contrast, psychologists use objective measures of known validity and reliability to assess people’s personalities (Archer, 2014). Since the 1990s, most researchers have embraced the idea that there are five basic personality traits, known as the Big Five, namely, neuroticism, extraversion, openness to experience, conscientiousness, and agreeableness. Several tools have been developed to measure these traits and thereby quantify differences in people’s personalities (McCrae & Costa, 2003; Widiger, 2017). The NEO Five Factor Inventory (NEO-FFI) is the best known and most used of these tools.

Personality traits play both direct and mediating roles in medical students’ well-being, capacity for empathy, and career intentions (Eley et al., 2019; Haight et al., 2012; Hojat & Zuckerman, 2008; Lo et al., 2018; Mehmood et al., 2013; Mullola et al., 2019; Murthy et al., 2013; Prins et al., 2019; Toto et al., 2015). Moreover, four literature reviews conducted during the last two decades have found personality traits, especially extraversion and conscientiousness, to be the individual characteristics that most consistently associate with students’ academic performance (Chisholm-Burns et al., 2021; Doherty & Nugent, 2011; Ferguson et al., 2002; Hojat et al., 2013). This finding has led medical schools to use scores on personality traits associated with academic success, often measured indirectly via tools such as Multiple Mini Interviews and Situational Judgement Tests, as selection criteria for new students (Albanese et al., 2003; MacKenzie et al., 2017; Musson, 2009; Patterson et al., 2016; Powis, 2009; Powis et al., 2007).

Nevertheless, the research described above assumes that personality traits are stable characteristics (Caspi et al., 2005). This traditional view is now being challenged by research into maturational processes and recent theories according to which personality traits may vary as a result of psychosocial processes triggered by certain life situations and events (Ferguson & Lievens, 2017). One such event is the passage from school to university (Atherton et al., 2021; Bleidorn et al., 2018; Specht et al., 2011). The potential for people’s personalities to change has important implications for using personality evaluations as a selection criterion for medical students. Indeed, if students’ personalities evolve, pre-admission measures of personality traits may not correlate with future academic performance.

Two studies have investigated the predictive validity of personality assessments across medical training, but they both measured personality traits only on admission to medical school. Lievens and colleagues reported increases over time in the operational validity of extraversion, openness, and conscientiousness scores for predicting performance (Lievens et al., 2009), whereas Ferguson et al. found that high levels of conscientiousness were linked to better knowledge-based performance during preclinical training but to poorer clinical knowledge during clinical training. This result suggests that a trait can have both a “bright” side and a “dark” side (Ferguson et al., 2014).

The present longitudinal analysis is, to our knowledge, the first study to investigate changes in personality traits across initial medical training and to determine whether any such changes impact personality measures relation to examination performances. Changes in personality traits are most commonly assessed by determining their rank-order stability (consistency in the relative ordering of individuals on a given trait over time), mean-level changes (absolute mean-level change in scores for a given trait over time), and, to a lesser extent, individual changes (the magnitude of any change over time in an individual’s score on a trait) (Edmonds & Hill, 2020). To evaluate personality traits measures validity we used Messick’s unified validity framework, notably we sought evidence regarding relations to other variables (predictive validity) (Boateng et al., 2018; Cook & Beckman, 2006).

The present study used all of these methods to determine (1) whether personality traits change across medical school and (2) whether personality traits assessed at the beginning of medical school correlate with final examination performance, controlling for gender and examination format.



Participants were students who entered Geneva Medical School in 2012 and 2013. To be included in the study, a student had to have completed the personality questionnaire at the beginning (Year 1) and end (Year 6) of medical school and to have taken the Swiss Federal Licensing Examination (FLE) at the end of Year 6.

Data collection

The present study was part of a larger longitudinal research project conducted between 2011 and 2019 in which, each year, participants completed a questionnaire at the beginning of a compulsory class. The full method is described in previous papers (Abbiati et al., 2016; Piumatti et al., 2020). Before agreeing to participate in the study (by signing a consent form), students had been informed by email, ten days prior to signing session, of the content of the research project, their rights and commitments as volunteer participants, and the terms of confidentiality and privacy. The chair of the Cantonal Commission for Ethical Research exempted the present study from formal review.

Participants were asked to complete a personality questionnaire on three occasions during the 6-year medical training program: at the beginning of Year 1 (baseline), at the end of pre-clinical training (Year 3), and at the end of clinical training (Year 6). The measures of participants’ academic performance on entering and on finishing medical school were, respectively, their scores on the Swiss medical studies aptitude test (Eignungstest fur das Medizinstudium in der Schweiz, EMS) and on the FLE. These scores were provided by Geneva Faculty of Medicine and the Institute for Medical Education.


Demographic data

Demographic data included in the study were age, gender, nationality (Swiss, European countries, Non-European countries), and type of high school diploma (scientific vs. other). We used parents’ level of education (primary, secondary, tertiary) as an indirect measure of a student’s socioeconomic level (UNESCO, 2011).


Participants completed the French version of the NEO Five Factor Inventory (NEO-FFI) (Aluja et al., 2005; Costa & McCrae, 1992), whose 60 items measure the Big Five personality traits (12 items per trait). Items are scored on 5-point Likert scales from 0 (strongly disagree) to 4 (strongly agree), giving a maximum score of 48. All five factors had acceptable internal reliability (Cronbach’s alphas were between 0.61 and 0.82), although extraversion and agreeableness had weaker internal reliability than neuroticism, openness to experience, and conscientiousness (Abbiati et al., 2014).

Performance on the EMS

As a measure of academic performance before medical school, we used participants’ scores on the EMS, which assesses students’ reasoning and problem-solving abilities, but not their scientific knowledge, communication skills, or social skills. Test scores are given as percentage rankings from 0 (low) to 100 (high). Medical schools in the German-speaking part of Switzerland use the EMS as an admissions test (Hänsgen & Spicher, 2011). Geneva University Medical School trialed the EMS between 2010 and 2012, during which period all applicants had to complete the test, although the results were not incorporated into the selection process (Cerutti et al., 2013).

Learning outcomes

Medical training at Geneva University Medical School is a 6-year program. At the time of the study students take a competitive examination at the end of Year 1 (average pass rate = 65%)(Abbiati et al., 2016) and a final written examination (the FLE) at the end of Year 6 (average pass rate = 99.5%). The FLE combines multiple-choice questions (MCQ) with objective structured clinical examinations (OSCE).

Statistical analysis

We carried out supporting data and wave analyses to assess non-response biases and used attrition analysis to compare baseline personality measures for participants who did and who did not complete the personality measure at the end of medical school (Phillips et al., 2016). We then performed Pearson’s chi-squared tests to investigate dependence between the categorical variables and conducted analyses of variance to determine differences between groups of continuous variables.

We used Spearman correlation coefficients (r), Cohen’s d effect sizes, and Reliable Change Indices (RCI) to assess, respectively, personality rank-order stability, mean changes in personality traits between time-points, and changes in individuals’ personality traits (Christensen & Mendoza, 1986; Damian et al., 2019). Critical values were defined as following: for Spearman’s r : > 0.50 = high, 0.30 to 0.50 = moderate, and 0.25 to 0.30 = low; for Cohen’s d: > 0.80 = large, 0.30 to 0.80 = medium, and 0.20 to 0.30 = small (Cohen, 1988). We also used Pearson’s chi-squared tests to assess the reliable deviation from a stable situation (considered as follows: 2.5% of individuals with a decrease, 95% stable, and 2.5% with an increase).

We performed multiple linear regression analyses to determine whether MCQ and OSCE final examination scores correlate with personality traits, EMS, and gender. Results are given as percentages of variability explained (R2 and adjusted R2) and estimated coefficients with 95% confidence intervals.

Type I error rates were set at 0.05. All analyses were performed using R version 4.1.1 (R Foundation for Statistical Computing, Vienna, Austria) and SPSS version 24 (IBM Corp., Armonk, NY, USA).


Main sample

Of the 419 students who completed the baseline personality questionnaire (78.5% response rate), 272 (156 females) were admitted to Year2. Mean age at Year 1 was 21 years (standard deviation 2.0). Gender and age distributions were similar to those for medical students within the Geneva area and within Switzerland (Office fédéral de statistique, 2021).

Table 1 Demographic Data and EMS Scores for the 419 Students Initially Enrolled in the Study

Table 1 compares students who did and who did not progress to Year 2 of medical school in 2012 and 2013. A wave analysis showed no differences in the two groups’ gender and age distributions. Most Year 1 students were Swiss, held a scientific high school diploma, and had college-educated parents. Males and students whose parents had had a college education were more likely to progress to Year 2 of medical school. Students who progressed (vs. did not progress) had higher EMS scores. All the students admitted to Year 2 sat the FLE at the end of Year 6 and therefore constituted our analytic sample for predictive validity.

Personality traits longitudinal sample

Of the 272 students in our sample, 191 (69.2%) completed the personality questionnaire on all three occasions (Years 1, 3, and 6) and 53 (19.5%) completed the personality questionnaire in just Years 1 and 6. Applying the eligibility criteria (having completed the personality questionnaire in Year 1 and in Year 6 of medical school and to have taken the Swiss Federal Licensing Examination (FLE) at the end of Year 6) led us to exclude 28 participants (10.3%) from the personality traits stability analyses. Consequently 12 participants were excluded after the Year 1 data collection point; 16 participants were excluded after the Year 3 data collection point. There were no significant differences in personality traits between the included and the excluded students or between the 2012 and 2013 cohorts. As a result, our longitudinal personality traits sample (n = 244, 138 females) included 65% of the 376 students (226 females) who were admitted to Year 2 of medical school in 2012 or 2013, that is, 90% of the 272 students (156 females) who were admitted to Year 2 and who also completed the baseline questionnaire.

Table 2 Descriptive Statistics for the Big Five Personality Traits at Years 1 and 6

Table 2 shows gender-stratified means and standard deviations for the five personality traits, measured in Year 1 and in Year 6. In both Years 1 and 6, females had higher scores than males for neuroticism and agreeableness. This was also the case, but to a lesser extent, for conscientiousness and extraversion.

Rank-order stability

Table 3 shows personality rank-order stabilities (r) between Years 1 and 6, between Years 1 and 3, and between Years 3 and 6. Mean rank-order stability over six years ranged from 0.55 for extraversion to 0.62 for openness, with a mean of 0.56 for all five personality traits. Rank-order stabilities for most of the traits were above 0.50. The mean rank-order stabilities of the five personality traits were 0.68 for Years 1 to 3 and 0.69 for Years 3 to 6. Rank-order stabilities for most of the traits were above 0.60 for both 3-year periods.

Table 3 Test-Retest Correlations and Standardized Mean-Level Change Effect Sizes (Cohen’s d) for Each Personality Trait

Mean-level changes

Table 3 shows the Cohen’s d effect sizes for standardized mean-level changes in personality traits for males and females between Years 1 and 6, between Years 1 and 3, and between Years 3 and 6. Between Years 1 and 6 (mean ages 21.0 and 26.0 years, respectively), students’ mean agreeableness scores increased (d = + 0.72), whereas their mean neuroticism (d = -0.29) and conscientiousness (d = -0.25) scores decreased. Overall, changes in personality traits were greater between Years 1 and 3 than they were between Years 3 and 6: There was a moderate increase in agreeableness (d = + 0.56) and a moderate decrease in conscientiousness (d = -0.34) between Years 1 and 3, but only a small increase in agreeableness (d = + 0.24) and a small decrease in neuroticism (d = -0.21) between Years 3 and 6. Extraversion increased between Years 1 and 3 but decreased between Years 3 and 6 (d = + 0.30; d = -0.22, respectively).

Individual changes

Reliable Change Indices (RCIs) showed no change in personality traits between Years 1 and 6 for between 74.2% and 95.5% of the students, depending on the trait considered (see Table 4). However, neuroticism, conscientiousness, extraversion, and agreeableness decreased significantly for 18.8%, 13.1%, 4.9%, and 1.20% of the students, respectively. Conversely, agreeableness, conscientiousness, neuroticism, and extraversion increased significantly for 14.3%, 7.4%, 7.0%, and 4.1% of the students. The only significant difference between gender was for neuroticism, which increased for 13.2% of males but for only 2.2% of females.

Table 4 Year-1 to Year-6 Reliable Change Indices for Individuals’ Personality Traits Stratified by Gender

Multiple linear regression analysis

The linear regression analysis revealed links between FLE scores and EMS scores, personality traits, and gender (Table 5). EMS scores (MD = 0.25, p < .001) and conscientiousness (MD = 0.20, p = .029) correlated with higher FLE MCQ scores. Neuroticism (MD = 1.57, p < .001), extraversion (MD = 1.48, p = .024), being female (MD = -17.34, p = .025), and EMS scores (MD = 0.71, p = .025) correlated with higher FLE OSCE scores.

Table 5 Linear Regression Analysis of Gender, EMS Scores, and Personality Traits on Final Examination Performance


Before deciding to integrate personality assessments into medical school selection processes, it is essential to determine whether students’ personalities change during the 6-year course. To this end, we conducted the first longitudinal study of rank-order stability, mean-level changes, and individual changes in the personality traits of two cohorts of medical students. We also analyzed the predictive validity of personality traits, EMS scores, and gender for performance on the MCQ and OSCE components of Switzerland’s medical licensing examination.

Results showed classic gender differences in personality traits and that students’ personality traits generally remained stable across medical school (Costa & McCrae, 1992). The personality traits rank-order stability correlations we obtained were similar to those reported by previous studies showing age-related changes in stability (e.g., rank-order correlations of between 0.40 and 0.50 at age 12 years and of around 0.70 at age 70 years, (Edmonds & Hill, 2020). Also in line with previous studies, the correlations we obtained decreased over the test-retest period (Costa & McCrae, 1992). Overall, the average mean-level change was just a quarter of a standard deviation and therefore too small to engender observable differences in everyday behavior (Hojat et al., 2013; Hojat & Zuckerman, 2008).

Individual-level analyses of personality traits revealed the proportion of students for whom changes in personality across the study period were reliable. These changes were similar to the mean-level changes, that is, they were mostly decreases in neuroticism and conscientiousness and increases in agreeableness. The changes in neuroticism and agreeableness were lower than but consistent with those reported by studies in developmental psychology (Borghuis et al., 2017; Costa et al., 2019; Damian et al., 2019; Hampson & Goldberg, 2020). This was not the case for the decrease in conscientiousness. Moreover, most of these changes occurred during preclinical training (Years 1 to 3), which suggests that they were due to a combination of students adapting to the new educational environment and their increasing maturity (Atherton et al., 2021; Specht et al., 2011). First-year medical students tend to be more conscientious than first-year psychology and science students, probably because students must be very conscientious, organized, efficient, and rational in order to successfully complete the highly competitive first year of medical school (Abbiati et al., 2016; Abbiati, Vaudraz, et al., 2015). However, they may get away with being less conscientious during the rest of their training, as drop-out rates are extremely low and the pass rate on the final examination is very high. Thus, the small decrease in conscientiousness during preclinical training suggests that some students are able to regulate this “bright” trait, within limits, according to their perceptions of situational requirements (Ferguson & Lievens, 2017). Similarly, students seem to regulate their neuroticism, a trait often perceived as “dark” due to its links with anxiety and depression, but which can also help people be more sensitive to emotions (Costa et al., 2019; Ferguson et al., 2014).

Finally, our results confirm the predictive validity of personality traits. In line with the findings of a preliminary 3-year study (Abbiati, Horcik, et al., 2015), conscientiousness and extraversion correlated significantly with examination performance, with only small differences in operational validity for both. Furthermore, although EMS scores correlated with OSCE and MCQ scores, gender and personality traits appear to be linked more strongly with OSCE performance than they are with MCQ performance. In other words, the enthusiasm, precision, and sensitivity to people’s feelings associated with higher levels of neuroticism and extraversion may be assets in examinations testing clinical skills, an aspect of medicine that cannot be tested using multiple-choice questions.

These insights into personality changes both in medical students as a group and in individual students are, nevertheless, subject to certain limitations. First, we explored the stability of personality in a medical school where students take a competitive examination at the end of Year 1 but face no further selection process. Although studies on different populations, in different cultures, and with different test-retest periods have reported similar findings regarding the overall stability of personality traits and the direction of any changes, our results may not be generalizable to settings involving selection processes at several stages during medical training (e.g., competitive final examination to enter certain residencies). Second, the self-report NEO-FFI questionnaire we used to assess personality traits is a reliable tool for measuring personality traits, but other methods, such as contextual measures and third-party evaluations, are also available. Finally, we assessed the stability of single personality traits (variable-centered approach), rather than the stability of overall personality profiles (person-centered approach).


Medical students’ personality traits remain more-or-less stable across their training program and are usual correlates of examination performance that medical schools can incorporate into their student-selection process. Changes in some students’ personalities may be due, at least partly, to them adapting to their new learning context. Differences in the correlations between personality traits and different examination formats suggest that (medical) schools could adapt examination formats to favor certain personality traits.

Our results could help medical schools recruit students with a wider range of profiles. In fact, selection begins long before students apply to medical school, as some potential candidates may rule themselves out due to preconceived ideas about the types of student medical schools accept. Determining which traits students must have if they are to meet the medical profession’s and society’s changing needs and incorporating personality assessments into screening processes could enable medical schools to expand recruitment to students with different backgrounds and personal characteristics.