Do personality traits assessed on medical school admission predict exit performance? A UK-wide longitudinal cohort study
Traditional methods of assessing personality traits in medical school selection have been heavily criticised. To address this at the point of selection, “non-cognitive” tests were included in the UK Clinical Aptitude Test, the most widely-used aptitude test in UK medical education (UKCAT: http://www.ukcat.ac.uk/). We examined the predictive validity of these non-cognitive traits with performance during and on exit from medical school. We sampled all students graduating in 2013 from the 30 UKCAT consortium medical schools. Analysis included: candidate demographics, UKCAT non-cognitive scores, medical school performance data—the Educational Performance Measure (EPM) and national exit situational judgement test (SJT) outcomes. We examined the relationships between these variables and SJT and EPM scores. Multilevel modelling was used to assess the relationships adjusting for confounders. The 3343 students who had taken the UKCAT non-cognitive tests and had both EPM and SJT data were entered into the analysis. There were four types of non-cognitive test: (1) libertariancommunitarian, (2) NACE—narcissism, aloofness, confidence and empathy, (3) MEARS—self-esteem, optimism, control, self-discipline, emotional-nondefensiveness (END) and faking, (4) an abridged version of 1 and 2 combined. Multilevel regression showed that, after correcting for demographic factors, END predicted SJT and EPM decile. Aloofness and empathy in NACE were predictive of SJT score. This is the first large-scale study examining the relationship between performance on non-cognitive selection tests and medical school exit assessments. The predictive validity of these tests was limited, and the relationships revealed do not fit neatly with theoretical expectations. This study does not support their use in selection.
KeywordsMedical school admissions Medical school selection Non-cognitive testing Psychometric testing Situational judgement tests United Kingdom clinical aptitude test (UKCAT)
Emotional non-defensiveness (a domain within MEARS)
Educational performance measure (a measure of examination performance during medical school)
Index of multiple deprivation, a socio-economic indicator, based on postcode (quintiles)
Interpersonal traits questionnaire (see also NACE)
Interpersonal values questionnaire
Managing Emotions and Resiliency Scale
A psychometric test with the domains of narcissism, aloofness, confidence and empathy (see also ITQ)
National statistics socio-economic classification, based on parental occupation (quintiles)
Personal qualities assessment (includes the IVQ and ITQ)
Situational judgement test
Universities and colleges admissions service, an organisation whose main role is to operate the application process to British Universities
United Kingdom clinical aptitude test
United Kingdom foundation programme office
There are a number of issues of importance in selection for admission to medical school (Prideaux et al. 2011; Girotti et al. 2015). One of these is assessing the predictive validity and reliability of any selection tool, to ensure it measures what it claims to measure, does so fairly and consistently and can be employed rationally (e.g., Cleland et al. 2012; Norman 2015). A second is ensuring that selection tools assess the range of attributes considered important by key stakeholders. Medical schools must select applicants who will not only excel academically but also possess personality traits befitting a career in medicine such as compassion, team working skills and integrity (e.g., Albanese et al. 2003; General Medical Council 2009; Frank and Snell 2015; Accreditation Council for Graduate Medical Education 2014).
This increasing recognition that there is more to being a capable medical student or doctor than academic performance follows on from a similar direction of travel in education where, according to a large body of research, a number of non-cognitive skills are associated with positive academic and work-related outcomes for young people (see Gutman and Schoon 2013, for a recent review). Given this, the assessment of non-academic factors, or personality traits is of increasing importance in medical school selection (Patterson 2013). However, “traditional” methods of assessing such personality traits, including unstructured interviews, using personal references and autobiographical statements are now known to have weak predictive validity (Cleland et al. 2012; Patterson et al. 2016). There is a drive to identify better ways to assess non-academic attributes such as values and personality traits in medical school selection.
Various different ways to do so have been proposed. These can be grouped into “paper and pencil” assessments of personality traits (e.g., Adams et al. 2012, 2015; Bore et al. 2005a, b; Dowell et al. 2011; Fukui et al. 2014; James et al. 2013; Lumsden et al. 2005; Manuel et al. 2005; Nedjat et al. 2013), structured multiple interview approaches (Dore et al. 2010; Eva et al., 2004a, b, 2009; Hofmeister et al. 2008, 2009; O’Brien et al. 2011; Reiter et al. 2007; Roberts et al. 2008; Rosenfeld et al. 2008), selection centres (Gafni et al. 2012; ten Cate and Smal 2002; Ziv et al. 2008; Gale et al. 2010; Randall et al. 2006a, b) and—the “new kid on the block”—situational judgement tests (Christian et al. 2010; Koczwara et al. 2012; Lievens 2013; Lievens et al. 2008; Patterson et al. 2009).
However, there are relatively few studies examining the predictive validity of the “paper and pencil” tests which aim to assess personality traits in medical school applicants. Those which have been published are often concerned with feasibility of use across cultural settings (e.g., Fukui et al. 2014; Nedjat et al. 2013) and/or are descriptive in terms of cross-sectional comparisons across different groups of students (e.g., graduate entrants versus school-leavers: Bore et al. 2005a, b; James et al. 2013; Lumsden et al. 2005; Nedjat et al. 2013). The few studies of predictive validity to date tend to be small scale, usually single site (Adams et al. 2012, 2015; Manuel et al. 2005) and/or use local assessments as outcome measures (Adams et al. 2012, 2015; Dowell et al. 2011; Manuel et al. 2005), limiting their generalizable messages. Large-scale, independent studies of the predictive validity of approaches to assessing personality traits, or non-academic factors in medical selection are lacking, partly because appropriate non-academic outcome markers are not easily available.
Moreover, there is much debate about the promise of personality traits for predicting success generally, the different approaches being advocated to measure these, and a clear need for more evidence (Norman 2015; Powis 2015). Drawing on the wider educational literature again, it is clear that personality traits include a very broad range of characteristics. These can be separated into those considered to be modifiable, such as motivation, resilience, perseverance, and social and communication skills, and those considered more stable or personality traits, which include Openness to Experience, Conscientiousness, Extraversion, Agreeableness, and Neuroticism (also called Emotional Stability) (Gutman and Schoon 2013). There is a wealth of evidence indicating that the latter, the “Big Five” personality traits, correlate highly with job performance over a range of occupational groups (e.g., Barrick and Mount 1991; Rothmann and Coetzer 2003; Salgado 1997; Dudley et al. 2006) and with performance at medical school (e.g., Lievens et al. 2002). It is this apparently stable group of traits which has been used as the theoretical basis of most “paper and pencil” assessments of personality traits designed specifically for use in medical school selection (see earlier for references). However, the more recent approaches to measuring personal characteristics in medical school selection have a slightly different conceptual basis. For example, rather than being based directly on the “Big Five” theory of personality traits, SJTs are based on implicit trait policy (ITP) theory and, depending on the job level, specific job knowledge (e.g., Motowidlo and Beier 2010a, b; Patterson et al. 2015a, b). SJTs measure the expression of personality traits in hypothetical situations which are designed on the basis of what is expected in the job for which the individual is being assessed (Motowidlo et al. 2006). They encompass measurement of personal choice (e.g., what is the best way to respond in this particular situation?) rather than just unfiltered (our word) trait expression which is arguably what is measured in traditional personality tests. There is also a pragmatic difference between “paper and pencil” tests and the SJTs. The latter are based on thorough job analysis (Patterson et al. 2012b, Motowidlo et al. 1990) of what is expected by doctors in particular roles (e.g., junior doctor (resident)/or doctor working in a particular specialty) and take the stance that “one size does not fit all”, whereas the former are typically more general measures of traits which are considered generically important to being a doctor. We return to the implications of these different positions and theoretical underpinnings for assessing personality traits in medical school section in the discussion section of this paper.
Several major changes in selection for medical school and medical training after graduation in the UK now enable large-scale multi-site studies examining the predictive validity of selection processes, including those proposing to measure personality traits. The first of these is greater consistency across UK medical schools in terms of their selection approaches (Cleland et al. 2014), with, for example, the vast majority of UK medical schools using the same aptitude test, the UK Clinical Aptitude Test (UKCAT), as part of their selection matrix. While the focus of the UKCAT is assessment of cognitive ability, “non-cognitive” or personality trait tests were included, on a trial basis, in 2007–2009. The second is the introduction of a standardised, national process for selection into the next stage of medical training after medical school in the UK, via the Foundation Programme Office (UKFPO). Those entering the selection process for the UKFPO obtain two indicators of performance: an Educational Performance Measure (EPM) and the score they achieve for a Situational Judgement Test (SJT). We present details of these indicators later in this paper. Finally, there is a move within the UK for organisations such as UKCAT and the UKFPO to work together in terms of data linkage, to enable large-scale, high-quality, longitudinal research projects.
Together, these innovations finally provide the opportunity to address a gap in the literature highlighted many years ago (see Schuwirth and Cantillon 2005). Our aim in this paper is therefore to examine the predictive power of tests purporting to assess personal personality traits in relation to two national performance indicators on exit from medical school: an academic progress measure and a measurement of personality traits determined, through a job analysis, to be associated with successful performance as a Foundation Programme doctor. We do so with data from a large number of medical schools.
This was a quantitative study grounded in post-positivist research philosophy (Savin Badin and Major, Savin Baden and Major 2013). We examined the predictive validity of the personality traits, or “non-cognitive” component of the UKCAT admissions test (http://www.ukcat.ac.uk/) compared to the UK Foundation Programme (UKFPO: (http://www.foundationprogramme.nhs.uk/pages/home)) performance indicators in one graduating student cohort.
Our sample was the 2013 graduating cohort of UK medical students from the 30 UKCAT medical schools. This was the first cohort for whom both UKCAT and UKFPO indicators were available.
With appropriate permissions in place, working within a data safe haven (to ensure adherence to the highest standards of security, governance, and confidentiality when storing, handling and analysing identifiable data), routine data held by UKCAT and UKFPO were matched and linked.
The Interpersonal Values Questionnaire (IVQ) measures the extent to which the respondent favours individual freedoms (versus societal rules) as a basis for making moral decisions (Bore et al. 2005a, b; Powis et al. 2005). The rationale being that this dimension of moral orientation, the extent to which the individual will ‘act in own best interests’ (Libertarian) vs ‘act in interests of society (Communitarian). This has one domain entitled libertarian (low score –communitarian (high score). Candidates are presented with a number of situations where people have to decide what to do according to their opinions or values, responding via a 4 point Likert scale to decide where best their values sit.
The Interpersonal Traits Questionnaire (ITQ) or NACE, which measures narcissism, aloofness, confidence (in dealing with people) and empathy Munro et al. (2005); Powis et al. (2005). It claims to assess specific aspects of the wider domain of empathy; a high degree of empathy is linked to convivial interpersonal relationships and is generally seen as a positive thing in care-givers; although too high a degree of empathy it is argued could lead to over-involvement and burnout. ITQ produces a summary score for INVOLVEMENT where C + E − (N + A), therefore some totals may be negative overall representing ‘detachment)’. Overall confidence and empathy are deemed positive, narcissism and aloofness negative. The candidates who receive this test are presented with 100 statements about people and the way in which they might think or behave in certain situations. They are then given a 4 point Likert scale, and asked to decide which statements most relate to them.
The Managing Emotions and Resiliency Scale (MEARS) (Childs et al. 2008) was designed to reflect the cognitive, behavioural and emotional elements of resilience and describe coping styles in terms of attitudes, beliefs and typical behaviour, in six domains: self-esteem, optimism, self-discipline, faking, emotional non-defensiveness, and control (Childs 2012). In each a high score reflects a high perceived self-value in that domain. It is reported as three scores: cognitive/self-esteem and optimism scales, behavioural/control and self-discipline and emotional non-defensiveness. Candidates receive a set of paired statements that represent opposing viewpoints. They must decide their level of agreement within a six point range.
1 and 2 above combined, both in an abridged format.
Medical school performance by decile (presented as 34–43 points).
Additional degrees, including intercalation (up to 5 points).
Publications (up to 2 points).
We chose this as an outcome measure as, given that there is emerging consensus that the SJT is essentially a measurement technique that targets non-cognitive attributes (Motowidlo and Beier 2010a, b), this offers a meaningful interim outcome marker for non-academic measures used within medical school selection processes.
The EPM and SJT are summed to give the UKFPO score out of 100.
All data were analysed using SPSS 22.0. Pearson or Spearman’s rank correlation coefficients were used to examine the linear relationship between each of SJT score and EPM and continuous factors such as UKCAT scores and pre-admission academic scores and age. In terms of practical interpretation of the magnitude of a correlation coefficient, we have a priori defined low/weak correlation as r = 0.10–0.29, moderate correlation as r = 0.30–0.49 and strong correlation as r ≥ 0.50. Two-sample t-tests, ANOVA, Kruskal–Wallis or Mann–Whitney U tests were used to compare UKFPO indices across levels of categorical factors as appropriate.
Multilevel linear models were constructed to assess the relationship between the independent variables of interest: UKCAT non-cognitive test totals and individual domains with each of the four outcomes (SJT, EPM decile, EPM total and UKFPO total). Fixed effects models were fitted first and then random intercepts and slopes were introduced using maximum likelihood methods. Intercepts and slopes for the medical schools were allowed to vary for the non-cognitive tests variables only. Models were adjusted for identified confounders (based on pre-hoc testing showing a correlation coefficient of >0.2 or <−0.2) such as gender, age at admission, IMD quintiles, year UKCAT exam was taken and whether or not the student attended a fee-paying school (NS-SEC and ethnicity had to be dropped from the models due to issues with non-convergence). Interactions between our primary variables and year of UKCAT exam were tested using Wald statistics and was dropped from the models if not significant at the 5 % level. Nested models were compared using information criteria such as the log-likelihood statistic, Akaike’s information criteria, and Schwarz’s Bayesian information criteria. The best fitting models are presented.
There were 6294 students from 30 medical schools in the graduating 2013 cohort. UKCAT non-cognitive and UKFPO results were available for the 3343 students who sat the UKCAT in 2007 (n = 2714) and 2008 (n = 629) but not those who sat the test in 2006 as non-cognitive tests were not part of UKCAT in 2006—i.e. those applying in 2006 had not had the non-cognitive tests administered.
Descriptive statistics of the demographic variables of the sample
Age at admission (years), median (IQR)
19 (18, 22)
Female, n (%)
Ethnic group, n (%)
Type of secondary school, n (%)
IMD quintile, n (%)
NS-SEC score, n (%)
Domicile, n (%)
In terms of outcome measures, as would be expected in a decile system such as the EPM, the percentage of graduating students within each decile per school were relatively constant (varying between 9.7 and 11.2) with only the lowest decile as an outlier (7.6). EPM, SJT and total UKFPO scores are shown in Table 1. Almost one half (47.8 %) of the sample had no additional EPM points, 34.9 % (n = 1168) gained three or more further degree points, which indicates they had either intercalated or entered medicine as an Honours graduate. Most (75.3 %) did not gain any points for publications, while 18.4 % gained 1 point, and 6.3 % 2 points.
UKCAT non-cognitive domain scores
UKCAT non-cognitive domain scores
Points available and (actual results range)
Test 1, n = 879
Libertarian communitarian, mean (SD)
Test 2 NACE, n = 973
Test 3 MEARS, n = 600, median (IQR)
74 (70, 81)
82 (78, 103)
63 (59, 71.5)
74 (70, 90)
86 (82, 92)
77 (74, 84)
Test 4, n = 891
Univariate analysis: relationship between demographic variables and outcomes
Age at entry; rank correlation (r)
5 (3, 8)
38 (41, 44)
40.4 (38.3, 42.3)
81.1 (76.9, 85.2)
6 (4, 8)
38 (41, 44)
41.3 (39.1, 43.3)
82.4 (78.3, 86.4)
6 (4, 9)
41 (38, 44)
41.3 (39.2, 43.3)
82.8 (78.8, 86.6)
5 (2, 7)
40 (37, 43)
39.9 (37.7, 41.9)
79.9 (75.8, 83.9)
Type of secondary school attended
6 (3, 8)
41 (38, 44)
41.1 (39, 43.2)
82.1 (78, 86.1)
6 (3, 8)
41 (38, 44)
41 (38.9, 43)
82.1 (77.7, 86.1)
6 (4, 8)
41 (38, 44)
41.3 (38, 44)
82.4 (78.4, 86.3)
6 (3, 8)
41 (38, 44)
41.3 (38, 44)
82.4 (78.3, 86.5)
6 (3, 8)
41 (38, 44)
41.0 (38, 44)
81.8 (77.5, 85.6)
6 (3, 8)
41 (38, 43)
40.6 (38, 43)
81.0 (77.1, 85.3)
5 (2, 8)
40 (38, 43)
40.2 (37, 43)
80.3 (76.5, 84.1)
6 (3, 8)
41 (38, 44)
41.1 (39.1, 43.1)
82.3 (78.4, 86.2)
6 (3, 8)
40 (37, 43)
41.1 (38.9, 42.9)
80.8 (78.3, 84.1)
5 (3, 8)
41 (38, 44)
40.6 (38.3, 42.8)
81.5 (77.5, 84.9)
5 (3, 8)
40 (37, 43)
39.9 (37.8, 42.2)
80.8 (77.1, 83.6)
5 (3, 7)
39 (37, 42)
40.0 (37.8, 42.0)
78.7 (76.5, 83.6)
6 (4, 8)
42 (40, 45)
40.2 (38.2, 42.5)
82.5 (79.2, 85.9)
4 (2, 7)
39 (36, 43)
39.3 (36.5, 41.3)
78.8 (73.9, 83.3)
6 (3, 8)
41 (38.8, 43.1)
41 (38.8, 43.1)
82.0 (77.9, 86)
Total UCAS Score (r)
Linear regression showed that there was no significant association between EPM decile or total EPM and any of the individual domains in the non-cognitive tests 1, 2 and 4. In test 3, however, there was modest correlation between total EPM and each of the individual MEARS domains (r = 0.255–0.449, p < 0.001) and there was weak correlation between the MEARS domains and EPM decile (r = 0.085–0.211). There was no significant correlation between any of the non-cognitive tests and the SJT score. Total UKFPO had weak correlation with the MEARS domains (r = 0.209 to 0.318). Of note, there was a strong correlation between student age and MEARS total (Spearman’s r = 0.570. p < 0.001). (Not shown in tabular form).
As a large number of multi-variate analysis tests were performed, where significant results were obtained, the effects are quite small.
Multilevel analysis—non-cognitive test coefficients adjusted for year of UKCAT exam, gender, age at admission, school type attended and IMD quintile for the four outcomes
Estimate (95 % CI)
Estimate (95 % CI)
Estimate (95 % CI)
Estimate (95 % CI)
n = 721
Libertarian-communitarian (TEST 1)
0.0055 (−0.015, 0.026)
0.0073 (−0.008, 0.023)
0.007 (−0.013, 0.027)
0.012 (−0.025, 0.05)
n = 774
NACE total (TEST 2)
0.009 (−0.005, 0.023)
0.001 (−0.012, 0.014)
−0.001 (−0.016, 0.014)
0.009 (−0.011, 0.03)
n = 774
−0.022 (−0.058, 0.014)
−0.003 (−0.031, 0.025)
0.002 (−0.03, 0.034)
−0.019 (−0.073, 0.034)
−0.066 (−0.122, −0.01)
−0.014 (−0.058, 0.03)
−0.013 (−0.062, 0.037)
−0.079 (−0.161, 0.004)
−0.002 (−0.042, 0.037)
−0.018 (−0.049, 0.013)
−0.013 (−0.048, 0.022)
−0.014 (−0.072, 0.045)
−0.071 (−0.116, −0.026)
−0.003 (−0.038, 0.032)
−0.003 (−0.043, 0.037)
−0.074 (−0.14, −0.007)
n = 472
0.013 (−0.017, 0.043)
−0.008 (−0.018, 0.003)
−0.005 (−0.017, 0.007)
−0.006 (−0.022, 0.01)
−0.016 (−0.056, 0.024)
−0.019 (−0.056, 0.017)
−0.022 (−0.062, 0.019)
−0.031 (−0.093, 0.032)
0.055 (0.006, 0.104)
0.053 (0.009, 0.097)
0.074 (0.025, 0.123)
0.133 (0.056, 0.209)
0.001 (−0.056, 0.059)
−0.041 (−0.093, 0.01)
−0.051 (−0.109, 0.006)
−0.05 (−0.14, 0.041)
−0.03 (−0.077, 0.017)
−0.019 (−0.062, 0.024)
−0.035 (−0.083, 0.012)
−0.069 (−0.143, 0.006)
−0.025 (−0.072, 0.021)
0.039 (−0.003, 0.081)
0.058 (0.011, 0.105)
0.025 (−0.048, 0.098)
0.002 (−0.05, 0.055)
−0.075 (−0.157, 0.008)
n = 708
Libertarian-communitarian (TEST 4)
−0.022 (−0.054, 0.01)
−0.011 (−0.033, 0.011)
−0.006 (−0.031, 0.019)
−0.028 (−0.073, 0.016)
n = 705
NACE total (Test 4)
0.022 (−0.013, 0.057)
0.011 (−0.010, 0.032)
0.006 (−0.018, 0.029)
0.025 (−0.016, 0.067)
In the MEARS domains, the emotional non-defensiveness (END: how one feels and reacts to people and situations) sub-test stood out as predicting all measures positively, with an accumulative effect such that a modest and achievable 7.5 extra marks (out of a valid range of 24–144) would improve total UKFPO score by 1 mark out of 100. Interestingly, increased self-esteem (out of 126) was related to a decrease in EPM decile and this filtered through to EPM total. One extra mark in aloofness (out of 50) led to a decrease in SJT score of 0.066 points, in other words, 15 extra aloofness marks led to a decrease in SJT of one point. Similarly, 14 extra points in empathy (out of 50) on average predicted one less SJT point.
This is the first study examining the predictive validity of paper and pencil tests of personality traits on admission to medical school against academic and non-academic outcomes on exit, in relation to both school-based and national performance indicators. We found some significant correlations but all with low effect sizes and an overall inconsistent picture. For example, aloofness and empathy scores on the NACE negatively predicted performance on the SJT but not the EPM decile or EPM total. Moreover, the actual patterns seem conflicting–higher empathy (representing emotional involvement) and higher aloofness (representing emotional detachment) both predicted performance in the same direction. Similarly, scores on the MEARS instrument generally lacked correlation although, first, it seemed the most sensitive test in that modest differences in scores could influence performance on the outcome measures, and, second, two scales appeared of interest. The emotional non-defensiveness (END: how one feels and reacts to people and situations) sub-test stood out as predicting all outcomes measures positively while higher self-esteem was associated with lower EPM decile and EPM total scores. EPM is an indicator of academic achievement, mostly test performance, both written and clinical, but this does fit with the wider, non-medical literature which highlights that non-cognitive attributes can influence cognitive test performance (e.g., Gutman and Schoon 2013). However, these tests are not primarily being employed to predict academic performance and the small effect size with the SJT does not, on its own, seem sufficient to justify the use of such a test (although there may be an argument to explore the utility of the END sub-test further).
Where do our findings sit in comparison to previous literature? Powis and colleagues developed the Personal Qualities Assessment (PQA: which includes the IVQ and ITQ) and tested it in a number of centres. However, few of the reported studies have examined the predictive validity of the PQA, and those which have been carried out are limited in their methodology (e.g., small scale, local outcome measures e.g., Adams et al. 2012, 2015; Dowell et al. 2011; Manuel et al. 2005) and—at best—find only modest correlations (Adams et al. 2012, 2015). We would argue that, given the evidence to date as to the utility of SJTs in a variety of professional groups (see earlier, and Patterson et al. 2012a, b for a review) the use of a validated SJT as an outcome measure is more robust than the comparators used by other authors, and hence the weak relationship we found is probably a more accurate assessment of the power of the IVQ and ITQ to predict outcomes at the end of medical school.
It has been argued that the non-academic attributes the PQA measures are desirable in clinicians until the extremes are reached, as too much or little of any may be problematic. Indeed, Powis (2015) has gone as far as suggesting that the minority at these extremes might be excluded from the selection process. This view is not widely supported (e.g., Norman 2015) and indeed, given the low effect sizes and inconsistent picture we found with the NACE, we elected not to assess the ‘extremes’ as advocated by Powis and colleagues (e.g., Bore et al. 2009; Munro et al. 2008) as there seemed no justification for doing so. Certainly, on our evidence, the PQA cannot be justified as a tool or filter for excluding individual candidates.
Should we have expected there to be an association between performance on the various non-cognitive tests included in the UKCAT, and the EPM and the SJT? It could be argued that we compared apples and pears by expecting tests of personality traits to predict academic performance and the expression of job-specific personality traits in hypothetical situations. On the other hand, there is evidence that the “Big Five” personality factors correlate with academic performance at medical school (e.g., Lievens et al. 2002) and with implicit trait policies (ITPs) (Motowidlo et al. 2006a, b). However, what about the additional influence of other personality traits such as motivation, resilience, perseverance, and social and communication skills (Gutman and Schoon 2013)? It was made clear to applicants that the non-cognitive tests within the UKCAT would not be used in selection decisions, so it would not be unreasonable to assume that those sitting this part of the UKCAT were less motivated to do well on these tests compared to the “high stakes” cognitive UKCAT tests. Conversely, the Foundation Programme application process is competitive so motivation to do one’s best will be high.
There is also the issue of beliefs about the costs and benefits associated with expressing certain traits in particular situations. While ITP theory proposes to be related to individuals’ inherent tendencies or traits, individuals must make judgements about how and when to express certain traits. Thus, SJTs are designed to draw on an applicant’s knowledge of how they should respond in a given situation, rather than how they would respond. Although this seems a conceptual gap to us, there is some evidence that SJTs predict performance in one medical training context, that of UK general practice training (Lievens and Patterson 2011) (and the wider literature also suggests that the way an individual responds to an SJT question does predict actual behaviour and performance once in a role (e.g., McDaniel et al. 2001)). Validity studies have also shown that SJTs add incremental validity when used in combination with other predictors of job performance such as structured interviews, tests of IQ and personality questionnaires (O’Connell et al. 2007; McDaniel et al. 2007; Koczwara et al. 2012). While the focus of this paper is not to analyse the conceptual and theoretical frameworks of personality tools, it is essential that these are critically examined in order to develop, evaluate and compare medical selection tools and how these are used in admissions/selection processes.
This study is unusual in its scale, allowing for accurate estimates of correlations, subgroup analysis and multilevel modelling to more accurately estimate effect sizes. However, the range of outcome markers available was limited. The EPM is an indication of overall course academic achievement as judged against peers within each medical school: without a common exit exam it is not clear how much variation there is between schools and we are unable to estimate this effect, or correct for this. It is also a complex and varied measure as it includes other degrees and publications that will be confounded by age and other factors such as previous degrees. However, there are currently no comprehensive, standardized assessments across the UK akin to say the Canadian or US licensing examination, and so we had to be pragmatic and use what outcome measures were available to us. The SJT predictive validity remains to be determined but there is good reason to expect this, based on related previous work (McManus et al. 2013; Patterson et al. 2012a, 2015a). However, although we did not have access to the full dataset of test-takers (i.e. including those who either were not admitted to medical school or who did not graduate in 2013), mean scores and ranges across the non-cognitive tests were very similar between the full dataset summary (UKCAT technical reports 2007 and 2008) and the results from this graduating cohort (data not shown). In other words, those who graduated did not have significantly different non-cognitive scores from those who did, implying no range restriction due to subset selection. The non-cognitive tests were included in the 2007 and 2008 UKCAT test battery on a trial basis, and it was made clear that this data would not be used in decision making: this “low stakes” situation may have influenced candidate test behaviour, as discussed earlier (e.g., Abdelfattah 2010).
Norman (2015) argues that, without a clear negative relationship between academic achievement and desirable non-academic attributes, selection for medical school can and should seek students with attributes in both domains. To do so, requires valid, reliable and affordable measurement techniques if we are to avoid an overly large initial filter on purely academic grounds. We must conclude that none of the non-cognitive tests evaluated in this study have been shown to have sufficient utility to be used in medical student selection in their current forms. Newer non-cognitive tests, such as the UKCAT entry level SJT (http://www.ukcat.ac.uk/about-the-test/situational-judgement/) will hopefully prove to be more useful in our context, when scrutinised in due course. We intend to follow up this cohort of doctors to examine the predictive validity of the cognitive and non-cognitive tests used at admission to medical school against post-graduate outcome measures.
The Chair of the local ethics committee ruled that formal ethical approval was not required for this study given the fully anonymised data was held in safe haven and all students who sit UKCAT are informed that their data and results will be used in educational research. All students applying for the UKFPO also sign a statement confirming that their data may be used anonymously for research purposes.
We thank the UKCAT Research Group for funding this independent evaluation and thank Rachel Greatrix and Sandra Nicholson of the UKCAT Consortium for their support throughout this project, and their feedback on the draft paper. We also thank Professor Amanda Lee and Ms Katie Wilde for their input into the application for funding, and ongoing support.
This study addressed a research question posed by a funding committee, of which JD was a member. JC and RMcK wrote the funding bid. JD advised on the nature of the non-cognitive data. RMcK managed the data and carried out the preliminary data analysis under the supervision of DA. DA advised on all the statistical analysis and carried out the multi-variate analysis. JC wrote the first draft of the introduction and methods sections of this paper. RMcK and DA wrote the first draft of the methods and results section, and JD the first draft of the discussion. JC and RMcK revised the paper following review by all authors.
- Accreditation Council for Graduate Medical Education (ACGME). (2014). ACGME mission, vision and values. https://www.acgme.org/acgmeweb/tabid/121/About/Misson,VisionandValues.aspx.
- Adams, J., Bore. M., McKendree, J., Munro, D., & Powis, D. (2012). Can personal attributes of medical students predict in-course examination success and professional behaviour? An exploratory prospective cohort study. BMC Medical Education, 12:69. http://www.biomedcentral.com/1472-6920/12/69/abstract.
- Childs R. (2012). Accessed 17th February 2015. http://www.teamfocus.co.uk/tests-and-questionnaires/understanding-motivation/resilience-scales.php.
- Childs, R., Gosling, J., & Parkinson, M. (2008). Resilience Scales User’s Guide Version 1. Accessed 18th February 2015. http://www.teamfocus.co.uk/user_files/file/Career%20Interests%20Inventory%20Users%20Guide%202013.pdf.
- Cleland, J. A., Dowell, J., McLachlan, J., Nicholson, S., & Patterson, F. (2012). Identifying best practice in the selection of medical students. http://www.gmc-uk.org/Identifying_best_practice_in_the_selection_of_medical_students.pdf_51119804.pdf.
- Cleland, J. A., Patterson, F., Dowell, J., & Nicholson, S. (2014). How can greater consistency in selection between medical schools be encouraged? A mixed-methods programme of research that examines and develops the evidence base. http://www.medschools.ac.uk/SiteCollectionDocuments/Selecting-for-Excellence-research-Professor-Jen-Cleland-etal.pdf.
- Dowell, J., Lumsden, M. A., Powis, D., Munro, D., Bore, M., Makubate, B., & Kumwenda, B. (2011). Predictive validity of the personal attributes assessment for selection of medical students in Scotland. Medical Teacher 33, e485–e488 http://informahealthcare.com/doi/abs/10.3109/0142159X.2011.599448?prevSearch=allfield%253A%2528jon%2Bdowell%2529&searchHistoryKey.
- Dudley, N. M., Orvis, K. A., Lebiecki, J. E., & Cortina, J. M. (2006). A meta-analytic investigation of conscientiousness in the prediction of job performance: Examining the intercorrelations and the incremental validity of narrow trait. Journal of Applied Psychology, 91, 40–57.CrossRefGoogle Scholar
- Frank, J.R., & Snell, L. (2015). The draft CanMEDS 2015: physician competency framework. http://www.royalcollege.ca/portal/page/portal/rc/common/documents/canmeds/framework/framework_series_1_e.pdf.
- Fukui, Y., Noda, S., Okada, M., Mihara, N., Kawakami, Y., Bore, M., et al. (2014). Trial use of the personal qualities assessment (PQA) in the entrance examination of a Japanese Medical University: Similarities to the results in western countries. Teaching and Learning in Medicine, 26, 357–363.CrossRefGoogle Scholar
- General Medical Council. (2009). Tomorrow’s doctors: Outcomes and standards for undergraduate medical education. GMC, London. http://www.gmc-uk.org/Tomorrow_s_Doctors_1214.pdf_48905759.pdf.
- Gutman, L.M., & Schoon, I. (2013). The impact of non-cognitive skills on outcomes for young people. Institute of Education, London. https://educationendowmentfoundation.org.uk/uploads/pdf/Non-cognitive_skills_literature_review.pdf.
- Hofmeister, M., Lockyer, J., & Crutcher, R. (2008). The acceptability of the multiple mini interview for resident selection. Family Medicine, 40, 734–740.Google Scholar
- McManus, I. C., Woolf, K., Dacre, J., Paice, E., & Dewberry, C. (2013). The academic backbone: Longitudinal continuities in educational achievement from secondary school and medical school to MRCP(UK) and the specialist register in UK medical students and doctors. BMC Medicine 11(1), 242. http://www.biomedcentral.com/content/pdf/1741-7015-11-242.pdf.
- Munro, M., Bore, M., & Powis, D. (2008). Personality determinants of success in medical school and beyond: “steady, sane and nice”. In S. Boag (Ed.), Personality down under perspectives from Australia (pp. 103–112). New York: Nova Science Publishers Inc.Google Scholar
- Nedjat, S., Bore, M., Majdzadeh, R., Rashidian, A., Munro, D., Powis, D., et al. (2013). Comparing the cognitive, personality and moral characteristics of high school and graduate medical entrants to the Tehran University of Medical Sciences in Iran. Medical Teacher, 35, e1632–e1637.CrossRefGoogle Scholar
- Patterson, F. (2013). Selection for medical education and training: Research, theory and practice. In K. Walsh (Ed.), Oxford Textbook for Medical Education (pp. 385–397). Oxford: Oxford University Press.Google Scholar
- Patterson, F., Aitkenhead, A., Edwards, H., Flaxman, C., Shaw, R., & Rosselli, A. (2015a). Analysis of the situational judgement test for selection to the foundation programme 2015: Technical Report.Google Scholar
- Patterson, F., & Ashworth, V. (2011). Situational judgement tests; the future for medical selection? British Medical Journal. http://careers.bmj.com/careers/advice/view-article.html?id=20005183.
- Patterson, F. P., Kerrin, M., Edwards, H., Ashworth, V., & Baron, H. (2015) Validation of the F1 selection tools. Leeds: Health Education England, 2015. www.foundationprogramme.nhs.uk/download.asp?file=Validation_of_the_F1_selection_tools_report_FINAL_for_publication.pdf. Accessed 9 November 2015.
- Powis, D. A., Bore, M. R., Munro, D., & Lumsden, M. A. (2005). Development of the personal qualities assessment as a tool for selecting medical students. Journal of Adult & Continuing Education, 11, 3–14.Google Scholar
- Rothmann, S., & Coetzer, E. P. (2003). The big five personality dimensions and job performance. Journal of Industrial Psychology, 29, 68–74.Google Scholar
- Savin Baden, M., & Major, C. H. (2013). Qualitative research: The essential guide to theory and practice. Routledge, London. Cited in Cleland, J. A, & Durning, S. J. (2015) Researching medical education. Wiley, London.Google Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.