Background

Medicine is a highly popular career choice internationally. For example, each year in the UK alone there are over 19,000 applicants to medicine and approximately 42,000 in the USA [1]. Likewise, selection to internship and residency training programmes is very competitive and these high stakes assessments determine which graduates ultimately work in the various specialities. As attrition rates in medical education are very low and most students graduate, the composition and calibre of the future medical workforce is significantly dependent on the methods used to select medical students [2,3,4]. Hence medical student selection is a topic of considerable public interest with numerous stakeholder groups. These include applicants and potential applicants; selectors such as medical school admissions committees; medical students; the medical profession; school career guidance teachers and society. Arguably, the most important stakeholders are patients. Best practice in the design, development and continued use of selection methods should be an iterative process informed by regular feedback from stakeholders [5]. The term political validity captures the centrality of stakeholder views and, is defined as “the extent to which various stakeholders and stakeholder groups consider the tool(s) to be appropriate and acceptable for use in selection” [6, 7]. Political validity is recognised as an important consideration in widening access to medical schools [8, 9]. Elsewhere it has been argued that political validity is to some extent informed and influenced by evidence for the construct validity of selection tools [10]. According to Kane construct validity “is a property of the proposed interpretations and uses of the test scores” [11]. Five sources of evidence to support test interpretation are recommended: test content; relationship to other variables, response process, internal structure and consequences of testing [12]. It is beyond the scope of this paper to provide an in-depth definition of these and the reader is directed to the most recent edition of Standards for Educational and Psychological Testing, for further information [13]. However it is likely that different sources of evidence, exert varying degrees of influence on stakeholders’ opinions, and this may differ depending on the stakeholder group in question.

Understanding stakeholder perceptions is important for a number of other reasons. Selection methods that are perceived as unfair may deter potential medical students from applying which would be considered a profoundly negative consequential effect [9]. Under-representation of lower socio-economic and minority groups in medicine is multifactorial but arguably these groups are particularly vulnerable to the consequences of negative perceptions regarding selection [14]. Additionally, in some situations, there appears to be a trade-off between stakeholder views and other criteria used to evaluate the appropriateness of selection tools, such as predictive validity and reliability. For example personal statements, letters of reference and traditional interviews continue to enjoy widespread use, despite evidence of limited predictive validity and susceptibility to bias [15,16,17,18,19]. It has been argued that this can in part be explained by these tools serving some other political agenda for which they achieve stakeholder acceptance and approval [9]. It is crucial therefore that stakeholder views are explored, understood and communicated effectively in order to increase the likelihood that selection tools can be developed that can meet with stakeholder approval whilst also satisfying the other important psychometric criteria. Finally a thorough understanding of the basis for stakeholders’ views will better enable selectors to explain the rationale supporting some, perhaps less popular, but more psychometrically robust selection tools.

Stakeholder views: a theoretical framework

Over the past fifty years, organisational justice theories have been developed to describe perceptions of fairness in organisational processes, including selection [20,21,22]. Patterson et al. and Kelly have established that organisational justice theories are relevant to selection in medicine and that they can be used to provide deeper insights into and appreciation of the views of stakeholders [7, 23].

These justice theories can be categorised as distributive, procedural and interactional- see glossary [24]. In the context of selection distributive justice relates to the fairness of selection outcomes - such as medical school places, in terms of equal opportunity and equity [7]. From a distributive justice perspective selection is viewed to be fair when everyone receives the same opportunities [25]. Procedural justice in selection is concerned with the perceived fairness of the selection tool in terms of job relevance and characteristics of the test [7]. From a procedural justice perspective selection is viewed more positively when the methods are connected with the job and when the purpose of the method is explained [22, 26]. While the interactional justice of selection methods refers to how applicants are met during the selection process and includes the information applicants are given as well as the manner in which it is conveyed [27, 28]. The fairness of the communication is a very influential determinant of how interactional justice is perceived [28, 29].

Despite their significance, to our knowledge, there has been no review that draws together the views of stakeholders when considering the appropriateness of various selection methodologies. Therefore this review is necessary and timely, as important questions remain to be answered. This study aims to (i) systematically review the literature with respect to stakeholder views of selection methods for medical school admissions; (ii) relate the findings to organisational justice theories and (iii) identify priority areas for future research.

Methods

There was no published review protocol.

Search strategy

Data searching and subsequent critical review of identified articles was informed by best evidence medical education guidelines [30,31,32,33]. The search strategy was developed in collaboration with a research librarian (JM). Nine electronic databases were searched: PubMed, EMBASE, SCOPUS, OVID Medline, PsycINFO, Web of Science, ERIC, British Education Index and Australian Education Index. Relevant papers were identified using search terms (including synonyms) for each of the four concepts “stakeholder”, “views”, “selection” and “medical school”. Terms were mapped to MESH terms or the appropriate term from the controlled thesaurus of the various databases. In addition, text word searches were used for key words. See Additional file 1 for sample search.

For the purposes of this review “Stakeholders” were defined as those who are affected by or can affect recruitment processes [34]. The search terms for stakeholder were deliberately cast widely to encompass as many stakeholder groups as possible. “View” was defined as an opinion or attitude. “Selection” was taken to mean any admission test or entrance assessment process that a medical school applicant would have to go through in order to be offered a place. “Medical school” was taken to include both graduate and undergraduate schools. Additionally, as there is significant overlap between some methods used for selection to medical school and selection to higher professional training (for example Multiple Mini Interviews (MMIs) and Situational Judgement Tests (SJTs) are increasingly used in both settings) this search was widened to include internship and residency. Within each concept, terms were joined using the Boolean operator “OR”. The four searches were then combined with the operator “AND”. Language or type of publication restrictions were not applied during the searching phase. The reference lists of papers included in the review were hand searched for additional relevant publications. Two experts in the field were contacted for any additional records or unpublished work. Further grey literature searching was facilitated by searching for conference publications and networking with researchers in the field which provided access to unpublished reports, doctoral theses work and abstracts.

The inclusion criteria were: (a) Studies published between January 2000 and July 2014 (this time frame was chosen as many of the selection methods in current use were neither available nor widely used prior to 2000) (b) Studies evaluating selection to medical school or studies evaluating selection to residency and internship programmes which described selection processes relevant to selection to medical school (for example- studies focussing on the residency match rank process were not included) (c) Studies which reported the views of at least one stakeholder group established by means of quantitative, qualitative or mixed methods research. The exclusion criteria were: (a) Reviews or articles which were not original studies (b) Papers for which an English language translation was not available on contacting the authors. As this was a systematic review which did not involve any original stakeholder data, ethical approval was not required.

Results

Study selection and data extraction

Figure 1 illustrates the steps from initial identification of records, to identifying those included and excluded. Records were retrieved from the electronic search as follows: all records identified in the electronic database search (total n = 2686) and by the additional means described above (n = 26) were transferred to EndNote database, duplicates were removed (by automatic deduplication and manual check) and the remaining records were inspected (n = 1017).

Fig. 1
figure 1

Study Search Strategy and Review Process

Two reviewers (MK and AWM) independently assessed all titles for relevance (n = 1017). Where disagreement arose the record was included for review of abstract. These reviewers also independently screened abstracts of all retained records (n = 233) to identify those to be assessed on full text, with 95.71% agreement. This left a total of 108 records which were read in full by three reviewers (MK, AWM, SO’F) and independently assessed for eligibility to be included in the full review. Disagreement was managed by consensus in consultation with another author (FP). Subsequently 71 records were included for full review and 37 excluded. Figure 1 indicates the reasons for exclusion.

The following data were collected from each eligible record and collated in a data extraction form: author, publication year, type of publication, principal study aim, location and setting, study design, medical selection tool used, stakeholder characteristics (including identification of stakeholder group, sample size, response rate, gender, age, socioeconomic group, background if provided), data collection method and overall findings.

Quality assessment strategy

Quality criteria of quantitative records was assessed using the Medical Education Research Study Quality Instrument (MERSQI), a commonly used validated ten-item checklist for rating the methodological quality of medical education research papers [35, 36]. This instrument has six domains (study design, sampling, type of data, validity of evaluation instrument, data analysis and outcomes). As a quality assurance step, at the outset, a sample of five records were independently scored using the MERSQI instrument by MK and AM. The scoring was discussed and debated and consensus was reached as to the interpretation of the scoring grid. For consistency one author (MK) then applied the MERSQI instrument to the retained records. For studies with multiple aims, for example assessing predictive validity and stakeholders’ views, the MERSQI rating was applied to the portion of the study that assessed stakeholder views, as this is the subject of this systematic review. It was not appropriate to use MERSQI to assess qualitative studies, and with respect to mixed methods studies, the score refers to the quantitative strand only.

Quality assessment and evidence synthesis

The MERSQI ratings for the included records ranged from 3 to 10.5, out of a total possible score of 18. The mean MERSQI score was 7.2 and the median 7.5 (total MERSQI scores, for all included records, are presented in Table 1. Additional file 2 presents the completed MERSQI scoring matrix for all records). By comparison, a review of over 200 published peer review medical education papers determined that the mean MERSQI of published papers was 9.95 (range 5–16) [36]. This indicates that the overall quality of the retained records was generally low, reflecting the standard of currently available literature on stakeholder views. (See Study Limitations in the Discussion section). MERSQI scores were used to compare quality and were not used for the purpose of excluding records from this review. Due to the heterogeneity of studies, and the wide variety of selection methods and evaluative measures used, it was not possible to pool results statistically. Therefore the evidence is synthesised into a narrative review.

Table 1 Summary of the Research Evidence of Stakeholder Views of Selection to Medicine

Risk of Bias

The included studies ranged from qualitative to quantitative and mixed methods and were predominately descriptive study designs. Therefore, performing a risk of bias-assessment across studies was not possible, and we focused instead on assessing the quality of the reporting of data and outcomes of the studies using the MERSQI tool.

Study designs

A data display matrix summarising the main research findings, and MERSQI scores of the studies included in this review is presented, in alphabetical order (see Table 1).

Included records comprised eight qualitative studies [37,38,39,40,41,42,43,44]. Seven were mixed methods studies [45,46,47,48,49,50,51]. The remaining records were quantitative [52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106].

Twelve records were abstracts [37, 54, 68, 73, 84, 88, 91, 93, 95,96,97, 102]. Two were peer-reviewed short research reports [75, 85], one PhD [44], one commissioned report [47], one letter describing original research [76]. The remaining were peer-reviewed original research papers.

Twenty-two records were from studies conducted in the UK, 12 in Canada, 12 in USA, 6 in Australia, 5 in Ireland, 2 in New Zealand, 2 in Belgium, 1 in Australia/ Canada, I in USA/Canada, and 1 each in Israel, Pakistan, Netherlands, Singapore, Thailand, and Saudi Arabia respectively.

The sample size ranged from a minimum of 14 to a maximum of 9067 (mean 397, median 91) excluding qualitative studies. Twenty-nine records included the views of more than one stakeholder group, most commonly applicants and assessors.

Synthesis of results

The research largely explored the views of three main stakeholder groups: a) applicants; b) selectors and c) medical students.

The views of applicants

Applicants constituted the most researched stakeholder group (45 records).

Interviews including multiple mini interviews (MMIs)

Applicants’ views of MMIs, both at medical school and residency training levels, have been extensively surveyed internationally, most likely reflecting their novelty, within the timeframe of this review. The research was generally of good quality (10 records with a MERSQI score over 8), achieving high response rates (9 records with response rates over 75%) and a reasonable sample size (9 records where n = ranged 69–324).

Applicants are on the whole supportive of MMIs. They perceive that they are generally fair, relatively free of gender or cultural bias, provide adequate opportunity to present their abilities and strengths and that the quality of advance information and clarity of instructions are good [45, 48, 50, 55, 61, 63, 72, 99]. Applicants indicate a preference for MMIs over traditional interviews [45, 48, 60, 72, 89, 96]. One paper included the views of a small number of unsuccessful applicants, and found that the majority still commented positively on MMIs [50]. Applicants value the perceived independence of interviewers and the authenticity of MMIs [39, 45, 96]. In particular the multiple opportunities for applicants to demonstrate abilities appears influential on positive reactions [39, 55, 61, 63, 99]. The chance provided by MMIs to “redeem” oneself has been positively noted [39].

Applicants’ reported some misgivings with respect to MMIs. Some applicants found MMIs more difficult [89], and more stressful [61], than standardised interviews, while others were concerned that MMIs favour highly communicative applicants [39]. When compared to ratings of other aspects of the MMIs, applicant satisfaction with allotted time was slightly lower [39, 55, 60, 63, 99].

Applicants’ views of other interview techniques were also positive; with one small study reporting that the majority of participants (93%, n = 53) believed that any selection process which did not include interviews would be unacceptable [85]. Standardised interviews have been positively received by applicants in one Canadian medical school [68]. Technological advances have made web based interviewing a possibility and two studies report positive applicant reactions to this approach [57, 98]. Applicants also perceive interviews as an opportunity to glean valuable information about the values and ethos of the school or programme to which they are applying [56, 78, 85]. One study reported favourable levels of applicant satisfaction with group interviews, however international applicants felt they would struggle to impress interviewers by comparison with local candidates (n = 77, response rate 37.8, p = 0.004) [91]. Only one paper was identified that reported negative applicant reaction to panel interviews, and in this paper criticisms related mostly to inadequate levels of post interviews feedback [74].

Situational judgement tests

A small number of studies (n = 6) have explored applicant perceptions of Situational Judgment Tests (SJTs). In terms of quality the MERSQI ratings range from 3 to 10.5, with 4 records with a MERSQI score between 8 and 10.5; response rates were provided by five records and ranged from 36 to 96% and sample sizes, where indicated ranged from 200 to 9067. Two national studies in Belgium found that medical school applicants rated SJTs as having significantly better face validity than aptitude tests [82, 83]. Studies of medical school applicants, foundation year doctors and two studies of applicants to UK general practice training confirmed these positive applicant reactions to the relevance and job relatedness of SJTs [7, 73, 79, 93].

Selection centres

Likewise a small number of studies reported positive reactions to Selection Centres (SCs). Applicants consistently consider SCs to be fair, appropriate and to offer adequate opportunity to demonstrate skills and abilities [66, 94, 97, 102]. SCs rate very positively in terms of of job relevance overall with simulated patient stations being viewed most positively [7, 79]. SCs are not often used for medical student selection, however two examples were located (Singapore and Israel) and both studies report high levels of applicant acceptability [97, 106].

Aptitude tests

Applicants’ acceptance of aptitude tests was somewhat less positive. Under-represented and minority applicants view the Medical College Admissions Test (MCAT) as a barrier to their chances of admission [70, 92]. Medical school applicants considered the UK Clinical Aptitude Test (UKCAT) difficult and were generally unconvinced of its relevance [81, 100, 101]. Conversly, in one study over half of respondents (55%, n = 787) thought that the test was fair [81].

Other selection methods

Only one paper was identified that explored medical school applicants’ views of the biographical essay [43]. Applicants described approaching the essays as a way to “show themselves” and “tell their own story” in a subjective way which they felt was missing from other parts of the admission process.

No article in the timeframe 2000–2014 specifically evaluated applicants’ approval of the use of academic record perhaps reflecting the long-established practice and evidence supporting their use. Likewise we did not identify any records of applicants’ views of personality assessment or references.

In summary, applicants’ views of specific selection methods have been widely surveyed, with the preponderance of evidence relating to applicants’ opinions of newly introduced tools. Applicants appear to be consistently supportive of interviews and MMIs in particular. There is reasonable emerging evidence that both SJTs and SCs are also well regarded. Conversely aptitude tests were not as well supported by applicants. There were significant gaps with respect to applicant views of other selection tools.

The views of selectors: interviewers, faculty and admissions committee members

Thirty seven records included the views of selectors (mean and median MERSQI scores of these records =7). Selectors comprised interviewers, faculty and admission committees constituting persons from a wide variety of backgrounds, both clinical and non-clinical, who share a responsibility for particular aspects of medical selection. These individuals serve variously to appraise written applications, letters of reference, personal statements; serve on interview panels or assess MMI stations; or develop and assess performance on SJTs and SCs. In this fashion they contribute to either shortlisting applicants or a final selection decision. Fairness, validity and comprehensiveness are viewed as crucial aspects of the selection process [49]. A strong sense of social accountability motivates community members and lay persons to become involved in the selection process [41].

Interviews including MMIs

Interviews are considered a stalwart by selectors to medical schools [87]. Similarly, in a large study examining stakeholders’ views of selection methods to Australian medical schools, interviews were viewed as the most valid selection method overall [54].

A large number of studies have evaluated selectors’ opinions with respect to MMIs. Interviewers ranked MMIs highly in terms of perceptions of fairness [45, 48, 50, 61, 72, 99]. Importantly, interviewers felt that MMIs allowed them to accurately evaluate applicants and that the scoring mechanisms allowed them to adequately differentiate between candidates [45, 48, 55, 60, 63, 77, 99]. The multiple assessment opportunities afforded to candidates and the multidimensional assessor view meant that interviewers felt much less anxious about their own decision making [39]. Razack et al. report that interviewers found MMIs appropriate for use with home and international applicants [50]. There is some evidence that interviewers may favour MMIs over traditional interview [45, 72].

Interviewers’ concerns regarding MMIs include: a fear that it might be primarily measuring communication skills [39]; that issues including applicants’ culture, personality or language may negatively impact on performance [38, 50]; the lack of opportunity for interviewers to benchmark the scores they assign against their peers [39]; insufficient time for calibration [48], the requirement for additional training [61] and that MMIs can be a somewhat impersonal process [84].

Situational judgement tests

This review did not identify any records reporting selectors’ views of SJTs.

Selection centres

Emerging evidence suggests that selectors, in both in medical school and postgraduate residency settings, are supportive of SCs. Overall assessors rate SCs highly for relevance, fairness and opportunity for candidates to demonstrate their ability appropriateness to selection [66, 86, 106]. When compared to stations comprising structured interviews, portfolio review and a presentation station, simulated stations were rated significantly higher (p < 0.001) with respect to relevance to selection, opportunity to demonstrate ability and appropriateness to selection [66]. Negative findings were few but included complaints about the inflexibility of the structured approach.

Academic records and aptitude tests

The Medical Colleges Admission Test (MCAT) and undergraduate grade point average are widely considered by Admissions Deans in North America as the two most important selection methods in the decision of who to call to interview for a medical school place [87]. However, at the decision of offers of places, interview and letters of recommendation were more influential. Cognitive ability tests and academic record were viewed by selectors, in one study, as a significant barrier to under-represented and minority applicants [53].

Other selection methods

Letters of reference are viewed as helpful when they were factual, descriptive and cited examples of specific behaviours [75]. In the case of postgraduate selection, they were considered more valuable when they were written by a clinician known to the selector [76]. Perceived shortcomings of letters of reference include difficulty in ascertaining the true strength of recommendation, leading to guess work and reading “between the lines” not least because of a reluctance on the part of the writer to give an honest account of candidates’ weaknesses [42, 76].

One study found that personal statements and a description of work experience were also deemed useful by selectors in terms of revealing an applicant’s depth of understanding of a medical career but considered highly subjective [42]. No records were found describing selectors’ views of personality measures.

In summary, there is reasonable evidence that selectors endorse the use of interviews in general and in particular MMIs, judging this latter tool to be fair, relevant and appropriate for selection, with emerging evidence for similarly positive reactions to SCs. Aptitude tests and academic record were viewed as most useful in the decision of who to call to interview, however they are sometimes viewed as lacking validity and acting as barriers to certain groups of applicants. The usefulness of letters of reference seems mostly to be for ruling applicants out rather than in.

The views of medical students

Twelve studies were identified that explored the views of medical students, distinct from those where students were directly involved in the selection process as in the group above.

Interviews including multiple mini interviews

Two records suggested that students prefer interviews to cognitive testing [51, 77]. International students are even more likely to support interviews (p < 0.01) [51]. Students appreciate the same aspects of MMIs as applicants do, describing it as relevant and suitable for use in selection [77]. One small study examined the views of students admitted through a widening access route, on the role interviews for selection [95]. Interestingly students in their early clinical years supported traditional interviews while students in the senior years felt that MMIs were more appropriate. Elsewhere mature students highlighted the importance of interviews to their sense of identity and fit with prospective medical schools [40].

Aptitude tests

Medical students have mixed to poor reactions to aptitude tests for selection. A good quality mixed methods study (MERSQI rating 10.8) of first year medical students in five Scottish medical schools revealed that overall, the UK Clinical Aptitude Test (UKCAT) was poorly viewed [46]. Focus group interviews showed that students felt it lacked face validity, had poor predictive validity, was coachable, potentially discriminating against less affluent applicants and that there was lack of certainty about how the test was applied by medical schools. Similarly, in a survey of two medical schools in New Zealand (n = 1325, response rate 65%) the majority of students were unconvinced of the importance of Undergraduate Medical and Health Professions Admission Test (UMAT), with over two thirds believing it was not fair [59]. This contrasts with findings evaluating a similar selection tool, the Health Professions Admission Test (HPAT)-Ireland which had a much more positive student reaction; in one study 76% of medical students thought it was fair and 70% felt the questions were well designed and relevant [51]. But elsewhere when compared to MMIs only 38% found HPAT-Ireland relevant [77]. One of the objections medical students have to cognitive aptitude tests is their perceived susceptibility to coaching. Stevens et al. reported the vast majority (79%) of those who had accessed commercial coaching (for HPAT-Ireland), felt it improved their performance [51]. Elsewhere students who had undertaken a commercial course in preparation for UMAT reported higher confidence levels and expected to do well, despite the evidence that coaching does not lead to significant differences in overall performance [105].

Other selection methods

Kumwenda et al. report that two thirds of medical students suspect peers “stretch the truth” in their personal statement as part of their written application to medical school and over 13% believe that, although dishonest, this is a necessary part of the medical school admission “game” [80]. Being from a medical family was seen as a significant advantage in gaining access to relevant work experience for inclusion in the personal statement [44]. Mature medical students indicated that they perceive the written application form to be inflexible and that there was a lack of transparency about what would constitute a good mature application [40]. No records were found exploring medical students’ views of personality assessments.

In summary, medical students appear to prefer interviews based selection methods to cognitive aptitude tests; highlighting the perceived relevance as an important influencing factor. By contrast they view the latter as being less relevant, prone to bias and susceptible to coaching. They are also unconvinced about the transparency of written applications where they believe exaggeration is both common practice and a necessary part of the selection game, with mature students perceiving them as inflexible.

The views of other stakeholders

We identified very few studies which sought to explore the views of other stakeholders i.e. those who are not applicants, selectors or medical students. Four such studies were identified.

One Australian study which included applicants, medical students, patients and doctors (n = 938) evaluated the face validity of tools used for selection to Australian medical schools and noted that medical professionals had lower confidence in the tools used than others surveyed [54]. Aptitude tests were viewed as the least valid selection method.

Three related studies were conducted following the introduction of substantial changes to national selection to medical school in Ireland, which included the introduction of an aptitude test. In a national survey of career guidance counsellors over half of supported the introduction of HPAT-Ireland [90]. Elsewhere, Dennehy et al. surveyed Irish General Practitioners who were not directly involved in selection and report that the majority (97%) strongly support academic record as a selection tool while 70% supported the use of aptitude tests [58]. Kelly et al. qualitatively explored the views of doctors, from a variety of clinical backgrounds to the same test [38]. On the whole they considered the test to have a moderately good degree of job-relatedness. However a non-verbal reasoning section was criticised by all participants, for lacking clinical relevance.

Discussion

This review and synthesis of the evidence identifies a growing body of research into the views of stakeholders. It identified that the research largely explores the views of three main stakeholder groups: a) applicants; b) selectors and c) medical students. The emerging evidence demonstrates that there appears to be a reasonably high level of concordance of views between these stakeholder groups. Applicants support interviews, and multiple mini interviews (MMIs). There is emerging evidence that situational judgement tests (SJTs) and selection centres (SCs) are also well regarded by applicants, but aptitude tests less so. Selectors endorse the use of interviews in general and in particular MMIs judging them to be fair, relevant and appropriate, with emerging evidence of similarly positive reactions to SCs. Aptitude tests and academic records were valued in decisions of whom to call to interview. Medical students prefer interviews based selection to cognitive aptitude tests. They are unconvinced about the transparency and veracity of written applications.

The findings of this review resonate with the constructs of organisational justice theories- in particular with both procedural and distributive justice. On the whole stakeholders are supportive of interviews (in particular MMIs), SCs and SJTs in selection. Procedural justice is one of the most influential determinants of perceived fairness of selection tools and it can be argued that these methods are acceptable to stakeholders because they are viewed as procedurally just. Prior research has shown that the extent to which a selection tool is viewed as job related exerts the greatest influence on perceptions of procedural justice [9, 21]. This review establishes that MMIs are considered by applicants, selectors and students, as highly authentic with immediate relevance to clinical practice. SCs and SJTs represent high to medium fidelity assessments and the job relatedness of these methods is similarly highly rated by applicants and selectors.

Another aspect of procedural justice is the concept of “voice” [20,21]. “Voice” describes adequate opportunity for the applicant to perform, to make a case for themselves as well as sufficient time to do so [107]. The fact that applicants and selectors view MMIs, SCs and SJTs as providing adequate opportunity for candidates to demonstrate their ability and allows for differentiation between candidates is likely to be a key factor in acceptability. In addition MMIs, SJTs and SCs involve selectors directly in selection judgements and by extension this provides them with an opportunity for voice in selection decisions. By contrast selectors are somewhat removed from decisions made by aptitude tests which may contribute to relatively poorer ratings of this tool.

Aptitude tests generally receive mixed stakeholder acceptability. Underrepresented and minority medical school applicants view them as barriers, while other applicants and medical students question their fairness, face and predictive validity. The job relatedness of some of the item formats is questioned in particular abstract reasoning test items, such as non-verbal reasoning questions. This reflects the experience outside of medicine with similar tools and it is recommended that one way to incorporate procedural justice into the design of cognitive tests is to use comparatively concrete item types [108].

Of concern to students and selectors is the perception that aptitude tests may be susceptible to coaching, and the associated fear that this may lead to economic bias. Concerns regarding the possibility that commercial coaching could lead to unfair advantage represent a breach of the distributive justice principle of equal opportunity. Research has shown that justice rules can be more influential, and weigh more heavily on overall estimation of acceptability, when they are violated rather than when they are satisfied [21, 24]. In practical terms, this could mean that even in a situation where a selection tool is considered to perform well across a number of other justice domains – it may still prove unacceptable to stakeholders if it is perceived to fall short in one regard.

There were very few records that explored stakeholder views of other methods such as letters of recommendation, essays and personal statements, but those that did expressed some reservation about the veracity of content. The predictive validity of these methods is also known to be limited, and this coupled with poor stakeholder acceptability, challenges their role in the selection process) [6, 109].

Study limitations

One of the biggest limitations of this review is that the overall quality of the evidence was low based on the average MERSQI score. The low MERSQI scores are principally due to the majority of studies being conducted in single institutions, with single groups of stakeholders surveyed once, often immediately after exposure to one selection tool in a new or pilot setting, with limited evidence for the validity of the evaluation instrument. Furthermore due to the heterogeneity of study designs, a formal assessment of risk of bias that may affect the cumulative evidence (e.g. attrition bias, reporting bias, publication bias) was not performed. For both of these reasons, readers are advised to note that the quality of evidence in this review is relatively low and there may be potential for bias within and across studies.

A second major limitation of this review relates to the heterogeneity of the selection methods themselves. For instance not all MMIs are the same, in fact no two MMIs are the same [110]. Similarly there are important differences between aptitude tests, such as the degree to which they are designed to measure crystallised versus fluid intelligence [6]. A systematic review such as this one, that seeks to summarise and generalise overall stakeholders’ views, will inevitably mask these important contextual differences.

This review emphasises gaps and shortcomings in the research evidence of stakeholder views of selection to medicine. For example, no studies were identified during the time frame of this review, exploring participants’ views of personality assessments and only scant exploration of academic record which confirms that there are gaps in our understanding of stakeholder views. However the authors acknowledge that the time frame of this review may have excluded research on some of the longer established methods, such as academic record.

With respect to methodology, this review revealed a predominance of quantitative research. Qualitative research, on the other hand, is ideally suited to understanding the meaning of selection for the respective stakeholder groups and can greatly add to our understanding of the views and attitudes of stakeholders. The use of theoretical models, to conceptualise and interpret stakeholder views, was rare in this review, but again can help us to better appreciate and compare the nuances of stakeholder acceptability.

Within the quantitative paradigm, the use of standardised methods would better facilitate higher standards of reporting of the content and internal validity of the evaluation instrument, and would accommodate comparisons between different stakeholder evaluations. Transparently transmitting this information in an interpretable manner to stakeholders should assume more importance. Equally, there is a need for better prioritisation of stakeholder views as a legitimate aim for selection research. Future research should consider this, given the centrality of stakeholder views to the political validity of selection methods and the potential to which negative perceptions can deter already marginalised applicants and negatively influence their opinion of the medical professions.

In addition future research should aim to follow up the views of unsuccessful applicants and to seek the views of a wider pool of stakeholders. For example, no studies exploring the views of medical school applicants’ parents were found, yet they are likely to be substantially invested in the application process. Also, there was limited research on the views of patients, or general public or members of the medical profession outside of those directly involved in the admission process or clinical teaching. Similarly, there was only one study of career guidance officers identified yet this group has been noted to be potentially very influential on applicants’ preparation for medical school admission [44, 81, 90]. Finally, while there have been many studies of stakeholders’ views, for the most part, each group is treated as if it is homogenous. Future research should be mindful of these issues and seek to sensitively explore views in a manner that accommodates both differences and similarities within stakeholder groups.

Conclusions

Stakeholders in medical student selection are a collection of diverse groups with potentially differing views. It is critical to the operation of fair and defensible selection processes that we understand and appreciate the range and depth of views that they hold. It is incumbent upon all involved in the selection process to ensure that accurate information is available to all stakeholders and that there is clarity regarding the objectives and purpose of each selection method used to allocate a place in medicine. This review demonstrates that there is important work being done in this field, especially in respect to applicants. However, it highlights the need for better standards and more appropriate methodologies; for broadening the scope of the stakeholder groups included in future research. Finally we hope this review reinforces recognition that stakeholders even from the same group are not necessarily homogenous. Their perceptions are significantly influenced by a range of cultural and environmental factors as well as information disseminated by those responsible for selection.