Background

An increasing number of International Medical Graduates (IMGs), defined as physicians who graduated from medical schools outside the countries where they intend to practice, are migrating to economically advanced countries. IMGs are a vital part of the international physician workforce, and many countries greatly depend on this IMG physician workforce. In the UK in 2018, around 33% of registered doctors graduated outside the UK [1]. IMGs represent 24% of Canadian physicians [2] and 25% in the United States in 2010 [3] and only 47% of medical practitioners in Australia were born in Australia [4]. Employing IMGs comes at tremendous cost savings to the accepting country. In 2013, Australia had saved approximately US$1.7 billion in medical education costs through the arrival of foreign-born medical practitioners over the preceding five years [4].

Competition for IMGs applying to IMG postgraduate training positions is fierce in Western countries, and lawsuits initiated by unsuccessful IMGs have emerged to challenge the system, which have led to the need to be able to legally defend selection decisions [5]. This raises the need for an evidence-based, defensible, and transparent system for selecting and recruiting IMGs to postgraduate training positions. Moreover, postgraduate training of an IMG in their new country is resource intensive and expensive (£485,380 per general practitioner trainee in the UK) [1]. Furthermore, there has been evidence from several countries, including the UK [6], Australia [7,8,9,10,11] and Canada [12, 13], that certain IMGs are more likely to have complaints and suspensions, censure and lawsuits against them. Therefore, it is crucial to identify predictors of success in training, certification exams, and professional practice, as well as the risk of disciplinary actions, during the selection process [14,15,16,17,18,19,20,21]. The leading author (IS) of this review conducted two observational studies [14, 15] to investigate the determinants of success among IMGs in Canada. This exposed the need for an international systematic review to comprehensively understand the factors associated with success.

The main objective of this systematic review is to investigate the predictors of success and failure of IMGs in postgraduate training or practice in their new country.

Methods

We followed the Synthesis without meta-analysis (SWiM) in systematic review reporting guideline [22] and registered our review (PROSPERO Identifier: CRD42021252678). This systematic review did not require ethics approval.

Data sources and searches

An academic librarian (NW) developed and implemented the search strategy by using controlled vocabulary and keywords representing the concepts [International medical graduates], [success and failure] and [predictors and risk factors] on Medline (OvidSP)[1946-present], Cochrane Central Register of Controlled Trials (Cochrane Library, Wiley)[Issue 1 of 12, January 2022], BIOSIS Citation Index (Web of Science)[1969-present], CINAHL (EBSCOHost), Embase (OvidSP), ERIC (EBSCOHost), Global Health (OvidSP), LILACS (https://pesquisa.bvsalud.org/portal/), Science Citation Index (Web of Science)[1900-present], PsycINFO (OvidSP)[1806-present] and Scielo (https://scielo.org/en) from inception to February 28, 2022, without any language or date restriction (eTable 1). We reviewed reference lists of eligible studies and related reviews for additional potentially eligible articles. We have set up an automatic alert for our search strategies to inform us of recent publications.

Eligibility criteria and study selection

We included prospective and retrospective observational studies in any language that: (1) enrolled IMGs who are defined to be physicians working or in postgraduate training in a country other than their country of training, and (2) investigated predictors for success or failure in IMGs during training or practice, utilizing adjusted analyses—comprising any type of regression or the use of ANCOVA or MANCOVA—to demonstrate the association between predictors and outcomes.

Pairs of reviewers independently screened titles and abstracts identified through our literature searches for relevance to the research question. Before the formal screening process, we performed multiple rounds of screening to achieve agreement. For each round, 50 titles and abstracts and 10 full texts were used for pilot screening. Pairs from the same set of reviewers independently assessed the full texts of all potentially eligible articles based on the predetermined selection criteria. All conflicts were resolved through discussion to reach consensus, and if needed, a senior reviewer (IS) was involved. We used online Covidence systematic review software, Veritas Health Innovation, Melbourne, Australia, available at www.covidence.org, to facilitate literature screening.

Outcomes

After data extraction, we decided to enhance our reporting by grouping outcomes into six categories. We defined success as a binary outcome, where a candidate passes a qualifying or certificate exam, matches for residency, and continues to practice in the new country. Conversely, failure was defined as not meeting any of these criteria. Similarly, achieving a higher score on a scale was considered a success for continuous outcomes. Additionally, we classified lawsuits, complaints, suspensions, and censure as types of failure. Regarding clinical outcomes, higher mortality or increased rates of opioid prescriptions were also classified as failures.This refinement will provide a clearer and more organized presentation of the results: (1) success in qualifying exams, e.g. Medical Council of Canada Qualifying Examination (MCCQE); Educational Commission for Foreign Medical Graduates (ECFMG) clinical skill assessment and integrated clinical encounter exams in USA - United States Medical Licensing Examination parts (USMLE); (2) success in matching for residency; (3) success in certification exams for getting licensed and practice medicine e.g., Certification examinations of College of Family Physicians of Canada (CFPC), Royal College of Physicians and Surgeons of Canada (RCPSC), American board of family medicine (ABFM) certificate; (4) retention of IMGs to practice in the new country where the IMG had completed additional postgraduate training, (5) being disciplined or receiving complaints, and (6) clinical outcomes of patients managed by IMGs, e.g. mortality of patients treated by IMGs vs. local medical graduates or quality of practice issues identified as per best practice guidelines (e.g. prescription considerations such as opioid prescription).

Data extraction

We used a modified checklist for critical appraisal and data extraction for systematic reviews of prediction modeling studies (CHARMS-PF) for predictors [23]. The CHARMS checklist provided detailed guidance about the key items across 11 domains grouped into nine main categories: (1) study design, (2) study population features, (3) outcome measurement methods and their validity, (4) predictors and adjusted measures of association with outcomes, (5) sample size, (6) missing data, (7) analysis, (8) results, (9) interpretation and discussion. (eTable 2).

Pairs of reviewers extracted data independently. Reviewers resolved disagreements by discussion or by consultation with an adjudicator when required. When a study reported more than one regression model, we used the model with the largest number of predictors.

Risk of bias assessment

The same pairs of reviewers assessed the risk of bias (RoB) independently and in duplicate using the QUIPS-PF (Quality in prognostic factor studies) tool. We used the following criteria to assess the risk of bias among observational studies (eFigure 1): (1) representativeness of the study population; (2) proportion of missing data (≥ 20% was considered high risk of bias); (3) predictor measurement; (4) validity of outcome assessment; (5) statistical analysis and reporting; (6) whether predictive models optimally adjusted by included, at minimum, age and sex [24].

Data synthesis

Due to the diverse methods used in assessing success and failure, conducting a meta-analysis was not feasible. Instead, we organized studies based on outcomes, when available, reported the baseline probability for each outcome and a measure of association [relative risk (RR), odds ratio (OR), or hazard ratio (HR)], along with the corresponding 95% confidence interval and absolute probability change for each predictor.

Results

Study selection

We identified 1,955 unique records, of which a total of 25 studies met our eligibility criteria (Fig. 1). We received one citation [25] from the automatic alert system, designed to notify us of newly published references. We included three studies [12, 26, 27] that reported predictors of success and failure; however, they did not use adjusted analysis. Two eligible studies [14, 15] had population overlap, and we reported the results for predictors supported with larger populations.

Fig. 1
figure 1

Flow diagram of study selection

Study characteristics

We included 23 retrospective observational studies, one prospective cohort study [28], and one case-control study for a total of 25 studies included [29] Seven studies were from Canada [12, 14, 15, 27, 30,31,32], eight from the USA [25, 26, 29, 33,34,35,36,37], seven from the United Kingdom [1, 38,39,40,41,42,43], and one each from Australia [7], Sweden [28] and Finland [44]. Four were single-institution studies. Wherever funding was acknowledged, it was from an official educational grant or government grant (Table 1).

Table 1 Baseline characteristics of included studies

Risk of bias

Eleven of twenty-five (44%) studies were rated as having a low risk of bias. Two (8%) were rated as having a moderate risk of bias. Twelve (48%) were rated as having a high risk of bias. Five studies did not enroll a representative study population, six studies reported high losses to follow-up, seven studies did not measure predictors, five studies did not use valid tools to measure outcomes, the regression model in nine studies did not adjust for one of age and sex and the regression model did not adjust for all predictors (eTable 3).

Predictors and outcomes

Twenty-five studies (375,549 participants) reported the association between 93 independent variables with six outcome groups in IMGs. To optimize the reporting and interpretation of the results of this systematic review, the authors identified six groups of outcomes based on their similarity.

Success in qualifying exams

We identified two studies [30, 34] that explored predictors of success in qualifying exams in the USA and Canada. One study [34] showed that female candidates were more likely to pass the Integrated Clinical Encounter (ICE) [OR: 2.64 (95%CI: 2.31, 3.03)] and Doctor-Patient Communication (COM) components of the clinical skills assessment (CSA) [OR: 2 (95%CI: 1.71, 2.39)] for ECFMG certification.

Mathews (2017) [30] demonstrated that sex was not associated with success in the MCCQE2 in Canada (Table 2) (eFigure 2). Van Zanten (2003) [33] showed that candidates with higher TOEFL scores were more likely to pass the ICE and COM. Additionally, native English language-speaking candidates had a very high likelihood of passing COM [OR: 6.85 (95%CI: 3.81, 12.29)].

Table 2 Predictors of success and failure in certification exams

Van Zanten (2003) [34] showed that candidates with higher scores on the USMLE step 2 were more likely to pass the CSA. Mathews (2017) [30] showed that IMGs who participated in a skills assessment program had a very high likelihood with high variability of passing the MCCQE2 [OR:9.60 (95%CI: 1.29, 71.63)] (Table 2).

Regarding graduation recency, candidates who graduated ≤ 5 years had more success on the COM exam in the USA [OR: 1.54 (95%CI:1.32, 1.81)] [34]. Conversely, candidates who graduated ≥ 6 years ago were more likely to pass the MCCQE2 in Canada [OR:3.45 (95%CI: 1.52, 7.69)] [30].

Matching into residency (postgraduate training position)

We identified one study [29] that investigated predictors of IMGs matching into an ophthalmology residency, which showed that having three letters of recommendation from US ophthalmologists [OR: 6.2 (95%CI:2.54, 15.16)], a USMLE step 1 score ≥ 236 [OR: 3.22 (95%CI: 1.38, 7.49)], having received an academic award [OR:1.12 (95%CI:1.03, 1.22)], having high-impact journal publications [OR: 2.99 (95%CI:1.51, 5.72)], and having US research experience [OR: 2.95 (95%CI:1.31, 6.67)] were associated with successful matching into ophthalmology residency in the USA. Furthermore, the results showed that doing postgraduate clinical training, including a surgical internship for ≥ 3 years in the USA, reduced the success rate for matching into an ophthalmology residency in the USA [OR: 0.26 (95%CI: 0.12, 0.58)] [29] (eTable 3).

Success/failure in certification exams

Nine studies informed predictors of success/failure in certification exams.

Age

Five studies reported conflicting evidence on the association between age and success in certification exams. While one study demonstrated that a decrease in age (younger age) was associated with greater success in the College of Family Physicians of Canada [OR = 1.76 (1.32, 2.33)] and the Royal College of Physicians and Surgeons of Canada certification exams [OR = 1.54 (1.08, 2.18)] in Canada [15] a similar study indicated that younger candidates were more successful in the licensing exam in Sweden [28]. However, another study showed that age increment was associated with more success in Membership of the Royal College of Pediatrics and Child Health part 1B examination outcome [OR = 0.71 (0.53, 0.97)] in the UK [41] (Table 3) (eFigure 3). The remaining two studies [39, 44] showed a statistically non-significant association between age and success in certification exams.

Table 3 Predictors of success and failure in qualifying exams
Female gender

Six studies [15, 30, 39, 41, 42, 44] explored the associations between female gender and success in certification exams, with five studies [15, 30, 39, 41, 44] indicating that female candidates were more likely to succeed in various certification exams [1] (Table 3)(Fig. 2).

Fig. 2
figure 2

Association of female gender and success in certificate exams

English fluency

Two studies showed that English fluency was associated with more success in both components of the College of Family Physicians examination, the Royal College of Physicians and Surgeons certification examination in Canada and the Clinical Skills Assessment (CSA) component of the Royal College of General Practitioners Membership in the UK [1, 15] (eTable 5).

Race and ethnicity

We identified two studies [39, 43] with conflicting results regarding the relationship between ethnicity and success in certification exams. Bessant (2006) [42] reported White graduates were more likely to pass the practical assessment of clinical examination skills examination of the MRCP in the UK [OR:2.04 (95%CI: 1.42, 2.94)], while Tiffin (2014) [39] demonstrated that being White was not associated with Annual Review of Competence Progression in the UK (eTable 5).

Country of graduation

Our search identified three studies that explored the association between the place of graduation and success in certification exams. Two studies showed that graduates from the UK vs. IMGs were more likely to pass the Membership of the Royal College of Paediatrics and Child Health MRCPCH [OR: 3.17 (95%CI: 2.41, 4.17)] and PACES (practical assessment of clinical examination skills) examination of the MRCP (Royal College of Pediatrics, UK) [42] [OR: 4.87(95%CI:3.86, 5.72)]. We also identified a study that showed graduates from European vs. non-European universities were very likely to pass the Clinical Skills assessment in 3rd-year residency in the UK [41] [OR: 21.3 (95%CI: 5.6, 91.3)] (Table 3).

Previous experience

In terms of previous experience, Schabort (2014) demonstrated that candidates with a prior internship were more likely to pass the Royal College of Physicians and Surgeons of Canada (RCPSC) examination on their first attempt [14] [OR:4.09 (95%CI: 1.24, 13.5)] (Table 3).

Result of qualifying exams

To explore the association between qualification exams and success in licensing exams in the USA, we identified one study which showed that lower USMLE Step 2 CK [OR:0.99 (95%CI: 0.98, 0.99)] and in training examination scores through years 1–3 of residency [OR:0.99 (95%CI: 0.99, 0.99)] were marginally associated with failing to obtain ABFM certification in the USA. Conversely, higher scores on the spoken English proficiency component of the USMLE Step 2 was associated with failing ABFM certification [OR:1.04 (95%CI: 1.02, 1.06)] [33] (Table 3).

Explaining the association between qualification exams and success in certificate exams in the UK, results of one study revealed that higher scores in the clinical problem-solving test [Coefficient for Linear regression 0.05 (95%CI: 0.04, 0.07)] and Situational Judgment Test [Coefficient Linear regression 0.07 (95%CI: 0.05, 0.09)] were associated with higher scores in the clinical skills assessment component of the Membership of the Royal College of General Practitioners (MRCGP) exam in the UK [1]. Bessant (2006) also showed that candidates who passed the part 2 written test on the first attempt were more likely to pass the PACES (Practical Assessment and Clinical Skills Examination) component of the MRCP in the UK [OR:3.64 (95%CI:2.31, 5.73)] [43] (Table 3).

Predictors of IMG retention for practice and academic career pursuits in a new country

We identified a study that showed that recent graduates who received their MD degree ≤ 5 years ago were more likely to work in Canada within two years after Postgraduate training [OR: 1.36 (95%CI: 1.03, 1.79)], and IMGs who were eligible for full licensure were more likely to work in Canada [OR:3.72 (95%CI:2.30, 5.99)] [30] Furthermore, residency vs. fellowship candidates were more likely to work in Canada [OR: 2.63 (95%CI: 1.59, 4.35)] [31] (eTable 5).

One study showed that when comparing family medicine and specialty IMGs, family medicine candidates were more likely to work in rural communities [OR: 2.32 (95%CI: 1.33, 4.17)], and male IMGs were more likely to work in rural communities [OR: 1.77 (95%CI: 1.16, 2.70)] [30].

One study showed that completing a post-residency clinical fellowship was associated with following an academic career amongst IMGs [OR: 1.73 (95%CI: 1.01, 2.96)] [37] (eTable 5).

Being disciplined and receiving complaints from the medical board

One study demonstrated that males [HR: 2.73 (95%CI: 1.90, 3.93)], IMGs with higher International English Language Testing System (IELTS) speaking scores [HR: 1.39 (95%CI: 1.13, 1.72)] and IMGs who attempted to pass the Professional and Linguistic Assessment of the General Medical Council in the UK (PLAB 1) after ≥ 4th attempts [HR: 2.30 (95%CI: 1.26, 3.59)] and PLAB 2 at ≥ 3rd attempts [HR: 2.45 (95%CI:1.44, 4.18)] were more likely to be censured by the General Medical Council (GMC) in the UK [38]. Furthermore, one study conducted in Australia showed that being an IMG was associated with attracting complaints [OR:1.24 (95%CI: 1.13, 1.36)] and being adversely disciplined [7] [OR: 1.41 (95%CI: 1.07, 1.85)] (eTable 6).

We identified a study with unadjusted results showing that IMGs had a higher risk of being disciplined than North American medical graduates [12] [OR = 1.58 (95%CI: 1.38, 1.82)] (eTable 8).

Clinical outcomes of patients managed by IMGs

One study showed that mortality rates in patients were lower when treated by non-US trained IMGs vs. US medical graduates or US citizens trained abroad [OR: 0.91 (95%CI: 0.86, 0.96)] [36]. Each additional score of the physician’s USMLE Step 2 CK examination decreased the likelihood of mortality in patients treated by that physician [OR: 0.998 (95%CI: 0.996, 0.999)] [35]; and patients of non-US trained IMGs had a 20% lower likelihood of mortality [OR: 0.82 (95%CI: 0.62, 0.99)] than patients treated by US citizens trained abroad [35] (eTable 7).

Another study [32] showed that male IMG physicians in the US were more likely to prescribe opioids [31] [OR:1.11 (95%CI: 1.03, 1.19)] (eTable 9) and US IMG specialists, including internists, medical specialists, surgeons, and emergency medicine specialists, were less likely to prescribe opioids vs. primary care physicians. IMG physicians practicing in the north [OR: 0.63 (95%CI: 0.58, 0.69)] and west regions [OR: 0.88 (95%CI: 0.80, 0.96)] were less likely to prescribe opioids vs. physicians from the southern US [31] (eTable 7).

Discussion

It is imperative to recognize that IMGs apply in large numbers for postgraduate training in their new country, competing for a limited number of positions. For example, in the 2023 Canadian Residency Service Match, 2105 IMGs registered, of which 555 (26.3%) were matched into a postgraduate residency position [45]. Furthermore, Canadian and US citizens who are unsuccessful at securing a medical school position in their country, are leaving their country of citizenship to complete medical school training elsewhere and then return to have to compete as an IMG with immigrant IMGs for postgraduate training positions in their country [46]. Medical schools often have less background information about IMG applicants compared to local graduates, where the Medical School Performance Record (MSPR) or other customary variables are available to aid in the file review, selection to interviews, and ranking process. This clearly demonstrates the importance of identifying predictors for IMG success and failure in residency and practice employment from the data available at the application; and highlights the need for a transparent, evidence-based, and defensible selection process for IMGs.

Although a pooled association measure would be more informative for decision-makers, we were unable to pool data statistically for a single pooled result due to diversity in outcome assessment methods. Yet, we identified six groups of outcomes. We have summarized the evidence in a narrative format, highlighting the association between various predictors and each outcome while acknowledging the uncertainty inherent in this evidence. Despite these limitations, it remains the most comprehensive evidence available to inform decisions regarding the selection of IMGs.

We found that female sex, English fluency and higher scores in previous qualification exams were associated with more success in the CSA components of the Educational Commission for Foreign Medical Graduates (ECFMG) exams [34]. Given that females demonstrate higher exam success compared to males in our systematic review, this trend may be attributed to their superior performance in topics like Ob/Gyn and stronger skills in data gathering, communication skills, clinical skills and note-writing [34, 47]. Additionally, factors such as cultural integration, exemplified by Finnish immigrants [44] and the global imbalance in healthcare opportunities for women, provide more opportunities for highly qualified female IMGs to succeed [15]. However, conflicting findings suggest a need for further study to clarify these associations conclusively. Qualitative evidence synthesis is increasingly prioritized in decision-making processes for complex areas like this.

Three letters of reference from US ophthalmologists, a USMLE step 1 score > 236, high-impact journal publications, and US research experience were associated with more success in Ophthalmology residency matching [29]. The association between the number of recommendation letters from US ophthalmologists suggests a preference over non-US letters for several reasons: Residency committees may value letters from US physicians with whom they have professional relationships; these letters demonstrate the applicant’s ability to make productive mentoring relationships; they signify approval from a physician trained in the US system; and letters from non-US physicians unfamiliar with the US match process may lack sufficient commentary on valued applicant characteristics. It should be noted that the number of letters of support as a predictor of matching in residency reflects several complex factors, such as work ethics, professional collaboration, and interpersonal relationships [29].

The evidence also showed that age decrement was associated with more success in certification exams in Canada [15] The findings of European studies showed notable inconsistencies regarding the association between age and success in certification exams [28, 39, 42, 44]. All studies [15, 39, 41, 44] except one [42] showed that females were more successful in passing licensing exams or had more satisfactory progress in competence. We found inconsistent results on the relationship between White ethnicity and certification assessment success [39, 43]. The results showed that candidates who graduated from UK and European medical schools were more likely to pass the MRCPCH writing test [42] and clinical skills assessment (CSA) [41, 43] in the UK. Possible reasons for these findings include familiarity with the culture and reduced language barriers. However, none of these factors can conclusively explain the findings [43]. Further investigation is needed to determine if these differences show real differences in skills.

Among all previous experiences of internship, residency and research, Schabort [14] showed that previous internship was associated with success in the RCPSC certification exams in Canada.

Considering previous exams and qualifications, the USMLE step 2 CK scores, in-training exams performance in 1st − 3rd-year residency were associated with residents’ performance in the ABFM certification exam [33]. Higher scores in the Clinical problem-solving and situational judgment tests were associated with success in the Royal College of General Practitioners (MRCGP) exam in the UK [1].

One study showed that male candidates who graduated ≤ 5 years ago, were eligible for a full license, and completed residency vs. fellowships were more likely to remain and work in Canada [31].

We identified a study that reported being male, higher scores in the speaking module of the IELTS and attempting to pass the PLAB1 ≥ 4 times and PLAB2 ≥ 3 times were more likely to be disciplined by the GMC in the UK [38]. The authors of this study suggested that making the PLAB or replacement assessment more stringent and raising the required standards of language reading and listening and clinical skills competency, as well as capping the number of PLAB resits permitted, may result in fewer fitness to practice events in IMGs.

Regarding clinical performance, one study showed that relative mortality risks were 20% lower when non-US trained IMGs treated patients than US citizens trained abroad [35].

Since the current evidence is derived from observational studies, the certainty of evidence is low. The observed association between predictors and outcomes indicates a significant relationship, yet it does not establish causation. Given the multifactorial nature of each association, it is important to note that these findings are indicative. Caution should be exercised when interpreting this evidence.

Strength and limitations

This review is the first and only systematic review on this topic using rigorous Cochrane-endorsed methodological tools (CHARMS, CHARMS-PF) to explore the predictors for success and failure in IMGs using variables available at the time of selection. Strengths of our review include explicit eligibility criteria and a comprehensive search without language restriction that identified 25 studies exploring predictors of success and failure for IMGs through adjusted analysis. We also assessed the risk of bias for each study using the QUIPS tool [24]. Whenever possible, we reported baseline probability for the outcome and presented the association as both relative and absolute measures transparently and explicitly to optimize interpretation.

The main limitation was excessive diversity in measuring outcomes, which made meta-analysis impossible to perform, and it provided more limited information for decision-makers. The results of this review are limited by the quality of primary studies available for inclusion, as more than half of the studies have a moderate to high risk of bias.

Conclusions

The main objective of this systematic review was to investigate the predictors of success and failure of IMGs in postgraduate training or practice in their new country. The studies encompassed in this systematic review span across Australia, Canada, Finland, the United Kingdom, the United States of America, and Sweden—making it the only study of its kind internationally. These studies identified predictors for success in qualifying or certification examinations, success in matching to a postgraduate residency position, or retention of an IMG to practice in their new country after the country had invested in training them. Moreover, these studies identified predictive factors for instances where IMGs were disciplined or faced complaints at medical boards in three countries. These findings are notably serious and warrant close attention. Another significant finding was predictors for mortality of patients treated by IMGs and predictors for clinical competence of IMGs in practice.

These predictors are worthy of the attention of all organizations and international policymakers involved in IMG selection, and the result of this review could assist in exploring predictors for success for IMG selection into postgraduate training or employment.