Introduction

Postgraduate medical education has undergone considerable transformation in an attempt to address ongoing concerns regarding the readiness of medical graduates for independent practice [1,2,3,4,5,6,7]. Many countries have sought to define necessary competencies required of physicians for effectively meeting the healthcare needs of the public they serve in an attempt to increase transparency and accountability of medical training [8,9,10]. Eventually, competency-based educational frameworks were gradually integrated into many postgraduate medical education programs internationally [11]. Following the adoption, medical education accreditation and regulatory bodies were faced with how to operationalize such overarching competencies to create effective and reliable outcomes-based assessments of residents by Faculty raters.

One solution introduced the concept of an entrustable professional activity (EPA) which was defined as a specific task or responsibility representative of clinical activities required of professional practice, which can be used to monitor and assess a resident’s progression of entrustment until a required level of competence is achieved to allow for independent practice [12]. By performing an EPA, residents are required to use a combination of competencies and as such, assessment of competencies can be inferred through the observation of a resident’s performance of an EPA [13]. Ten Cate has further clarified that, “competencies are person descriptors, as they signify what individuals are able to do, whereas EPAs are work descriptors and only reflect the work, tasks and activities that are to be carried out in healthcare, irrespective of who does that work” [14]. EPAs therefore coincide with the normal operations of clinical practice, where Faculty supervisors are responsible for assigning an appropriate level of responsibility or “entrustment” to a resident when performing an activity. Importantly, EPAs formalize this process of entrustment and enable the tracking of resident competence attainment across defined sets of professional tasks and activities, promoting accountability and transparency of graduating residents.

Furthermore, resident education is moving from time-based to competency-based medical education (CBME). As valid and reliable assessments for designated EPAs that provide trainees with valuable feedback are the cornerstone of CBME, the entrustment scale plays important role in determining the success of CMBE. In Canada, the Ottawa Surgical Competency Operating Room Evaluation (O-SCORE), based on 5 levels, is recommended to use in resident EPA assessment by the RCPSC [15, 16]. The ACGME in the United States developed “Milestones,” which are performance levels used in resident/fellow assessment of each subcompetency. It has five levels, proceeding from lower to higher competency [9, 17]. The performance in each ACGME milestone refers to the expected performance in each stage of training, and each specialty indicates details on the procedures that trainees can perform and how well they perform for each ACGME milestones level. Level 4 is considered the graduation target [9].

The Intercollegiate Surgical Curriculum Programme (ISCP) of the United Kingdom (UK) developed “Supervision Level” which assesses resident performance on how much supervision is required in each task [10].

Most of the currently used entrustment scales depend on the supervisor’s degree of involvement in the case or level of supervision. These levels do not take into account the complexities that different clinical scenarios may present to the graduating resident surgeon. For example, The Royal College of Emergency Medicine of UK (RCEM) entrustment scale has four main levels (1–4), that focus on the extent of supervisor involvement in residents performing the activity. In level 2, supervisor is “off-site” but still within the hospital in case the trainee needs assistance, whereas in level 3, the supervisor is on standby at home [18]. This could raise concern regarding patient safety and may increase liability [19, 20].

In practice, entrustment decisions given by Faculty raters can be influenced by many different factors, including a supervisor’s clinical and teaching experience, the Faculty member’s case mix (are they operating on routine cases or more complex cases related to an EPA), personality, propensity to trust, their own expectations, activities they are engaged with outside the operating room, and, issues like existing rapport with residents [21,22,23,24,25,26,27]. As such, entrustment scales (ES) that highlight a supervisor’s level of involvement in a case rather than focusing on explicit markers of resident performance, minimizes transparency behind Faculty decisions regarding resident entrustment and what a resident needs to improve to gain the desired level of entrustment. Should a graduating resident later face legal challenges based on competency, Faculty raters who certify competence achievement based on their judgment “when they left the resident alone” may raise questions such as how they actually decided performance was competent.

We posit, that assessment of resident competence should be focused on observation of performances that take into account the resident’s ability to think and act in a way that responds to new challenges they face as a case evolves. It moves away from a supervision-based scale to one that values an ability to respond to the contextual demands of a case and to provide trainees with effective feedback to help guide the development of competency of each EPA. The objectives of our study were to: (1) identify important resident performance markers for demonstrating competent attainment of an EPA; (2) identify the standard of performance expected of graduating residents; (3) collect evidence for the validity of our purposed entrustment scale; and (4) identify necessary components required to provide feedback to residents in guiding the development of competent performance of an EPA.

Methods

We developed a questionnaire with questions regarding years of practice, number of staff physicians and residents working in the respondents’ specialty at their hospital, number of residents the respondents supervise per day, as well as the important resident performance markers for demonstrating competence attainment of an EPA, evidence of validity of our novel entrustment scale, standard expected of EPA respective to our proposed entrustment scale, important components of EPA assessment forms, importance of accessing all of a resident’s previous EPA evaluations and respondent’s perspectives on certifying competence based on level of resident supervision on additional risk of liability.

For questions regarding the expected standard competence performance, we asked the respondents to choose the level they thought should be a graduation target of residents from one of the proposed five entrustment scales ranging from level A to E, representing lower to higher competency. Levels A-D included aspects of the levels in the O-SCORE, but we excluded the aspects in the scale relating to supervision and purposefully added a fifth level, which explored the ability of the resident to competently respond to novel complexities that might present to the practicing surgeon and their ability to do the procedure competently, even when complexities arise.

Contextual complexities that are referred to in the scale include not only common complications, but also rare complications, and other events/environments that deviate from the normal situation. For example, not having the instrument that the surgeon needs, meaning they must adapt other available instruments to successfully accomplish the procedure, working with newly identified unexpected events or findings during a case such as dealing with an inadvertently injured blood vessel, or, working in difficult team dynamics, where the trainee needs to use a variety of competencies, e.g., leadership and communication, to be able to handle the situation and provide excellent patient care.

The details of our proposed entrustment scale and how it compares to other ES is outlined in Table 1.

Table 1 Comparison of our proposed entrustment scale with other currently used entrustment scales

Survey

An online questionnaire was sent via email to all Canadian neurosurgery Faculty with publicly available email addresses using the SurveyMonkey platform. Emails inviting neurosurgical Faculty to participate were sent out three times over a four-month period. Participation was completely voluntary. To ensure participant confidentiality, no identifiable information was collected. Each respondent provided informed consent for participation.

This study reports on the questions related to our proposed entrustment scale. Approval for the study was obtained from the Research Ethics Board of Unity Health-St. Michael’s Hospital.

Statistical analysis

Only complete responses of questions regarding important performance markers, necessary components of an EPA assessment form, face validity of our proposed entrustment scale and standard of performance expected of graduating residents were analyzed.

Descriptive data of respondent’s demographics (proportion for categorical responses, median and ranges for numerical responses) regarding working experience, along with descriptive data about programs, including: number of physicians, number of residents at respondent’s hospital and number of residents under the respondent’s supervision per day were calculated.

The responses of 5-point Likert scale questions were summarized using proportion, means (strongly agree/extremely important = 5, agree/very important = 4, neither agree nor disagree/not so important = 3, disagree/not so important = 2, strongly disagree/not at all important = 1), and summation proportion of “strongly agree” and “agree” or “extremely important” and “very important.”

For question regarding Faculty’s Perspectives on degree of importance of resident performance markers for demonstrating competence attainment of an EPA, we ranked the degree of importance of each performance marker by using the calculated means of perceived importance.

We noted the proportion of responses for all questions that required categorical responses and used a probability level of 0.8 for tests of proportionality. Medians, minimums, maximums and/or IQRs were calculated for numerical responses. Categorical responses were analyzed using Fisher exact test and numerical responses were analyzed using Wilcoxon rank-sum test for sub-group analyses involving the respondent’s years of Faculty experience (up to 10 years and more than 10 years). A p value of < 0.05 was used for statistical significance. All analyses were carried out in R [33].

Latent class analysis (using packagepoLCA) [34] was conducted using the agreeability (coded as agreed, neutral, and disagreed) of a respondent in considering each of the eight performance markers as important for evaluating competency. Latent refers to an unobserved variable consisting of different categories/subgroups/classes. A set observed categorical variables (in our case, the eight performance markers) is analyzed through latent class analysis. Such analysis allows to characterize a latent (unobserved) variable such that the parameters of some of the observed variables vary across the classes of the latent variable [35, 36]. Finally, open-ended comments were thematically analyzed to help understand the reasoning behind Faculty’s responses.

Results

We received a total of 67 questionnaire responses, of which 52 responses (usable response rate of 77.6%) were considered complete responses with fully usable data regarding our objectives and included into the analysis for this study. Of the respondents, 40% had been in a neurosurgical Faculty position for 10 or less than 10 years and 60% for 11 years or longer. Demographics of Faculty are shown in Table 2. By applying Latent Class analysis to the responses related to the importance of each performance marker, we found two classes of Faculty that described their perspectives on necessary performance markers for demonstrating competence attainment of an EPA (Fig. 1). Table 3 demonstrates characteristics of each LCA class. The first group included Faculty who had a mixture of agreeability on every performance marker. In contrast, Faculty in the class 2 were found to have 100% agreeability for being able to “perform safely,” “perform effectively,” “adapt performance or decisions in response to unexpected events,” “adapt performance or decisions in response to contextual complexities of the case,” and “perform independently,” which are similar to the five performance markers with the highest average degree of importance for demonstrating competence attainment of an EPA.

Table 2 Demographics of respondents (N = 52)
Fig. 1
figure 1

Findings from latent class analysis

Table 3 Characteristics of faculty in class 1 and 2 from latent class analysis

No significant differences in these two classes were found when considering Faculty’s working experience or selected level E as a graduation target.

Aim 1: Resident performance markers of EPA competence attainment

When Faculty were asked to rate the degree of importance of each performance marker for demonstrating competence attainment of an EPA, the most important resident performance marker was to perform the procedures safely (Fig. 2). Moreover, being able to “perform safely” is the marker that had highest proportion of strongly agree or agree (96.1%). The other four performance markers with the highest average degree of importance were: being able to “perform effectively,” “adapt their performance or decisions in response to unexpected events,” “adapt their performance or decisions in response to contextual complexities of the case,” and “perform independently.” On the other hand, the least important resident performance marker, on average, was to “perform without supervision.” In addition, being able to “perform without supervision” is the marker that had greatest proportion of strongly disagree or disagree of (19.3%).

Fig. 2
figure 2

Faculty’s perspectives on degree of importance of resident performance markers for demonstrating competence attainment of an EPA

Figure 3 shows the results of the performance markers Faculty indicated to be important for demonstrating resident competence attainment of an EPA from their perspective as a Faculty member (in blue-colored bar) and the performance markers that would be important to them if they were a patient (green-colored bar). Being able to “perform safely” and “adapt performance or decision in response to contextual complexities of the case” were the two most frequently chosen performance markers by Faculty from their perspective as both Faculty and a patient point of view. Performing without supervision was considered far less important when respondents considered the level from the perspective of Faculty or the perspective as patients when answering the question.

Fig. 3
figure 3

Perspectives of faculty as a patient versus faculty as a faculty member

The finding that being able to “perform without supervision” was seen as less important than other resident performance markers in demonstrating resident competence attainment of an EPA were supported by the following Faculty comments:

A neurosurgeon who strongly disagreed with being able to “perform without supervision” said:

“No student pilot transports passengers without supervision….an important analogy.”

Another neurosurgeon who disagreed with this performance marker stated:

“It cannot be a marker, if it is not observed. Who marks? The resident will say: me. However, that has a clear observer-bias built in.... The only control tool then is the postsurgical outcome. One may argue: if the patient is fine, surgery was probably fine. However - that does not assess the quality of care delivered.... That makes such an evaluation system unreliable.”

Another neurosurgeon who neither agreed nor disagreed with this marker remarked:

“Should be ‘could perform’ instead of ‘can perform’ because I will always be present during surgeries involving patients under my care.”

Liability with documentation of faculty level of supervision

The issue regarding liability was raised as many currently used entrustment scales assess resident competency with the degree of supervision. Nearly half of the Faculty surveyed (44.2%, 23/52) strongly agreed or agreed that, “certifying competence based on decreased levels of resident supervision (e.g., documenting level of supervision) places Faculty at additional risk of liability should a patient later complain about their quality of treatment.”

In contrast, 28.8% of Faculty strongly disagreed/disagreed, and 19.2% neither agreed nor disagreed with this sentiment.

No significant difference of opinion was appreciated based on Faculty’s working experience.

A neurosurgeon who agreed that, “certifying competence attainment using entrustment scales based on decreasing levels of resident supervision placed additional risk of liability on Faculty” said:

“Agree based on experience with a College complaint (that was dismissed).”

Aim 2: The standard of performance expected of graduating residents

Over two-thirds of Faculty (67.3%, 35/52) believed that level E, which is defined as residents being capable of adapting performance or decisions in response to contextual complexities of the activity independently and safely, as opposed to being able to solely perform a task without complexities, represents the standard expected of competent performance of graduating residents. Twelve respondents (23.1%) considered being able to simply perform a procedure without complexities independently to be representative of an appropriate level of competence for graduating residents. Level C which is being able to perform the activity with occasional guidance required was selected by only 3.8% (2/52).

No respondent indicated that the lowest two levels of competence (A and B) were sufficient for graduating residents. Three Faculty selected others and provided comments. No significant differences in responses were found as a result of Faculty’s working experience.

Aim 3: Evidence of validity of proposed entrustment scale

The results supported that our proposed entrustment scale has good evidence of validity as 80.8% of Faculty agreed or strongly agreed to the appropriateness of the proposed progression of entrustment for ensuring resident competence in performing a given EPA. 78.9% believed that five different levels of entrustment would be ideal for such a scale, and 76.9% agreed the provided descriptions for each progression of entrustment were easy to understand (Table 4).

Table 4 Faculty perspectives on evidence for validity of our proposed entrustment scale (N = 52)

Our scale can discriminate the different levels of performance expected at graduation as over 90% Faculty believed that level E or D should be a graduation target, and no Faculty selected the two lowest levels of the scale as sufficient.

Aim 4: Necessary components required to provide effective feedback to residents in guiding the development of competent performance of an EPA

Table 5 shows that 71.1% of Faculty believed that “documenting a few weaknesses” was an extremely important or very important component, followed by “providing contextual comments of the case” (67.3%), “providing suggestions for future learning” (57.7%), and “providing a global assessment for an EPA with one-rating” (50%). “Documenting 2–3 strengths” was only perceived as very or extremely important by 40.4%. Finally, “providing an evaluation for each checklist item of an EPA” was deemed important by only 27% of Faculty, and when compared to the other components, a significantly higher percentage of Faculty indicated that it was not so or not at all important (9.6%) (p < 0.0002).

Table 5 Faculty perspectives on necessary components required to provide feedback to residents in guiding the development of competence performance of an EPA (N = 52)

A neurosurgery Faculty with less than 5 years’ experience provided the following comment:

"The most important feedback and evaluation is the personal, immediate and oral format. At the end of each rotation, one should give a global written assessment. That would be my preferred format.”

Another neurosurgeon with 11–20 years of working experience indicated an important element of providing resident feedback:

“Knowledge of management principles for the case."

Discussion

To support the CBME’s goals of increasing transparency and accountability of its residency graduates, this study has identified performance markers for standardizing Faculty expectations for guiding entrustment decisions and documenting and tracking residents’ successful attainment of a required level of competence for an EPA. Our scale is compared to others in use in Table 1. Most of the other scales assess resident performance based on some indication of their supervisor’s degree of involvement in the case. The results in our study revealed that being able to “perform without supervision” was rated as the least important performance marker for demonstrating resident competence attainment of an EPA. Additionally, 44.2% of Faculty believed that certifying competence based on decreased levels of resident supervision (e.g., documenting that resident is competent enough and the supervisor does not need to come on-site to supervise) would place Faculty at additional risk of liability. Our entrustment scale addresses the need to avoid the level of supervision as a performance marker.

We found that being able to “adapt performance in response to contextual complexities of the case” was highly important from both Faculty perspectives as a patient and perspectives as a Faculty member in rating graduating residents. This supports the inclusion of Level E in any ES. While the ISCP’s competence scale also assesses the ability of the resident to be able to manage without assistance, including potential common complications, our scale is not limited to solely considering complications [10, 28,29,30,31,32]. We focus on contextual complexities and the demands that these complexities place on the resident to perform safely. It also includes environments or events that deviate from a “normal” situation. Our Level E indicates that the graduating resident needs to be able to handle new events that occur during a case, not only able to do the sort of case that presents any challenging by going beyond the ordinary case or situation. Being able to handle new events that arise and to be able to “think of one’s feet” and adapt a solution that meets the needs of the patient to get them through surgery in a safe way is considered an important feature of a competent graduating resident. Over 90% of Faculty selected entrustment level E (67.3%) and D (23.1%) as a graduation target and no Faculty chose the lowest two levels, indicating that our scale discriminates between resident performance and so provides evidence for the validity of our proposed entrustment scale.

Being able to “perform efficiently” was ranked the eighth necessary performance marker selected by Faculty from both Faculty perspectives and Faculty as a patient perspective (Fig. 3). These findings challenge entrustment scales based on residents performing without supervision or other anecdotal definitions that highlight the need to perform efficiently [37], and instead agree with some studies suggesting that efficiency should be reserved for experts [38]. It also reveals that though it is important to operate independently, this does not necessitate a lack of supervision.

Moreover, 80.8% agreed that our scale’s descriptions were easy to understand and 78.9% corresponded that the five different levels of entrustment (A-E) represented an ideal number, which contrasts the use of some existing 9-point or 4-point entrustment scales [39, 40]. The highest level of entrustment, “E”, was designed to include important features of performance as identified in the literature (performing independently, safely, while also being able to respond to contextual complexities that could arise) [41,42,43,44]. Though a majority selected “E” as the level of entrustment meant to represent a resident’s competence attainment, about one quarter selected lower entrustment levels of “D” or “C”. This could possibly be due to the variations in the personal comfort levels or attributes of supervisors with entrusting residents with professional activities [25, 45,46,47].

It is important to note that the standard of performance expected of graduating residents using our proposed entrustment scale is EPA-dependent. For example, 79% of neurosurgery Faculty indicated that level E should be the graduation target for performing burr hole drainage of a chronic subdural hematoma while only 57% of them believed that level E should be graduation target for performing peripheral nerve decompression procedures [48].

We found that “documenting a few weaknesses,” “providing contextual comments of the case,” “providing suggestions for future learning,” and “providing a global assessment for an EPA with one-rating” were the most necessary components in providing effective feedback. Our scale helps guide residents to develop competent performance of an EPA, as our scale helps identify weaknesses, includes “adapting their performance or decisions in response to contextual complexities of the case,” as a performance marker, and is a global rating scale. These features of the scale would support Faculty in providing residents with feedback involving contextual comments which are considered necessary components required for effective feedback. Moreover, our scale is learner-centred, as the scale focuses solely on trainee’s performance rather than primarily highlighting the level of supervision required. It could be used on index cases, defined by the committee tasked with defining the competency levels expected of residents at the time of graduation. These cases would differ based on the specialty, and the expectations of patients and regulatory bodies at the time that the competency level is defined.

Limitations and future studies

Our study shares the limitations of other survey-based methods. We recognize that our results are representative of academic neurosurgeons in Canada, and as such, may not represent the perspectives of neurosurgeons elsewhere or of Faculty from other specialties. Given the electronic nature of the survey, and the difficulty of ascertaining if all emails sent were actually received, our response rate is a lower-bound estimate of the response rate. In addition to having a vast scope of practice, the specialty of neurosurgery has many overlapping procedures with other surgical specialties, regularly requires both medical and surgical management of unstable and critically ill patients and necessitates significant emotional and interpersonal demands of its trainees [49]. Thus, this study’s results likely represent reasonable insights and targets that could be generalized to other specialties looking to integrate CBME frameworks into their curriculum.

We acknowledge that we are in an early phase of the collection of evidence for the validity if our measure. Future studies will need to be conducted over a number of institutions, and, different levels of training, to determine the validity and reliability of this study’s proposed entrustment scale. In addition, given that the importance of providing contextual comments or responding to contextual complexities of a case were emphasized, future studies should be performed to see how the different contexts of the same EPA affect Faculty expectations for standard of performance expected of graduating residents.

Conclusion

To facilitate transparency and accountability of resident training within CBME curriculum, this study has defined Faculty expectations of resident performance markers deemed important for guiding entrustment decisions and acknowledging competence attainment of an EPA. We have provided an alternative entrustment scale that is learner and feedback-centred and focuses on the development of competency in residents that are being supervised and evaluated. The entrustment scale is a global rating scale that is supervision independent. It has good evidence of validity, as it has appropriate progression levels, is easy to understand, and can discriminate between the levels of training.