Background

In patient care, both Technical Skills (TS) and Non-Technical Skills (NTS) are necessary to maintain best practice as well as reach a high level of expertise [1]. TS are routinely taught in trainee programs. However, evaluation and assessment of NTS have been missing for a long time [2, 3]. NTS are defined as “the cognitive, social and personal resource skills that complement technical skills and contribute to safe and efficient task performance” [4].

Adverse events in high-risk settings often take place due to deficiencies in NTS [5], which has been shown in various fields such as aviation and nuclear energy [6,7,8] and has also been confirmed for medical care: up to 70% of adverse events are due to human errors [9,10,11]. In order to reduce medical errors, good NTS and improved teamwork are essential [12].

NTS interventions and mostly feedback on NTS have shown to have positive effects on team performance, concluding that good patient care requires TS and NTS, which have been found to correlate in crew resource management [13, 14] and to foster improved clinical performance like quicker problem solving in simulated operating theatre environment [15]. The positive effects of NTS also encompass enhanced patient safety. Salas et al. showed in a meta-analysis the positive effects of NTS on team members’ reactions and attitudes related to teamwork safety [16]. Other studies pointed out positive effects of NTS on clinical performance (TS) and on patient outcome like surgical complications and morbidity [17].

Knowledge of necessity and benefits of NTS lead many institutions to emphasize the importance of interpersonal skills [13, 16,17,18,19,20,21] and these departments implemented crew-resource training programs in their curricula, primarily focussing on NTS during postgraduate training [22].

The training of NTS should not only be focused in postgraduate training - undergraduate curricula and education should already integrate the concept of “patient safety”, directly addressing, teaching and assessing NTS [23, 24]. Only a few studies have investigated the effect of teaching NTS in undergraduates. Hagemann et al. showed that even one brief seminar had positive effects on undergraduates’ NTS [21].

The German Association for Medical Education has acknowledged the repeatedly expressed need of NTS implementation in undergraduate education by publishing a “Learning Objective Catalogue for Patient Safety in Undergraduate Medical Education”, which has the aim to unify the curricular targets in German medical faculties [25]. However, a concrete curriculum, implementation or teaching strategy for NTS in undergraduate education is not given yet. In addition, due to a missing conventional rating tool a structured assessment of NTS in undergraduates is often lacking.

To create and realize an implementation and teaching strategy for NTS in undergraduates, at first the structured assessment of NTS with a robust method is necessary [4], in order to provide specific and formative feedback and to monitor the learning progress in undergraduates.

Several rating tools for assessing NTS in medical professionals are available. They are helpful to provide feedback which is not based on “gut feeling” and to speak “the same language” during the feedback process [26,27,28,29,30,31]. However, the existing rating tools, such as the Anaesthetists’ Non-Technical Skills (ANTS) [32], are very complex and not designed for undergraduates or junior residents, as ANTS is developed for experienced anaesthesiologists to rate trainees who have reached certain TS, which limits its broad application in undergraduate education. A feasible application is further limited as for raters a two-day training with the rating scheme is required. The use of ANTS delays the feedback loop, as the NTS ratings are based on video clips of the training sessions, which are evaluated after the training [5].

The goal of this study was to develop a rating tool to assess NTS in undergraduate education in emergency medicine and anaesthesiology: Anaesthesiology Students’Non-Technical Skills (AS-NTS). The tool is supposed to be feasible and easily handled without the necessity for video recording or extended instructions and trainings for the user.

Methods

Study design

This study was performed at the Department of Anaesthesiology in the University Medical Center Hamburg-Eppendorf, Germany. The study was conducted in the period of Janurary 2017 to December 2017. Undergraduates and residents in anaesthesiology participated in this study with a stepwise design in order to develop and validate a rating tool for NTS in undergraduates in anaesthesiology. The development took place in four steps (Table 1), empirically grounded on qualitative and quantitative research methods:

  1. 1.

    Review of published literature (expert group)

  2. 2.

    Focus group and half-structured interviews

  3. 3.

    Field observation

  4. 4.

    Implementation and validation

Table 1 Development steps of AS-NTS

Table 1 shows a scheme of the conducted developmental steps and underlying research methods.

A detailed explanation of the development is given in the Additional file 1.

Study setting: assessment of NTS during emergency and anaesthesiology training sessions

The undergraduate curriculum of the Medical Faculty of Hamburg has implemented emergency training sessions in nearly every semester, in order to experience the students in emergency medicine. We use high fidelity simulators (Rescue Anne Laerdal) which are suitable for training technical skills such as endotracheal intubation, defibrillation or drug administration.

NTS were assessed in four different training sessions (Advanced cardiac life support I, II, III and operation room simulation) of four different semesters. In each training session a pre-existing set of standardized simulation scenarios were used (13 in total, a detailed description of the simulation scenarios is provided in the Additional file 1).

The simulation scenarios are standardised and solely for each type of training session. For example, the training session “Advanced cardiac life support II (ACLS II)”, which is held in the 3rd year of undergraduate education, includes the scenarios: “Hyperkalaemia”, “Hypothermia” and “Aspiration”.

In each training session every student is assigned to a small group which rotates through each simulation scenario.

With each following semester, the simulation scenarios require more advanced TS and NTS. In order to rule out that low NTS skills are due to technical skills being not proceduralized we decided to test our rating system in students who had already passed the basic life support training in following training sessions:

  • ○ Advanced cardiac life support I (ACLS I: 2nd or 3rd year undergraduates, pre-existing simulation scenarios: 2; number of rated simulation scenarios for interrater agreement analysis: 20)

  • ○ Advanced cardiac life support II (ACLS II: 3rd year undergraduates, pre-existing simulation scenarios: 3; number of rated simulation scenarios for interrater agreement analysis: 24)

  • ○ Advanced cardiac life support III (ACLS III: 4th year undergraduates, pre-existing simulation scenarios: 5; number of rated simulation scenarios for interrater agreement analysis: 23)

  • ○ Operation room (OR) simulation (3rd or 4th year undergraduates, pre-existing simulation scenarios: 3; number of rated simulation scenarios for interrater agreement analysis: 31)

In each training session, the undergraduates are divided into groups of three. Each of these groups rotates through the simulation scenarios of the training session. In each simulation scenario one student takes the role of the physician, the other two that of paramedics or anaesthetic co-workers. The student in the role of the physician leads the team and delegates basic tasks such as establishing the monitoring, preparing defibrillation and other required medical procedures to the other team members. Therefore, only this student was evaluated by the two or three supervising anaesthesiologists, using the AS-NTS.

Raters and interrater reliability

Twenty-one anaesthesiologists (Table 2) conducted the training sessions during the study period. In 67 emergency simulation scenarios two of them rated the students independently, in the 31 operating room simulation scenarios three raters were involved. The raters who rated the same simulation scenario, did not discuss their results while rating, in order to rule out cognitive bias.

Table 2 Characteristics of the twenty-one raters

The rater teams changed frequently based on the teaching schedule. The raters all received a five-minute introduction into the AS-NTS.

The interrater-reliability was investigated using a two-step approach.

In the first step a classical analysis of interrater-reliability was conducted, analysing data from rating pairs. To rule out agreement by chance, the intraclass correlation (ICC) from six pairs of raters were calculated, which had rated at least six simulation scenarios together. The first analysis included 67 of the total of 98 simulation scenarios. In five of the six pairs the first author (R1) took part (Table 3).

Table 3 Characteristics of the six pairs of raters

In the second step of the interrater-reliability analysis, the whole data set from the 98 simulation scenarios was analysed. To rule out that either the strong involvement of R1 in the development process of AS-NTS or the medical training had an effect on the interrater reliability, data was aggregated across raters being in the same year of training. This allowed us to investigate the relationship between medical expertise and rating agreement (Table 4).

Table 4 Ratings and comparisons by anaesthesiology training after data aggregation

Statistical analysis

Statistical analysis was performed using IBM SPSS Statistics Version 23.0. Intraclass correlation (ICC) was used for ordinally scaled data and Cohens Kappa for nominally scaled data to calculate interrater reliability. We used the one-way random effects model to calculate the ICCs [34]. Values of ICC and kappa below 0.40 are interpreted as poor correlation, between 0.40 and 0.59 as fair correlation, between 0.60 and 0.74 as good correlation and between 0.75 and 1.00 as excellent correlation [35].

Results

Development of the AS-NTS assessment tool

The literature search resulted in 12 different NTS important in anaesthesiology (Table 5). The discussions in the focus- and expert group revealed, that not all of these NTS are highly important for undergraduates. During the field observations some NTS were difficult to observe. Using the results of the focus group discussions we defined new dimensions specifically for undergraduates, symbiosing some pre-defined NTS:

  • Planing tasks, prioritising and conducting

  • Teamwork: exchanging information and leading the team

  • Team orientation

Table 5 Hierarchical mapping of Non-Technical skills and multi-step development of AS-NTS

Table 5 displays the created list of the NTS and the further conducted steps which were decisive for the inclusion of each skill. The last column illustrates which NTS is part of ANTS and AS-NTS. Figure 1 displays the definition of the NTS.

Fig. 1
figure 1

Definition of the NTS. The definitions were extracted from the cited taxonomies in Table 5, mostly the ANTS system

The first dimension of AS-NTS:

“Planning tasks, prioritizing and problem solving” resulted as a compound, mainly formed by pre-defined NTS dimensions “Decision making” and “Task management” (Fig. 2). The elements that were considered important in undergraduates and therefore created the basis to define the first dimension of AS-NTS are highlighted.

Fig. 2
figure 2

Underlying NTS for dimension one of AS-NTS

“Coordinating team members”, “communication” and “Leadership” were regarded as highly important in the focus group and performance could be observed in different levels during the field observation, therefore these elements created the basis for dimension two of ANAESTHESIOLOGY STUDENTS’ NON-TECHNICAL SKILLS: “Teamwork and leadership”.

Leadership, defined as the skill of directing others, coordinating, managing workload and motivating others [37] is often separated into two independent dimensions allowing for the assessment of different leadership styles [60] distinguishing between task orientation and team orientation. In this leadership model, “Task orientation” is closely related to our first two AS-NTS dimensions, therefore we decided to add “Team orientation” as third and final dimension of the AS-NTS. “Teamwork and leadership” emphasizes the collaborative processes to perform a task, whereas “Team orientation” focuses on the collaborative processes to build a team.

In contrast to the ANTS, performance is rated in the AS-NTS on the three dimensions and not on the level of skills. However, the underlying skill structure was used to give behaviorally anchored rating examples to clarify what a “good” or “poor” performance on each dimension might look like. In the final AS-NTS assessment tool (Fig. 3), a five-point Likert scale was used for each dimension, although the ANTS system has a four-point scale [32]. Cook et al. could show that, in regard to reliability and interrater reliability, there are no differences in 5- and 9- point scales in mini-clinical evaluation exercise [61].

Fig. 3
figure 3

AS-NTS assessment tool (english version; the original German version has been added to the Additional file 1)

Feasibility and content validity of the scoring system

The interviews with eight anaesthesiologists in their first year of residency, who used both the AS-NTS and ANTS in simulation training (including video tapings), showed that no further dimension had to be added to the AS-NTS rating tool (step 4). Furthermore, they confirmed the feasibility of AS-NTS and concluded that in undergraduates, as well as in the first 2 years of residency in anaesthesiology, the ANTS system is too complex.

Without video tapings it is nearly impossible to complete ANTS, due to time shortness. This was already pointed out by the developers [5, 32]. The eight anaesthesiologists discovered the rating of the videos to be very time consuming and delaying the feedback loop.

These anaesthesiology trainees decided to continue their postgraduate training curriculum using AS-NTS, rather than ANTS, for their first 2 years of residency.

The results from an additional evaluation questionnaire, completed by 21 anaesthetits, who had used the rating tool at least three times in undergraduate medical education, confirmed that the AS-NTS was feasible and practical (Additional file 1: Table S1). Additionally, they rated the importance of each dimension of AS-NTS.

The content validity index for each dimension was calculated, reflecting the proportion of relevance [62]. The calculated content validity index for the first dimension of AS-NTS was 0.9, for the second dimension 0.95 and for the third dimension 0.8. A content validity index of 0.75 or higher is considered as “excellent” [33].

Interrater reliability

The interrater reliability reached high levels of agreement (Table 6), except for dimension two, in the group of 3rd vs. 5th year residents (fair correlation). The ICC indicated a high rater agreement regardless of educational experience, training in anaesthesiology or familiarity with the AS-NTS rating tool.

Table 6 Interrater reliability of the six pairs of raters and of all data (98 rated simulation scenarios)

Discussion

The development of the AS-NTS was performed in a stepwise approach, beginning with a review of pre-existing literature, continuing with focus group analysis and field observation, and ending with implementation and validation.

The steps were processed by means of empirical and qualitative research methods, which have gained a broad application in medical research [63,64,65,66,67,68,69,70,71].

During the field observations some skills proved to be difficult to observe and excluded from ANAESTHESIOLOGY STUDENTS’ NON-TECHNICAL SKILLS, based on developmental guidelines of assessment tools described by Abell et al., who recommend items to be excluded, if they are not observable in at least 50% of field observations [72].

Nonetheless, the excluded skills are part of most existing NTS taxonomies and regarding the importance of these skills, one might argue that they should still be taught and addressed in undergraduate education.

However, acquiring and refining NTS is an individual and ongoing process [73]. Therefore, in undergraduate training pre-cursors of some NTS should be assessed and evaluated. Further, most of the taxonomies from which the NTS list was extracted, are developed for postgraduate training- focussing on specialist level, which makes these skills not one to one transferable to undergraduates.

Those skills should be focused in more advanced educational levels, mostly in postgraduate training. Nevertheless, the aim of the study was to include as many NTS as possible into the ANAESTHESIOLOGY STUDENTS’ NON-TECHNICAL SKILLS, in order to assess them in undergraduates to provide accurate feedback, enhancing the learning process. [74] For this goal, skills were redefined during the development of ANAESTHESIOLOGY STUDENTS’ NON-TECHNICAL SKILLS, symbiosing some pre-defined NTS and focusing more on pre-cursors and underlying elements of skills. This adaptation process was not solely based on the expert- and focus groups- but was supported by literature and resulted in the new dimensions of ANAESTHESIOLOGY STUDENTS’ NON-TECHNICAL SKILLS, specifically designed for undergraduates. The adaptation step was necessary, as some NTS are highly important but not fully developed in undergraduates.

Transferred to the first dimension of ANAESTHESIOLOGY STUDENTS’ NON-TECHNICAL SKILLS, two main dimensions of described NTS (“Decision making” and “Task management”) were symbiosed to the first AS-NTS dimension “Planning tasks, prioritizing and problem solving”.

This might lead to the assumption a specific assessment of these skills is not possible, as they are assessed in the same dimension of performance and in pre-existing rating tools, they are separately assessed.

This objection can be warded by focusing on the developmental rational and existing literature:

“Decision making” is a complex skill which is divided into subskills and rated separately by some behavioral taxonomies [28, 29, 32]. Flowerdew et al. pointed out that it is not only making the decision which is of great importance, but also following the effects caused by the decision, like planning and prioritizing tasks to conduct the decision [38]. Here, “Decision making” is directly linked to the dimension “Task management”, which includes the elements: “Planning and preparing, prioritizing, providing and maintaining standard, identifying and utilizing resources”. Conducting the elements of “Task management” is the following consequence after “a decision is made”.

The comprehensiveness of “Decision making” and “Task management” regarding the training level of undergraduates was pointed out repeatedly, leading to the exclusion of these dimensions and focusing on some elements of these skills.

The elements of “Decision making” are: Identifying options, balancing risks, selecting options and re-evaluating [7, 20, 32, 56, 75].

Identifying options is necessary to solve a problem- in the decision making loop the risks and benefits of the solving strategy are re-evaluated – this concept of decision making is applicable in more complex scenarios than in undergraduate simulation training. Therefore, the focus group discussed and agreed to include “problem solving” as a less complex proxy of decision making into the first dimension of AS-NTS.

The strength of this study was scrutinizing the interrater reliability from different viewpoints. The interrater reliability was not only defined by a few designated raters, as in classical approaches. A two-step approach was chosen to analyse rater agreement, simultaneously examining if personal background (a.e. year of anaesthesiology training or experience in medical education) might influence ratings. First, agreement of rater pairs were analysed with a sufficient number of ratings, excluding agreement by chance, then data aggregation of the full sample was conducted based on anaesthesiology training, to calculate the interrater reliability. AS-NTS achieved excellent Interrater reliability, only within the group of 5th year vs. 3rd year anaesthesiology residents, the ICC and Cohens Kappa were “good” and only “fair” for dimension two of AS-NTS.

Data aggregation in the full sample, supports the result that the rating agreement is detached from anaesthesiology training and experience in medical education, fostering the usability of AS-NTS.

The strong involvement of Rater 1 in the assessment of the interrater reliability might lead to the assumption one rater could influence all the other raters. Regarding our results, this is not the case, as data aggregation across all raters, in which Rater 1 is not represented predominantly, showed high agreement on ratings as well.

The dimensions of the final version of AS-NTS achieved excellent content validity indexes according to a guideline for evaluating standardized assessment instruments [35]. However, one weakness of the study is that the calculation of the content validity index was only calculated from the evaluation of twenty-one anaesthesiologists. Although there is no predefined sample size required to establish content validity [76], the effect of agreement by chance is higher in a small sample.

The AS-NTS has high potential to improve NTS assessment in undergraduate education and ultimately patient safety, because a lack of NTS leads to adverse events in high-risk settings [5]. A recent study by Hagemann et al. showed that NTS in undergraduate students are improved after only one seminar [21]. Due to its good feasibility, the AS-NTS could be applied to all students as a standardised assessment and feedback tool.

Limitations

The AS-NTS has only been tested in German language at one institution with a limited number of teachers. Further studies should be conducted to establish the validity, reliability and feasibility of the English version.

Conclusion

AS-NTS provides a structured approach to the assessment of NTS in undergraduates, providing accurate feedback. The findings of usability, validity and reliability indicate that the AS-NTS can be used by anaesthesiologists in different year of postgraduate training, even with little experience in medical education.