General anesthesia (GA) remains a choice for Cesarean delivery (CD) if there is an absolute or relative contraindication to neuraxial anesthesia or when expedited delivery is required for the safety of the mother or the neonate. Emergency CD requiring GA is a unique and challenging situation for the anesthesiologist as it involves two patients at the same time. Even the healthy mother undergoes anatomical and physiological changes of pregnancy, and the fetus may be stressed by innumerable issues. Some of the challenges while managing GA in a parturient include inadequate time for pre-oxygenation, difficult ventilation/intubation and airway management, rapid desaturation, aspiration risk, risk of intraoperative awareness, and uterine atony.1

According to the American Society of Anesthesiologists’ closed claims analysis (1990-2003), the medicolegal liability claims in obstetric anesthesia for maternal death or permanent brain damage remain high, with GA for CD accounting for 28% of total claims. The most common causes for such claims were found to be difficult intubation (25%), inadequate oxygenation and ventilation (4%), aspiration of gastric contents (4%), and airway obstruction (4%).2

In current obstetric practice, however, GA has been largely replaced by regional anesthesia, which has led to a significant decrease in the practical exposure of trainee anesthesiologists to this important procedure and threatens their ability to achieve and maintain optimum skill level and clinical competency.3-5 The factors specific to the obstetric arena, such as extreme time pressure, lack of availability of skilled assistance, anxiety about outcome, and poor team communication, further add to the common stressors of an emergency situation.6,7 Complications during such events are often multicausal, and human errors and organizational factors contribute in 50-70% of the cases.8

Trainees are required to learn how to provide optimal treatment while ensuring patient safety, and balancing these two needs is an important part of residency training. Simulation-based learning can help mitigate this tension by developing trainees’ knowledge, skills, and attitudes while protecting patients from unnecessary risk. Recent studies in medicine have suggested simulation as one of the most effective methods for formative and summative assessment of participants.6,9,10 Furthermore, the use of simulation training in obstetric anesthesia has been endorsed in the triennial reports of the Confidential Enquiry into Maternal and Child Health (CEMCH) 2003-200511 and 2006-200812 as a method of improving performance in the management of life-threatening emergencies, especially for junior staff. The reports also suggested that it should be incorporated as part of a nationally accredited scheme.

The objective of this study was to assess the influence of a teaching plan consisting of didactic teaching and repeated simulations on the performance of anesthesia residents in the management of GA for emergency CD. Further, we also aimed to identify areas of consistently poor performance so that more emphasis can be placed on these areas during teaching in our curriculum. We hypothesized that structured teaching and at least one high-fidelity simulation session after two months would be required to see improvement in the skills of the residents in the management of GA for emergent CD.

Methods

Design and recruitment of participants

Institutional Research Ethics Board approval (09-0108E) was obtained for this quasi-experimental study, which was conducted at Mount Sinai Hospital, Toronto from October 2009 to January 2011. Recruitment criteria included anesthesia residents in postgraduate years 2 (PGY2) and 3 (PGY3) at the University of Toronto who had already completed their dedicated three months of obstetric anesthesia training. All residents signed a written informed consent for participation in the study and for the videotaping and assessment of their performance. They also signed a confidentiality agreement not to disclose study details to anyone outside of the study personnel. The study structure consisted of a didactic teaching session followed by a simulation session after one week and a repeat simulation session two months later.

Didactic teaching

An interactive didactic PowerPoint©-based teaching session was held for all participants by the principal investigator. The presentation included information from standard anesthesia textbooks,13 literature review,14-16 and current guidelines (Electronic Supplementary Material).17,18 The teaching included a comprehensive review of the physiology of pregnancy and management of GA for emergency CD with focus on the following points: preoperative assessment, including airway examination; checking equipment; preparation before induction; seeking assistance; aspiration prophylaxis; appropriate medications for induction and maintenance of anesthesia; monitoring standards; and postoperative recovery. The two-hour session generated extensive discussions on practical aspects of case management.

Simulation sessions

The simulation sessions were conducted at the SimSinai Centre at Mount Sinai Hospital. The simulation centre was set up as a virtual obstetric operating room equipped with a Laerdal SimMan® simulator mannequin (Laerdal Medical Canada Ltd, Toronto, ON, Canada) with programmable monitors (Laerdal Medical Canada Ltd., Toronto, ON, Canada), anesthesia machine (Datex Corporation, St. Laurent, QC, Canada), anesthesia drug cart and airway equipment, along with the instruments needed to perform a CD. An abdominal flap was fitted over the mannequin’s abdomen to mimic pregnancy. The simulation programmer, with a view into the room, was seated outside the simulation room to input commands and manage the mechanical interfaces. The mannequin’s voice was a speaker located inside the mannequin with a microphone in the programmer’s console. There was capability for communication between the simulated operating room and the programmer’s console speakers. A video camera in the simulated operating room recorded the events, which were displayed with superimposed vital signs on the image, and these recordings were used for subsequent evaluation by the raters. The participants were unaware of the nature of simulation, the details of the scenario, or the assessment criteria.

The simulations were performed with a common team of actors in the roles of obstetrician, nurse, and respiratory therapist. The programmer and actors were given guidelines on specific verbal responses to provide during the simulations, either voluntarily or in response to a participant’s question. Participants were specifically instructed to perform all actions using the appropriate equipment as in a real-life case and to verbalize their thoughts to enable appropriate assessment. Before the simulation sessions, each resident underwent an orientation to the mock labour and delivery operating room, mannequin and monitors, as well as the method of debriefing and evaluation.

Two identical simulation sessions requiring GA for emergency CD were conducted two months apart. The first simulation session was held one week after the teaching session.

The simulation scenario involved a 38-wk pregnant woman presenting into the operating room with a prolapsed umbilical cord. The obstetrician was unable to elevate the presenting part on her vaginal exam. There was an associated sustained fetal bradycardia (50 beats·min −1 ) and a stat CD was required.

Five minutes prior to case presentation, the anesthesia resident was asked to set up the operating room, similar to the practice at the change of shift at our hospital. During simulation, the programmer was instructed to alter the vital signs according to a flow sheet that indicated the suggested changes based on the type and doses of drugs administered.

After the first simulation session, the study investigator gave participants a focused feedback highlighting the critical errors. The participants were requested not to participate in any other simulation sessions during this time frame. The debriefing after the second session involved interactive oral feedback and a constructive critique and the opportunity for participants to provide the study investigators with feedback on their experience.

Development of technical skills checklist

To facilitate the evaluation process, the study investigators developed a checklist that was based on the literature review, previous guidelines, and common practice at our institution (Appendix)1,13-19 and designed to reflect the specific tasks necessary for the appropriate performance of GA for emergency CD. The proposed checklist was delivered electronically to eight members of the Canadian Anesthesiologists’ Society who are experts in obstetric anesthesia and represent several geographical areas of Canada. The experts were asked to validate the checklist using the modified Delphi technique.20 The experts, six of whom participated in the entire process, were blinded to each other’s identities. They weighed each checklist item on a scale of 0-4 based on the order of importance (where 0 = not important and 4 = extremely important). The items with “0” rating were removed from the checklist. After each round, each expert was given the opportunity to change their rating (based on the median score for that item), to suggest the elimination or addition of items on the list, and to make pertinent comments. Uniform consensus for the list of items was obtained in the second round, and the median weights of the items were finalized for scoring (Appendix).

Evaluation of technical skills

Two raters, experts in obstetric anesthesia and simulation assessment, independently evaluated the technical and non-technical skills of the participants using video recordings. The raters received videos in a random order and were blinded to the residents’ training level, simulation session, and each other’s scores. The technical skills were assessed using our checklist generated via the Delphi technique. The participants scored “1” if the task was completed as per expectations and “0” if not done at all, done after prompting, not done timely, or done with error. Each task score was multiplied by its weighting factor, and the weighted scores were summed to provide a total score. The final score used for the analysis was the percent of the maximum possible score.

Evaluation of non-technical skills

The non-technical skills were assessed using a previously validated Anaesthetists’ Non-Technical Skills (ANTS) checklist21 that included such items as task management, teamwork, situation awareness, and decision-making, which were each rated on an anchored ordinal scale (0 = not observed; 1 = poor; 2 = marginal; 3 = acceptable; and 4 = good). The raters had received prior instruction in the use of ANTS and were provided with the ANTS background literature with the User Manual.Footnote 1 Before independently rating the study videos, the reviewers viewed videos that did not include the study participants and discussed their individual ratings to produce an agreed rating. This exercise provided the opportunity for developing a clearer definition of the scoring rubric for each task.

Our primary objective was to assess the influence of a teaching plan on residents’ performance in managing GA for CD over time, as measured by the checklist and ANTS scores. The secondary objectives were to compare performance based on level of residency, to identify common critical errors, and to receive participants’ feedback.

Statistical analysis

The checklist scores were reported as mean (SD) in percentages, while ANTS scores were presented on a four-point scale as described in the Methods. Paired Student’s t tests were used to identify significant differences in scores over time. In order to check if the skills in particular changed over time for each outcome, the proportion of participants completing the tasks in session 1 vs session 2 was compared using McNemar’s test. All reported P values are two-sided. For each simulation session, we calculated Spearman’s rank correlation between the checklist and ANTS total scores. Inter-rater reliability was summarized with the intraclass correlation coefficient (ICC1,2) for consistency determined from a two-way random effects model. For the purposes of calculating inter-rater reliability, session 1 and 2 scores were averaged. All analyses were completed using SAS® version 9.2 (Cary, NC, USA).

Sample size for Student’s t test for paired samples was calculated using the methods outlined by O’Brien and Muller22 which incorporate the within-subject correlation. Assuming a two-sided type I error of 0.05, a type II error of 0.20, and a conservative estimate of within-subject correlation (i.e., correlation between pre and post measures) of 0.60, we determined that 20 participants with pre and post measurements would be required to identify a moderate effect size (i.e., Cohen’s d = 0.60).

Results

Twenty-one of the 40 anesthesia residents invited to enrol agreed to participate in the study. All participating residents attended the didactic teaching and both simulation sessions, except for two who did not appear for the second simulation. Of the 19 residents who completed the entire study, 12 were PGY2 and 7 were PGY3 residents. The mean (SD) age of the participants was 29 (3) yr. They had completed 5 (1.5) months rotation in obstetric anesthesia prior to their participation in the study. At our university, all residents complete an anesthesia rotation in hospitals with an obstetric specialty, a two-month rotation in their first year and four to eight months during their second year. Their posting to obstetric anesthesia typically consists of 12-16 shifts (9-10 hr/shift) during their first year and an average of 40-58 shifts (9 hr/day and 15 hr/night) during their second year of residency. All residents had received some previous simulator training using the SimMan mannequin during the residency program. Sixteen participants had attended at least one case of GA for CD with the attending staff present throughout the case, and the median (range) exposure was 2 (0-5) cases.

Checklist scores

The mean (SD) weighted checklist score of the participants was 64.5% (7.1%) in session 1 and an increase to 76.7% (6.7%) in session 2 (difference = 12.2%; 95% confidence interval [CI] 8.5 to 16.0; P < 0.001) (Table 1, Fig. 1). The differences in scores between PGY levels were small for both sessions and the improvement between sessions was similar (Table 2).

Fig. 1
figure 1

Box plots showing the distribution of technical skills scores of the residents. PGY = postgraduate year. Y axis represents average scores in percentages

Table 1 Checklist scores in both simulation sessions
Table 2 Checklist and Anaesthetists’ Non-Technical Skills scores by residency level

Anaesthetists’ Non-Technical Skills scores

The overall mean (SD) ANTS score was 3.0 (0.4). The score improved significantly over time from 2.8 (0.5) in session 1 to 3.3 (0.4) in session 2 (P = 0.001), suggesting improvement from the acceptable to the good range (Table 3). There was no difference in the ANTS scores between PGY2 and PGY3 residents, with both groups demonstrating significant improvement in the second simulation session (Table 2, Fig. 2).

Table 3 Anaesthetists’ Non-Technical Skills scores in all participants
Fig. 2
figure 2

Box plots showing the distribution of Anaesthetists’ Non-Technical Skills scores of the residents. PGY = postgraduate year. ANTS = Anaesthetists’ Non-Technical Skills. Y axis represents Anaesthetists’ Non-Technical Skills rating of 1-4, where 1 = poor; 2 = marginal; 3 = acceptable; 4 = good

Common performance errors

Many of the tasks that the experts considered highly important (i.e., given a score of 4) were completed by fewer than 50% of the participants, especially in the first session. These tasks included: airway assessment, checking availability of airway equipment and emergency drugs, confirming left uterine displacement, positioning the patient’s head properly for tracheal intubation, auscultation of the chest for bilateral breath sounds, asking the assistant to release the cricoid pressure after intubation, and providing adequate preemptive analgesia to ablate intraoperative sympathetic responses and prepare the patient for postoperative pain (Table 4). More participants performed these tasks in the second simulation session.

Table 4 Number of individual tasks performed by the participants

Correlations and reliability

There was a moderately high correlation between the overall checklist and ANTS scores (correlation coefficient, r = 0.7; 95% CI = 0.37 to 0.87; P < 0.001) and in the individual simulation sessions (session 1: r = 0.7; 95% CI = 0.38 to 0.87; P < 0.001) (session 2: r = 0.65; 95% CI = 0.26 to 0.85; P = 0.003). The correlations between the subscales of ANTS were also high and statistically significant in both sessions. For example, in session 1, those with high decision-making scores also had high scores in task management (r = 0.89), team work (r = 0.78), and situation awareness (r = 0.90); all differences were statistically significant (P < 0.001). The inter-rater reliability among the raters was high with an overall ICC for checklist scores of 0.72 (95% CI 0.62 to 0.81) and an overall ICC for ANTS of 0.74 (95% CI 0.49 to 0.89).

Participants’ feedback

On average, the participants rated the setup and realism of the scenarios as 3.4 (0.6) and debriefing as 3.9 (0.2) on a four-point scale (1-4; 1 = poor and 4 = excellent). All participants mentioned that all components of the scenario were easy to understand and strongly indicated the need for simulation training. All stated that this experience would help them in clinical management. Prior to their first simulation session, the comfort level of performing the case in a clinical setting was rated as poor, marginal, satisfactory, and good by 33%, 25%, 42%, and 0% of participants, respectively; however, after the second session, the comfort level was rated good by 100% of participants.

Discussion

Our study shows that participants’ technical and non-technical skills improved with successive simulation sessions, suggesting that didactic teaching alone may not be sufficient. The addition of hands-on practice with simulation and feedback during debriefing is beneficial in reinforcing the skills and enhancing performance. This also suggests a role for experiential learning with repetition of similar scenarios at short intervals.

In our study, despite prior intensive interactive teaching, the scores in the first simulation session were not remarkable. Although the participants were unaware of the assessment criteria, all the items on the checklist were extensively discussed during the teaching session. This implies that trainees may not be able to retain all the knowledge gained during didactic teaching and/or may not be able to put that knowledge into practice. Poor scores could also be related to residents’ limited clinical exposure to such cases and their awareness of being videotaped and evaluated in the simulation environment.

Significant improvement in the second session suggests that the skills in the management of this scenario do not deteriorate after an interval of two months. Practical experience gained through simulation and focused debriefing during the first session may have reinforced the management decisions of the residents and led to an improvement in performance in the second simulation session. There was a potential for the third-year trainees to outperform the second-year trainees; however, we found no difference in their technical or non-technical skills during any of the simulation sessions. This could be because the residents in their third year rotate through internal medicine with no exposure to anesthesia or obstetric specialties.

Many variables have been shown to affect retention of skills and could be difficult to isolate in real-life or simulated scenarios.23 Factors, such as hands-on practice, simplicity of instruction, multimedia presentations, and feedback from instructors, have shown to have a positive effect on skills retention. We have observed that a structured teaching plan that includes more than one or all of these factors is required to realize the effective outcome.

It is necessary to define protocols for uniformity and standardization of practice, especially in critical scenarios that are not routinely encountered. Nevertheless, adherence to standard protocols or guidelines while managing crisis situations varies as per the type of emergency and the time interval after training.24,25 It has been shown that about 50% of anesthesiologists deviate from in-house protocols when dealing with failed intubation in obstetrics.24 Studies in both basic and advanced cardiac life support show short retention times and linear degradation of skills with time.23,26 Similarly, any skill, including that learned in a simulation setting, is likely to follow a similar fate, and the optimal interval for reinforcement of such training remains to be determined.

In their simulation study on failed intubation in obstetrics, Goodwin et al. 27 demonstrated several deficits in the performance of anesthesia trainees. These, however, improved significantly after practice and formal teaching, with a greater adherence to the protocol on the second occasion. Scavone et al.28 studied the performance of trainees in the simulated management of GA for emergency CD and found higher scores in those trained on a patient simulator with this scenario as compared with those trained on GA for a non-obstetric scenario. Our study further reinforces the importance of hands-on practice in enhancing not only the procedural skills but also the non-technical skills, which the participants would possibly have learned in an implicit way during the first simulation session. There was a higher adherence to the institutional protocol for GA for CD in the second simulation session. This could be related to the shorter intervals between the teaching session and first simulation session, and the first and second simulation sessions, in addition to the feedback from the first simulation session. More simulation sessions at various intervals may be necessary until the participants demonstrate the best possible performance.

Several areas of weakness were identified in the first session. Lack of appropriate preoperative assessment was observed; this has been shown to be a contributory factor in anesthesia-related mortality.29 Despite being given enough time to check the availability of equipment and emergency drugs, many residents failed to accomplish this task. A previous study published by our group on unanticipated difficult airway management in obstetrics revealed a concerning observation, i.e., study participants did not call for help or request a difficult airway cart, and they made infrequent use of a laryngeal mask airway device.30 Although this was not a difficult intubation and/or ventilation scenario, the importance of availability of equipment for an unanticipated difficult intubation was highly emphasized in the teaching session. Very few residents ensured left uterine displacement of the patient. It is likely that residents assumed that the patient was positioned with left uterine displacement, considering that this task is performed most often by the nurses. Asking for timely release of cricoid pressure was often forgotten. After intubation, it is crucial to confirm endotracheal tube placement by auscultation of the chest for bilateral breath sounds. We noticed a threefold improvement in this task in the second session. Frequent findings in our study were inadequate anesthesia prior to delivery and inadequate preemptive analgesia after delivery. These were likely due to fear of neonatal depression and the short length of the procedure, respectively. Focusing on these commonly performed errors in classroom teaching could lead to further improvement in case management.

The American Joint Commission root cause information (2004-2012) recently revealed that communication errors were responsible for greater than 50% of overall maternal sentinel events.31 To avoid such major adverse events, it is important to train the residents in their non-technical skills, such as communication, cooperation, leadership, etc., which are promising tools to foster teamwork. Although we did not provide any explicit training on non-technical skills during the teaching session, we noticed a significant improvement in their ANTS scores in the second simulation. A good correlation between our objective checklist and the ANTS scale indicates that both explicit and implicit processes play a complementary role for better overall performance. Furthermore, a strong correlation between the individual subscales of the non-technical skills indicates a high interdependency of these items for overall improvement.

Our procedural checklist was predominantly tailored to the practice at our hospital and was validated by experts from Canada. Compared with the checklist by Scavone et al.,19 our list has some additional items, including a section on recovery, and the experts have given different weight to the items. Bould et al.32 suggested that a different group of experts may not agree on each point on the checklist, thus necessitating the development and validation of additional checklists for assessing skills for similar procedures. An additional advantage of this process is that the checklist will have an intrinsic validity of content if it is well constructed, ensuring that it incorporates and examines what is taught in that institution.32 Our checklist does show good validity and reliability of content, and the scores of our participants are comparable with those in the study by Scavone et al.

All participants strongly indicated the need for high-fidelity simulation training on GA for CD and agreed that it is a useful method of testing and enhancing their knowledge. Given our results, we look forward to incorporating formal teaching followed by a simulation session into our own residents’ program of study in the future.

There are some limitations to our study, including a small sample size, as only half the number of invited residents agreed to participate in the study. However, in our view, they represent the typical cohort of our residents. In the absence of previous research on the effect of successive simulations and with our observation of improvement in performance, especially of non-technical skills, we think this study could form a foundation for further investigations in this area. A further limitation was that our raters were recruited from faculty at the institutions attended by the residents; hence, it is possible that the raters could identify some of the subjects. Nevertheless, they were blinded to the residents’ training level and simulation session (first or second) and all videos were allocated in a random order. Since the participants performed similar scenarios in both sessions with the same actors, it is unlikely that the reviewers had any potential bias in the assessment.

Another limitation of our study is the lack of a control group without didactic teaching or a group with didactic teaching but without a first simulation session. Since our study included junior residents with limited exposure to such cases, it was decided to provide them with uniform baseline information to ensure a better learning experience. Furthermore, teaching is a standard practice in most institutions and not teaching some residents would have deprived them of the unique learning experience. Skipping the first simulation session would perhaps have led to even worse scores in the simulation at two months due to a longer interval between teaching and simulation. Nevertheless, the residents served as their own controls for tracking their performance over time through repeated simulations. The findings of our study may have clinical significance since our specific teaching plan improved scoring as measured by our checklist.

For any form of simulation, realism or authenticity of the experience is important for participants. Although the mannequin used in our study was human-like and had physiologic responses consistent with the clinical situation, it cannot be mistaken for a live person. One of the common concerns is that simulation is obviously an “artificial environment”, and therefore, the amount of stress created in this setting could be far less than that in the real situation, as the performer knows well that life is not at risk. This may have been the reason for some of the common failures and errors seen in our study. It may be argued, however, that this provides learners with an opportunity to learn from their own mistakes.

As with any study using high-fidelity simulation as an evaluative tool, any findings should ideally be confirmed by evaluating performance in real situations. Unfortunately, the infrequent and unpredictable nature of emergency CD makes it challenging to perform an evaluation of such events in anesthesiology. Periodic reinforcement of skills in a simulation environment may be beneficial to help with their translation in a real-life scenario.

We suggest that use of simulation for obstetric anesthesia has the potential to decrease the number and effects of medical errors, facilitate open discussions in training situations, and enhance patient safety. By accepting simulation as a standard of training and certification, health systems will be viewed as more accountable and ethical by the populations they serve. Such simulations can facilitate the objective assessment of the residents, identify gaps in teaching, and ultimately lead to improved clinical practice.