The Fundamentals of Endoscopic Surgery (FES) program, created by the Society of American Gastrointestinal and Endoscopic Surgeons (SAGES), is an educational and assessment tool of knowledge and skill in flexible gastrointestinal endoscopy. The exam is a validated assessment tool that consists of both a cognitive knowledge and manual skills assessment, the latter of which involves five tasks assessing the following skills: scope navigation, loop reduction, mucosal inspection, retroflexion, and targeting [1]. The skills portion of the exam is hosted on the 3D Systems (formerly Simbionix) GI Mentor endoscopic simulator, a virtual reality (VR) endoscopy simulator that has been the subject of multiple studies assessing construct validity [2,3,4,5]. Virtual reality simulators allow for the utilization of proficiency-based curricula, which require the identification of construct valid tasks and performance-based rather than time- or repetition-based criteria for completion [6]. A curriculum for attainment of proficient performance accounts for different rates of learning in individuals and ensures that trainees are truly acquiring an acceptable level of skill prior to performing procedures [6].

Beginning in 2018, all surgical residents are required to pass the FES examination to graduate; thus, there is a need for an effective curriculum to prepare trainees to pass this exam. The urgency for effective endoscopy curriculum had been heightened by work that demonstrated poor baseline pass rates of surgical trainees, suggesting that standard endoscopy experience is insufficient [7]. Fortunately, a few proficiency-based curricula have been developed to help trainees prepare for the FES manual skills examination [8,9,10,11]. Despite improved FES exam pass rates, no studies have assessed the impact of a VR training curriculum on trainee’s endoscopic skills in a live animal model or patients.

Simulation-based endoscopy training has been shown to improve performance on a porcine model and studies have shown a correlation between live colonoscopy and FES manual skills assessment performance [12, 13]. The ultimate goal of the FES curriculum and endoscopic training is to prepare surgeons to perform endoscopy on real patients, and thus, understanding the optimal VR training preparation for colonoscopy in a live animal model is relevant for all general surgery training programs. The purpose of this study is to evaluate the transfer of VR endoscopy training to live porcine colonoscopy and to compare the relative effectiveness of two different VR training paradigms in preparing for live colonoscopy: proficiency-based training vs repetition-based training.

Materials and methods

Participants

Novice endoscopists, defined as trainees who have completed fewer than 10 colonoscopies, were recruited via email from a single large academic surgical residency between January 2020 and June 2021. At our institution, residents complete a dedicated endoscopy rotation during their second postgraduate year; therefore, recruitment was limited to surgical interns (categorical general surgery and general surgery preliminary interns) whose rotations do not generally include colonoscopies. Participants were excluded if they had previous participation in a simulation-based endoscopy training curriculum or had previously taken the FES examination.

Instrumentation

The GI Mentor (3D Systems, Rock Hill, SC) is a virtual reality endoscopic simulator that is designed to assist in the teaching and practice of both upper and lower gastrointestinal endoscopy. This platform was chosen by SAGES for the administration of the FES examination and as a result, was selected for use in this study [1].

Study design

Our Institutional Review Board (IRB) exempted this protocol from further review with regards to human subject protections (IRB Protocol: 2015P000522-AME4) as an amendment and extension of our research team’s prior work on FES VR curricula [10]. The Partners Healthcare Institutional Animal Care and Use Committee (IACUC) approved the porcine endoscopy protocol for use of swine in surgical training (IACUC Protocol #2019N000140).

This study employed a randomized control design. Participants completed baseline demographics surveys and were provided with a one-on-one familiarization session with the simulator by an experienced operator, including a full demonstration and an opportunity to ask questions. All participants underwent pre-testing by completing the FES manual skills examination and performing a porcine endoscopy and colonoscopy assessed with the Global Assessment of GI Endoscopic Skills (GAGES) scoring system, a tool that has been shown to have validity evidence for technical skills in flexible endoscopy (Appendix A and B) [14]. Healthy male Yorkshire pigs (30–40 kg) underwent a bowel preparation consisting of a clear liquid diet and polyethylene glycol starting the day prior to the procedure. Endoscopy was performed in the left lateral decubitus position under general anesthesia and animals were euthanized at the conclusion of the experiment. A single channel colonoscope (KARL STORZ Endoscopy-America, USA) was used for both the upper and lower endoscopy.

Participants were then randomized in modified matched pairs to one of two VR curricula: proficiency-based training or repetition-practice based training. The proficiency group completed practice GI Mentor VR tasks with expert level benchmarks as described in Hashimoto et al. (Appendix C) [10]. Each participant in the proficiency group was required to meet the benchmarks for each task, as determined by expert performance, on two consecutive occasions. In comparison, the repetition group completed the same tasks as the proficiency group but for a set number of repetitions (10), regardless of performance quality or benchmark achieved. For each group, a FES-certified instructor was available to provide coaching during the first repetition of each task. Subsequent repetitions in the two groups were self-directed by the participant with feedback provided by the simulator (e.g., total time, proficiency achievement) after each task completion.

All participants underwent a post-test consisting of a repeat FES manual skills exam and porcine endoscopy and colonoscopy with GAGES assessment. Faculty performing the GAGES assessment were expert endoscopists and blinded to the study arm to which each participant has been assigned. Participants completed repeat demographic surveys indicating how many endoscopies and colonoscopies they performed in clinical training during the study period.

Assessment

Performance on porcine endoscopy and colonoscopy was measured with the GAGES scoring system, a checklist with high inter-rater reliability adopted by the American Board of Surgery to assess endoscopic skills in both upper and lower endoscopy [14]. The endoscopy and colonoscopy GAGES checklists are a 5-item checklist with 5-point behaviorally anchored ratings assigned to each item, assessing skills such as scope navigation and instrumentation by a trainee or other user. The instrumentation portion of the GAGES checklists was excluded from this study, and participants were evaluated with a maximum possible score of 20. Porcine endoscopy and colonoscopy were assessed by FES-certified surgeons who were blinded to each participant’s assigned study arm.

Data analysis

We assessed for balanced randomization based on pretest performances and baseline participant characteristics. Participant demographics, FES-scaled scores, and pre-test GAGES scores were compared between the two training arms using student’s unpaired t-tests, Chi-square, and Wilcoxon Rank Sum. The two curricula cohorts were compared based on VR simulator use data using unpaired student’s t-tests. To determine the potential transfer of skills attained during VR training to live animal colonoscopy, pre- to post-test differences within the two cohorts on both FES-scaled scores and GAGES performances on porcine endoscopy and colonoscopy were assessed using paired student’s t-tests. After confirming the parallel slopes assumption, the post-tests between the proficiency- and repetition-trained participants were then compared using analysis of covariance (ANCOVA). Statistical analysis was performed using Stata (StataCorp. 2019. Stata Statistical Software: Release 16. College Station, TX: StataCorp LLC), and statistical significance was set at p < 0.05 for all tests.

Results

Participants

Twenty-four (n = 24) general surgery interns were recruited to participate in the study across two academic years (2019–2020, 2020–2021). There were two dropouts during the study, one from the repetition group and one from the proficiency group. In total, eleven (n = 11) inexperienced endoscopists were randomized into the repetition group and eleven (n = 11) into the proficiency group. There were no significant differences between the repetition and proficiency groups in sex, age, glove size, handedness, or clinical experience with endoscopy or colonoscopy at time of enrollment (Table 1). There was no difference in mean baseline scaled performance on FES manual skills pre-testing (proficiency = 253.5, SD 115.7, repetition = 288.2, SD 96.5; p = 0.45). There was also no difference in porcine endoscopy and colonoscopy GAGES scores between the two groups (Endoscopy: proficiency = 10.2, SD 2.7, repetition = 11.1, SD 2.8, p = 0.92; Colonoscopy: proficiency = 10.5, SD 2.9, repetition = 10.3, SD 2.6, p = 0.96).

Table 1 Comparison of the demographics of novice endoscopists participating in the proficiency vs. repetition curricula

Performance

The repetition group spent an average of 242.2 (SD 48.6) minutes on the VR simulator compared to the proficiency group’s average of 170.0 (SD 66.3) minutes (p = 0.013; Table 2). The proficiency group spent significantly less time on the Endoscopic Navigation task (proficiency = 43.1 min, SD 25.0, repetition = 70.6 min, SD 17.0; p = 0.010) and the Advanced Mucosal Evaluation task (proficiency = 43.5 min, SD 20.9, repetition = 68.5 min, SD 22.4; p = 0.019). There was no difference in time spent on completion of the upper endoscopy bleeding modules or the colonoscopy modules.

Table 2 Comparison of time spent (in minutes) on VR simulator tasks between the proficiency and the repetition groups

There was a significant difference in live porcine endoscopy (pre-endoscopy mean 10.6, SD 2.8, post-endoscopy mean 16.6, SD 3.4; p < 0.001) and colonoscopy (pre-colonoscopy mean 10.4, SD 2.7, post-colonoscopy 16.4, SD 4.2; p < 0.001) performance as measured by the GAGES score before and after curriculum completion (Fig. 1). There was no difference in post-test performance between the proficiency and repetition groups, after accounting for prior pre-test performance (Table 3).

Fig. 1
figure 1

Comparison of GAGES performance in pre- and post-curriculum porcine endoscopy and colonoscopy testing

Table 3 GAGES scores presented as means plus or minus standard deviations. Scores are reported before and after completing the assigned curricula. After confirming parallel slopes assumption, the scores between groups compared with Analysis of Covariance (ANCOVA). There was no significant difference between the two groups

There was a significant different in FES manual skills performance (pre-curriculum mean 270.9, SD 105.5, post-curriculum mean 477.4, SD 68.9; p < 0.001) before and after curriculum completion (Fig. 2). However, no difference was seen in performance on the post-curriculum exam between the proficiency and repetition groups (proficiency = 465.3, SD 58.6, repetition = 489.6, SD 78.7; p = 0.658, Table 4). Likewise, there was no difference in performance between the two curricula on subset analysis of each component of the FES exam. All participants (n = 22) in both the proficiency and repetition groups had a 100% FES manual skills pass rate.

Fig. 2
figure 2

Comparison of FES manual skills exam performance in pre- and post-curriculum testing between repetition- and proficiency-based curriculum groups

Table 4 FES-scaled scores presented as means plus or minus standard deviations

Discussion

This study demonstrates the utility of using a VR curriculum in training novice endoscopists prior to their clinical exposure to endoscopy. A VR curriculum for endoscopy results in significantly improved performance in both upper and lower endoscopy as measured by the GAGES scoring system in a live animal model. Following curriculum completion, novice endoscopists demonstrated significant improvement in all four of the GAGES parameters assessed for both endoscopy and colonoscopy as well as significant overall improvement.

Skills transfer following completion of simulation-based training is critical for novices to make the transition from skill development to safe clinical practice. Trainees who reached simulation-based skill proficiency before undergoing patient-based assessments have previously demonstrated improved performance in both an animal model and in the operating room for laparoscopic procedures, with higher global assessment scores and fewer errors than their counterparts [15, 16]. Trainees completing simulator-based training for endoscopy perform at a similar level in a clinical setting compared to trainees who participated in patient-based training, suggesting that the skills learned on a simulator are comparable to clinical learning [17]. Our study suggests that the endoscopic skills obtained through completion of a VR endoscopy curriculum transfer to a live animal model, an important pre-clinical model representative of clinical practice.

Recommended procedure numbers for general surgery residents are 35 and 50 for upper endoscopy and colonoscopy, respectively [18]. Prior studies have suggested that these recommendations may not represent the experience needed to achieve proficiency, as the learning curve for upper endoscopy and colonoscopy, based on GAGES scoring, begins to plateau around 50 and 75 cases, respectively [19]. Given the nuanced balance between trainee skill acquisition and patient safety, it is critical trainees reach a baseline level of proficiency prior to clinical exposure to maximize within-case learning. With trainee time limits and competing clinical demands, efficient training modalities are essential for residents to reach proficiency prior to graduation. Completion of our VR curriculum demonstrated high FES pass rates and significant clinical improvement, serving as a sufficient preparation for a clinical endoscopy rotation.

Similar to our group’s prior work, completion of the VR curriculum resulted in a high pass rate (100% for both proficiency and repetition curricula) and high scores (465.3 for proficiency curriculum, 489.6 for repetition curriculum) on the FES skills exam [10]. Several other groups have adopted their own endoscopy curricula using either the GI Mentor II or an alternative model to improve institutional FES scores [8, 9, 11, 20,21,22]. These findings continue to advocate for the use of simulation-based training for endoscopy training and preparation for the FES examination. The results of our study in conjunction with our prior work suggests that our institutional VR curriculum and performance benchmarks can be utilized by programs with access to the GI Mentor II in order to best prepare their residents for the FES manual skills examination [10].

Our study sought to compare the relative effectiveness of two different VR training paradigms, proficiency-based and repetition-based training, in preparing trainees for live colonoscopy. Between the two curricula, there was no difference in performance on the post-training FES manual skills examination or live colonoscopy. Despite the similar performance, there were significant differences in time spent completing the curriculum. The participants in the proficiency-based group spent on average 72 less minutes completing the curriculum (170.0 min vs. 242.2 min; p = 0.013) and less time completing the Endoscopic Navigation and Advanced Mucosal Evaluation tasks. All participants in the repetition group met proficiency standards in both tasks, suggesting that the additional repetitions in this group were unnecessary with an additional time cost. This is similar to laparoscopic skill acquisition in novices, where criterion-based training reduces overall training time without impacting training outcome and overtraining, despite a faster learning curve, has no long-term effect on skill retention and no additional time benefit [23, 24]. With the time saved and proven comparable outcomes with proficiency-based curriculum for skill acquisition, there does not appear to be any benefit to the use of a repetition-based curriculum.

Interestingly, 2 participants (18.2%) in the repetition group did not meet proficiency standards in Task 9 and 4 participants (36.4%) did not meet proficiency standards in Task 10. Both tasks are designed to practice the skill of loop reduction, notoriously the most difficult task for the FES examination [25, 26]. Despite this, there was not a difference in performance on the Loop Reduction task between groups in the FES manual skills examination (proficiency = 65.5, SD 21.7, repetition = 59.6, SD 34.8; p = 0.594), suggesting that proficiency benchmarks of expert performance for these tasks may be too strict.

Trainees begin with different levels of fundamental ability, experience, and skill. If a standard number of hours or number of tasks is prescribed to all trainees, the outcome will be variable performance levels based on individuals’ manual skill learning curves [27]. One of the benefits of a proficiency-based curriculum is that all trained individuals perform at a pre-determined benchmark level of competence. This allows for flexibility in training, and as we found, a reduction in overall time on the simulator to reach proficiency standards with similar outcomes to repetition-based training. This is critical given there are currently 73 institutions within the USA that are FES test centers with GI Mentor II access [28]. Given there was no difference in clinical or examination performance between a proficiency-based and a repetition-based curriculum, the recommendation to use a proficiency-based curriculum is secondary to the problem of limited simulator access. We want to help ensure residents can complete an appropriate curriculum in a timely and efficient manner.

This study has several limitations that must be considered. First, this study was conducted at a single academic institution with residents from a single general surgery residency. Due to costs and equipment availability, including both the animal laboratory and the GI Mentor, it was difficult to expand the sample size outside of a single institution. As a consequence of these limitations, the study was not powered in order to see a difference between the two curricula. Power calculations would reveal a sample size of n = 59 necessary to detect a difference on FES performance between the proficiency-based and repetition-based curricula and a sample size of n = 1870 to detect a difference in GAGES score, the latter suggesting essentially no difference between the curricula in clinical endoscopy performance.

The pig model also has its own anatomical considerations that differ significantly from human anatomy. For the upper endoscopy portion, successful endoscopic intubation is more difficult given the presence of a pharyngeal diverticulum. Normal pig anatomy has a pharyngeal diverticulum similar to the pathological Zenker’s diverticulum that can be seen in a human. In the pig, the pharyngeal diverticulum is located posteriorly, at the level of the upper sphincter, and evident upon passing the pharyngeal sinus on either side of the glottis [29]. After entry into the diverticulum, slow withdrawal of the scope is required to allow viewing of both the septum and the esophageal lumen, facilitating safe passage. For the colonoscopy portion, normal pig anatomy includes a proximal spiral colon arranged in a series of centrifugal and centripetal coils [30]. This consists of the cecum, ascending colon, and transverse colon, whereas the anatomy of the left colon and rectum is similar to that of humans. Given these findings, participants were only required to reach the spiral colon but not completely traverse it to reach the cecum.

Finally, it is important to recognize that the original study design was interrupted by the COVID-19 pandemic during the first year of the study. Residents during the 2019–2020 academic year were pre-tested prior to the pandemic; however, due to a suspension in all research and simulation activities to maintain social distancing guidelines, the VR curriculum was completed during the beginning of their second clinical year. As a result, this group may have gained more clinical endoscopy exposure; however, no participant completed their dedicated endoscopy rotation prior to GAGES post-testing, likely minimizing this clinical education confounder.

Conclusion

Participation in a VR curriculum leads to both improved FES performance as well as improved clinical endoscopy performance for novice endoscopists. A VR curriculum and training program can provide a structured approach for residents to acquire endoscopic skills. Completion of a VR curriculum leads to improved clinical performance in an animal model, maximizes the benefits of a clinical endoscopic experience, and optimally prepares residents for performing patient endoscopy.