The impact of surgical simulation on patient outcomes: a systematic review and meta-analysis

The use of simulation in surgical training is ever growing. Evidence suggests such training may have beneficial clinically relevant effects. The objective of this research is to investigate the effects of surgical simulation training on clinically relevant patient outcomes by evaluating randomized controlled trials (RCT). PubMed was searched using PRISMA guidelines: “surgery” [All Fields] AND “simulation” [All Fields] AND “patient outcome” [All Fields]. Of 119 papers identified, 100 were excluded for various reasons. Meta-analyses were conducted using the inverse-variance random-effects method. Nineteen papers were reviewed using the CASP RCT Checklist. Sixteen studies looked at surgical training, two studies assessed patient-specific simulator practice, and one paper focused on warming-up on a simulator before performing surgery. Median study population size was 22 (range 3–73). Most articles reported outcome measures such as post-intervention Global Rating Scale (GRS) score and/or operative time. On average, the intervention group scored 0.42 (95% confidence interval 0.12 to 0.71, P = 0.005) points higher on a standardized GRS scale of 1–10. On average, the intervention group was 44% (1% to 87%, P = 0.04) faster than the control group. Four papers assessed the impact of simulation training on patient outcomes, with only one finding a significant effect. We found a significant effect of simulation training on operative performance as assessed by GRS, albeit a small one, as well as a significant reduction to operative time. However, there is to date scant evidence from RCTs to suggest a significant effect of surgical simulation training on patient outcomes. Electronic supplementary material The online version of this article (10.1007/s10143-020-01314-2) contains supplementary material, which is available to authorized users.


4
1 Wooster M,et al. 27 Carotid endovascular stenting, patients anatomy uploaded to simulator and practised on before surgery 15 (c=9, i=6) Yes: does rehearsal using patient specific anatomy uploaded to a simulator improve procefural efficiency and outcomes?
Yes 8 2 Maertens H,et al. 41 Endovascular interventions in lower extremity 32, 3 drouputs (c=10, i1=10, i2=9) Yes: how does the PROSPECT training program compare to e-learning and traditional training with respect to acquisition of endovascular skills and the transferrability of these skills to real patient scenarios Can't tell 11 3 Zevin B, et al. 56 Bariatric surgery, practice and assessement on pigs, intervention group rated on live patient after trial 20 (c=10, i=10) additionally 9 chief residents Yes: to develop and provide evidence of validity for a simulation-enhanced training curriculum for an advanced minimally invasive procedure Can't tell 14 4 Desender L, et al. 29 Aneurism repair, half of patients anatomy uploaded to simulator and practised on before actual operation 100 (c=50, i=50) Yes: to evaluate the effect of patient specific rehearsal prior to endovascular aneurysm repair on patient safety and procedural efficiency Yes, but control group was assed 2 times compared to the intervention group which was assessed 4 times Can't tell Supervising surgeon was blinded to amount of VR simulated procedures the trainee had completed at the time of patient based assessment Yes, all subjects were at the start of their training in gastroenterology with no previous endoscopic experience Yes, but intervention group i completed 50 total VR colonoscopies and group ii completed 100 total VR colonoscopies Yes Supervising staff surgeon was blind to the status of the resident surgeon, observer was not blinded (Staff GOALS score was used, both reviewed video recording to assess intraoperative complications). Retrospective assessment of patient medical records was done by a blinded member of staff.
Yes, baseline TEP repair was similar, groups were similar with respect to a host of other variables (post-grad year, sex, handedness, video game experience, TEP comfort + experience) Yes, but the control group could cross over to the intervention protocol after TEP2, 10 participants elected to do this Can't tell OSCE checklist assessors were not blinded, but 60% of the tests were assessed post hoc (video recording) by a blinded author.

Control group training Which patient outcomes were recorded? Further notes
None None 15 patients over 3 years, 2 hospitals. Not really any patient outcomes other that duration of procedure and contrast volume used. Unclear how similar the groups were. Unclear who was assessing the surgeries. Complete and utter lack of statistical power. Continued conventional training Recorded patient outcomes: peri-operative complications, major + minor adverse events in hospital and 30 days after treatment 2 live surgeries after intervention. Statistical improvement in GRS, Examiner Checklist and nr of takeovers. No differences aside from these. Authors argue consultants taking over may have prevented differences in patient outcomes. Unclear if groups were balanced.
Continued conventional training None Intervention with training on box trainer with porcine cadaver lead to better psychomotor performance on anesthetized porcine model than peers who recieved standard training. Psychomotor performance by the tested 3. year residents was equivalent to that of Chief Residents (8/10 in intervention group allowed, only part of surgery, only 1 surgery). No patient outcomes recorded. Rehearsal after procedure Recorded patient outcomes: peri-operative errors, technical + clinical success rates, in-hospital + 30 day mortality Rehearsing led to changes in the pre-op plan (88% changed, 92% implemented) and led to fewer perioperative mistakes. However there was no statistically significant improvement to technical skills (already proficient teams) or patient mortality (equally low for both groups) None None Control, practice on camera navigation and practice on actual procedure. Camera navigation performed better on simulation test (different to the simulation training). No statistical differences between the groups upon transfer to the OR. Continued conventional training None 1 hour of simulator practice lead to better time, but otherwise no statistically significant improvements compared to regular training. Patients were not randomized. No patient outcomes recorded. Continued conventional training None Seven 2 hour practice sessions on box trainer and VR simulator. Technical performance assessed with OSA-LS was better in intervention group. Intracorporeal knot tying was not statistically different. No patient outcomes measured.
Continued conventional training None Improvement to 2/9 items on OSAT and overall OSAT after 1.5 hours practice on porcine cadavers. Overall improvement from 26.7 to 29.9 (3 points), which the authors note is little. Both groups performed poorer than would be expected (c=26.2 and i=29.9 out of 45 possible points) Continued conventional training None Initial testing on patient, then 1 hour of sim practice for intervention group, then re-testing on patient, then re-testing on patients after 1 year. Intervention group had statistical improvements in ASSET score and time to completion from test 1 to 2. No other significant differences. Time incrased between test 2 and 3 by 18% (mean), so improvements were not retained after 1 year. Continued conventional training Recorded patient outcomes: arterial puncture, hematoma, catheter malposition, catheterassociated infection, pneuthorax and death Simulation training untill achieved mastery. Intervention group had significantly better adherence to procedural protocol. All other measured variables were similar between the groups. Problem with the study is few observed CVC placements (87) and the hospital already having an excellent track record. Observer rated placement that they themselves supervised. Supervisor would step in if necessary (was noted in GRS) Simulation training without feeback None 8 hours of practice on sim, sim test before intervention, right after then 4-6 weeks after. Clinical colonoscopy 4-6 weeks after intervention. Intervention group was statistically better than control group with respect to JAG DOPS score at clinical colonoscopies. No patient outcomes measured. Highlights difference between running sim practice alone and getting feedback from expert. Continued conventional training None Intervention group recieved 1 day of skills training on plastic phantom and anesthetized pig. They then completed 20 hernia repairs over the course of the 4-6 days. The control group had standard training, doing hernia repairs whenever the department deemed it to be apppropriate. Both groups were compared at the end of the year (the first year of surgical specialty training). Intervention led to statistically better technical score and time to completion, these improvements were retained (but not improved on) at the end of the year. intervention group had significantly better technical scores at the end of the year compared to control group, but not time to completion (better, but p=0.059). No clinically important outcomes measured. Number of procedures** None Both groups recieved sim traning, group i for a total of 50 VR cases and group ii for a total of 100 VR cases. Improvement on sim plateued after 60 VR cases, improvement in patients plateued after 50 VR cases. Plateu = no further statistically significant improvements. Highlights diminishing returns of VR sim practice. Intervention group trained on simulator until they achieved mastery. At TEP2 the intervention group was faster, completed more of the procedure themselves and had lower rates of complications. At TEP3 the residents who crossed over from control to intervention performed faster than their control counterparts. At TEP3/4/5 GOALS scores were not significantly different between the groups. Complications were similar between crossover and control groups at TEP3. All TEPs after intervention combined, and excluding the crossover group, the intervention group were statistically better in all measured outcomes (aside from overnight stays). Seems like a slam dunk. Why is crossover and control similar at TEP3, does that not mean intervention and control are similar at TEP3? Difference in total measures is significant but groups compared at TEP3/4/5 isnt? Continued conventional training None Intervention group trained on simulator until mastery. Thereafter self reported success rate of first infant LP they perfomed. Intervenetion group was significantly more successful than the control group (95% vs 47%, P=0.005) and had fewer traumatic procedures (but not significantly fewer). Self reported success, very small sample size, at 6 months 9 of the 20 who said they had not perfomed an LP admitted that they had. Surgeons served as there own controls None 8 surgeons completed 2 surgeries each, 1 with warm up before and 1 without. Surgeries with warm-up beforehand recieved significantly better OSATS scores. No other variables measured. Small sample size. 16 hours of practice on patients None Intervention group recieved 16 hours of sim practice, control group recieved 16 hours of practice on patients. At post intervention assessment the intervention group outperformed control on the simultor. There were no statistically significant differences between groups upon patient based assessment. Interesting because it compares patient and sim practice directly. Continued conventional training None 7 trainees in the intervention group, performed better in their first 10 cholecystectomies (assessed at 1+5+10). They were 58% faster (barely not significant) and the control group made 3x more mistakes. Interesting to note that neither one of the groups improved from surgery 1 to 10. There was substabtially more variability in the control group (8x more variability in total errors in control group compared to intervention) Continued conventional training None 23 subjects performed 10 hours of practice on a VR simulator. Unclear how similar the groups were at the start. Subjective discomfort was similar between the groups. Objective competence was significantly better in the intervention group, but this effect tapered off and was no longer significant after 80 procedures. Subjective competence was significantly better in the intervention group until 40 performed procedures. Both groups improved over time, more at the start than at the end. Both groups needed the same amount of median cases to reach competency. Highlights that not all residents become competent at the same rate. The intervention group was "ahead on the learning curve" compared to control at several points in time.