Introduction

Increased needs for service provision [1], reduced working hours [2] and public intolerance of medical error have augmented the necessity for “in vitro” training methods like simulation [1]. Arguably cadavers are an ideal training medium for surgeons [3]. Cadaveric simulation has been practiced (at least) since the foundation of the royal colleges and relatively recent developments (such as new types of embalming) [4] and are expected to increase their role in operative pedagogy.

This review of reviews aims to explore the current evidence on advantages and disadvantages of cadaver simulation compared to other techniques of simulation for surgical training.

Methodology

This review of reviews was conducted according to the methodology reported by Smith et al. [5]. An electronic systematic review of OvidMedline (1946- 1st week of November 2016) and all Evidence-Based Medicine (EBM) review databases, including Cochrane databases, specifically for systematic reviews, was conducted. The search terms were cadaver* and training or surg* or simulat*.

Inclusion and exclusion criteria

Systematic reviews describing the use of human cadavers or human cadaveric body parts, for the purposes of surgical education and training were included in this review. Randomised controlled studies, non-randomised studies, editorials, case-series, case reports, opinion articles and conference proceedings, were excluded. Reviews describing dissection techniques for anatomy teaching were also excluded. Only reviews in the English language were analysed.

Identification of relevant studies

Two authors (M.Y, G.D) independently assessed the reviews identified after the literature search, initially by reading the title and abstract. For all potentially relevant studies, the full text was retrieved and assessed against inclusion criteria.

Data extraction

Data extraction was performed using a customised data collection spreadsheet. This included the following categories: authors, country, year of publication, type of embalming (if any), type of surgery/procedure, comparators, outcomes and study design (i.e. systematic review or meta-analysis). Descriptive synthesis was performed according to the PICO (participants, intervention, comparators and outcome measures) approach.

The PICO approach was used because it provides a structured manner by which the authors can summarise and compare systematic reviews, and it is suggested by Smith et al. for conducting reviews of systematic reviews (Table 1) [5].

Table 1 PICO approach for included studies

Quality analysis

The quality and strength of evidence provided by the included systematic reviews was assessed using the AMSTAR tool, previously tested favourably for reliability and validity [6].

Results

The literature search resulted in 33 records. Of these, three were duplicates, and 21 were either not relevant to the topic of the review or of inappropriate study design. Full papers were reviewed for the remaining nine studies, of which four were relevant systematic reviews and were included in this review of reviews. A study [7] not indexed in the databases searched was known to the authors and included in this review. Overall, five systematic reviews were included (Fig. 1).

Fig. 1
figure 1

PRISMA flowchart for included studies

Participants and intervention

Both open and minimally invasive procedures were taught on cadavers, in more than eight different surgical specialties including trauma, general surgery, neurosurgery and GI surgery [7,8,9,10,11]. Basic (e.g. wound closure) and advanced skills (e.g. laparoscopic splenectomy) were included in the studies, and both consultant surgeons and trainees (or medical students) were recruited. Nevertheless, only one study [12] within one of the systematic reviews [9] involved medical students and not postgraduate trainees.

The participation of experts is vital for studies assessing realism, as is the participation of both experts and non-experts for evaluating predictive or construct validity. Surgeons of the appropriate training level were recruited in almost all studies experts—i.e. consultant surgeons—if the outcome measure was fidelity, and both experts and non-experts—trainees and medical students—if the outcome measure was predictive or construct validity).

Whilst some authors believe that cadaveric simulation may not be appropriate for all levels of surgeons [7], others feel that cadaveric simulation is of benefit to all surgeons irrelevant of their experience [10]. The former opinion is based on concerns for cadavers’ availability, tissue compliance and cost [7, 9] as well as on some evidence of equity between cadavers and other forms of simulation [7]. Nevertheless, if the systematic review is assessing simulation for a specific procedure that cannot be adequately replicated with other forms of simulation (e.g. temporal bone surgery), the authors are found to be more enthusiastic about cadaveric simulation [8, 9].

Comparators

Most systematic reviews performed a subgroup analysis based on the type of simulation. This yielded 4–5 groups (animal models—live or not), bench top models, human cadavers and robotic simulators [7, 8, 11]. The authors explore the advantages and disadvantages of each type of simulator, although studies directly comparing different types of simulators are a rare occurrence. Yet, when cadavers were compared to other pedagogic tools, they were found to have the same impact as bench models and virtual reality simulators [7] but had a higher impact than textbook studying [7, 11]. However, these results should be interpreted with caution as the studies have methodological limitations such as lack of power calculation, small sample sizes and use of inappropriate outcome measures [7,8,9].

In the systematic review by Thomas et al., an improvement in skills both in the laboratory and theatre is described in all but one of the included studies. This was the case for all types of simulators, excluding animal models. Studies assessing the didactic value of the latter did not explore the transferability of skills to the real operating theatre [7]. Bhutta et al. note that, according to expert opinion, cadavers are the optimal training medium [8]; they are however evaluating various training methods for temporal bone surgery, which involves complex anatomy, which is difficult to reproduce using synthetic models and virtual reality simulators. Further, significant anatomical differences exist between animals and humans in the temporal region, making human cadavers the only option. For the specific type of surgery, physical validity is of the utmost importance, which leads the authors to advocate in favour of cadaveric training [8]. This may not be the case for other types of surgery, which according to Thomas et al. can be taught equally well on non-cadaveric simulators [7].

It should be highlighted that emerging technologies such as image reconstruction from medical images and 3D printing for the creation of bespoke models, with the possibility to add increased functionality features (e.g. colour coding of vital structures that should not be injured), can revolutionise surgical training and help equate synthetic and virtual reality models to cadavers [7,8,9].

Gilbody et al. [10] presents three studies with different types of cadavers. Giger et al. [13] used cadavers embalmed using the Thiel method for laparoscopic training in an array of procedures. The participants were pleased with the use of this type of “soft-embalming”, known for preserving the colour and texture of live tissue. Supe et al. [14] used fresh cadavers to teach six minimally invasive procedures. The laparoscopy novices recruited in the study were highly satisfied with this training experience. The authors highlight the lack of active bleeding and breathing movements, as well as the limited time frame within which the cadavers must be used as two of the main disadvantages of fresh cadavers. Finally, Reed et al. [15] used fresh frozen cadavers to teach vascular procedures to first and second year residents. The teaching sessions met the participants’ expectations, but no skills were assessed.

Davies et al. [9] identified three studies involving cadaveric simulation. Gunst et al. [16] taught exposure of 48 structures (trauma surgery) using fresh human cadavers. Self-perceived operating score was increased immediately after the course and remained at similar levels 7 months later. However, in this study, cadaveric simulation was employed within a curriculum; therefore, it is unsure whether the effect on trainees’ confidence was due to the hands-on practice or the remaining aspects of the curriculum. Mitchell et al. [17] used fresh frozen cadavers to teach dissection of structures rarely identified in vascular surgery and showed increased post-simulation anatomical knowledge and confidence. Transferability of skills to the real operating theatre was not assessed. Anastakis et al. [18] compared cadavers to bench models and a surgical text, showing that the cadavers had same instructional value as bench models but superior to the surgical textbook.

Sutherland et al. [11] identified 30 randomised controlled trials comparing different types of simulators but only one included cadavers as one of the comparators. This was a study [18], identified in other systematic reviews and discussed above.

Outcome measures

The outcome measures were significantly diverse. In a number of studies, they were subjective including verbalised feedback or questionnaires aiming to establish realism or self-reported confidence in performing an operation [7,8,9,10]. Competency-based assessment was also common. Expert surgeons were employed to assess the performance of trainees, usually immediately after the simulation session. Checklist and global rating scales were frequently used [7, 9,10,11]. As pointed-out by Davies et al., who looked into cadaveric simulation for open surgical procedures, in-built, objective feedback like the one provided by a Virtual Reality (VR) laparoscopic simulator is not always available or feasible; therefore, researchers have to rely on experts assessing surgical performance [9]. It is also suggested that global rating scales are preferable to checklists as they take into account the individual variability of techniques used by surgeons, focusing on the final outcome of the operation and not the steps leading to it, which may vary significantly between surgeons.

All the studies assessing validity have reported good face (i.e. subjective judgement of usefulness) and content (i.e. evaluates whether a specific element adds or retracts to the educational value) validity of cadavers [7,8,9,10]. Equally, studies showed improvement of performance after the cadaveric training session; however, assessments were performed in a simulated and not a real surgical environment [7,8,9, 11]. In fact, transferability of skills from the cadaveric training sessions in a real Operating Room (OR) is rarely assessed and not conclusively demonstrated in any of the studies. Self-reported confidence in the OR was assessed in one study [16] showing an increase immediately after the course, which was maintained 7 months later. Otherwise, long-term retention of skills was not assessed after cadaveric simulation into these studies.

In addition to the quantitative outcomes, authors often described the advantages and disadvantages of cadaveric simulation. Unsurprisingly, high fidelity and accuracy of anatomy were the most commonly reported advantages [8, 9], whilst high cost, low availability and restricted time frame for cadavers to be used were the main concerns [8, 9].

Study design and quality assessment

All systematic reviews were conducted within the past 10 years. Some focused on cadaveric simulation solely [10], whilst others report on surgical simulation training as a whole [7,8,9, 11]. The number of studies included within the reviews ranged from 1 to 13. Only two of these were randomised controlled trials.

The results of the AMSTAR score demonstrate that the reviews included in the current study are of moderate quality (Table 2). Sutherland et al. [11], with an AMSTAR score of 5/11 contributed one study only, which however, was included in other systematic reviews and was discussed elsewhere (Table 1). Therefore, the contribution of the above to the conclusions of this review of reviews is minimal.

Table 2 Quality assessment

Special mention should be made of the methodological restrictions of the studies composing the systematic reviews. As highlighted by several authors, there is paucity of high quality evidence comparing cadavers to other forms of surgical hands-on learning. Besides the limited number of participants and absence of sample size calculations, some of the criteria for realism are also put into question. For instance Bhutta et al. [8] describe how a model with significant anatomical inaccuracies was scored 4/5 for face validity. They also highlight experts who are often less enthusiastic than non-experts regarding the realism of models; however, results from both groups are pooled together to demonstrate high validity. Further, the transferability of skills from the cadaveric laboratory to the real OR has not yet been established [10].

Discussion

Expert and trainee surgeons agree alike, that cadaveric simulation is a valuable adjunct to their training. Cadavers are rated highly for their fidelity and realism [7,8,9,10,11] but often there are ethical concerns, issues with low availability, tissue compliance and high cost [7, 10].

The use of human cadavers has historically been and in many ways may remains controversial [19]. The right to a burial is a basic human right and some may consider the use of cadavers for teaching purposes as a deprivation of that right [20]. In fact, dissection of human cadavers has been characterised as an insult to the dead and the ultimate violation of a person’s privacy [20]. Currently, laws have been put in place to ensure cadavers have been donated according to the wishes of the deceased and “commercialisation” of human bodies is prevented. However, these vary from country to country and inadvertently are liable to “loop holes”, which may allow for unethical practices for obtaining a human cadaver [20].

Nevertheless, none of the above have deterred the vast majority of medical schools from using cadavers for medical and surgical training. It was not until the wide use of computers and the introduction of virtual anatomical models that the use of cadavers was reduced [20]. However, the quality of the current evidence is not high enough to ensure the superiority of cadavers compared to other media of training (e.g. virtual anatomical models). Future research should be conducted to effectively and convincingly answer the question of how cadavers compare to other forms of simulation. This review of reviews can help shape future studies, providing suggestions aiming to improve the shortcomings of currently available studies.

Existing studies can play the role of pilots showing the effect size of cadaveric training on surgical skills and therefore power future Randomised Controlled Trials (RCTs). If higher sample sizes are needed to conclusively assess cadaveric simulation and cost or low availability of cadavers are hindering factors, embarking on 2:1 control: cadaveric simulation randomisation (i.e. two participants to be randomised to control group and one to the cadaveric group, thus increasing the sample size without having to increase the number of cadavers) [21]. Only experts should be recruited to assess the realism of a cadaveric model. A mixture of experts and non-experts should be avoided as it may lead to inaccurate results. This is due to the variance in experience and potential unfamiliarity with the intricacies of a surgical procedure; surgeons in training may have which would not allow them to assess face and content validity of a simulated model as competently as expert surgeons.

Due to the lack of objective measures for assessing operating skills on a cadaver, researchers will have to rely on evaluation by expert surgeons. The use of scoring tools that take into account variances in technique between surgeons and focus on the end result is preferable [22]. Blind assessment using video recordings of the simulated procedures is easy to achieve, as HD (high definition) cameras are now readily available and cost-effective. Studies that can link simulation training to improved real Operating Room (OR) performance both immediately after the training sessions and long-term are desperately needed.

Furthermore, we need to explore ways to overcome the limitations of cadaveric training. Thiel is a “re-usable” cadaveric model presumed to provide “life-like” tissue texture and colour, which trainees can use to perform several procedures [4, 23,24,25,26,27]. Thus, Thiel may be more cost-effective than fresh or fresh frozen in the long run. The inability for customisation and lack of functional features can be dealt with hybrid models such as a 3D printed model of the patient’s organs that can be placed in a cadaver, therefore offering the option of patient-specific pre-operative rehearsals. There is no reason for virtual and additive manufacturing technologies to compete with cadaveric simulation, considering that a combination of the two can open new horizons of surgical training and pre-operative preparation. Other emerging technologies will also allow for the accurate recreation of breathing movements and circulation [28, 29], the lack of which was mentioned in one of the reviews as a shortcoming of cadaveric simulation.

Conclusion

Whilst cadaveric dissection has stood the test of time [20] and the educational value of cadavers for surgical training is widely accepted by experienced surgeons [7,8,9,10,11], restrains in the application of cadaveric dissection in surgical training combined the introduction of new technologies which could provide equally good paedagogic tools for surgical training [20], may limit the role of cadaveric simulation.

As demonstrated by this review of reviews, there is a lack of comparative trials regarding the ideal surgical simulation model, particularly comparing computerised anatomical models to cadavers. Whilst there may be a perception amongst surgeons that cadavers are costly, evidence about long-term cost-effectiveness of cadaver-based training compared to computerised models are inconclusive.

Finally, it would be interesting to explore the future role of cadaveric dissection; will it co-exist alongside computerised models or will it be replaced by them?

We hope this review will assist in commencing new research efforts that can conclusively determine the role of cadaveric simulation in surgical training.