Introduction

Trainee surgeons are increasingly expected to complete a variety of competency-based surgical courses during specialty training. Surgical courses on high fidelity simulation models provide students and junior trainees with key practical skills placing them at the center of the educational experience as opposed to the clinical setting [1]. Surgical simulation is increasingly presented as a solution to improving patient safety despite truncation of training schemes. It has particular relevance to technically challenging sub-specialty areas including microvascular flap reconstruction [2].

Microsurgical simulation training courses generally extend over a number of consecutive days aiming to provide a bundle of skills within a patient safe and non-threatening training environment for trainees [3, 4]. These educational and training interventions have been shown to lead to significant technical and non-technical skill acquisition both over the course period, as well as over a long term [5,6,7]. Usually, microsurgical simulation training progresses from ex vivo to in vivo animal simulations with a rising level of complexity [8, 9].

There is an increasing recognition clinically that microvascular reconstructive flap outcomes are at least as dependent on flap elevation as the subsequent microvascular anastomoses, especially that successful flap design relies on maintaining the structure and physiological function of smaller perforator vessels [10].

The live pig is a favorable model for reconstructive microsurgery flap elevation over other models due to its large size and similarity to human anatomy [11,12,13,14]. In a step towards achieving validation of the live pig as a training model in reconstructive microsurgery training, the aim of this research was to investigate trainees’ perceptions (face and content validity) as well as skill acquisition (construct validity) of an in vivo porcine simulation training model for a classical flap, namely, the pedicled latissimus dorsi myocutaneous flap.

Methods

Study participants

Twenty-seven participants from two different surgical simulation courses were involved in the study: 18 participants from the “Advance Hands-on Course in Microsurgical Breast Reconstruction Workshop” that took place at the Experimental Research Center of ELPEN, Athens, Greece on the 12th–13th June 2015, and 9 participants from the “Free flap dissection workshop” that took place on the 15th–17th of March, 2015 at the Pius Branzeu Center, Timisoara, Romania (Fig. 1).

Fig. 1
figure 1

Participants (HST = higher surgical trainee, CST = core surgical trainee)

Face and content validity

We used questionnaires as the main assessment tool, as it has been the mainstay of subjective assessment accounting for face and content validity [15,16,17,18]. All 27 participants have completed a questionnaire of 15 questions requiring a response from a 5-point Likert scale. The questionnaire included general questions about the usefulness and overall experience of the courses, as well as questions related to the acquisition of skill and similarity of the live pig model to the real-time setting. A further questionnaire of 20 questions requiring a response from a 5-point Likert scale was completed by 6 participants from the Timisoara group, which included additional specific questions in relation to the elevation of the latissimus dorsi myocutaneous flap on the live pig model.

Construct validity—objective assessments

Six participants from the “free flap dissection workshop” in Timisoara also completed objective assessments on an in vivo porcine simulation of the LD flap elevation. They were divided into two groups: 4 surgical trainees and an expert group comprising of 2 plastic surgery consultants with at least 6 years of training in the UK (Fig. 1).

Objective assessments of skill and skill acquisition (first flap elevation = pre-test, second flap elevation = post-test) included the following: HMA using the Trackstar™ electromagnetic tracking system [19, 20]; video recording [21] using a GoPro™ camera (Fig. 2); and grading with a peer-reviewed procedure-specific rating scale [22] (Appendix B), a performance checklist [23], and an assessment of flap viability using a skin scratch test (Fig. 3) and fluorescence imaging [24] (Fig. 4). The assessment methods are listed in Table 1.

Fig. 2
figure 2

a, b, c HMA and GoPro camera attached during procedure

Fig. 3
figure 3

Positive skin scratch test showing non-congested bleeding

Fig. 4
figure 4

Fluobeam®—handheld florescence camera and image captured on screen

Table 1 Assessment methods

Analysis of data

Face and content validities were assessed by Likert scale questionnaires, with the answers to general questions relating to the courses as indicators of face validity and answers to specific questions directly relating to the LD flap procedure indicators of content validity. The mean Likert data for each question are presented as percentages. Construct validity was assessed by a student t-test (parametric values: time and path length) and Mann-Whitney U test (non-parametric values: number of hand movements, checklist, PSRS, and flap viability). In evaluating the effect of training on skill acquisition, the differences between pre-test and post-test were compared between skill levels. A paired sample t-test was used for parametric values, and a Wilcoxon rank sum test was used for non-parametric values. For all tests a p value of < 0.05 was considered statistically significant.

Results

Face and content validity

Of the total 27 participants canvassed for both courses, 100% agreed strongly or moderately that the live pig is a useful model for flap elevation training. Of the participants, 98.2% agreed strongly or moderately that the live pig is a useful model for skill acquisition and relevant clinical application, and 77.8% were interested in attending a similar course in the future. Only a very small minority dismissed inclusion at an undergraduate level (3.7%) and at CST level (7.4%) (Fig. 5).

Fig. 5
figure 5

Combined results for questionnaire responses—the live pig model

Of the 6 participants who raised the latissimus dorsi flap, 100% agreed strongly or moderately that the live pig model is useful for both skill acquisition and potentially skill maintenance in raising the flap. One hundred percent agreed that the live pig training model adequately simulates the gross dissection and microvascular skills involved in raising and following the flap. All of the participants recommended porcine flap elevation simulation for trainees prior to LD flap elevation clinically. Half of the participants recommended the live porcine simulation training model for general plastic surgical skill acquisition, and 66.7% agreed that training using the live pig model adequately resembles operating on human tissue or the real-time setting. Further review of questionnaire results is shown in Fig. 6.

Fig. 6
figure 6

Combined results for LD Flap questionnaire responses

Construct validity

HMA

The expert group scored lower values on all three HMA parameters: time, number of movements and path length, and on the pre-test attempt. The average total time taken by the expert group was 37.75 min and that for the trainee group was 96.28 min, with a difference of 58.5 min (60.8%) (p = 0.036) (Fig. 7). The average total number of movements recorded for the expert group was 6185.75 movements and that for the trainee group was 12,285.75 movements, with a difference of 6100 movements (49.7%) (p = 0.165). The average total path length recorded for the expert group was 342.94 m and that for the trainee group was 837.64 m, with a difference of 494.71 m (59.1%) (p = 0.146 - Table 2).

Fig. 7
figure 7

Average total time (min)—pre-test (p = 0.036)

Table 2 Summary of results (values were rounded up for table presentation; however, p values were calculated according to original numbers)

Checklist

A difference of 20% was noted for the pre-test checklist between the control and trainee group (p = 0.140) (Appendix A).

Flap viability

Skin scratch test for flap viability

The expert group achieved 100% viability on the pre-test attempt (n = 2) (2 viable flaps) but 25% (1 viable flap) among the trainees’ group (n = 4). This comparison was not statistically significant (p = 0.114).

Fluoptics® imaging

Three flaps were assessed using the Fluoptics® system on the pre-test attempt, including those of both experts and one trainee. All three flaps were viable; however, this data was not sufficient for comparison between both groups but lends itself to further study.

PSRS

The average total score for the control group was 49.5 points (99%) and that of the trainee group was 36 points (72%). The control group score was higher than that of the trainee group by a mean difference of 13.5 points (27%). The difference, however, was not statistically significant at the level of p < 0.05 (p = 0.064) (Appendix B).

Effect of a single repetition on skill acquisition

HMA

In the trainee group, there was a decrease in average time to completion of 44.2%, average number of movements of 52.3%, and average path length of 68.1% compared with their initial attempt. For the control group, there was an increase in average time to completion of 12.9%, and a decrease in the average number of movements of 52.1% and average path length of 22.5%. A summary of differences between pre-test and post-test values in the control and trainee groups is shown in Table 2. Difference between attempts for all participants were not significant for average time (p = 0.206) or average path length (p = 0.099) but were statistically significant for number of hand movements (p = 0.046).

Checklist

The mean pre-test checklist completion for both groups was 86.7% against a mean post-test score of 87.5%. The trainees’ completion percentage was 11.3% higher post-test, while in the expert group, it was actually decreased by 20%, but not statistically significantly (p = 0.917) (Appendix A).

Flap viability

There were 2 viable flaps in the expert control group (100%) on the pre-test attempt, and only 1 flap (50%) in the post-test attempt. The trainee group had only 1 viable flap in the pre-test attempt (25%) and 4 viable flaps (100%) in the post-test attempt (p = 0.034). Flap failure was presumably from direct pedicle damage ± damage to the skin island perforator base. Table 2 shows the number of viable and non-viable flaps in the pre-test and post-test attempts for both groups that was assessed (by skin scratch test, and including those assessed by fluorescence imaging).

PSRS

The lowest achieved score of the PSRS was 30, and the highest achieved was 50 out of 50. On average the pre-test score of the control group was 49.5, and the trainee group 36; the post-test score for the control group was 40 and the trainee group 40.75 (Table 2). The differences were not statistically significant (p = 0.078) (Appendix B).

Discussion

Surgical training courses vary in their content to acquire necessary surgical skills. The variation of training courses and models used leads us to question the efficiency by which these models succeed in delivering surgical skill necessary, which can in turn be transferred to the actual clinical setting. Validation of various surgical training models is without doubt an important stage for identifying, comparing, and consequently selecting a reliable model for its justified use in surgical training, ensuring proper acquisition of skill necessary, with minimal loss of resources and maximal avoidance of risk to patients. In this project, an attempt was made to validate the live pig model for surgical training in the field of reconstructive microsurgery.

Face validity

In this study, the evaluation forms’ responses (n = 27) has supported the live pig model as well, approving that it has led to the improvement of their skill, that these skills would be used in the real setting, and that it is recommended for surgical trainees, all by 100% of responses varying between strong and moderate agreement. The approval denotes the significance of this model in their hands on experience whether in the trainee or control group.

Despite the high approval rate of the flap model, there was only a 50% agreement on recommending courses with the live pig training model for trainees for learning of other flaps or general plastic surgery skills. These responses may be explained by varying trainees’ seniority and operative experience. Early exposure builds confidence and may develop a more profound clinical interest, a better learning experience, and a more enhanced learning curve [25].

Responses on timing of introducing the model into the surgical curriculum varied at the undergraduate stage or during early core surgical training. In contrast, most respondents agreed that it should be part of a higher surgical training, an opinion warranted by the fact that this type of procedures are usually carried out at higher levels of training in any case, and therefore these skills may be unnecessary to acquire at earlier stages.

Content validity

Most participants agreed that the live pig model is excellent for preparation and maintenance of skills involved in the LD flap operation and that the live pig model adequately presents the tissue and pedicle dissection skills involved in raising the LD flap. The agreement however was understandably impeded by the anatomical factor; the difference in anatomy rendered the model less accepted for representation of flap marking and design skill.

Publications vary in their consideration of content validity through participants’ responses in regard to skill level: Some include both experts and trainees [26,27,28], while others only consider the opinions of experts [18, 29, 30]. In this validation, the discrimination between expert and trainee was only made for 6 participants by questionnaire. As the total number of participants (n = 27) including both experts and trainees provided responses in favor of the live pig model in terms of face and content validity, discrimination between levels of expertise was considered unnecessary. Those few fully assessed at the workshop in the “free flap dissection workshop” in Timisoara were supported by the opinions of attendees at the “Advanced Hands on Course in Microsurgical Breast Reconstruction” in Athens.

The main and relatively small number of participants (n = 6) who were fully assessed at the workshop in the “free flap dissection workshop” in Timisoara was strengthened by adding data from the same tools of the assessment of face and content validities from the “Advance Hands on Course in Microsurgical Breast Reconstruction” in Athens; the combined results had aided in strengthening the validation and eliminating influencing factors that may be related to the workshops themselves from the validation process of the live pig training model. Overall, the model showed promising results of face and content validity.

Construct validity

Despite the small sample size of this study, the model allowed differential demonstration of competence between the expert and trainee groups as measured by procedure-specific checklist, flap physiological outcome (scratch test and fluorescence imaging), and objective hand motion analysis. However, this discrimination was not statistically significant on all parameters measured.

The procedure-specific checklist and global rating scales were developed by the expert panel in this study and are likely to prove most valuable as a training feedback tool. However, neither the check list nor the global rating scale was subjected to any validation research before their use in this procedure assessment. Many surgical assessment tools have been designed and introduced in the field of reconstructive microsurgery for the same procedure by different research groups [31]. The standardization of surgical assessment tools by means of comparative performance and consensus methodologies [32] may help limit variables when it comes to data collection, thus allowing an accurate compilation of skill acquisition data for future validation of training models.

Physiological flap perfusion parameters were used to evaluate performance between the two groups. Various factors may affect viability outcome in any flap surgery. The general condition of the animal and tissue handling during the procedure play an important part in the surgical outcome. Despite the high level of success of modern free flap surgery, reaching up to 95% success rate [33], free flap failure continues to be a serious complication that should be diagnosed and addressed early during and/or postoperatively. Out of the two methods used to assess flap viability, the fluorescence imaging provided direct visualization of flap perfusion and is therefore more objective than the scratch test. This assessment method, although was not of great value statistically, provided a very important feedback on individual and team performance.

Hand motion analysis (HMA) is an objective assessment tool of surgical skills that involves the tracking of one’s hand movements while performing a standardized task, using measures gained from this tracking, to assess competence and acquisition of microsurgical skill [34, 35].

In this study, there was a notable difference between expert and trainee groups on the pre-test attempt in all three parameters of HMA measurements, namely, time, hand movements, and path length. Despite the fact that only average time was statistically significant, this objective discrimination shows promising results for the purpose of this training model’s construct validity for future research.

There is no consensus on the ideal placement for the digital sensors for HMA [36], and standardization is likely to provide more consistency. There was some limited electromagnetic interference with the HMA software when the transmitter was in close proximity to the electrocautery device.

Simultaneous filming of the procedures allows real-time or subsequent expert assessment with a rating score (PSRS). In this instance, the expert rater was present at the workshop, so ratings were not absolutely blinded. Nevertheless, the complexity of whole procedure simulation provided by this model presents an extremely rich opportunity for assessment of skills. The procedure length, instrument handling, tissue handling, and pedicle handling all provide a spectrum of competencies that are easy to discern by a blinded assessor despite the potential bias of procedure’s audio and video footage that might make it easy to identify a trainee from an expert.

The eagerness of the trainee group to improve their performance both established a promising level of construct validity of the model and exposed a relative fall in performance in the expert control group. This could be explained by an element of overconfidence. The expert group included experienced specialists who had performed the procedure numerous times in the clinical setting. Any lack of interest in a perfect post-test attempt is more striking following flawless pre-test attempts. Indeed, surgical training workshops also offer a platform for altering the confidence and attitudes of surgeons. Through training, overconfidence can be reduced [37, 38].

Conclusions

The in vivo porcine simulation model of pedicled latissimus dorsi flap elevation demonstrated face and content validity, and some evidence of construct validity, with trainee skill acquisition. This model can then be extended to an even more face valid free flap model, by dividing the pedicle and re-anastamosing at a distant site.