Nowadays, simulation is used in a more structured way during surgical training. Objective assessment of the performance, provided by virtual- and augmented reality simulators, is fundamental for continuous skill refinement [1, 2]. Additionally haptic feedback is important for adequate skills training in minimally invasive surgery and in particular for laparoscopic suturing [36]. In general, it is assumed that realistic simulations with haptic feedback provide better training outcomes and better transfer of skills to the clinical setting [7]. A study by Aggarwal et al. showed that training with haptic feedback results in significantly improved skills transfer to the trainee, compared with training without haptic feedback [8]. However, realistic haptic feedback during laparoscopic training is lacking in most virtual reality simulators.

Professional organizations have recently recognized the need to assess surgical performance objectively. To be an effective tool, the simulator has to provide metrics that are meaningful and informative to the trainee. Time is a frequently used parameter, but appears not be the best solution as a sole measurement [8]; for example, in laparoscopic suturing, a surgeon may be very fast but ties the worst knots imaginable, whereas a surgical resident may take three times as long but achieves qualitatively optimal knots. This is also observed in clinical laparoscopic procedures, where time as a surrogate parameter for proficiency is not sufficient [9]. There are other parameters, such as path length and smoothness, recorded by most simulators, but these are not informative either. Smoothness is defined as the recorded path length compared with a calculated optimal path length. This will give an indication of the global performance of the trainee, but does not provide any information on the performed procedure or sutures. Therefore it is important that an assessment module is developed for specific skills, such as suturing.

Surgical skills training models should be reliable and valid to become incorporated into an objective structured clinical assessment, which could be used to assess individual development and allow progression through a training programme [10]. Currently, no such laparoscopic suturing and knot-tying modules with realistic haptic feedback exist. In this study we validated a new suturing module for the ProMIS v2.0 augmented reality simulator (Haptica, Dublin), using an assessment with meaningful measurements.

Methods

Subjects

Twenty-four participants were allotted to two groups based on their clinical laparoscopic suturing experience: experienced (n = 10), >50 laparoscopic procedures and clinical laparoscopic suturing experience; novices (n = 14), no previous laparoscopic experience, pretrained for basic laparoscopic skills and to get acquainted with the fulcrum effect on the minimally invasive surgical trainer virtual reality (MIST-VR). All participants were tested from January to June 2008, at the Catharina Hospital Eindhoven, The Netherlands.

Equipment

In this study we used the ProMIS v2.0 augmented reality (AR) simulator (Haptica, Dublin, Ireland). The laparoscopic interface consists of a torso-shaped mannequin (29” long × 20” wide × 9” deep), with a skin-coloured cover, which is connected to a notebook (Dell, XPS M1710). The mannequin contains three separate camera tracking systems, arranged to identify any instrument inside the simulator from three different angles. The camera tracking systems capture instrument motion with Cartesian coordinates in the x, y and z planes at average rate of 30 frames per second (fps). The distal end of the laparoscopic instrument shaft is covered with two pieces of yellow electrical tape to serve as a reference point for the camera tracking system; therefore it accepts a broad range of instrument types. Instrument movement is recorded and stored in distinct sections, based on the time the tips of the instrument are detected until they are removed from the mannequin. The notebook was positioned so that the participant had the screen placed just below eye level and the mannequin was placed at a standard ergonomic height for performing the laparoscopic tasks.

The simulator records time, path length and smoothness of movement (through changes in instrument velocity and changes in direction), during each separate task within the training module. After completion of the task, ProMIS provides statistics on the screen. In addition, a full video and virtual playback of the trainee’s performance are saved. Different trays may be placed in the mannequin for each task, such as suturing pads for suture and knot-tying task. During training 26173 KL and 26173 KAL KOH macro needle-holders (Karl Storz, Tutlingen Germany) with Syneture (Covidien) Polysorb 3-0 suturing needle and thread were used.

Questionnaire

The questionnaire used to research the face validity in this study consisted of three parts. The first part was about the demographics and laparoscopic and simulator experience of the trainees. In the second part, questions were asked about the realism and didactic value of the suturing module of the ProMIS V2.0 laparoscopic simulators. These questions were answered on a five-point Likert scale. The final questions asked the opinion of the participants on the size of the dome in the module and preference of simulation technique for practising laparoscopic suturing skills.

Informed consent was signed by all participants, stating that they voluntarily participated in this study.

Evaluation form

Two independent expert observers rated the performances of the participants by means of a standard evaluation form, which consisted of seven items, scored on a five-point Likert scale. This was to research the concurrent validity of the model, as these standard evaluation forms are used in the thoroughly validated Fundamentals of Laparoscopic Surgery (FLS) to assess the suturing performance. The following items were used: position of the needle in the needle holder, running the needle through the suturing pad, taking proper bites of the suturing pad while doing the suture, throwing the thread around the needle holder, pulling tight the thread in the proper direction, tying a correct surgeon’s knot and global evaluation of the performance. Both observers were experienced with laparoscopic suturing and knot-tying using the same technique as in the module.

Protocol

All participants (both experienced and novice) started the suturing module at the beginner level and performed two runs of the task. Only the scorings of the second run were recorded for construct validity, to avoid bias in the scorings because of unfamiliarity with the simulator and module. The novice participants practised their suturing skills more extensively on this module as part of a training, from which the baseline knot and the knots at both the individual and average performance curve were also used for this study. The scorings of the assessment method were compared with the scores of two independent expert observers, who observed the video recordings of the performances and scored them by means of the evaluation form.

After finishing the session, all participants filled out the questionnaire regarding their opinion on this adapted suturing module and the assessment method to evaluate the face validity of the module.

Statistics

The data were processed and analyzed with the Statistical Package for the Social Sciences (SPSS) version 13.0. Data on differences of opinion between the groups were analyzed with independent t-test. The performance scores of the two experience levels were compared using the independent t-test. To visualize the correlation between the performance scores of the assessment method and the scorings of the objective observers Spearman’s rho was used. The interobserver reliability was calculated with Cronbach’s alpha. A p-value ≤0.05 was considered statistically significant.

Results

Suturing module

The standardised suturing technique for the surgeon’s knot was used as previously described by Hanna et al. [11]. This suturing and knot-tying technique is divided into several steps. The step-by-step approach of this suturing module was built with guidance, by means of a dome and an arrow, to pull the knot tight in the proper direction. The assessment method of this module is based on the placing of the instruments. When throwing the thread around the needle-holder the instruments have to be inside the dome, but when pulling the knot tight the pulling instrument can move outside the dome (following the direction of the guiding arrow) but the instrument holding the tail end has to stay inside the dome (Figs. 1, 2, 3, 4). The outcome of this assessment method is presented at the end of the performance as a calculation of the percentage of the time spent in the correct area for each step and the strength of the knot. If an error is made (e.g. taking an instrument out of the dome during knot tying) the dome will turn bright blue, until the error is restored. This error percentage is shown in the assessment parameter: time spent in the correct area. The second assessment parameter used in the assessment score is the strength (quality) of the knot, which was tested by cutting the suture out of the suturing pad and pulling at the cut ends with a tension meter. This showed whether the knot would slip or brake when pulling at it with at least 25 N, which a correct surgeon’s knot should be able to endure [11, 12].

Fig. 1
figure 1

The dome is a simulated area in which the trainee has to stay in during the knot-tying. When pulling the knot tight in the proper direction these is a guidance arrow to guide the correct direction to tie a surgeon’s knot. The proper instrument can come out of the dome in the guided direction during this step

Fig. 2
figure 2

When the knot is pulled tight in the wrong direction, the dome will turn bright blue, until the error is restored

Fig. 3
figure 3

The guidance arrow guides the direction of pulling the second knot tight in the correct direction

Fig. 4
figure 4

When the instrument with the tail-end comes out of the dome the dome will turn bright blue, until the error is restored

The suturing module is divided into three difficulty levels, in which the dome is the largest in the beginner level and the smallest in the advanced level. The size of the dome in the middle level is comparable to the area available for suturing the crura or common bile duct in the clinical setting.

Validity of suturing module

The experienced participants scored significantly better in the beginner-level mode, according to the assessment method than the novice participants (mean 95.73 vs. 60.89) (Table 1). For the separate assessment parameters of time spent in correct area and strength of knot, the experienced participants also scored significantly higher.

Table 1 Differences in performance scores

When asking the participants about the properties of this suturing simulator, the haptic sensations were rated good to excellent by the majority (Table 2). The demonstration videos before the task were considered good for training (mean 4.35), while the videos during performance were rated as less useful (mean 3.21). The experienced participants even rated the step-by-step videos with a mean of 2.60, which is significantly worse than the novice participants (p = 0.001). The size of the dome in the beginner-level mode was rated as good for training by 16 participants, while six were of the opinion that it was too small and two did not have an opinion on this matter. When asked them about the representation of the performance by the assessment scores, 18 were of the opinion that it was a good representation, two thought it was too high, one that it was too low and three had no opinion. The suturing module was rated as a good to excellent training tool for training of laparoscopic suturing for surgical residents (mean 4.50).

Table 2 Opinion of the participants on the suturing module and assessment method

Assessment method

The performance scores of the assessment method were compared with the scorings of the same performances (n = 43) rated by the objective observers (on the standard evaluation form) and showed a significant correlation (Spearman’s rho 0.672), with an interobserver reliability of 0.96 (Cronbach’s alpha). The scoring of the observers also correlated significantly with both the time spent in the correct area and the strength of the knot (Table 3).

Table 3 Correlation between the scores of the assessment method and the performance scores graded by the objective observers, for all baseline knots and the knots at the top of the performance curve of the novice participants (n = 43)

When comparing the separate assessment parameters (Table 4), there were strong correlations between the total assessment scores and both the time spent in the correct area and the strength of the knot (n = 229, Spearman’s rho 0.719 and 0.830, respectively), based on the fact that the assessment score is made up from these assessment parameters. The parameter time spent in correct area had some correlation with the strength of the knot (Spearman’s rho 0.257), but as seen in Fig. 5, no relevance from this calculated correlation can be made, so a rho value <0.4 could not be seen as a relevant correlation. The secondary parameter, time, also showed a significant calculated correlation with the assessment score and the separate assessment parameters, but as is clear from Figs. 6, 7, 8, time to complete the suture does not give a good impression of the primary assessment score.

Table 4 Correlation between assessment parameters for all performances of the beginner level of the suturing module (n = 229)
Fig. 5
figure 5

Scatterplot of the correlation between the strength of the knot and the time spent in the correct area during the suturing training (n = 229), showing a calculated significant correlation, although this figure shows no clinical relevant correlation

Fig. 6
figure 6

Scatterplot of the correlation between the assessment scores and the time to complete the task (n = 229), showing a calculated significant correlation, although this figure shows no clinical relevant correlation

Fig. 7
figure 7

Scatterplot of the correlation between the time to complete the task and the time spent in the correct area during the performance (n = 229), which showed a calculated significant correlation, although this figure shows no clinical relevant correlation

Fig. 8
figure 8

Scatterplot of the correlation between the time to complete the task and the strength of the knot (n = 229), showing a calculated significant correlation, although this figure shows no clinical relevant correlation

Discussion

Augmented reality

Augmented reality is the combination of physical and virtual reality in one system. Real instruments, which are modified by means of coloured tags on the tips, are video-tracked by the system to measure the performance of the tasks. This results in the objective assessment of the real physical tasks performed by the trainees, and thus in an objective scoring of that performance.

A major advantage of the ProMIS AR laparoscopic simulator over computer-based VR simulators is that it allows the trainee to use the same instruments that are currently used in the operating room. The simulator provides realistic haptic feedback because of the hybrid mannequin environment in which the trainee is working, which is absent in virtual reality systems. This simulator offers a physically realistic training environment that is based on real instruments interacting with real objects. This physical character is regarded as very important to learn laparoscopic suturing and knot-tying skills. The participants of the current study also appreciated the realistic haptic feedback of the augmented reality, as shown in Table 2.

Meaningful feedback

When learning laparoscopic skills, which are distinctive motor skills acquisitions, it is essential to provide feedback to stimulate the learning process [13]. A previous study of Porte et al. [14] demonstrated that information about motion efficiency in the form of number of movements made during the learning of knot-tying skills, with or without expert derived criterion, was not as valuable to the learning process as expert feedback. Presumably, the feedback given by the experts was more understandable to the trainees than the feedback of time, path length and economy of movement. This type of feedback is referred to as extrinsic feedback, which should guide and motivate trainees to reach their performance goals [14, 15]. However, to motivate trainees to practise their skills, this extrinsic feedback has to be meaningful and informative, which can be in the form of expert feedback. However intense extrinsic feedback can hinder learning during the early stages of skills acquisition by inhibiting intrinsic learning strategies [15]. Therefore meaningful feedback at the end of each task, in the form of time spent in the correct area per step and knot quality, is more meaningful than motion efficiency and should hinder the trainee less during the training than would expert feedback. Other extrinsic feedback that could guide trainees is demonstration videos before and during the training, which are also provided in this suturing module.

To provide informative feedback it is important that meaningful measurements, such as time spent in correct area and the strength of the knot, are combined, because focusing on only one is not sufficient to improve the skills. There are some correlations between these measurements, but it is clear that in individual cases the separate measurements individually do not give a proper assessment of performance. These significant correlations could be based on the large number of knots used for these calculations. The correlation between the assessment measurements has to be clinically relevant and a correlation coefficient of <0.4 could not be considered a clinically relevant correlation. Time does not show a clinically relevant correlation with any of the assessment measurements, and nor do time spent in the correct area and strength of knot (Table 4), whereas a calculation from these two measurements gives a good impression of the suturing skills when compared with the ratings of the objective observers.

Assessment method

To determine the end point of the training of suturing skills it is important to know what the trainee is doing and which path has been travelled to get to the final knot. Parameters such as time, path length and smoothness do not tell you anything about the exact movements that are made within the mannequin to get to that final knot. Therefore it is important to create a three-dimensional space in which the trainee has to stay while throwing the thread around the needle-holder. This space is imagined as a dome (a cage on the suturing ground), based on the average suturing path travelled by experienced laparoscopic surgeons. The physical dimensions of the dome are derived from measurements of experienced laparoscopic surgeons when suturing crura or common bile duct. This is also the ideal space within which to stay during suturing, which makes the handling of the instruments and throwing the thread around the needle the easiest. Additionally, it is important that the surgical resident learns to suture in a confined space, because in the clinical setting there is always the chance of puncturing the liver or spleen during suturing.

One of the measurements in the assessment score is calculated from is the quality of the tied knot (i.e., strength of knot), which provides a reliable assessment of the security of the knot [11, 12] and is considered the most important factor in tying a knot.

Another major advantage of the dome and guidance arrows during training is the fact that the suturing procedure is divided into steps to show the trainee precisely and unequivocally how to perform the suture correctly [2]. The dome (Figs. 1, 2, 3, 4) itself is only for the path of throwing the thread over the needle-holder. When the thread has to be pulled tight, the trainee has to come out of the dome in the proper direction, which is calculated from the ideal path of experts. It is important that the trainee only pulls on the needle end of the thread and only with the proper hand, in the correct direction, to create a surgeon’s knot. With this dome, the length of the tail end can also be assessed, as the instrument holding the tail end has to stay inside the dome while pulling the knot tight.

The performance scores of the assessment method showed significant differences between the two experience groups and therefore demonstrates construct validity [16]. As shown in the tables there is significant correlation between the scorings of the assessment method and the scorings of the objective observers, which demonstrate the concurrent validity [16] of the developed assessment method for laparoscopic suturing training.

Limitations

The group of experienced participants is smaller than the novice group, which can be attributed to the fact that there were fewer experienced surgeons available to enter in this study. Because the novice participants tied 16 knots each (of which three were assessed by the objective observers), there were enough runs of the task on the suturing module to calculate correlations between the assessment parameters and the scorings of the objective observers.

The quality of the visual feedback on the screen was not optimal and, in combination with the projection of the dome over the instruments on the screen, the participants (both experienced and novice) regarded the dome sometimes as a nuisance, because of the lack of vision of the needle and thread during the performance. Therefore the quality of the camera should be improved, as should the way of visualizing the dome on the screen, instead of an overlay over the instruments and suturing material. There is also room for improvement in the step-by-step demonstration videos and spoken guidance, as these were properties of the module that could not be adapted. The demonstration video shown at the beginning of the training was constructed for this study and was not part of the suturing module. The step-by-step videos were also rated as less useful during the training, with a significantly worse rating by the experienced participants. This can be explained by the fact that they are useful in the beginning of the learning process, but not when the steps of the procedures are clear to the trainee. The demonstration video with the proper steps, shown before the training, was rated better by the experienced participants of the study than the step-by-step videos (mean 4.22 vs. 2.60).

Conclusions

The current study shows the construct, concurrent and face validity of the suturing module, with the adapted assessment method on the ProMIS laparoscopic simulator. This assessment method is a valid tool for assessing laparoscopic suturing skills objectively. Although assessment parameters can correlate, to provide informative feedback it is important to combine meaningful measurements, e.g. strength of knot, in the assessment of suturing skills. We recommend incorporating simulator systems with an informative assessment method for laparoscopic suturing training, as described and validated in this study, into the training curricula for surgical residents.