Surgeons require a specific set of advanced technical skills to safely perform minimally invasive surgery (MIS). These technical skills include hand–eye coordination, depth perception and handling long instruments with reduced tactile feedback [1,2,3,4]. The surgical residency program has followed an apprenticeship model of ‘‘see one, do one, teach one’’ for more than a century [5]. However, after the introduction of the relatively complex MIS, this approach of training in the operating room resulted in increased patient injuries and associated health care costs [6,7,8]. During the last decades, reports on improvement of patient safety by simulation training have been published [9,10,11]. This has started a paradigm shift of the way surgeons are trained, following a new model of “see one, simulate many, do one’’ [5, 12].

Traditionally, laparoscopic skills have been assessed subjectively using forms such as OSATS, GOALS, and OPRS [13,14,15]. However, it is important to include objective assessment in skills training to provide supervisors with a consistent tool to assess the skills of surgical residents [1]. A combination of performance parameters has been classified, representing tissue manipulation and instrument handling skills, which enables objective assessment of laparoscopic skills [16]. Our research group reported earlier successful implementation of objective performance parameters in basic laparoscopic skills training of first-year surgical residents, enabling objective assessment of learning curves [17].

Objective assessment of the learning curve is essential to determine when proficiency levels have been acquired [17,18,19,20]. It is demonstrated that baseline performances of psychomotor ability uniquely predict the learning curve during laparoscopic skills training with virtual reality simulators [21, 22]. Predicting learning curves at an early stage of training allow creating individually adjusted skills training programs in the near future [21, 23]. The aim of this study was to analyze and predict the learning curve of basic laparoscopic technical skills.

Methods

Participants

First-year surgical resident who completed a basic laparoscopy course between April 2020 and June 2021 were included for prospective data-analysis. Surgical residents from multiple teaching hospitals in the Netherlands were included. The Basic Laparoscopy Course is part of the surgical residency program and participating in the study was voluntary (and without consequences). The study was exempt from Ethical Board review.

Protocol and the basic laparoscopy course

The basic laparoscopy course consisted of a 3-week at-home laparoscopic box training course, followed by a hands-on training day at the Amsterdam Skills Centre, consisting of performing a laparoscopic appendectomy and cholecystectomy on fix for life cadaver models [17]. Trainees received a laparoscopic box trainer and were instructed to train a minimum of five sessions a week, performing six different validated laparoscopic tasks [13, 24,25,26] (Supplemental File A). Measurements were compared to predefined proficiency levels, which were equal to mean parameter outcomes of 7 surgeons [13]. The scoring system consists of a scale of 1–10 with 8 being the proficiency level (pre-set competency based on experts). The score consists of the average of the force, motion and time which are each scored individually.

At the end of this course, the trainees performed the six tasks once again as a post-course assessment. Overall progression was measured by comparing baseline and post-course assessment. Objective force, motion and time parameters were measured, representing tissue manipulation and instrument handling skills [16, 27].

System and materials

The Lapron box trainers (Amsterdam Skills Centre, Amsterdam, The Netherlands) [28] were utilized during the basic laparoscopy course. The box trainers were equipped with the ForceSense objective measuring system (MediShield B.V., Delft, the Netherlands) [29], which uploaded all measurements and recordings to an online database. Six previously validated laparoscopic tasks were included: Post and Sleeve, Loops and Wire, Flap task, Wire chaser, Pattern cut and Zigzag loop [17] (Supplemental File A). Furthermore, the Lapron box trainer was equipped with two curved Maryland grasping forceps, one laparoscopic scissor and a laparoscopic axial needle holder (Aesculap, B. Braun, Melsungen, Germany). All statistical analyses were performed using the 26th version of IBM SPSS Statistics. Graphs were created using GraphPad (Prism 9.0.0, San Diego, California USA).

Statistical analyses

Learning curves that show maximum force (N), path length (mm) and time (s) were created for the six tasks, displaying the group mean and proficiency levels. The path length was defined as the total distance travelled by the laparoscopic instruments and the maximum force was defined as the maximum absolute force applied on the laparoscopic tasks [13].

Linear regression tests were performed to predict the learning curve at an early stage of training using IBM SPSS statistics 28 (SPSS Inc., Chicago, Illinois USA). Baseline performances of the parameters maximum force (N), path length (mm) and time (s), were included as independent variables. These baseline performances were defined as the average scores of the first three measurements. The number of sessions that were needed to reach the proficiency level were included as dependent variables. Linear regression tests were performed separately for each parameter of the six tasks. All sessions that were not successfully completed due to unforeseen circumstances and tasks that were performed less than three times were excluded from analysis. Trainees that did not reach the proficiency level for one of three parameters were excluded from analysis of this specific parameter. Post hoc power analyses were performed using GPower (Supplemental File A, Table A1).

Results

Learning curve analysis

A total of 6010 trials, performed by 42 trainees from 13 Dutch hospitals were included for analysis. Figure 1 shows the proficiency graphs of the parameters: maximum force (N), path length (mm) and time (s) for the Post and Sleeve laparoscopic task. Proficiency level graphs of all six tasks are provided in Supplemental file B.

Fig. 1
figure 1

Post and Sleeve learning curves

For the Post and Sleeve, the benchmark of maximum force was reached at the 4th session, while the benchmark of the path length was reached at the 32nd session and the benchmark of time at the 21st session of the training. Supplementary Figures B2-B6 show an improvement of the mean and standard deviation over the initial training sessions, after which it gradually levels out in the plateau phase.

Table 1 shows the proficiency level of each parameter and the number of trainees that reached the proficiency level. For all tasks, the proficiency level of the maximum force was the first to be acquired, except for the Wire chaser, in which the benchmark for the maximum force was reached at the 38th session. The benchmarks of the parameters time and path length were reached at the same time for the Flap task and the Pattern cut. While for the other tasks, the proficiency level for time reached before the proficiency level of the path length. Moreover, 19 out of 42 reached proficiency for the path length of the Loops and wire, 21 out of 42 trainees reached the proficiency level of the maximum force for the Wire chaser, and 31 out of 42 trainees reached the proficiency level of the path length for the Zigzag loop. These three parameters were reached by the lowest number of trainees. The remaining 15 parameters were reached by more than three quarters of the trainees.

Table 1 Mean session to proficiency and number of trainees that reached the proficiency level

Learning curve prediction

The results of the Linear regression analyses are provided in Table 2. For 17 of 18 parameters, the baseline performance had a statistically significant relationship with the number of sessions needed to reach the benchmark. Within the path length of the Loops and wire, this relation was insignificant. Fifteen out of 18 dependent variables were not normally distributed and therefore were either log-transformed or square-root transformed, as shown in Table 2. The relation between the number of sessions needed to reach the benchmark and the baseline performance was quadratic for two out of 18 parameters. For these parameters, curvilinear regression tests were performed in which squared independent variables were included for analysis.

Table 2 Results of linear regression analyses: baseline performances as a predictor of the number of sessions needed to reach the benchmark of the parameters time, path length and maximum force

Table 3 shows the linear regression equations for the estimation of the number of sessions needed to reach the benchmark. Transformed models are either Log – Linear (Log10) or Square-root – Linear. Post hoc power-analysis revealed high power (> 0.8) for 16 out of 18 linear regression tests. The power of the path length within the Loops and wire was 0.138, while the power of time within the Zig-zag loop was 0.610. See Supplementary file A for results of the post hoc power-analyses (Supplementary Table A1).

Table 3 Learning curve Prediction: linear regression equations. Y = Number of sessions needed to reach the benchmark; X = Baseline performances of the parameters time, path length and maximum force

Discussion

This study showed that it was possible to predict the learning curve of laparoscopic technical skill in a basic laparoscopy course at an early stage of training. By performing a laparoscopic task three times, it is possible to calculate how many repetitions are needed to acquire the benchmark for force, motion and time parameters. For example, according to the calculations in Fig. 2, 19 repetitions for reaching the time benchmark, 31 repetitions for reaching the path length benchmark and two repetitions to not exceed maximum force are advised when a trainee completed the Post and Sleeve task three times with the following average outcome: time 238 s, path length 12,836 mm and maximum force 2.94 N.

Fig. 2
figure 2

Example of prediction calculation

This allows identifying trainees who require more time and feedback for their laparoscopic training. Furthermore, the possibility arises to find trainees require less training time for basic laparoscopic skills, and hence, can advance earlier to more complex laparoscopy. Lastly, using the current methods it is possible to recreate this learning curve prediction model for other laparoscopic (and robotic) training tasks and curricula.

Proficiency graphs displaying the group learning curve made it possible to analyze the learning curve and determine if and when the proficiency level is reached for each parameter of the six tasks. For all except one tasks, the benchmark of maximum force was the first to be reached. The benchmarks of the parameters time and path length were either reached at once, or the benchmark of time was reached before the benchmark of the path length. This indicates that trainees need the most time to improve their path length. This is supported by the analysis of the number of trainees that reach the proficiency level, in which the benchmark of the path length was reached the least often. This is consistent with our prior conducted research [17]. At the start of the training, the majority of novices is focused on safe tissue manipulation and on completion of the task. Resulting in a longer completion time and more instrument movements. Furthermore, efficient handling of instruments is more inherent to experts and their proficiency levels in this metric are relatively high. Using the above mentioned prediction model, more feedback and guidance can now be given in an early phase for path length parameters.

Furthermore, differences between tasks and parameters can be examined. It is found that the proficiency levels for the Zig-zag loop are reached at a later stage of training, while the three benchmarks for the Pattern cut are reached within three sessions, suggesting that the Pattern cut is easier to perform, compared to other tasks.

The learning curve of all residents improved rapidly during the first sessions, after which it gradually leveled out in the plateau phase, which is as expected [30]. This suggests that it is possible to predict the learning curve at an early stage of training. Stefanidis et al. (2017) stated that baseline performances, which were defined as average scores of the first three measurements, might be of value in the prediction of skill acquisition in laparoscopic skills training with virtual reality simulators [31]. This was consistent with our analysis, the mean of the first three measurements were defined as baseline performances and included as predictor variables in the prediction of the learning curve in the basic laparoscopy course.

Improving a personalized curriculum could be achieved by showing the trainees performance level and the proficiency levels during training. This enables comparing the trainees performance with the proficiency level, the group mean and quartiles 1 to 3. Displaying these proficiency graphs during training can enhance the individual feedback that is received directly by trainees. [32]. As an example of personalized training, the Amsterdam UMC and the 13 affiliated teaching hospitals have a Minimally Invasive Surgery Curriculum in which personalized training is implemented. Since 2018, the Basic Laparoscopy Course is mandatory for junior residents. The course participants receive box training with objective feedback and are examined on fix4life human cadavers in the Amsterdam Skills Centre. After obtaining the certificate, the residents perform laparoscopic procedures in the OR.

A limitation of this study is that trainees that did not reach the proficiency level were excluded from analysis within the learning curve prediction. This could have the effect that the prediction model is more optimistic in prediction. This was because for statistical reasons making a prediction model only the trainees that achieved proficiency could be reliably used for prediction. However, the prediction model is still able to distinguish between underperformers and overperformers in an early phase. This implies that overperformers would not need to use expensive training facilities for an extended period. And underperformers will be identified in an early phase for additional personalized training. A personalized prediction model can be applied universally for all trainees.

In conclusion, measurement of objective force, motion and time parameters can predict the time of reaching proficiency allowing tailored and personalized training.