Objective assessment of laparoscopic targeting skills using a Short-Time Power of Difference (STPOD) method

Purpose To ensure that the use of surgical training tools results in improvement of surgical skills, it is necessary to be able to measure and assess surgeons’ skills. We established the Short-Time Power of Difference (STPOD) method as an evaluation tool for evaluating targeting technique. The STPOD method evaluates the distance from the actual movement of the forceps to the shortest linear path between two points in a short time period. We examined the effectiveness of the STPOD method as a new forceps kinematic analysis. Methods Six residents were categorized as novices and six urologists as experts. All participants performed box trainer training and LapPASS® Simulator training. During the procedure, objective scores (time, distance, and STPOD) were recorded. STPOD (Power) evaluated motion smoothness and STPOD (Stop) evaluated the stop time of the forceps. Results STPOD (Stop) on the right side of the experts was significantly lower than that of the novices in the box trainer. Furthermore, there were significant differences in the distances of left side and STPOD (Power) between the experts and the novices in the simulator. In the correlation of parameters between the box trainer and the simulator, time showed the strongest correlation, STPOD (Power) and distance showed a mild correlation. Conclusion We showed the construct validity of STPOD (Power) and STPOD (Stop) using both the box trainer and the simulator. This method is a good evaluation tool for assessing a physician’s skill; however, there are much more complex motions that are performed in actual surgery. Future studies are needed to focus on evaluation in an environment closer to actual surgery and comparing with other existing methods.


Introduction
To ensure patient safety, it is desirable for surgeons to practice surgical procedures before performing them. In addition, in order to improve their skill levels and shorten learning curves and procedure times, surgeons are required to practice procedures outside of the operating room. Simulation tools meet such demands. To ensure that such training results in improvement of surgical skills, it is necessary to be able to measure and assess surgeons' skills. Thus, it is important to understand the performance differences between experienced and novice surgeons.
Previous studies distinguished experienced surgeons from novices using performance scores based on performance time, bleeding volume, and number of errors made during an operation. Some studies evaluated laparoscopic skills using objective evaluating methods such as the Global Operative Assessment of Laparoscopic Skills (GOALS) [1] and Global Evaluation Assessment of Robotic Skills (GEARS) [2].
Especially in laparoscopic surgery, some studies examined psychomotor skills. Using electromagnetic positiontracking sensors, kinematic analyses of motion, involving parameters such as time, path length, and speed, have been performed [3,4]. Compared with the novices, experienced surgeons were expected to handle manipulators more smoothly during laparoscopic surgery and spend a shorter time to imagine their next action due to their experience, which would result in a shorter time without moving the forceps. However, quantitative parameters, such as motion smoothness (MS) and the blank time when the forceps are not moved, are controversial. There is no established objective method for evaluating targeting technique (applying forceps to an object), which is the basic movement of laparoscopic surgery. Most existing evaluation methods of motion smoothness use acceleration in the form of three-dimension vectors. We initially used the same method; however, it was difficult to distinguish between novices and experts. Therefore, we established the Short-Time Power of Difference (STPOD) method as an evaluation tool for assessing targeting technique. Herein we examined the effectiveness (construct validation) of the STPOD method as a new forceps kinematic analysis method.

Materials and methods
This study was approved by the Institutional Review Board of Yokohama City University. Participants signed their informed consent to participate in the study. The data gathered were coded, and all reporting was confidential and did not impact the official evaluation. Participants could choose to withdraw at any point during the study and they were made explicitly aware of this at the time of informed consent. Six residents were categorized as laparoscopic novices (no experience in laparoscopic surgery) and six urology doctors as laparoscopic experts (> 20 laparoscopic procedures completed and having a surgical skill qualification issued by the Japanese Society of Endourology) [5]. The dominant hand was all right. All participants were oriented to the box trainer and the training simulator LapPASS ® (Mitsubishi Precision, Japan, https://www.mpcnet.co.jp/product/ lappass/) [6][7][8][9], and given a demonstration of both. The subjects in each category were then randomized into two groups using a randomizing program (https://en.calc-site. com/randoms/grouping). Group 1 underwent training with the box trainer followed by LapPASS ® and Group 2 with LapPASS ® followed by the box trainer ( Fig. 1a and b). The task in the box trainer was touching targets in order using forceps five times using each hand. One participant performed 10 times in total. The electromagnetic tracking system TrakSTAR ® (Mikimoto Beans, Japan, https://tracklab. com.au/products/brands/ndi/ascension-trakstar/) was used to acquire the position information of the forceps. The device was attached to the tip of the tuppel to get position information. The task in LapPASS ® consisted of participants performing hand-eye training six times. This hand-eye training consisted of applying the forceps to the targets in order.
The STPOD method is entirely different from existing methods for kinematic analysis. STPOD is a way to evaluate and quantify "How much the distance is from the position of the tip of the tuppel either to the shortest path between the starting point and the end point, or to the average of all positions visited during a short time period." The value obtained is denoted as P m . We calculate STPOD (Power) and STPOD (Stop) making a graph with P m on the y-axis and time on the x-axis. When P m is less than the threshold, the surgical tool is considered as "not moving." When P m is bigger than the threshold, the surgical tool is considered as "not linear nor smooth." It was calculated using a simple product-sum operation and was suitable for real-time evaluation. A schematic diagram of the STPOD method is shown in Fig. 2a The following is an explanation of the STPOD method.
In the time series data t 1 x 1 , t 2 x 2 , . . . , t e x e , extract short time period 'm' and define it as Here t e means time and x e means the position of the tip of the tuppel. x e represents a three-dimensional vector, but here is simplified to x e for clarity. Time period m means 0.5 [s] in the box trainer, and 1 [s] in the simulator. Next, we defined Power 'P m ' in this period as Here, f m t m i is a standard formula and is defined as aver- a m and b m were defined as: x m and t m were the averages of x m andt m , respectively. The latter part of the formula 0.5 − 0.5 cos 2π is the Hamming window function.
In conclusion, formula (1) represents the sum of the squares of the distance from the position of the tip of the tuppel either to the shortest path between the starting point and the end point, or to the average of all positions visited during time period m, given weight by the window function.
There are two evaluation indexes by the STPOD method: 'power,' which indicates wasteful movement, and 'blank time', which indicates when movement is stopped.

[Evaluation item 1: power]
We used the regression line as the standard. When 'P m ' is bigger than the threshold, the surgical tool is considered to be "not linear." We calculated the sum of the time sections above the threshold. A smaller value means that the surgical instrument moves linearly and smoothly (Fig. 2b).

[Evaluation item 2: blank time]
We used the average as the standard. When 'P m ' was less than the threshold value, the surgical tool is considered to be "stop." We calculated the sum of the time sections below the threshold. A smaller value means that there was shorter stagnation of movement (Fig. 2c).
We evaluated "time" and "distance" as conventional comparative values.
Statistical analysis was performed using EZR [10], which is a graphical user interface for R (The R Foundation for Statistical Computing, Vienna, Austria). Mann-Whitney U test was performed for each pair of groups. In order to correct for multiplicity, we performed the Bonferroni correction. As a result, the significance level of the box trainer (n = 4) changed fromp < 0.05 to p < 0.0125 (0.05/4 = 0.0125), and of the simulator (n = 5) changed to p < 0.01 (0.05/5 = 0.01).

Results
In the box trainer, there was a tendency for all the parameters to improve in both groups as the number of times increased (Fig. 3). For all repetitions together, STPOD (Stop) on the right side and time on the left side of the experts were significantly lower than those of the novices. This shows the construct validation of STPOD (Stop) in the box trainer ( Fig. 4) (Table 1).
In the simulator, both groups improved their score as the number of times increased (Fig. 5).
For all repetitions together, there was a significant difference in distance on the left side and STPOD (Power) between the experts and the novices. This shows the construct validation of STPOD (Power) in the simulator (Fig. 6) (Table  2).
By evaluating the correlation of parameters between the box trainer and the simulator, time showed the strongest correlation (γ = 0.719). Next, STPOD (Power) and distance showed a mild correlation (γ = 0.396 and γ = 0.347, respectively) (Fig. 7).

Fig. 5
Learning curve of the simulator between the expert and novice groups. "rt" stands for right hand and "lt" for left hand

Discussion
Conventionally, for the evaluation of the simulator, time, the amount of bleeding, and objective evaluations, such as GOALS and GEARS, have been used. When the surgeon performed the procedure carefully and spent a lot of time, the amount of bleeding was small and the results were often good; however, it was expected that the burden on the patient would be large. Previous studies reported that a shorter operation time lead to a reduction in postoperative complications [11][12][13][14]. Ideally, a smooth and short surgery is required; however, it has been difficult to evaluate smoothness. In general, it is predicted that the more experienced surgeons have smoother motions and obtain lower values for MS compared to less experienced ones. Furthermore, it is predicted that the experts need a shorter time to think about what they should do next and this results in a shorter blank time.
As an evaluation method, a scoring system, such as GOALS and GEARS, is used; however, these evaluation methods depend on the surgical expert's subjective decisions. For this reason, studies were conducted to distinguish MS between experts and novices using objective methods; however, this was challenging. Maithel et al. evaluated MS using the Computer-enhanced Laparoscopic Training System [15], and found no significant difference. Hofstad et al. evaluated psychomotor skills using the D-Box Basic Simulator [16]. The Aurora Electromagnetic Measurement System was used for the tracking of the forceps and MS was defined There was a significant difference only in the non-dominant hand when comparing novices to experts and intermediates. A further study assessed cholecystectomy performed in a porcine liver box model [17], and MS was defined the same as Hofstad et al.; however, there was no significant difference between the groups. On the other hand, Escamirosa et al. evaluated surgical skills by 13 motion analysis parameters using the EndoViS Training System [18]. The motion of the forceps was recorded by a video-tracking system. They defined MS as abrupt changes in acceleration resulting in jerky movements of the instrument (m/s 3 ). There was a significant difference between the experts and novices in all three tasks, peg transfer, pattern cutting, and intracorporeal knot suture. We initially used acceleration to evaluate MS using existing methods; however, it could not distinguish between the novices and experts. On the other hand, STPOD was able to distinguish the two groups. Thus, this new method is an alternative way to evaluate MS.
In the present study, right-hand distance of the box trainer in the expert group was longer than that of the novice group. This may reflect the habits of each surgeon. In actual laparoscopic surgery, when moving forceps between two points, the forceps are not moved in a straight line and are often moved after the forceps have been pulled back to the hand. Because of their real-world experience, experts may take a longer distance. This study was conducted in the order of right to left hand; thus, the participants recognized the position of the pins and became familiar with the movement in the right-hand session, and this may have resulted in the lack of a significant difference between the two groups in the left-hand session.
The most suitable objective method to evaluate blank time is controversial. Uemura et al. evaluated forceps movements using a labeling system based on predefined terminology. They concluded that skilled participants had a shorter blank time (time without forceps movement) than novices. They proposed that the time spent holding the forceps but not moving them was time spent thinking, and that the shorter blank time for the skilled users was due to their experience and stable manipulation movements [19]. In addition, Hofstad et al. reported that there was a significant difference in the idle percentage (percentage of total time the instrument is moved at speed < 2 mm/s) [17]. On the other hand, some reports concluded that there were no significant differences between expert and novices when defining idle time as the percentage of time where the instrument is considered to be still [4,18].
By using the STPOD method, we evaluated the degree of flicker of the forceps and the blank time when the forceps were not moved. We predicted that there would be a little flicker and blank time in the experts' surgery.
In this study, we found that the experts had less flicker and a shorter blank time in both the box and the simulator. Furthermore, although it is inferior to the objective evaluation method "time" that has been used for a long time, we showed a correlation of STPOD between the box and the simulator. Thus, the STPOD method is effective for evaluating forceps dynamics.
There are some limitations in this study. First, the number of subjects was limited. Second, we focused only on targeting and the task that the participants performed was just moving the forceps between fixed targets. In an actual surgery, there are much more complex motions. For this reason, we did not use GOALS evaluation. In the future, evaluation in an environment closer to actual surgery and comparing the STPOD method with other methods, such as GOALS, are necessary. Third, a comparison between our proposal and existing motion analysis parameters was not performed. Comparing STPOD with the existing method to evaluate motion smoothness is necessary.