Motion analysis for better understanding of psychomotor skills in laparoscopy: objective assessment-based simulation training using animal organs

Background Our aim was to characterize the motions of multiple laparoscopic surgical instruments among participants with different levels of surgical experience in a series of wet-lab training drills, in which participants need to perform a range of surgical procedures including grasping tissue, tissue traction and dissection, applying a Hem-o-lok clip, and suturing/knotting, and digitize the level of surgical competency. Methods Participants performed tissue dissection around the aorta, dividing encountered vessels after applying a Hem-o-lok (Task 1), and renal parenchymal closure (Task 2: suturing, Task 3: suturing and knot-tying), using swine cadaveric organs placed in a box trainer under a motion capture (Mocap) system. Motion-related metrics were compared according to participants’ level of surgical experience (experts: 50 ≤ laparoscopic surgeries, intermediates: 10–49, novices: 0–9), using the Kruskal–Wallis test, and significant metrics were subjected to principal component analysis (PCA). Results A total of 15 experts, 12 intermediates, and 18 novices participated in the training. In Task 1, a shorter path length and faster velocity/acceleration/jerk were observed using both scissors and a Hem-o-lok applier in the experts, and Hem-o-lok-related metrics markedly contributed to the 1st principal component on PCA analysis, followed by scissors-related metrics. Higher-level skills including a shorter path length and faster velocity were observed in both hands of the experts also in tasks 2 and 3. Sub-analysis showed that, in experts with 100 ≤  cases, scissors moved more frequently in the “close zone (0  ≤ to < 2.0 cm from aorta)” than those with 50–99 cases. Conclusion Our novel Mocap system recognized significant differences in several metrics in multiple instruments according to the level of surgical experience. “Applying a Hem-o-lok clip on a pedicle” strongly reflected the level of surgical experience, and zone-metrics may be a promising tool to assess surgical expertise. Our next challenge is to give completely objective feedback to trainees on-site in the wet-lab. Electronic supplementary material The online version of this article (10.1007/s00464-020-07940-7) contains supplementary material, which is available to authorized users.

the aorta, applying a Hem-o-lok in the vascular pedicle, and renal parenchymal closure. In order to complete each training task, participants need to employ various surgical skills using a range of laparoscopic surgical instruments, and we previously reported good construct validity of our cadaveric porcine organ model [1].
In the present study, we performed motion capture (Mocap) analysis of multiple surgical instruments among participants with different levels of experience of laparoscopic surgery in a series of wet-lab training sessions using our cadaveric porcine organ model. As abovementioned, because we had already confirmed good construct validity of the present training tasks in our previous study based on experts' video reviews, we consider that our model is appropriate to advance understanding of the components of surgical dexterity among several core surgical skills in laparoscopy. As described later, our Mocap system can recognize each instrument individually irrespective of instrument exchanges, which enables us to characterize the motions of multiple surgical instruments simultaneously in complex training tasks that require a range of surgical techniques, such as grasping tissue, tissue traction and dissection, applying a Hem-o-lok clip, and suturing/knotting. Our aims were to clarify the motion characteristics according to surgical experiences in a series of wet-lab training sessions, and digitize the level of surgical competency, which facilitates clear feedback of motion parameters to trainees.

Materials and methods
The institutional review board approved the present study (No. 018-0257). As described above, we previously reported our wet-lab training using cadaveric porcine organs. Briefly, participants performed three tasks: Task 1: tissue dissection around the aorta, dividing encountered mesenteric vessels after applying a Hem-o-lok, Task 2: tissue dissection and division of the renal artery, and Task 3: renal parenchymal closure. We observed good construct validity based on Global Operative Assessment of Laparoscopic Skills (GOALS) and our original assessment sheet, by two blinded experts' video reviews of all three tasks [1]. We used Task 1 and a modified Task 3 for the present Mocap analysis. Fortyfive subjects voluntarily participated in the training. Written informed consent was obtained regarding the use of their data for research. The details of the present training tasks are described in the next paragraph. In all tasks, porcine cadaveric organs were placed in a box trainer (Endowork Pro®, Kyoto Kagaku, Japan, Fig. 1A, B). Porcine organs were purchased from a commercial vendor. Before the training, each task was explained by one of the authors (KE) using recorded movies. During the training, one of the four authors (TA, MH, JF, and NI) was a scopist, using a video system (VISERA Pro Video System Center OTV-S7Pro, Olympus, Japan, Fig. 1A) and zero-degree lens. If participants had problems with simulation, especially medical students, each step of the training task was verbally guided by the scopist. After the training session, completed questionnaires were collected, including demographic data and experience of laparoscopic surgeries. In Japan, the Endoscopic Surgical Skill Qualification (ESSQ) system was initiated in 2004, in which two double-blinded referees evaluate an unedited surgical movie [16,17], and this certification status was also ascertained. All training sessions were video-recorded, and the subjective mental workload was assessed by NASA Task Load Index after each training session for subsequent analysis.

Task 1
Participants are required to remove the tissues around the aorta, dividing encountered mesenteric vessels after applying a Hem-o-lok (Fig. 1C, D). Usually, 5-7 mesenteric vessels were divided during the task.

Tasks 2 and 3
Participants are given a 15-cm 2-0 CT-1 VICRYL® thread, and are required to pass the needle from right to left through the kidney parenchyma at three different sites on a kidney (Task 2, Fig. 1E). In Task 3, participants are asked to complete three-square single-throw knots at two different sites on a kidney (Fig. 1F).

Motion capture analysis
We previously reported the present Mocap system that we newly developed [18]. Briefly, the Mocap system, which consists of six infrared cameras (OptiTrack Prime 41, Nat-uralPoint Inc., USA), simultaneously tracked the movements of multiple surgical instruments during a series of training steps ( Fig. 2A). Infrared reflective marker sets with an individual arrangement pattern were attached to   Fig. 2C: scissors). Therefore, our system recognized each instrument individually regardless of exchanges of instruments during a procedure. The tip movements were calculated based on the positional relationship between the tip and handle ( Fig. 2D: computer display showing the movements of surgical devices). Several markers were also attached to the left and right sides of a box trainer to identify the base position. In our previous study, a questionnaire survey revealed that participants did not feel significant disturbance from the attached marker sets during the manipulation of surgical instruments. The measurement outcomes analyzed in the present study were as follows: i. Operative time (s): total time to complete a task. ii. Path length (m): total length of the tip trajectory of an instrument where n is the total number of frames, and x i , y i , and z i are tip positions of an instrument in frame i . In this study, the trajectory that lies inside the box trainer is the measurement target, excluding that outside of it. iii. Velocity (cm/s): average velocity of the tip of an instrument.
iv. Acceleration (cm/s 2 ): average acceleration of the tip of an instrument.
v. Jerk (cm/s 3 ): average jerk of the tip of an instrument. Jerk is the changing rate of acceleration, and it represents motion smoothness.
vi. Frequency of opening and closing (times): total number of iterations of opening and closing the jaws of forceps. "A series of opening and closing the forceps once" is counted as "one iteration". vii. Distribution of velocity: the number of frames whose instrument velocity belongs to a certain velocity band as a ratio of the total number of frames n . Each velocity band is defined as follows: In Task 1, at the beginning of training, we intentionally recorded both the starting and ending points (around 17-18-cm distance) by placing both forceps on the aorta for 5 s. In Tasks 2 and 3, the same procedure was performed on the incised line of the kidney parenchyma. Using these data, we defined the concentric cylinder shape of the working area in each case, and calculated the "Distribution of working area" described above. ix. Average inserting time (s): the inserting time was calculated as the duration between insertion of an applier into a box trainer through a trocar and its removal. The average time was calculated for each case.
In this study, the trajectory of the tip of an instrument ( x i , y i , and z i ) was smoothed by the Savitzky-Golay filter [19], and its derivatives d j x i dt j , d j y i dt j , and d j z i dt j (j = 1to3) were also obtained by the filter. The polynomial order of the filter was set to 3, and the number of sampling frames of the filter was set to 31.

Analyses and statistics
Measurements were compared according to participants' levels of surgical experience (experts: 50 ≤ laparoscopic surgeries, intermediates: 10-49, novices: 0-9). In the present analyses, we chose a cutoff of 50 cases to define the "expert" category based on a paper demonstrating a shorter operative time on treating over 50 cases [20], an expert opinion [21], and our previous validation study of the present model [1]. The Kruskal-Wallis test was utilized to assess differences among the three groups. If the groups significantly differed, the Mann-Whitney U test was utilized for paired comparison to test the differences between groups. The ESSQ status was also used for comparison. Principal component analysis (PCA) was conducted, a data reduction technique, in order to understand the motion metrics that explained the level of surgical experience in the present tasks. In this study, the metrics showing a significant difference ( p < 0.05 ) in the Kruskal-Wallis test were used as input data for the analysis. To reduce the effects of outliers, the input data were normalized by a robust Z-score. The normalized data z i can be calculated as follows: where, X i denotes the original data, X m is the median, and NIQR is the normalized interquartile range.
Kruskal-Wallis and Mann-Whitney U tests were performed using JMP 14 (SAS, Japan), and PCA was conducted using R (Ver. 3.6.0). Table 1 shows a summary of participants' backgrounds. Thirty-nine urologists, one junior resident, and five medical students voluntarily participated in the present study. The previous experiences of laparoscopic surgery were as follows: 0-9: n = 15, 10-49: n = 12, 50-99: n = 6, 100-499: n = 9, 500 ≤ : n = 3. Fifteen participants had the ESSQ qualification. Two surgeons (one expert and one intermediate) were left-handed. However, they were included in the present analysis because they performed actual surgeries with a right-handed style. Table 2 summarizes the measurement metrics by the present Mocap system divided by the previous surgical experiences. Figure 3 also shows box plots of the path length, velocity, acceleration, and jerk of scissors and the Hem-o-lok clip applier in Task 1. In Task 1, there were significant differences (p < 0.05) in the path length, velocity,

Results
acceleration, and jerk among the three groups using both scissors and the Hem-o-lok clip applier, showing the superiority of speed-related parameters and economic movements (shorter path length) of surgical devices managed by the right hand of operators in the more experienced group. Regarding the Croce grasping forceps, managed by the left hand, there were significant differences in the path length and frequency of opening and closing, showing economic movements in the more experienced group. In Tasks 2 and 3, there were significant differences in the path length, velocity, and acceleration among the three groups in both hands.
Overall, regarding the paired comparisons of the motion metrics showing significant differences on the Kruskal-Wallis test, the difference between experts and novices was large and remained significant, while the difference between novices and intermediates, or that between intermediates and experts was sometimes small and non-significant. Figure 4 shows representative results for the trajectory of the instrument tip of an expert, an intermediate, and a novice in the three tasks. Supplementary Table 1 shows the measurement outcomes divided by the ESSQ qualification status. In Task 1, participants with the ESSQ qualification demonstrated superior speed-related parameters and economic movements with right hand devices, and economic movements with lefthand devices. In Tasks 2 and 3, there were significant differences in the path length and velocity in both hands between participants with and without the ESSQ qualification. Using data in Table 2, we created bar charts showing the distribution of the median velocity of each instrument among the three groups (Fig. 5). Overall, there was a trend toward a shorter idle state and longer state of the quicker velocity range in the expert group for all instruments excluding the Croce grasping forceps used in Task 1, and significant differences in the ratio of the velocity range were frequently observed in instruments managed by the right hand. Regarding the Hem-o-lok clip applier, sub-analysis of the ratio of the velocity range according to the working area showed significantly shorter idle states and quicker movements in the "near zone (2.0 ≤ -< 4.0 cm from aorta)" in the expert group (Fig. 6). Regarding analysis of the distribution of the working area, the trajectory of the Hem-o-lok clip applier was longer in the "near zone (2.0 ≤ -< 4.0 cm from aorta)" in novice and intermediate groups (novices: median 29.0%, intermediates: median 27.3%, experts: median 18.3%, p = 0.0123), which suggested the hovering of the instrument before determining the closure site on the vascular pedicle.
We further compared the metrics between the experts with 50-99 surgical cases and those with more than 100 cases (Table 3). In Task 1, right hand scissors moved more frequently in the "close zone (0 ≤ -< 2.0 cm from aorta)" in experts with more than 100 cases (experts 100 ≤ cases: 83.9%, experts of 50-99: 60.5%, p = 0.0145). In Task 3, experts with 100 ≤ cases showed a shorter operative time and     shorter path length for the right hand needle holder. There was no significant difference in any parameters in Task 2. Figure 7 shows the results of PCA regarding the level of surgical experience. Figure 7A, C, and E shows loading plots of 1st and 2nd principal components of each task, and Fig. 7B, D, and F shows score plots of 1st and 2nd principal components, respectively. In Task 1, we did not include the "Low-velocity time of Croce grasping forceps" for the PCA because it showed a V shape among the three groups, and not a constant tendency according to surgical experience, and in Task 3, we did not include the "Very high-velocity time" because the calculated values were extremely low (0-0.7%). As shown in Fig. 7A, C, and E, the principal loading vectors were roughly distributed in two directions. The right-directed vectors were associated with speed-related metrics (e.g., velocity and acceleration, shown by red and green arrows, respectively), and the left-directed vectors were associated with efficiency-related metrics (e.g., pathlength, task time, and frequency of opening and closing, shown by blue arrows). In Task 1, both categories contributed to the axes of 1st and 2nd principal components and Hem-o-lok clip applier-related metrics strongly contributed to the axis of the 1st principal component (Fig. 7A). In Task 2, efficiency-related metrics (path length and task time) strongly contributed to the axis of the 1st principal component (Fig. 7C), while both categories did in Task 3 (Fig. 7E). In Task 1, 68% of the total variance was explained by the 1st and 2nd principal components, and in Tasks 2 and 3, 81 and 82% of the total variance was explained by the 1st and 2nd principal component, respectively. Figure 7B, D, and F shows principal component score plots for each task. Overall, experts' scores were distributed in the right zone, intermediates' scores in the middle, and novices' scores in the left zone in all three Tasks. In addition, the plots of novices without any surgical experience [Novice (0)] were mainly distributed in the leftmost area, which meant the lowest skill level zone. Several participants were distributed in a different category-zone from that expected according to the level of surgical experience.

Discussion
To our knowledge, this is the first study of motion tracking of multiple surgical instruments simultaneously in a series of training drills using an animal organ model. As described in Materials and Methods, because our system can recognize each instrument individually regardless of exchanges of instruments during a procedure, it facilitated the present Mocap analyses of relatively complicated surgical tasks. Regarding Task 1 (tissue dissection around aorta), it was developed to help young trainees learn laparoscopic dissection skills around a vessel and Hem-o-lok clip application on a vessel. Trainees were required to remove the tissue around the aorta, dividing encountered mesenteric vessels after applying a Hem-o-lok, which means that trainees need to frequently exchange the instruments according to the situation. As shown in Table 2 and Fig. 3, superior speed-related parameters (velocity, acceleration, and jerk) and economic movements (shorter path length) involving scissors managed by the right hand, and economic movements (shorter pathlength and lower frequency of opening and closing) involving Croce grasping forceps managed by the left hand were observed. Furthermore, although "applying a Hem-o-lok" was a quick procedure, there were significant differences in the path length, velocity, acceleration, jerk, and average inserting time among the three groups. When applying a Hem-o-lok, surgeons were required to feed an applier toward the objective vessel without bumping or injuring intervening obstacles, ideally along the shortest route, close the Hem-olok, and remove the applier in reverse along the same route. After that, surgeons needed to bring the scissors back to the working area in the same manner, which partially influenced the total path length of scissors. Our observations strongly suggest that getting surgical devices in/out smoothly and correctly using a limited and two-dimensional monitor, namely safe and efficient exchange of surgical instruments with limited visual information, requires highly trained visual spatial skills, and it well-reflected the level of surgical experience in laparoscopy. We consider that a training task designed to learn visual spatial skills to exchange surgical instruments safely should be included in a laparoscopic training curriculum. Regarding the suturing/knot tying tasks (Tasks 2 and 3), our observations were in line with previous findings that a shorter task time, shorter path length, and faster velocity were observed in an expert group.
Regarding the distribution of the working area, Buckley et al. previously reported "zone" metrics, defined as the percentage of time spent with the instruments within pre-defined areas. In their study, ten medical students, ten surgical residents, and five experts performed a laparoscopic suturing task using ProMIS III® Simulator, and there was a significant difference in the average "in-zone (0-6 cm) score" among the three groups. The average right/left inzone scores were 88/83% for experts, 72/69% for surgical residents, and 49/50% for medical students [11]. In the present study, we directly calculated the ratio of the path length of a certain area from the target object (Task 1: aorta, Tasks 2 and 3: kidney surface) to the total path length. Regarding   Fig. 3 Box plots of path length, velocity, acceleration, and jerk of scissors and the Hem-o-lok clip applier in Task 1 (E experts, I intermediates, N = novices). There were significant differences (p < 0.05) in the path length, velocity, acceleration, and jerk among the three groups using both scissors and the Hem-o-lok clip applier, showing the superiority of speed-related parameters and economic movements (shorter path length) of surgical devices managed by the right hand in the more experienced group. Outlier box plots represent the median (center line) and 25th and 75th percentiles (box), and the ends of the whiskers are the outermost data points from their respective quartiles that fall within the distance computed as 1.5 times the interquartile range (IQR). E experts, I intermediates, N novices ◂ suturing/knot tying tasks (Tasks 2 and 3), we did not observe a significant difference in the distribution of the working area among the three groups (Table 2) Task 1, we observed that the trajectory of the Hem-o-lok applier was longer in the "near zone (2.0 ≤ to < 4.0 cm from aorta)" in novice and intermediate groups, which suggested the hovering of the instrument before determining the closure site on the vascular pedicle. Furthermore, we observed that experts with more than 100 cases handled scissors more dexterously in the "close zone (0 ≤ to < 2.0 cm from aorta)", which suggested short, deft movements around the objectives. On considering the results together with the study of Buckley et al., "zone-metrics" may be a promising parameter associated with the level of surgical expertise, and a future study is necessary to confirm its validity.
The idle time means the time period when instrument movement/interaction is minimal. Previous studies showed significant differences in idle times between a novice surgeon and an experienced surgeon regarding laparoscopic suturing, a more complex procedure, and an open surgery suturing task [22][23][24]. These results can be interpreted as a novice surgeon needs more time for motor planning and decisionmaking than a more experienced surgeon. As shown in Fig. 5, our results also showed a trend toward a shorter idle state and longer state of the quicker velocity range in the expert group for all instruments except the Croce grasping forceps, and this was more apparent for instruments managed in the right hand. Finally, in order to simplify data and  Fig. 1A and B, and the origin of the coordinate is the starting point of a target object recorded at the beginning of training (see definition of "Distribution of working area" in "Materials and methods"). Overall, a shorter path length was observed in the expert group  Fig. 5 Bar charts showing the distribution of the median velocity for each instrument among the three groups (E experts, I intermediates, N = novices). Overall, there was a trend toward a shorter idle state and longer state of the higher velocity range in the expert group for all instruments except the Croce grasping forceps used in Task 1. Proportion less than 1% of the very high-velocity range was not described in the figure. *, †, and ‡ indicates statistically significant (p < 0.05) among the 3 groups. E experts, I intermediates, N = novices identify the most relevant motion metrics that differentiated levels of surgical experience, we performed PCA analysis of the measured data. Regarding the PCA scores plot, experts' scores were generally distributed in the right zone, intermediates' scores in the middle, and novices' scores in the left zone in all three Tasks. Several participants were in a different category-zone from that expected according to the level of surgical experience, which suggested that they could have equivalent surgical skills of their category-zone, rather than skills determined by previous surgical experience, detectable by Mocap-based objective skill assessment. As described in Results, the analyzed metrics were roughly grouped into two categories: speed-related and efficiency-related metrics. Among the metrics, Hem-o-lok-related metrics strongly contributed to the axis of the 1st principal component. As shown in Fig. 6, experts better handled a clip applier in the "near zone (2.0 ≤ to < 4.0 cm from aorta) with a shorter idle state and quicker movements, while the analysis of the distribution of the working area revealed that the trajectory of the Hem-o-lok applier was longer in the "near zone (2.0 ≤ to < 4.0 cm from aorta)" in novice and intermediate groups.
These observations strongly suggest an autonomous state of experts without wondering of the instrument before determining the closure site on the vascular pedicle.
Limitations of this study include the small sample size and heterogeneity, for example, differences in background (medical students/a junior resident/urologists) and the inclusion of two left-handed surgeons. Experience does not always reflect the level of actual skills and expertise, although 15 of the 18 experts had the ESSQ credential. Mocap data do not necessarily reflect errors and quality outcomes. Regarding intermediate and expert groups, only urological doctors participated in the present study. Although we used 50 cases as a cutoff-point for the definition of the expert group based on our previous validation study of the present model, hundreds of cases may be needed to make one an expert surgeon in real world clinical practice. In order to confirm the generality of our observation and gain further insights into expertise in laparoscopic surgery, we started second data collection, inviting laparoscopic surgeons other than those with urological backgrounds. Nevertheless, we believe that our study contributes to in-depth understanding of surgical dexterity and the process of learning laparoscopic surgical skills in terms of motion metrics. Such knowledge could help educators develop a training curriculum and provide valuable feedback to trainees. Our next challenge involves revising the computer program for metrics calculation and developing an appropriate evaluation form that gives "completely objective" real-time feedback to trainees, and this could become a very powerful educational tool along with experts' feedback. Furthermore, automatic motion analyses using machine learning is also one of our goals.
Our study is an initial step in a more ambitious research plan to develop a Mocap system that can be utilized in live animal surgery, cadaveric surgical training, or a real clinical setting. This would involve examining the most suitable position of the tracking camera in the operating theater to avoid interruptions in a potentially  Fig. 6 Sub-analysis of the velocity range of the Hem-o-lok clip applier according to the working area (E experts, I intermediates, N = novices). A shorter idle state and quicker movements were significantly observed in the "near zone. Proportion less than 1% of the very high-velocity range was not described in the figure. *, †, and ‡ indicates statistically significant (p < 0.05) among the 3 groups. E experts, I intermediates, N = novices busy surgical environment. Additionally, easily sterilizable, durable, and light-weight material for the artificial markers used to tag the surgical devices should be sought in a future study [10]. We just started several amelioration including a use of another motion camera system with better portability that does not require calibration, in order to perform Mocap analysis in live animal/cadaveric surgical training as the next stage. Because the present Mocap system can track multiple surgical instruments simultaneously, it provides a novel type of surgical record like a "music score including several musical instruments" during complex procedures, which might become a novel educational tool. We are aiming to develop integrated simulation training using different training models, in conjunction with the feedback of motion parameters to participants.  and the exchange of instruments should be included in a training curriculum. Zone-metrics may be a promising tool to asses surgical expertise. Our next challenge is to give completely objective feedback to trainees on-site, which could become a very powerful educational tool along with experts' feedback.