The operating room environment is replete with stressors and distractions that increase the attentional demands of what are already complex and demanding psychomotor procedures [14]. Disruptions (e.g., pagers, irrelevant conversations) may occur as frequently as three times every minute in the operating room [5, 6], so the ability to focus on the task at hand, avoiding the effects of noise and distractions, is an important skill for surgeons [1, 7]. Indeed, recent research suggests that errors leading to surgical complications most often are caused by mistakes in technique or lapses of attention to detail [8]. Because the attention control of novices appears to be particularly disrupted by auditory distractions [3, 9], helping trainees to minimize such cognitive overloading should be considered an important aspect of surgical training [10].

One recent approach to assessing the impact of distractions on surgeons’ attention is by examining alterations in their gaze strategy. The impact of gaze disruptions on surgical intraoperative performance is not well known, although anecdotal evidence suggests that it is common [6]. Sutton et al. [6] used extracorporeal video footage to assess how frequently the surgeon’s gaze was diverted from the operation’s video display. They found that on average, 40 diversions occurred every 15 min, providing ample opportunity for errors in attention to occur [8, 11]. However, the full impact of attention disruptions on detection errors is likely to be underestimated using this approach, because it fails to account for instances in which a surgeon may be looking at the “wrong” location on the video display itself.

Advances in eye-tracking technology have made more fine-grained analyses of operator attention possible [1214]. For example, Wilson and colleagues recently demonstrated that in virtual reality (VR) simulations, experienced operators use more effective gaze strategies than novices; fixating on the most relevant locations and adopting more optimal psychomotor control [13, 14]. The authors related these findings to those from cognitive neuroscience, which suggest that skilled psychomotor behavior involves the ability to predict the consequences of one’s actions and implement mapping rules that relate motor and sensory signals [15]. In laparoscopic surgery, experts primarily fixate the target to be grasped and seldom need to check the location of the tools (target locking strategy), whereas novices, still developing the mapping rules, switch between tracking the tool as it moves toward the target and fixating the target itself (switching strategy) [13, 14, 16].

The investigation of such expert–novice comparisons is an important precursor for surgical education research, because only by understanding more about the psychomotor processes underpinning expertise can proficiency-related training interventions be developed [1719].

The transition from novice to expert motor performance is characterized by increases in biomechanical, metabolic, and neural efficiency [20]. In particular, a gradual reduction of attention demands during motor learning is thought to reflect progression from an initial verbal–analytical stage of performance, in which knowledge is easily accessed by consciousness (explicit, declarative knowledge), to an autonomous stage of performance, in which knowledge is unconscious (implicit, procedural knowledge) [21].

Although this is not in doubt, a growing body of evidence suggests that it may be possible, even advantageous, to avoid the initial verbal–analytical stage of performance by learning implicitly. In numerous studies, Masters and colleagues have shown that implicit motor learning promotes primarily the accretion of procedural knowledge, which is not readily available to conscious introspection by the performer and makes few demands on attention [22, 23]. By generating movement control that makes few demands on the already taxed cognitive resources of the surgeon, implicit motor learning results in stable motor performance under psychological stress, fatigue, multitasking, distractions, or over time [24, 25].

Contemporary research in sport has attempted to apply the implicit motor learning framework in the context of gaze control by training novice performers to adopt the efficient gaze strategies of experts from the outset of learning [2628]. In this way, novices do not pass through the typical sensory–motor mapping stages of learning (from reacting to movement outcomes to focusing on the desired end point of a movement [15]). Gaze training appears to expedite the learning of psychomotor skills compared with traditional, movement-focused training [2628], and, as with implicit motor learning, the benefits are more pronounced under stress; novices taught to focus on the key visual cues in the environment are better able to deal with the attentional demands associated with stress than novices taught via a traditional, explicit motor learning approach. Modelling expert gaze strategies may help to reduce the attention demands of complex movement, increasing psychomotor efficiency and freeing resources to be applied to concurrent tasks [17].

For surgical technical training to be clinically effective, it must transfer from simulator and bench models to the demanding environment of the operating room. For example, the basic technical performance of proficiency-trained learners has been shown to break down under the typical multitasking demands experienced in the operating room [1, 4, 9, 10, 29, 30]. The current study was designed to model these multitasking demands in a laboratory setting and is the first to examine the utility of gaze-training in a surgical environment. Three training interventions are compared: one focusing on training expert gaze control, one focusing on training expert motor/movement control, and one where no specific guidance is provided (discovery learning). We generated three hypotheses related to the expected benefits of gaze training for learning basic technical skills and for their robustness under multitasking demands:

FormalPara Hypothesis 1 (Baseline)

There will be no differences in completion time between gaze-trained, movement-trained, and discovery-learning participants in the baseline condition. All participants will display “novice-like” gaze (switching) and tool control (inefficient path lengths) strategies.

FormalPara Hypothesis 2 (Control)

Gaze-trained participants will display superior performance (faster completion times) after training (control condition) than movement-trained or discovery-learning participants. An expert-like gaze and tool control strategy will underpin the performance advantage of the gaze trained participants.

FormalPara Hypothesis 3 (Multitask transfer)

Gaze-trained participants will display stable performance when multitasking compared with movement-trained or discovery-learning participants.



A total of 30 novice participants volunteered to take part in the study (11 males, 19 females; mean age, 25.16 years; range 8 years). Participants were both left- and right-hand dominant (24 right, 6 left) and consisted of novice medics (final-year medical students or foundation-year medical trainees) who had not received any laparoscopic training. They were divided randomly (using a Latin squares design) into three treatment groups as discussed below (see Training groups).

Apparatus and task

Testing took place on a LAP Mentor (Simbionix USA Corp., Cleveland, OH) VR laparoscopic surgical simulator, based at the Centre for Innovation and Training in Elective Care, Torbay Hospital, UK. The “eye-hand coordination” task from the basic skills training module was used for this study, because previous research has demonstrated that the task validly differentiates expert and novice surgeons [13, 31, 32]. To complete the task, ten flashing balls set at different heights and depths must be touched by using one of two instruments (one held in each hand). One of the instruments is blue and the other is red, and they become visible on the screen as soon as they are inserted into the trocars. During the task, flashing balls of each color must be touched using the tip of the same color instrument within a set time period.

Participants were fitted with an Applied Science Laboratories Mobile Eye gaze registration system (ASL; Bedford, MA), which measures point of gaze using dark pupil tracking (see Wilson et al. 2010 [13] for a detailed description of this equipment). The system incorporates a pair of lightweight glasses fitted with eye and scene cameras and a set of three LEDs that project harmless, invisible near infrared (IR) light onto the eye. By teaching the system how the angles calculated by the eye camera relate to the image from the second camera that is viewing the environment (the scene camera), the eye tracker can compute what the eye is pointed at. A circular cursor, representing 1° of visual angle with a 4.5-mm lens, indicating the location of gaze in a video image of the scene (spatial accuracy of ±0.5° visual angle; 0.1° precision), is viewed in real time and recorded at 25 Hz for subsequent offline analyses.

Training groups

Participants were randomly assigned to one of three training treatment groups: gaze training, movement training, or discovery learning, following a baseline attempt at the eye-hand coordination task. The gaze-trained group was shown a video, derived from the eye tracker, of an expert’s visual control whilst performing the eye-hand coordination task (completed in 30 s). Participants were made aware of the target-focused gaze strategy (lengthy and stable fixations on each of the target balls), and the manner in which the gaze shifted from one target to another in a fast, smooth fashion [13]. They were then advised to try to mimic the gaze strategy of the expert on subsequent attempts. After completion of a second attempt, participants were shown their own video data, as captured by the eye tracker. Participants were asked to comment on differences between their own video and the expert prototype they had previously seen. This feedback process was replicated for a third attempt. Participants were then asked to complete another seven attempts, with verbal feedback of their gaze behavior provided by the experimenter after each trial.

The movement-trained group was shown the same expert video but without the gaze cursor present. Instead, participants were made aware of the manner in which the tool moved smoothly toward the target following a straight and direct path. Participants were then advised to try to mimic the smooth tool movements of the expert. The training protocol then followed the same structure as for the gaze-trained group; participants watched videos of their own performance (without the gaze cursor) after their second and third attempts and compared these to the expert prototype. Only verbal feedback about movement control was provided for the subsequent seven attempts.

The discovery-learning group was given no video feedback or training instructions but was allowed to examine their performance and movement scores from the LAP Mentor after every trial. A sample size of ten trainees in each group was based on (1) the numbers typically used in laparoscopic surgery technical skills training studies (for example, see Arora et al. [33], in which 10 participants were used in each training group), and (2) a power analysis performed using data from a relevant gaze training study also adopting ten trainees in each group. Vine and Wilson [27] compared the differential effects of gaze training and explicit motor training on motor performance and reported very large effect sizes in both a control condition (d = 2.82) and in a stressful/transfer condition (d = 3.92) (Note d > 0.8 is considered to be a large effect [34]).

Secondary task (tone counting)

A soundtrack developed with Labview software (National Instruments Inc.) played four distinguishable sounds from the Microsoft sound library (buzzer, ping, tone, and bell ring) in a randomized order (one sound every second) to the participant via speakers attached to a Dell PC. Audible stimuli have been shown to have a significant distraction effect during surgical performance [15]. Furthermore, distinguishing between relevant and irrelevant auditory stimuli may be important in the operating room environment [1, 5]. Participants were instructed to listen for the target sound (bell ring), count the number of times it was played, and ignore the other three “distracting” sounds. Participants were played a 30-second example of the soundtrack for familiarization purposes, before a baseline trial (see Procedure).


Participants arrived at the Training Centre individually at prearranged times. They read an information sheet, which describes the goals of the study, before completing a demographic questionnaire and providing written, informed consent. Participants were fitted with the eye tracker, which was calibrated using six visual landmarks on the LAP Mentor display screen. Calibration was checked after every trial to ensure that the eye tracker had not moved. Recalibration was only required on two occasions throughout the whole testing period.

Training consisted of ten trials of the eye-hand coordination task, because previous research has demonstrated that six to ten trials are sufficient for a plateau in performance to occur for this basic task [32, 35]. The first attempt acted as a baseline measure of gaze control and performance, following which participants completed nine further attempts with instructions relevant to their specific training intervention (see Training groups section). A control test was then performed with no additional guidance provided to assess learning. After a brief rest, participants performed a baseline trial of the tone counting task (60 s), followed by two attempts to perform the eye-hand coordination task and the tone counting task concurrently (multitasking). At the end of testing, participants were thanked for their time and debriefed about the purpose of the study.



Performance in this procedural task was assessed in terms of task completion time [13, 31, 32].

Process measures

The total path length (TPL) travelled by each tool was selected from the LAP Mentor parameter options to reflect efficient tool control [35]. For an indication of efficient gaze control, a measure of “target locking” was computed by subtracting the percentage of time spent fixating the tool from the time spent fixating the target ball (throughout a trial). Therefore, a more positive score reflects more time spent target locking, whereas a score of ‘0’ reflects equal time spent fixating the tools and targets (switching strategy). A negative score reflects more time spent fixating the tools than the targets. A fixation was defined as duration of gaze to a single location (within 1° visual angle) for at least 120 ms (≥3 frames of video) [13, 14]. Fixations to “other” areas of the screen were ignored for the purpose of this analysis.

Tone counting performance

The actual number of target sounds (bell rings) played during task completion was compared to the estimate provided by the participant. An error score (actual minus estimate) was then computed as a measure of performance, and a mean was computed for both multitasking trials. These mean error scores were all made positive (false-positives created a negative error score) and converted to a percentage for the purpose of subsequent analyses.


Performance and tool path length measures were downloaded directly from the LAP Mentor software environment after each trial. The gaze data were analyzed frame-by-frame (25 frames for one second of video) using GazeTracker (Eye Response Technologies, VA, USA) video analysis software [13, 14]. For each ball-touch attempt, areas of interest (Lookzones) were created and maintained around the target ball and the relevant instrument as the video progressed. The software automatically provided the percentage fixation duration to each area of interest for the trial as a whole. The researcher analyzing the gaze data was blind to the assigned training group of each of the participants to protect against analysis bias.

Statistical analysis

To test our specific hypotheses, dependent variables were subjected to one-way analyses of variance (ANOVA; groups: gaze, movement, discovery) at three key time points: baseline, control, multitasking.Footnote 1 Significant effects were followed up using Tukey’s post hoc comparison test to protect against the risk of type 1 errors due to multiple comparisons.



ANOVA revealed no significant differences between the three groups for completion time, F(2,27) = 0.019, p = 0.982 (Fig. 1); target locking, F(2,27) = 0.029, p = 0.972 (Fig. 2); total path length, F(2,27) = 0.053, p = 0.949 (Fig. 3), or baseline tone counting performance, F(2,27) = 0.484, p = 0.622 (Table 1).

Fig. 1
figure 1

Mean (±SEM) completion time (seconds) for the three training groups across baseline, control, and multitasking trials

Fig. 2
figure 2

Mean (±SEM) percentage of target locking fixation time (total target fixation duration—total tool fixation duration) for the three training groups across baseline, control, and multitasking trials

Fig. 3
figure 3

Mean (±SEM) total tool path length (cm) for the three training groups across baseline, control, and multitasking trials

Table 1 Mean (SD) percentage tone counting error scores (actual-estimate)

Control (posttraining)

ANOVA revealed significant differences in completion time (F(2,27) = 6.419, p = 0.005), with the gaze group displaying faster completion times than the movement group (p = 0.005) or the discovery group (although not significantly; p = 0.057; Fig. 1). Completion times for the movement and discovery group were not different (p = 0.535).

A significant effect also was evident for the percentage of time spent using target locking fixations, F(2,27) = 4.943, p = 0.015. The gaze group spent significantly more time using target locking fixations than the movement group (p < 0.012) or the discovery group (although again the effect was not significant; p = 0.136; Fig. 2). There was no difference in target locking for the movement and discovery groups (p = 0.51).

There were no differences in tool control (total path length), F(2,27) = 0.109, p = 0.897 between groups (Fig. 3).


ANOVA revealed a significant difference between groups for completion time (F(2,27) = 13.138, p < 0.001). The gaze group was significantly faster than both the movement (p < 0.001) and discovery (p = 0.03) groups (Fig. 1), which did not significantly differ (p = 0.077).

Differences also were evident for the percentage of time spent target locking (F(2,27) = 10.408, p < 0.001). The gaze group fixated the target significantly more than the movement (p < 0.001) or discovery (p = 0.008) groups (Fig. 2), which did not differ (p = 0.517).

There were no differences in tool control (total path length), F(2,27) = 0.817, p = 0.452, between groups (Fig. 3).

Tone counting performance was not significantly different between groups (F(2,27) = 2.977, p = 0.068), although the gaze group generally was more accurate than the movement or discovery groups (Table 1).


Keep your eye at the place aimed at, and your hand will fetch [the target]; think of your hand, and you will likely miss your aim (James 1890, p. 520).

This seminal quote from the psychologist William James [36] warns of the consequences of consciously directing attention to the control of movements in a reaching task. Instead, James recommends controlling the direction of gaze to the target and allowing the motor system to self-organize completion of the task. Support for James’ historic claim comes from research that examined the differences between the visuomotor control of experts and novices [13, 15] and the theory of reinvestment [22, 23], which suggests that attempts to consciously monitor and control movements can disrupt their performance (see also Wulf et al. [37]). The purpose of this study was to examine the efficacy of training technical laparoscopic skills using a gaze-focused, rather than a movement-focused, intervention. Three specific hypotheses were tested, as discussed below.

Hypothesis 1 (baseline)

We predicted that all participants would start from a similar novice level and display relatively slow completion times and poor gaze and tool control. The results supported this hypothesis, because all groups were equally slow (~60 s; Fig. 1), adopted a switching gaze strategy (equal fixations to targets and tools; Fig. 2), and used inefficient tool control strategies (tool paths of ~155 cm; Fig. 3). These results are important, because they corroborate findings for other novices performing this task [13] and, more importantly, demonstrate that there were no differences between the groups at the outset of the study. Any subsequent differences must be considered in terms of the specific interventions experienced by each treatment group.

Hypothesis 2 (control)

We predicted that gaze-trained participants group would display faster completion times after training than movement-trained or discovery-learning participants and that expert-like gaze and tool control strategies would underpin the performance advantage of the gaze-trained participants. After only ten trials of learning, the gaze-trained group revealed a performance advantage over the other two groups. They more than halved their completion times from the baseline condition (55% reduction compared with 32% reduction for the movement group and 39% reduction for the discovery group; Fig. 1). Their performance was underpinned by a more expert-like gaze strategy consisting of more time spent fixating the targets and less time focusing on the tools (Fig. 2). In comparison, the movement and discovery groups continued to predominantly adopt a switching strategy (Fig. 2). These results suggest that the gaze-trained group were further along the performance curve for this task.

Surprisingly, this divergence in gaze strategy did not transfer to the hypothesized differences in tool control (as indexed by total path lengthFootnote 2). Indeed, all three groups reduced their path length by approximately 33% between baseline and control conditions (Fig. 3). It appears that the tool control algorithms used by the LapMentor software are not as sensitive as the measure of gaze control (target locking) in discriminating between significantly different completion times. We originally postulated that training accurate gaze control might enable an efficient neural network for specialized motor planning that integrates visual information with motor commands, in effect allowing the motor system to self-organize in a more implicit manner [38]. More research is required to determine the specific tool control mechanisms through which expert-like gaze behavior exerts its performance advantage.

Hypothesis 3: multitasking

We predicted that gaze-trained participants would display stable performance when multitasking compared with movement-trained or discovery-learning participants, due to the reduced demands on attentional processing inherent in this form of training (i.e., implicit motor learning [22, 23]). The transfer of technical skill learning from simple to more demanding conditions reflective of those experienced in the operating room (e.g., multi-tasking) is critical if training programs are to have clinical utility [10]. The results revealed that the performance advantage found for the gaze-trained group after learning (control condition) was even more pronounced when multitasking. Although the gaze-trained group maintained completion times at control condition levels, the discovery learning group was 8% slower and the movement trained group was 22% slower under multitasking as opposed to control conditions (Fig. 1).

The tone counting performance results also are important, because they reveal that the gaze-trained group achieved this technical skill performance advantage while maintaining secondary task performance. Indeed, although the results were not significant (p = 0.068), the gaze-trained group made three times fewer errors than the movement trained group and half of the errors of the discovery-learning group when performing the tone counting task concurrently with the eye-hand coordination task (Table 1). Taken together, the primary and secondary task performance data support hypothesis 3 and suggest that the gaze-trained group had more “free” attentional resources to divide between tasks than the other two groups. As in the control condition, the percentage of time spent using target locking fixations (Fig. 2), rather than changes in tool control (Fig. 3), seems to underpin this performance difference.

General comments

It has been suggested that the neural mechanisms regulating goal-directed movements profit from the accurate and timely spatial information of the foveated target [39]. The current data add to previous research [13, 14] that suggests that this contention applies to laparoscopic tasks; faster completion times are underpinned by a target-locking strategy. Furthermore, the current research reveals that novices can be taught to model the gaze strategies of experts and do not have to follow typical visuomotor learning phases [15]. In this way, a gaze-training protocol has many similarities with implicit motor learning protocols. Implicit motor control demands fewer attention resources than conscious or explicit control [23] and is a key marker of automaticity [40]. Recently, Zhu et al. [41] showed that implicit motor learning resulted in reduced nonessential coactivation between the verbal–analytic and motor planning regions of the brain during performance of a laparoscopic task (see also [42]). However, whether a gaze focused approach can be considered to be an implicit motor learning paradigm remains to be confirmed by future research. For example, Masters and colleagues have consistently shown that people who learn their movements implicitly can consciously report very little about the mechanics of the movements (see Masters et al. [25] for a review related to surgery). Our study provides no evidence with respect to conscious reports.

There are a number of other limitations of the current study, which future research should address. The chosen task was relatively simple, as evidenced by the fact that the novices in the current study performed quicker after only a few practice trials than experienced surgeons in previous studies [13, 31, 32]. The utility of gaze training therefore needs to be assessed on more complex, technical tasks, and perhaps even on tasks that require perceptual–cognitive expertise (clinical judgments and decision making), as well as visuomotor expertise. Additionally, the training period was much briefer than that adopted in validated training curricula [43] and there was no attempt to test the longer-term retention (durability) of this learning after a period of time [44].

To conclude, the findings suggest that gaze-training interventions have potential utility in surgical, as well as sporting, environments. Not only did gaze-training expedite learning, but there is evidence that the gaze-trained participants had more attentional spare resources to complete the eye-hand coordination task under multitasking pressure. It appears that by adopting the visuomotor control demonstrated by expert surgeons, novices can “fast-track” some of their experience, thus climbing the performance curve to technical competency in less time and performing better when faced with challenging multitasking demands. The results support James’ [36] contention that not only is there potentially a cost in attempting to control movements, but that there is a benefit of focusing on controlling gaze accurately.