Introduction

Police officers often face potentially dangerous situations while on duty. Although worst-case scenarios—like armed attacks aimed at an officer’s life—are fortunately the exception, they may still happen at any given moment. Therefore, law enforcement personnel strongly rely on their ability to focus their attention, detect potential threats, evaluate situations quickly, and react adequately under pressure (Helsen and Starkes 1999; Vickers and Lewinski 2012).

In high-stress situations, factors like anxiety, low self-control, inexperience, and poor professional judgment are known to reduce police officers’ shooting accuracy and decision-making. Hence, practical training, knowledge-based training, ethics training, and continued on-the-job training are crucial to maximizing performance in high-stress law enforcement situations (Biggs et al. 2015; Donner et al. 2017; Donner and Popovich 2019; Ho 1994; Nieuwenhuys & Oudejans 2010, 2011; Nieuwenhuys, Savelsbergh et al. 2012).

Another determining factor in reducing risks in this context is knowledge of tactical gaze control. Police officers should be focused more on certain “dangerous” areas of a suspect’s body than other “harmless” regions. For instance, an armed attacker trying to kill or critically injure an officer will most likely use their hands to do so. Firearms, knives, other dangerous objects, and even most detonators for explosives must be operated manually to pose an immediate threat to an officer’s life. Therefore, it is crucial that police officers shift their gaze and focus their visual attention on the hands and potential weapon concealments (mostly in the hip region) of a suspect. Even seemingly low-level routine situations can escalate instantly as it takes a trained assailant only a fraction of a second to draw a concealed gun.

In an experiment conducted by Blair et al. (2011), suspects carried a drawn gun low at their side, pointing to the ground. Aiming and shooting at an officer took on average 360 ms. Police officers returned fire on average 380 ms after the suspects’ initial movement—even when their gun was already aimed at the suspect. The reaction times reveal the importance of noticing almost immediately what a suspect does. Especially when the suspect draws a firearm, it may already be too late for self-protecting measures if the suspect’s gun is drawn unnoticed.

The present study’s underlying question is whether a high level of training or a high level of practical routine is more beneficial for effective tactical gaze control and reaction time in law enforcement “shoot” situations.

Do police officers of a tactical unit display different gaze patterns than patrol officers in high-risk law enforcement situations, and are there observable differences in reaction time? If so, do recruitment, training, or practical routine primarily cause these differences? Understanding what factors facilitate superior tactical gaze control and shorten reaction times in life-threatening encounters may allow for adjusting and designing improved law-enforcement training in the future.

Visual Perception and Attention

Considering the fact that more than one-third of the human brain is devoted to visual perception, it becomes apparent that visual perception dominates sensory information processing (e.g., Ladwig et al. 2013; Sutter and Ladwig 2012; for an overview, see Sutter et al. 2013). Visual perception does occur automatically in the form of not only bottom-up driven processes but also top-down. For example, attention and domain-specific knowledge can initiate active gaze shifts or determine what information to extract. Therefore, active visual search, stimulus identification, and assessment play a significant role for human beings in perceiving their environment and interacting with it (Findlay and Gilchrist 2003; Gilchrist and Harvey 2006; Groner and Groner 1989).

However, the human visual system is limited in multiple ways that make it difficult or even impossible to perceive or extract information in certain situations. To see an object sharply and in full detail, the center of vision (the fovea) has to be directed onto it. Between these fixations, the eyes perform fast movements called saccades. During saccades, vision is suppressed (Findlay and Gilchrist 2003; Rayner 2009; Yarbus 1967).

There are also capacity limitations within the human information processing system that cause neglect or loss of information and result in biased or flawed interpretations. On most occasions, these simplified interpretations allow for an intuitive understanding of shape, depth, color, and size. The most striking examples for these effects are optical illusions, the effects of change blindness, and cognitive blind spots (Findlay and Gilchrist 2003; Gibson 2015; Rensink 2002, 2005; Rensink et al. 2016). Presented visual stimuli are not automatically perceived, and even a perceived stimulus is not automatically processed cognitively. Nevertheless, the human information processing system tends to trick its owner into the illusion of a comprehensive and loss-free vision.

There are multiple ways to maximize the efficiency of vision and overcome the human information processing system’s limitations during a specific task. The active shift of gaze and attention toward an object allows to see it sharply and detect changes faster and more reliable than with peripheral vision or when focusing on another object (Carrasco 2011; Castelhano et al. 2009; Dewhurst and Crundall 2008; Findlay and Gilchrist 2003; Hoffman 1998; Rensink et al. 2016).

This is in line with the spotlight theory of visual attention. It claims that focusing visual attention on one area at a time benefits visual perception (Eriksen and St. James 1986; Posner et al. 1980). However, studies on divided attention, such as McMains and Somers (2004), found that participants could also attend to two separate regions of space almost simultaneously by rapidly shifting their spotlight of visual attention between the two.

A gaze control technique that has been proven to be beneficial in general aiming tasks and performance under pressure is the so-called quiet eye, which was first described by Vickers (1996a, 1996b). A quiet eye is characterized by the steady fixation on the target rather than the medium (for example, the service pistol) right before and during shooting. Aside from practical police scenarios, this effect has also been observed in sports shooting, darts, and various types of ball sports using eye tracking equipment (Behan and Wilson 2008; Causer et al. 2010; Edworthy et al. 2000; Lebeau et al. 2016; Piras and Vickers 2011; Vickers 2011; Vickers and Lewinski 2012; Vickers and Williams 2007; Vine et al. 2014; Vine and Wilson 20102011; Wood and Wilson 2012).

Gaze Control in Law Enforcement

Previous research indicates that experienced and well-trained police officers display superior gaze patterns and reaction times, cope with anxiety better, and generally achieve better results in threat detection than naïve individuals or novice officers (Helsen and Starkes 1999; Körber 2016; Nieuwenhuys and Oudejans 2010, 2011; Vickers and Lewinski 2012; for an overview, see Heusler and Sutter 2020). A major difference in how expert and near-expert police officers control their gaze in tactical situations is utilizing the aforementioned quiet eye technique. Another factor is the anticipation of the location from which a concealed gun is most likely to be drawn (Vickers and Lewinski 2012).

Anxiety has been shown to have a strong negative impact on police officers’ overall performance in high-stress situations (Nieuwenhuys et al. 2009; Nieuwenhuys, Canal-Bruland et al. 2012; Nieuwenhuys and Oudejans 2010, 2011; Nieuwenhuys, Savelsbergh et al. 2012; Renden, Landman, Geerts, et al. 2014; Renden, Landman, Savelsbergh, et al. 2015). However, perceptual performance and gaze control can be improved after only a few or even just one training sequence. Simulation training and practical training seem to be the most beneficial in this context (Helsen and Starkes 1999; Neuberger 2013; Nieuwenhuys and Oudejans 2011).

A person’s face is a very salient stimulus, and some facial expressions attract both the observer’s gaze and attention more than others (Calvo and Nummenmaa 2008; Cerf et al. 2007). Although a suspect’s facial expressions may allow for the interpretation of emotions and respectful eye contact may help to de-escalate in some situations, merely focusing on the face cannot predict an upcoming attack. Neglecting a suspect’s crucial hands/hip region while focusing on their salient facial expressions may lead to severe injuries or even death in high-stress situations. Therefore, visual attention and situational awareness are key to minimizing the risks in potentially life-threatening situations.

According to Neuberger (2013), there are at least three possible ways to improve the visual expertise of a population. First, far below average individuals could be excluded or left by choice—like police applicants who fail the entrance test or do not reach the requirements to complete basic training successfully. Second, visual expertise can be improved automatically through on-the-job experience and, third, through targeted training.

Proper tactical gaze control and active vision in law enforcement most likely rely on a mix of natural abilities, which are part of the individual’s inherent capabilities, and an acquired skill set, which must be honed through experience and training (Körber 2016). Although previous research on this topic showed that both practical experience and training facilitate superior performance in high-stress situations, it is not clear to what extent expertise and training contribute to successful tactical gaze control.

Hypotheses

Previous studies indicate that experience and a high level of training facilitate performance in threat-detecting tasks and improve tactical gaze control (Helsen and Starkes 1999; Körber 2016; Neuberger 2013; Nieuwenhuys and Oudejans 2011; Vickers and Lewinski 2012). Therefore, we expect officers of a tactical unit to display longer fixation durations on a suspect’s hands/hip region than patrol officers (hypothesis 1). Vice versa, we expect patrol officers to display longer fixation durations on the facial area of a suspect than officers of a tactical unit (hypothesis 2). Furthermore, we presume that officers of a tactical unit react faster, and therefore shoot earlier, than patrol officers when a suspect draws a hidden firearm (hypothesis 3).

Method

Participants

Thirty-nine participants (10 female) aged from 23 to 51 years (M = 30.92; SD = 5.87) volunteered for the experiment and gave their informed consent prior to the study. All participants were police officers of the Hesse State Police Force, and none of them have ever encountered a situation in which they fired their service weapon at a person. All participants had normal vision on two meters distance, or they corrected their defective vision accordingly using contact lenses.

Twenty-five of the participating police officers were regular patrol officers. Another fourteen police officers were members of the tactical unit “Mobiles Einsatzkommando” (MEK), a highly trained unit specialized in covert surveillance and high-risk arrests (Friedrich and Metzner 2002; Sünkler 2010). Two officers of the tactical unit had to be excluded from the study: One officer experienced technical issues during the video scenarios. The other could not be paired with a matching officer of the regular patrol group.

For the experiment, we split the sample of participants into three groups. Table 1 provides group characteristics and the level of significance for group comparisons. Group differences were analyzed using a univariate ANOVA with the between-subject factor “group” (for the analysis of differences in age and service experience) and a Kruskal–Wallis test for multiple samples (for the analysis of wakefulness). Group 1 (tactical unit = TU) consisted of 12 officers of the tactical unit (3 female) with a mean age of 32 years and an average of 8 years of service experience. For each participant of group 1, we selected a matching patrol officer (same gender, similar age, and similar service experience) and assigned them to group 2 (matched patrol = MP). Participants of group 2 (3 female) had an average age of 32 years and a mean of 9 years of service experience. Group 3 (unmatched patrol = UP) consisted of the remaining 13 patrol officers. Participants of group 3 (3 female) had an average age of 29 years and a mean of 5 years of service experience. Statistically, age and service experience did not differ significantly between the three groups.

Table 1 Group characteristics

All participants were asked to assess their wakefulness right before the experiment on a 9-point Likert scale (1 = “very tired,” 9 = “fully awake”). The medians for wakefulness did not differ significantly between the groups (TU 5.5; MP 6.0; UP 5.0).

Apparatus and Materials

Questionnaire

The questionnaire was presented digitally on a 64-bit Windows 10 laptop computer running the OpenSesame v3.2.7 experiment building software (Mathot et al. 2012). The participants sat approximately 60 cm from the screen and completed the questionnaire: First, they provided demographic data (i.e., age, gender, years of service experience without basic training). Second, they answered whether they had ever shot their service gun at a person on duty. Third, they rated their current wakefulness on a 9-point Likert scale (1 = “very tired,” 9 = “fully awake”). Fourth, participants rated their training experience versus real-life police/citizen interactions on a 9-point Likert scale (1 = “I experience direct police/citizen interactions way more often in training,” 9 = “…way more often in real-life encounters”). For direct police/citizen interactions, the following examples were given: “Police checks, arrests, frisks, and so forth.”

Experiment

The experiment took place in a quiet and dimly lit room and was controlled by the same computer and software as the questionnaire. Figure 1 depicts the experimental setup. Participants stood on a marked spot, approximately 200 cm away from a white drop-down projector screen and two stereo audio speakers (Logitech Z200). A ceiling-mounted WUXGA laser projector (Epson EB-L610U; Epson EB-L510U) displayed the instructions and the video scenarios onto the screen. This setup enabled us to display the scenarios in life-size proportions (projection surface 190 cm × 150 cm) with stereo sound to maximize the closeness to reality.

Fig. 1
figure 1

The experimental setup

The motor complexity of shooting a service weapon cannot be sufficiently simulated by merely pressing a button on a response box (Neuberger 2013). Therefore, we used a pistol-shaped USB arcade gun (AimTrak light gun with recoil by Ultimarc and customized firmware). The participants operated the arcade gun with both hands and the trigger with their dominant hand’s index finger. We equipped the gun with additional weights to match the weight of the participants’ loaded service pistol—a Heckler & Koch P30 (approx. 950 g). The gun was connected to the laptop computer. It simulated recoil and transmitted a signal via USB when the participants pulled the trigger.

We also used a mobile eye tracking device (“Pupil Core,” 200 Hz eye tracker by Pupil Labs). This device allowed participants to move their heads freely without feeling restricted or uncomfortable. As police officers are accustomed to shooting with safety glasses in their regular training, the lightweight (22.75 g) glasses-like apparatus did not feel unfamiliar. The binocular version with two 200-Hz cameras and combined infrared LEDs allowed for recording eye movements reliably in dark conditions or when a participant closed one eye—while aiming, for example. We calibrated the device in 2D mode (instead of 3D) since the participants’ distance to the canvas did not change. This allowed us to achieve eye tracking accuracy within the physiological limits of one degree or less (Kassner et al. 2014). A OnePlus 6 mobile phone was connected to the eye tracker and saved the recorded data.

The instructions were presented visually in black letters on a white background and in German. The first instruction informed the participants that they are currently on duty and about to hear a radio transmission (the cover story).

The second instruction notified the participants that they were about to experience video scenarios, which are independent of each other but are all based on the previously heard radio transmission. They were further asked to stay on the marked spot on the floor and serve as the “securing officer” in a two-officer buddy team. Another officer in the scenarios would be the one doing the talking. We gave the participants the role of the securing officer, so they did not have to communicate or make any tactical decisions other than when to shoot.

The third instruction was presented before every scenario and instructed the participants to take a “compressed ready position,” meaning they should hold their drawn gun closely in front of their bodies. This instruction ensured that all participants started each scenario in the same way.

Stimuli

The cover story was presented auditorily. The audio file simulated a radio transmission from the police operations center (male speaker). Participants were informed of an armed fugitive who had just robbed a jewelry store and is currently on the run. It further contained information on the perpetrator’s physical appearance, the getaway vehicle, and the type of firearm used in the robbery.

In the following video scenarios, the participants were faced with the fugitive, who matched the previously heard description perfectly. The silver gun was clearly visible and easily identifiable in every scenario. We chose a pistol as the fugitive’s weapon to avoid ambiguity and possible tactical differences between the tactical unit and patrol officers. While some officers might not fire their gun right away on a suspect carrying a knife or a comparable dangerous object, a gun pointed at a police officer leaves no room for hesitation.

In each scenario, the target gun was presented in a different quadrant on the screen (Fig. 2). This was done to minimize learning effects during the experiment and avoid confounding variables.

Fig. 2
figure 2

The fugitive’s target gun was presented in one of the quadrants on the screen

A second officer (male; auditorily present only) covered the role of the “addressing officer” talking to the fugitive and giving him instructions. The addressing officer’s voice was played back in stereo, simulating his position relative to the participant to increase the situation’s realistic feeling further. All stimuli were designed and recorded in cooperation with active police trainers and police officers.

The cover story’s purpose was to provide the information necessary to identify the armed fugitive in the video scenarios and raise stress levels by increasing the degree of realism. It also served to put the participants in the correct “mindset” and make them aware that they might be confronted with a potentially life-threatening situation. Police officers heavily rely on dispatch information to assess threat levels (Taylor, 2020).

This particular cover story was chosen because it applies to both patrol officers and officers of the tactical unit; both could face a comparable situation in their line of duty. The aim was to avoid favoring either group by selecting a scenario that the other group is less likely to experience.

Scenario A

In this scenario, the fugitive is standing in a parking area beside his getaway vehicle. Figure 3 depicts the initial situation of scenario A. Although the addressing officer instructs the fugitive to calm down and show his hands, the fugitive acts agitated throughout the scenario and refuses to comply. After 27 s, he distractingly points to his car with his left hand while drawing the gun with his right hand (Fig. 4).

Scenario B

This scenario shows the fugitive sitting on a beer case and a second male person sitting in the getaway vehicle’s car trunk. Figure 5 depicts the initial situation of scenario B. During the scenario, the fugitive talks to the addressing officer and appears to be compliant but still somewhat dominant. Both the fugitive and the second person willingly show their hands as requested by the addressing officer. After 34 s, while the fugitive is still talking, the previously passive person in the car trunk slowly draws the gun (Fig. 6).

Fig. 3
figure 3

Scenario A—the initial situation

Scenario C

This scenario simulates a traffic stop situation with the addressing officer standing at the driver’s window of the getaway vehicle and the participant securing from the passenger’s side. Figure 7 depicts the initial situation of scenario C. When the addressing officer instructs the fugitive to turn off the ignition, the fugitive acts compliant at first but then, after 29 s, covertly draws the gun from the car’s central console (Fig. 8).

Scenario D

In this scenario, the fugitive is standing inside of a multi-story parking garage behind two bags and a jacket on the floor. Figure 9 depicts the initial situation of scenario D. Although acting somewhat reluctant, the fugitive still complies by showing his hands, kneeling, and lying down when the addressing officer tells him to. After 38 s, while lying on his stomach, the fugitive draws the gun from behind a bag (Fig. 10).

Control Variables

The following variables were measured to be utilized as control variables. We chose these control variables to ensure a maximum of comparability between the groups and minimize the effects of confounding variables.

TAP—Test of Attentional Performance

Police officers who want to join the TU have to prove their suitability for this demanding and highly specialized field of work in a multi-day assessment center. After successfully passing this assessment procedure (including tests on vigilance, cognitive abilities, and reaction time amongst other things), applicants are allowed to start the intense 6 months of special training.

Fig. 4
figure 4

Scenario A—target gun drawn

Fig. 5
figure 5

Scenario B—the initial situation

We conducted two subtests of the TAP—Test of Attentional Performance v2.3.1 (Fimm and Zimmermann 2017) to examine whether this preselection may have caused incomparability regarding general attentional performance between the groups. Both subtests utilized the following setup.

The participants sat in front of a 64-bit Windows 10 laptop computer with an 11.6-in. screen in a quiet and dimly lit room. To counter the effects of this mobile setup’s relatively small screen size, we scaled the display settings up to 120% and chose a close distance of approximately 40 cm between the participants and the screen. All visual instructions and stimuli were easy to identify. The TAP response button was placed at a convenient reaching distance in front of them. Both subtests offered a training mode in which the participants could run practice trials to familiarize themselves with the tasks and the equipment. They were instructed to concentrate and react as fast and correctly as possible.

The subtest “Crossmodal Integration” required the participants to detect the critical combination of a preceding acoustic stimulus (low- or high-pitched tone) and a subsequent visual stimulus (an arrow pointing down or up). The task was to operate the response button as fast as possible, whenever the pitch of the tone and the arrow’s direction corresponded (low and down/high and up).

The subtest “Divided Attention” examined the participants’ ability to simultaneously pay attention to a visual task and an auditory task. The participants had to press the response button as fast as possible when a target stimulus occurred in either of the tasks. For the visual part, participants had to observe multiple crosses in a 4 × 4 grid and operate the button when the crosses formed a small square. At the same time, low-pitched and high-pitched tones were played in alternating order. Participants also had to operate the button whenever two consecutive tones disrupted the order.

Training Versus Practical Experience

All participating police officers were in the same service grade level and received comparable basic police training at the beginning of their careers. However, when joining the TU, each police officer receives an additional 6 months of intense training in special police tactics. After successfully completing this training, teams of the TU continuously train these tactics multiple times per week. We assumed that regular patrol officers train less but encounter direct interaction with potentially dangerous individuals more frequently in their everyday work. Therefore, we asked participants to rank their training experience versus real-life police/citizen interactions in the questionnaire as described above.

Fig. 6
figure 6

Scenario B—target gun drawn

Fig. 7
figure 7

Scenario C—the initial situation

Anxiety

Anxiety critically impairs police officers’ performance in stressful situations (Nieuwenhuys et al. 2009; Nieuwenhuys, Canal-Bruland et al. 2012; Nieuwenhuys and Oudejans 2010, 2011; Nieuwenhuys, Savelsbergh et al. 2012; Renden, Landman, Geerts, et al. 2014; Renden, Landman, Savelsbergh, et al. 2015).

We asked participants to rate their anxiety level on a 9-point Likert scale (1 = “very calm,” 9 = “very anxious”), before the first and after each scenario. Based on the anxiety thermometer (Houtman and Bakker 1989), this allowed us to examine comparability between the groups and detect shifts in anxiety throughout the experiment.

Fig. 8
figure 8

Scenario C—target gun drawn

Fig. 9
figure 9

Scenario D—the initial situation

Procedure

The whole procedure took about 35 min for each participant and was conducted in a single room without considerable breaks. At the beginning, participants were handed an information sheet informing them about the voluntary and anonymous nature of the study. Officers were only included in the study when they gave their written informed consent.

Fig. 10
figure 10

Scenario D—target gun drawn

All participants completed both subtests of the TAP in the same order (first “Crossmodal Integration” followed by “Divided Attention”). Before each subtest, the participants could run a training mode. The digital questionnaire was conducted right after the TAP. Participants who declared they had shot at a person on duty before would have been excluded from the study. However, all participating officers were allowed to continue.

The experiment started with putting on the mobile eye tracker, adjusting the cameras, and calibrating the device. In the next step, the participants were exposed to the cover story and given instructions. Then, we gathered the base-level anxiety (t0). The main part with the scenarios was conducted in a loop until all four video scenarios (randomized order) were displayed. Before each scenario, participants were instructed to take the “compressed ready position” (Fig. 11). Each video was preceded by a 3-s-long visual countdown (white digits on a black background). As soon as the USB arcade gun was fired, an acoustic signal sounded, and an “end of scenario” screen was displayed. Immediately after, we gathered the current anxiety level (t1/2/3/4). When all four scenarios were completed, a final screen indicated that the experiment was over. The experiment took about 20 min per participant.

Design and Statistical Analyses

The present study has been approved by the German Police University’s ethical review committee and by the heads of the participants’ respective police organizations. Personal data (age, gender, etc.) was stored anonymously using random three-digit identification numbers assigned to the participants.

Experiment

The methodology of the study followed a quasi-experimental design. To facilitate readability, we used the simplified term “experiment” throughout the paper.

Salvucci and Goldberg (2000) state that fixations rarely last less than 100 ms and often range from 200 to 400 ms, while Irwin (1992) sets a broader range from 150 to 600 ms. Wu et al. (2015) state that objects may already be identifiable on a superordinate level after 120 ms. Therefore, we defined 120 ms as the minimum fixation time in the present study.

Although fixations can typically be considered stationary for most practical applications, they are a dynamic state in which the eye makes continuous miniature movements (Ditchburn 1973; Findlay and Gilchrist 2003). To counter the effects of these small movements, technical deviations, and slight head movements of the participants during fixations, we chose a maximum dispersion tolerance of three degrees for the calculations.

Fig. 11
figure 11

A tactical officer wearing the mobile eye tracker and holding the arcade gun in the “compressed ready position.” The mask is for identity protection and was not worn during the experiment

We added six visible 5 × 5 ArUco-style visual markers to the sides of each video scenario to allow the eye tracking software Pupil Player v1.12 to calculate the gaze data relative to the scenario. To detect the exact fixation times on the hands/hip region and the face, we predefined regions of interest (ROI) for each scenario. These quadrangular ROI were not visible for the participants but allowed automated analysis later in the process. The software detected and documented individual fixation times on each ROI, which we later summed up to calculate the total fixation times. Figure 12 shows an example screenshot of scenario D with the visual markers and the ROI (invisible for participants).

For each of the four scenarios, a timestamp was defined that matched the video frame in which the gun was first identifiable. To ensure consistency—even though participants shot at different times and, therefore, the length of the scenarios varied—we only analyzed the gaze data between the beginning of every scenario and its timestamp. The fixation times on the ROI hands/hip and ROI face were analyzed using a univariate analysis of variance (ANOVA) with the between-subject factor “group” (TU, MP, and UP). Further post hoc comparisons between groups were adjusted for multiple comparisons (Bonferroni correction).

Reaction times were defined as the period between the time stamp and the moment when the USB Arcade gun was fired (in milliseconds). We analyzed mean reaction times using a univariate ANOVA with the between-subject factor “group” (TU, MP, and UP).

In each group, one participant failed to shoot in one of the four scenarios. One MP failed to shoot in scenario A (the hands-in-pocket scenario) without giving an explanation. One TU and one UP did not shoot in scenario C (the car scenario). They did not realize that the fugitive pointed his gun at the fictional addressing officer, who was simulated on the other side of the car, and therefore did not see the need to shoot. The three missing reaction times were treated as missing values in the statistical analysis (2.03% of all reaction time values).

Control Variables

The TAP data was used to examine comparability between the three groups. For the TAP subtest “Crossmodal Integration,” we analyzed the decision score (the number of correct reactions minus the number of false reactions) and the mean reaction time using a one-factorial design with the between-subject factor “group” (TU, MP, and UP).

Fig. 12
figure 12

Example screenshot of scenario D with the visual markers and a visualization of the ROI. The ROI were not visible for the participants

For the TAP subtest “Divided Attention,” the decision score (the number of correct reactions minus the number of false reactions) and the mean reaction time were analyzed using a one-factorial design with the between-subject factor “group” (TU, MP, and UP).

We used Kruskal–Wallis tests for independent groups (TU, MP, and UP) to analyze the anxiety (t0/1/2/3/4) and training experience versus real-life encounters.

Results

Control Variables

Crossmodal Integration (TAP)

For the mean reaction time, the univariate ANOVA with the between-subject factor “group” did not reveal a significant main effect (p = .745; TU 388 ms; MP 392 ms; UP 409 ms).

For the mean decision score, the univariate ANOVA with the between-subject factor “group” also did not reveal a significant main effect (p = .059; TU 17.6; MP 17.3; UP 17.8).

Divided Attention (TAP)

For the mean reaction time, the univariate ANOVA with the between-subject factor “group” did not reveal a significant main effect (p = .662; TU 672 ms; MP 665 ms; UP 691 ms). For the mean decision score, the one-factorial ANOVA with the between-subject factor “group” also did not reveal a significant main effect (p = .612; TU 28.7; MP 30.4; UP 28.8).

Training Versus Practical Experience

The Kruskal–Wallis test for multiple samples revealed a significant difference between the groups (H(2) = 25.328, p < .001). Pairwise comparisons (adjusted using the Bonferroni correction) showed significant differences between TU and MP (z = − 4.592, p < .001) and between TU and UP (z = − 4.113, p < .001). The comparison between MP and UP did not reveal a significant difference (p = .569). The groups MP and UP had a higher level of practical routine regarding police/citizen interactions (Mdn MP = 9.0; UP = 8.0) than the TU (Mdn = 2.5).

Anxiety

The Kruskal–Wallis test for multiple samples did not show any significant differences

Fig. 13
figure 13

Mean fixation times (ms) for the ROI hands/hip (left) and ROI face (right) per group. Error bars represent the standard error

between the group medians on base-level anxiety t0 (H(2) = .867, p = .648). The anxiety levels for TU and MP (Mdn = 5.0, each) and UP (Mdn = 6.0) were on a medium level prior to the experiment.

The anxiety values measured after the individual scenarios (A, B, C, and D) did not reveal any significant differences between the groups (p > 0.05 for all). The same applies to the chronological order in which the participants experienced the scenarios due to the stimuli randomization (t1/2/3/4). All group medians on anxiety were located tightly around medium intensity, ranging between 4.0 and 6.0 throughout the study.

Table 2 Overview of the mean fixation times and mean reaction times in ms

Fixation Times

Figure 13 depicts the mean fixation times in milliseconds on the ROI hands/hip and the ROI face for the three groups TU, MP, and UP. The univariate ANOVA with the between-subject factor “group” revealed a highly significant main effect for the fixations time of the ROI hands/hip (F(2,34) = 11.96, p < .001, η2 = .413). Post hoc comparisons between groups revealed highly significant differences between TU and MP (p = .01) as well as between TU and UP (p < .001). The group difference between MP and UP did not reach a level of significance (p = .363). The mean fixation times on the ROI hands/hip were significantly longer for TU than for MP and UP (TU 17,707 ms versus MP 12,476 ms and UP 9901 ms).

Fig. 14
figure 14

Heat map visualization—focus on face (left) versus hands/hip (right)

The univariate ANOVA with the between-subject factor “group” revealed a highly significant main effect for the fixation times of the ROI face (F(2,34) = 8.27, p = .001, η2 = .327). Post hoc comparisons between groups revealed highly significant differences between TU and MP (p = .002) as well as between TU and UP (p = .006). The group difference between MP and UP did not reach significance (p = 1.0). The mean fixation times on the ROI face were significantly shorter for TU than for MP and UP (TU 2632 ms versus MP 8638 ms and UP 7914 ms). Table 2 provides an overview of the results.

The overall trend was observable throughout the experiment. The TU fixated their gaze longer on the ROI hands/hip and shorter on the ROI face than the MP and UP in all four video scenarios. Figure 14 shows an exemplary heat map visualization of two contrasting gaze patterns.

Reaction Times

The univariate ANOVA with the between-subject factor “group” did not reveal a significant main effect (F(2,34) = .28, p = .760, η2 = .016). The post hoc comparisons between groups were not significant (all p = 1.0). The mean reaction times did not differ significantly between TU, MP, and UP (TU 728 ms; MP 797 ms; UP 812 ms). Figure 15 and Table 2 provide an overview of the results.

Discussion

The present study examines the gaze patterns of police officers in potentially life-threatening law enforcement situations. It focused on investigating whether a high level of training or a high level of practical routine in police/citizen interactions are more beneficial for tactical gaze control. For this purpose, we compared the eye tracking data and reaction times of tactical police officers, matched patrol officers, and unmatched patrol officers in relevant video scenarios. The three groups showed no significant differences in basic attentional performance, wakefulness, and anxiety.

Fig. 15
figure 15

Mean reaction times (ms) per group. Error bars represent the standard error

We hypothesized that officers of a tactical unit would show longer fixation durations on a suspect’s tactically important hands/hip region than patrol officers (hypothesis 1). Our analyses reveal highly significant differences in fixation times on the hands/hip region that confirm hypothesis 1. On average, the TU exceeded the fixation times of the MP by 42% and UP by 78% on this tactically important ROI. The MP’s and UP’s results did not differ to a level of statistical significance. Furthermore, we hypothesized that patrol officers show longer fixation durations on a suspect’s facial region compared with officers of a tactical unit (hypothesis 2). The results show highly significant differences with the TU fixating the ROI face 69% less than the MP and 67% less than the UP, thus confirming hypothesis 2. Again, the MP’s and UP’s results did not differ to a level of statistical significance regarding this ROI. We also assumed that officers of a tactical unit react faster and therefore shoot earlier than patrol officers when a suspect draws a hidden firearm (hypothesis 3). However, the reaction times between TU, MP, and UP did not differ significantly.

In the questionnaire, patrol officers (both MP and UP) stated that they experience direct police/citizen interactions more often in real-life scenarios than in training. On the other hand, the TU stated that they experience these interactions more often in training scenarios than in real life, causing significant differences between the TU and both the MP and UP. No significant differences were visible between the MP and UP. These results indicate that training has a significantly higher impact on efficient tactical gaze control in potentially dangerous law enforcement situations than practical routine. However, shooting times did not show any significant differences among the groups.

Interestingly, the scenario in which the differences in tactical gaze behavior became most apparent was scenario A. In this scenario, the suspect acted agitated and kept his hands in his pockets, refusing to follow instructions. While the TU seemed to recognize the immediate threat that the hidden hands posed, most patrol officers shifted their focus on the emotional suspect’s salient facial expressions. On average, the TU fixated the critical hands/hip ROI in this scenario 86% longer than the MP and more than twice as long as the UP (TU 18,800 ms; MP 10,100 ms; UP 9200 ms). Conversely, the MP fixated the suspect’s face more than six times as long and the UP more than five times as long as the TU (TU 1600 ms; MP 10,900 ms; UP 8500 ms).

Therefore, we suspect that police officers with a lower level of training are more inclined to have their gaze and attention “grabbed” by salient stimuli (like an angry person’s face) as part of unconscious bottom-up processes. On the other hand, highly trained officers seem to have the toolkit to suppress these processes and actively shift both gaze and attention (top-down) on tactically crucial areas.

Limitations and Future Studies

Eye tracking can only detect the physiological shift of gaze but not the shift of attention. Even when a person fixates a certain region of space, the spotlight of their visual attention might be focused on another region. However, the unequivocal visual identification of an object is only possible when the object in question is fixated upon and seen sharply. On the other hand, fixating on an object does not necessarily mean that the participant attends to it and processes the visual information (Carrasco 2011; Castelhano et al. 2009; Dewhurst and Crundall 2008; Findlay and Gilchrist 2003; Hoffman 1998; Rensink et al. 2016).

As the present study focused mainly on gaze patterns during the interaction between a police officer and a suspect before shooting, some factors limit the scientific validity of the recorded shooting times. First, the intention of the participant could not be assessed in this experimental setup. Even though an officer might have short reaction times, it is impossible to assess in hindsight what they based their decision on. The suspect in our video scenarios always drew a gun and never a harmless object. Thus, detecting false-positive reactions (for example, shooting a suspect who is drawing a wallet) was not possible.

Second, we did not measure accuracy, and we therefore did not control for a speed-accuracy tradeoff in shooting behavior. Faster reaction times tend to go hand-in-hand with lower accuracy (Grünbaum et al. 2017; Nieuwenhuys and Oudejans 2010).

Third, and quite surprisingly, scenario C (the scenario in which the suspect sits in the driver’s seat of the getaway vehicle) exposed tactical differences in how tactical officers and patrol officers handled the situation. This led to noticeable differences in the observed behavior compared to the other scenarios. While all MP and UP aimed their guns directly at the suspect, some TU took an atypical shooting stance to avoid a “friendly fire” situation in which they might have endangered the other officer on the driver’s side of the vehicle. This atypical shooting stance takes more time but is safer for other officers and bystanders in a realistic scenario.

Although analyzing the recorded reaction times may have limited scientific value in this context, it still allowed for observing tendencies in relation to the obtained gaze data.

The participants were not informed about what data was recorded for further assessment. Consequently, shooting the USB arcade gun to end each scenario added pressure and realism to their experience. The participants’ feedback on using the arcade gun with its integrated recoil system was positive. It reportedly provided a “familiar feeling and handling”—comparable to their training and practical on-the-job experiences.

Throughout the experiment, we measured the subjective levels of anxiety to ensure consistent comparability between the groups. While the data provided valuable feedback on the participants during the experiment, it does not necessarily translate to real-life scenarios. A police officer facing a potentially life-threatening situation in an experimental setup most likely feels less anxious than in a comparable real-life situation.

Future research can further build upon our results by examining the effects of law enforcement training versus practical routine. The inclusion of shoot/no-shoot decisions, the detection of shooting accuracy, the effects of artificially induced stress, and structured gaze control training could bring valuable new insights.

Conclusion and Practical Implementation

The present study’s results suggest that training has a higher impact on efficient gaze control in law enforcement than practical experience. The experiment revealed significant differences in gaze patterns between highly trained officers of a tactical unit and patrol officers with high practical experience. Officers with a high level of tactical training showed superior gaze patterns by focusing more on the critical hands/hip region and less on a suspect’s face. Although practical experience has shown to be beneficial in past studies (for an overview, see Heusler and Sutter 2020), a high level of training is more advantageous in this context.

Past studies have shown observable differences in gaze behavior and perceptual performance after a few, or even only one, training sequence (Helsen and Starkes 1999; Neuberger 2013; Nieuwenhuys and Oudejans 2011). Accordingly, regular training and continuous sensitization on the subject may significantly decrease law enforcement personnel’s risk of death or severe injury in life-threatening encounters.

Additionally, improved visual perception may also lead to the correct identification of non-threatening objects in complex and high-stress situations, leading to correct no-shoot decisions and possibly saving unarmed citizens’ lives. Officers who focus on the critical hands/hip region instead of potentially misleading facial expressions have a higher chance of identifying weapons and harmless objects. Consequently, they may be faster and more accurate in their threat assessment under pressure.

Even when not actively fixating on critical regions, officers can still shift their visual attention covertly. This might be preferable in a situation where a single officer interacts with a suspect and avoiding eye contact to fixate the hands/hip region could escalate the situation. For two or more officers, it is most advisable to have one officer addressing the suspect and trying to de-escalate the situation while the other officer(s) focuses entirely on the critical regions of the suspect and the surroundings.

Teaching law enforcement personnel about the limitations of human visual perception, the importance of attention, and training them in tactical gaze control techniques can improve officers’ performance in life-threatening situations in which every millisecond counts.