1 Introduction

Simulation-Based Training (SBT) supports specialized training and adaptability by employing simulated platforms [1]. Allowing a shift in “higher-order skills” (e.g., coordination and decision-making), SBT fills the gap between traditional, classroom-based learning and live training scenarios. Behavior cue detection is a central application of SBT in pattern recognition training, especially in the Military training domain. Pattern recognition gives the ability to assess human behavior in complex combative environments by early identification of threatening actions via visual processing. This is achieved by a combination of bottom-up and top-down processing [2]. The bottom-up process includes gathering pieces of information to create generalizations without previous knowledge on the subject, while the top-down process is taking the big picture and breaking it down to smaller pieces.

Current literature regarding pattern recognition training in the military reference techniques that may be inadequate for advancing threat detection skills in warfighters (i.e., fingerprint matching, facial and handwriting recognition, speech detection). According to Fischer and Geiwetz [3], there is no formal training for detecting patterns in Soldiers’ environments, and most of their pattern recognition skills are due to years of field experience. A study regarding pattern recognition training, determined that soldiers who received formal training performed better than those trained in the traditional classroom [3].

Both the Marine Corps and the Army have developed training curriculums to improve soldiers’ pattern recognition through observational games, as well as routine observations and reports [4, 5]. One of the observational training games used, Kim’s game, calls for the individual to memorize several objects in an organized manner for recall at a later time (Sniper Sustainment Training, n.d.). Kim’s game has been found to improve improvisation skills, responsiveness, and analytic thought process [6]. It allows individuals’ to better their memory, and increase change blindness and change detection awareness. The utilization of Kim’s game may help train Soldiers in behavior cue detection, as it enhances the skills necessary to observe an environment more critically, memorize rapidly, and deepen descriptive skills.

Two metrics that assists SBT experimentation include Engagement Measure and Flow State Short Scale. Engagement determines how involved a participant is in the task, as well as facilitates cognitive processes, achievement, higher order perceptual skills [79] and training transfer [10]. The point in which a participant becomes unaware of their surroundings is defined as Flow [11]. According to Csikszentmihalyi [12], high Flow scores should correlate with high performance level. The purpose of this experiment was to assess participant engagement and flow for a signal detection task amongst a Kim’s game group and a control group.

2 Methods

2.1 Participants

There was a total of 75 (n = 37) participants, comprised of 41 females and 34 males. The Kim’s game group had 36 participants, while the control group had 39. Restrictions for participation consisted of US citizenship and an age restriction of 18 to 40 years (M = 22.72, SD = 3.75). Participants were required to have normal or corrected to normal vision and were administered the Ishihara Colorblindness test. The constraints on visual ability were due to the critical involvement of discerning visual stimuli in the experimental task. Compensation for participation consisted of either a monetary reward ($10/hour) or class credit.

2.2 Experimental Design

This experiment followed a between-subjects design with two conditions, the control and the instructional strategy Kim’s game. The dependent variables consisted of the Engagement, Flow, and performance measurements as the dependent variables.

2.3 Measures


Charlton and Danforth’s [13] engagement measure asks questions on how involved the participant felt (e.g., “I sometimes found myself to become so involved with the scenarios that I wanted to speak to the scenarios directly”). The measure has a total of seven questions rated on a scale from 1 (strongly disagree) to 5 (strongly agree).


Flow was assessed using Jackson, Martin, and Eklund’s [11] questionnaire, measuring the level of the participant’s mental state during the task. A sample question from this measure is, “I was completely focused on the task at hand.” Answers to each question were rated using a 1 (strongly disagree) to 5 (strongly agree) scale.

Post-test Detection Accuracy Scores.

Detection accuracy scores were calculated as a percentage based on the number of targets stimuli correctly identified within the vignette. The ratio was determined by the number of correctly identified targets divided by the total number of targets within each vignette.

Post-test False Positive Detection.

False positive detection identified a non-target model depicting a target behavior cue. Identification of false positive non-target cues and model types were calculated to determine any correlations between cue or model type and false positive detection.

Post-test Response.

Time Response time was determined by the amount of time a participant reacts to an event that appears on the screen, either clicking the target to detect a match or selecting the no change icon to indicate no match. The time was measured in seconds.

2.4 Experimental Testbed

Virtual Battlespace 2 (VBS2) was used as the experimental testbed. The experiment was conducted using a standard desktop computer with a 22-inch display monitor. For this experiment, there were four target cues (i.e., slap hands, clenched fists, wring hands, and check six) and four non-target cues (i.e., idle talking, check watch, cross arms, rub neck). Table 1 provides the descriptions of the behavior cues, as well as the corresponding classification for each of the target cues. All eight cues were modeled using 3D models of various skin tones.

Table 1. Target and non-target behavioral cues. Adapted from Salcedo [14].

2.5 Procedure

Upon arriving, participants were randomly assigned to either the control or Kim’s game condition. The experimenter and participant signed an informed consent describing the voluntary nature of the experiment and procedures. The experimenter proceeded by obtaining pre-experimental information from the participant, and by administering the Ishihara Test for Color Blindness [15]. Participants who failed were subject to dismissal; otherwise the experiment continued. After a demographics questionnaire was completed by the participant, an interface training lesson provided the user a chance to become familiar with the navigation and detection controls expected in the following scenarios.

Participants completed an interface training which required a passing score of over 75 %. Once completed they filled out a pre-test that gauged the user’s initial ability to detect the target kinesic cues (i.e., nervousness and aggressiveness). A second interface training followed, this allowed users to note the color change, if any, among a group of barrels.

After training participants were given a five-minute break, followed by kinesic cue training slides which demonstrated the aggressive and nervous behaviors (e.g., clench fists classified as aggressive, wringing hands as nervous) via model icons. Proceeding a second break, a 17-minute practice vignette was delivered asking participants to identify any changes in the models (e.g., a change from a non-target cue behavior to a target cue behavior). Upon completion, the Engagement Measure and Flow State Short Scale were administered. After a final break, the last interface training was provided for the post-test scenario, followed by a 40-minute long post-test scenario. The experiment concluded with a debriefing, then participants were dismissed.

3 Results

A one-way between-groups ANOVA was conducted. No statistically significant difference in Engagement between Kim’s game and the control groups. Another one-way between-groups ANOVA was conducted in order to determine the difference in Flow between Kim’s game and the control group. It was found that the Acton Awareness Merging subscale was significant between Kim’s game and the Control groups F (1, 73) = 4.92, p = .03, and participants in the control group (M = 3.44, SD = 1.10) reported higher Action Awareness Merging than the Kim’s game group (M = 2.89, SD = 1.04). There was also a significant difference between the two groups in the Clear Goals subscale F (1, 73) = 4.11, p = .05. The control group (M = 4.08, SD = .70) had a higher understanding of the Clear Goals than the Kim’s game group (M = 3.75, SD = .69). Finally, a significant difference was also found in Transformation of Time F (1, 73) = 6.28, p = .01, and was reported higher in the control (M = 3.69, SD = 1.00) than in Kim’s game (M = 3.06, SD = 1.09) (Table 2).

Table 2. ANOVA’s for flow between Kim’s game and control groups

Correlational data showed a weak, positive correlation between the Engagement survey for Total Engagement (r = .23), More time in the Virtual Environment (r = .28), Buzz of Excitement (r = .24), and post-test detection accuracy (Table 3).

Table 3. Correlation between Engagement and Post-Test Performance

There were also weak, negative correlations between the Engagement survey for Total Engagement (r = 1.24), Buzz of Excitement (r = –.25), and false positive detection. A moderate, positive correlation was found between Flow subscale Concentration Task at Hand and post-test detection accuracy, but all of the other subscales had no significant correlations.

Finally, neither Engagement nor Flow was found to be a statistically significant predictor of post-test performance. However, the results indicated that Flow had the largest contribution to post-test performance. More specifically, Concentration Task at Hand contributed largely to post-test detection accuracy, and Clear Goals to post-test false positive detection.

4 Discussion

Even though there were no significant results from the Engagement survey to explain practice performance or the post-test results, participants in both the Kim’s game and control groups reported feeling some level of engagement. This could be due to the fact that the questionnaire focused more on engagement levels felt in relation to the VE, and less on the individual’s perception of the engagement level. Another possibility is that the survey itself may not have been the most appropriate measure for the task at hand.

The weak, positive relationship between Engagement and post-test detection accuracy and also between More Time in the VE and post-test detection accuracy could be due to the fact that participants felt more comfortable with the task the more time that they spent in the VE, which resulted in higher levels of engagement and better performance. The weak, negative relationship between Total Engagement and false positive detection could be attributed to the fact that the more engaged a participant was, the more they were able to pay attention and had fewer false positives. Finally, the negative correlation between Buzz of Excitement and false positive detection could be due to practice in previous tasks during the experiment, which increased their confidence and led to fewer mistakes.

Participants in the control group reported higher levels in all the subscales of Flow (i.e., Action Awareness Merging, Clear Goals, and Transformation of time) than participants in the Kim’s game group after completing the practice scenario. This could be due to the practice scenario test, as the control group received an uninterrupted, continuous series of events, while individuals in the Kim’s game group received a more discrete task. This could have affected time perception, as the Control group received a more ‘seamless’ task, and could lead to losing track of time. The moderate, positive relationship between Concentration at Task and post-test detection accuracy may be explained by the idea that as the individuals concentrate on the task, the detection accuracy is greater. These results are also supported by the multiple linear regression conducted, as Concentration at Task was reportedly the largest contributor to post-test detection accuracy.

The control group also had shorter response times than the Kim’s game group, which indicates the control group felt a higher level of flow during the post-test. This could be due to a lack of flash recognition within the Kim’s game group, which is a technique used to improve visual memory recall [16]. In order for the brain to store visual information for later recall, it needs to quickly and accurately process incoming stimuli [17]. The length of the flashing time affects an individual’s ability to recall information [18]. The Kim’s game group received visual stimuli at a faster rate than the control group, which could attribute to these results.

5 Limitations

One limitation found in this experiment was the Engagement survey. Upon observing the results, it was concluded that the survey may not be an appropriate measurement for this task. The scale may need to be redesigned for future experiments to account for the lack of “sensitivity” it has for assessing performance of behavior cue detection.

6 Conclusion

Overall, the control group seemed to have higher reported levels of Engagement and Flow across all subscales. It is possible that the control group felt more comfortable as they received more repetition of the same practice, instead of a new task like the Kim’s game group. Perhaps designing a new Engagement survey, or including physiological measures in future studies would yield different results.