Keywords

1 Research Foundation

A Warfighter must accurately detect and interpret threats in the field. One type of threat is in the form of kinesic cues, here nonverbal body movements communicating one’s intent [1]. The target kinesic behavior cues conveyed in this study were either aggressive or a nervous behaviors signifying a potential threat. Aggressive behaviors included clenching one’s fists or slapping one’s hands, while nervous behaviors included checking one’s “six o’clock” or wringing one’s hands (i.e., clasping and rubbing the palm of one hand with the other hand). Currently, a need exists to increase accuracy in detecting threats in a Warfighter context. This need has inspired the use of instructional strategies in a virtual environment for training perceptual skills.

1.1 Instructional Strategies

Instructional strategies were sought to aid knowledge acquisition of target cues in a virtual environment. The instructional strategies included Kim’s Game, Highlighting, and Massed Exposure (ME). Kim’s Game was adopted from Rudyard Kipling’s book titled Kim, where the former focused on teaching change detection and recall. Adopting the premise behind Kim’s Game allowed for the creation of a virtual version presented in the form of an instructional strategy. The strategy presented a discrete task whereby the trainee had to identify a change (or no-change) in a virtual scene. Specifically, the trainee observed a scene of non-target cues for 8 s, a blank screen for 1 s (i.e., an interstimulus interval), and a second scene of cues for 8 s. The second scene required change detection, as one cue may have changed from a non-target to a target cue. Following, the trainee chose whether a change had occurred or not within final scene. This tasks the trainee with recalling the first stimuli, and then detecting whether a change has occurred [2].

Another instructional strategy used in this study was Highlighting, which is the use of a non-content stimuli within the simulated environment to orient the trainee’s attention toward the target cue. A translucent blue box appeared over the target cues, leaving the user to classify the presence of an aggressive or nervous cue. ME involves the saturation of practice opportunities by presenting a higher amount of the target cues to the user, in comparison to the Control group. A Control condition was incorporated as a continuous task with no formal training (i.e., the absence of an instructional strategy). Overall, each strategy requires the selective attention of the trainee, in order for the trainee to accurately detect the target cue among non-target cues [3].

1.2 NASA-Task Load Index

This research utilized the NASA-Task Load Index (NASA-TLX) survey, which is a subjective tool used to assess perceived workload for a given task. The full survey consists of six subscales, including mental demand, physical demand, temporal demand, performance, effort and frustration [4]. Each subscale is comprised of 100 points, with a 5-point interval step. For this report, the survey comprises effort, frustration, and performance subscales. Specifically, effort referred to how much physical and mental strength was needed to accomplish the task. Frustration examined the level of stress, versus relaxation, felt during the task. Finally, performance described the level of confidence the participant had in their performance of the task. Unlike the other subscales, performance was scored as ‘0’ to represent perfect performance and ‘100’ to represent lowest performance. The mental demand subscale, as well as a global demand scale, were evaluated and presented in research by [5] and will not be duplicated in this research effort.

1.3 Detection Accuracy

For selecting correct target cues, detection accuracy was logged and scored. The score was measured by the percentage of correctly detected targets cues out of the total number of target cues presented.

1.4 Research Questions

This research explores the impact of workload responses (i.e., effort, frustration, and performance) using a between-subjects design for the instructional strategies (i.e., Kim’s Game, ME, Highlighting and Control). Changes in cognitive demands and resource allocation may result in differences reported for workload, and ultimately detection accuracy. This has inspired an investigation into the following research questions:

  1. 1.

    Which strategy will score highest or lowest for effort, frustration, performance, and detection accuracy?

  2. 2.

    How do the strategies differ in workload and detection accuracy?

  3. 3.

    Which strategy is most effective in yielding an appropriate amount of workload?

2 Data Collection

The experimenter greeted each participant and verified that he or she met the inclusion requirements (i.e., the participant was a U.S. citizen, was at least 18 years of age, had normal or corrected vision, and had not participated in prior Simulation-Based Training experiments). After entering the experiment room, the experimenter obtained the participant’s signed consent and administered the color blindness test [6]. If the participant passed the color blindness test, he or she was assigned to one of the four conditions.

Next, the experimenter asked the participant to view a scenario (on a 22-inch desktop computer) which familiarized him or her with the user interface and the testbed’s Virtual Battlespace 2 (VBS2) software. Following this event, the experimenter requested that the participant complete a pre-test scenario, using prior experience to detect and classify cues as exhibiting either aggressiveness or nervousness. After, the experimenter presented a PowerPoint training on behavior cue detection with explanations for aggressive and nervous cues. The presentation also highlighted non-target cues (such as idle talking and cross arms) as well as instructions for completing the task. Next, the experimenter displayed one of the four conditions (i.e., Kim’s Game, ME, Highlighting or Control) on the desktop and asked the participant to complete the scenario. Following the scenario, which was displayed for approximately 15 min, the experimenter requested that the participant complete the on-line version of the NASA-TLX.

After the NASA-TLX was completed, a final PowerPoint with instructions to complete the post-test was presented by the experimenter. Afterwards, the participant completed the post-test, which tested knowledge acquisition and logged the detection accuracy scores used in the present analysis. Following this, the participants were debriefed, compensated (with either $10 per hour or class credit), and dismissed. The total duration of the experiment was roughly 3 h.

3 Results

Preliminary analyses indicated that the age ranges of the participants were 18–38 (M = 22.12, SD = 3.44). The number of both male and female participants was 65, with a sample size of 130. Closer inspection of the data showed that the distribution was non-normal, or nonparametric, in nature. Therefore, it was more meaningful to examine the medians for the NASA-TLX subscales (i.e. effort, frustration and performance) and detection accuracy. Observing the median distribution for all the conditions, Kim’s Game had the highest median scores for effort (60), performance (35), and frustration (45) subscales, and had the lowest detection accuracy score (85.94%). Highlighting had the lowest median scores for the three workload subscales of effort (14), performance (3), and frustration (5), and had the highest score for detection accuracy (96.71%). ME scored neither the highest nor lowest in effort (47), performance (24), frustration (30) or detection accuracy (94.19%). Finally, the Control was neither highest nor lowest in effort (53), performance (17), frustration (26), or detection accuracy (94.57%).

As previously mentioned, the non-normal distribution prompted an inspection of the data to identify extreme outliers. A closer look at the detection accuracy scores revealed extreme outliers in three out of the four groups: select ME, Highlighting, and Control scores were beyond the lower outer fence. Additionally, Kim’s Game’s effort subscale showed extreme outliers beyond the lower outer fence. Finally, Highlighting’s performance subscale displayed extreme outliers beyond the upper outer fence.

4 Discussion

Based on the results, responses to the following research questions are provided.

4.1 Which Strategy Will Score Highest or Lowest for Effort, Frustration, Performance and Detection Accuracy?

Kim’s Game scored highest in effort, frustration, and performance; but lowest in detection accuracy. The discrete nature of the Kim’s Game target cue presentation may have stimulated more attention, which increased effort and frustration levels. As a result, performance scores were impacted because the user felt challenged by Kim’s Game. When examining detection accuracy, it appears that the Control had a higher positive impact than Kim’s Game. In addition, the Control was comparable to ME in terms of detection accuracy. Overall, from a design perspective, a baseline helps determine if a strategy is warranted for downstream training acquisition.

Highlighting scored lowest in effort, frustration, and performance; but highest in detection accuracy. The subscale scores suggest that Highlighting was perceived as a simpler task than Kim’s Game. Prompting towards the target cue in Highlighting may explain this perception, leading to lower scores in effort and performance. Thus, Highlighting may be beneficial for training users who require low-frustration learning environments. For example, children with autism may benefit from training that starts with the least intrusive approach [7].

4.2 How Do the Strategies Differ in Workload and Detection Accuracy?

Each set of workload and detection accuracy scores for the three instructional strategies differ, signifying distinct trends to inform future strategy selection. At first glance, Highlighting seems equivalent to the Control in terms of detection accuracy. However, there is a distinction between performance scores for the Highlighting and Control conditions (see results for an overview). The differences may be explained by increased task confidence, induced by the non-content stimuli. Highlighting, therefore, may be applied toward inexperienced trainees, to increase their confidence levels. Highlighting may be applied to first-year nurses to gain confidence in technical skills required for treating patients in stressful environments [8].

When ME is compared to Highlighting, considerable differences in workload exist for all subscale scores. Yet, differences in detection accuracy scores between the two strategies were minimal. Given the differences in workload subscales, but similarities in detection accuracy scores, ME can assist with training tasks that are neither too challenging nor too simplistic. ME may be suited for high-stake security fields that require a tax on effort, but need a high detection accuracy. To illustrate, ME may be introduced to baggage-screening training, which requires the screener to identify Improvised Explosive Devices (IEDs) [9].

4.3 Which Strategy is Most Effective in Yielding an Appropriate Amount of Workload?

The appropriate amount of workload for an instructional training strategy depends on the desired application. Tasks that entail a large amount of workload for adequate performance may use Kim’s Game. Whereas training instances that demand execution of simple tasks may gear toward Highlighting. Alternatively, ME may be used as an intermediate training strategy that provides extensive amounts of practice, with moderate workload demands. The placement of instructional strategies within a training continuum can cater to specific trainee needs. For example, Highlighting may act as a simplified approach to train detection. Once the trainee has mastered basic task components, ME may be presented for practice purposes, to create repetition and moderate workload demands. Lastly, Kim’s Game may act as a testing mechanism for detection skills. Kim’s Game is an appropriate testing agent because it generates high workload demands expected in the real world. Therefore, the presence of high detection accuracy in Kim’s Game suggests ideal performance under high-workload conditions.

5 Limitations

A limitation of this study was the discrete aspect of Kim’s game, when compared to the continuous nature of Highlighting, ME, and Control. This ecological difference in design may have acted as an extraneous variable, given that the pre-test and post-test also used a continuous method. Furthermore, outliers were detected in all conditions, which may have skewed the data.

6 Conclusion

This study analyzed the descriptive statistics of three instructional strategies, for the NASA-TLX subscales of frustration, effort, and performance; and detection accuracy. Based on the research findings, recommendations were given to map strategies according to a training continuum. Potential applications include baggage screening, special education, and healthcare. Future directions of research includes a follow-up study to assess learning retention for detection skills, and assessing other subscales of the NASA-TLX. Finally, to improve the objectivity of the results, physiological measurements to assess trainee workload should be integrated.