1 Introduction

Effective communication is critical to the success of human-agent teams. Agents—embodied or software entities capable of independent action—and humans can communicate through a range of voice, visual, haptic, and other modalities. In this paper, we focus on human-agent communication that is visual display-mediated. In a variety of work domains involving unmanned and automated systems, visual displays are ubiquitous, and are often the human operator’s primary means for maintaining awareness about system and agent status, detecting problems, and investigating, diagnosing, and correcting problems. The focus of the current paper and a companion paper [1] is on improving how agent status is expressed and represented in status displays to support task and cognitive needs of the operators responsible for managing the agents. In particular, we focus on the needs of operators faced with managing greater numbers of processes, agents, vehicles, and systems [2] across different work domains.

Operators in human-agent teams engage in supervisory control tasks [3]. In supervisory control, monitoring occurs on a frequent and often intermittent basis, as operators concurrently engage in several other critical activities [4, 5]. As operator workload continues to grow, it is becoming increasingly important for operators to monitor proactively to detect and resolve emerging issues before they worsen [4, 6, 7, 8].

To proactively monitor the status of multiple agents through visual displays, operators must search through status indicators to detect signs of current and emerging problems that warrant further investigation. To understand how to support effective search through status indicators, we can draw from the extensive literature on visual search, human attention, and the factors that control the distribution of attention in a scene (see [9, 10], for reviews). Existing design guidance informed by this research underscores the importance of making task-relevant information accessible through displays [11], and cueing operator attention to the highest priority information through salient visual cues [12, 13]. The challenge lies in appropriately applying this guidance, and in balancing information access and attention management. As Simon [14] noted over forty years ago “…in an information-rich world, the wealth of information…creates a poverty of attention and a need to allocate that attention efficiently among the overabundance of information sources that might consume it…”. In visual displays, providing access to too much information reduces the relative salience of information due to competition and clutter; conversely, de-cluttering can enhance relative salience but hinder access to the de-cluttered information [15].

To explore these issues in applied designs, we considered how information access and visual salience in standard formats in today’s systems relate to proactively monitoring multiple agents. In terms of information access, we noted that many standard formats promote “value monitoring” as opposed to “trend monitoring.” In terms of visual salience, we noted that if standard formats use salient cues at all, they direct attention to current problems as opposed to future problems. Motivated by these observations, we developed a novel trend-icon hybrid indicator format (Trendicon, patent pending, Fisher Rosemount Systems, Inc.) that provides access to information about worsening trend, and makes indicators with more extreme worsening trends more salient. In a recent experiment [7], Trendicons were contrasted with a standard “value monitoring” format (Numeric Values) and Trend Graphs as a baseline “trend monitoring” format. Participants attempted to proactively detect problems while monitoring displays with varying numbers of status indicators. Using Trendicons, the majority of problem detections were proactive; however, using Numeric Values and Trend Graphs, problem detections were largely reactive, occurring after an event became critical. Further, Trendicons led to a roughly fivefold increase in the estimated number of indicators that participants could proactively oversee, compared to Numeric Values and Trend Graphs. Trendicons’ performance advantages were hypothesized to stem from improved access to worsening trends and more salient cues for more extreme worsening trends. Although Trend Graphs provide access to trend information, we hypothesized that their lack of direct access to, and undifferentiated coding and salience of, worsening trend were responsible for their associated reactive problem detections.

To investigate these hypotheses more deeply and extend them more broadly to other status formats, we conducted an analysis of information access and salience mapping across a range of status formats. The magnitude of this analysis is quite large, and therefore is split across two papers, with the focus of the current paper primarily on information accessibility and the focus of a companion paper [1] on assessing the assignment of visual salience to information. Across both papers, our goal is to provide an objective basis for pairing status indicator formats to operators’ tasks, to improve task support and performance.

1.1 Status Indicator Formats Considered

Figure 1 presents a qualitative analysis of information accessibility across a range of standard, novel, and alternative indicator formats. Exemplars are illustrated for each format. Our analysis was limited to formats depicting the status of single agents (vs. multivariate status). The top of Fig. 1 shows an agent’s indicator value plotted over time relative to a desired normal value (50), caution thresholds above (≥ 75) and below (≤ 25) normal, and warning thresholds further from normal (≥ 80 and ≤ 20). In the example, value and status gradually worsen, extend beyond caution and warning thresholds, and then resolve back into normal status. Key points along the time course are shown in each indicator format, for comparison. Considering how the formats appear at each time point through normal, worsening, and resolving status reveals strengths and weaknesses of using the formats to monitor current and future status. We consider the first seven formats first; each is described next.

Fig. 1.
figure 1

Information access across status indicator formats as an agent’s indicator value and status change over time. Formats mainly provide access to current value, current deviation, or trend; less direct access to current value (V) and deviation (D) across formats is noted. Formats provide no access, indirect access, or direct access to worsening trend, and identical, variable, or unique coding for worsening vs. resolving trend.

Numeric Values show a digital representation of the agent’s current value. In Gauges, a needle points to the agent’s current value along a circular dial marked with ticks, numeric labels, and yellow caution and red warning regions. In Bullet Graphs [16], a filled bar conveys current value on a linear scale with yellow caution and red warning regions (see [17] for other variations). Stoplight Coding codes threshold-based categorical caution (yellow) and warning (red) states. Rather than applying Stoplight Coding to each format in Fig. 1, we treated it as an independent status format to separately assess its properties from the properties of other formats. Trend Signature Plots represent recent trend behavior with one of seven standard first and second order trend patterns, as described in [18]: steady state, ramping up or down, increasing or decreasing at an increasing rate, and increasing or decreasing at a decreasing rate. Trend Graphs continuously plot values along a trend line relative to a black centerline and yellow caution and red warning threshold lines. Trendicons represent current deviation from normal with an analog indicator bar extending from a centerline, and present trend information including recent trend (directional shape), rate of change (number of jet trails), and worsening trend (bold for worsening); see [7] for a detailed design description. Key findings from a systematic analysis of the information accessibility of these seven formats are reported next. This analysis motivated the design of three alternative status formats shown at the bottom of Fig. 1 and discussed below.

Formats cluster into three main categories of information accessibility: each format provides access primarily to current value (Numeric Values, Gauges, Bullet Graphs), current deviation (Stoplight Coding), or trend information (Trend Signature Plots, Trend Graphs, Trendicons). These groupings are shown in Fig. 1. Some current value-based and trend-based formats also provide less direct access to current deviation and current value. For example, Gauges require operators to compare needle position to normal (vertical) to obtain a rough estimate of current deviation.

No standard format provides direct access to worsening trend, the key attribute needed for proactive monitoring. For some formats, information about worsening trend can be derived by combining multiple components or sequencing information. However, it is difficult to engage in an efficient search for features that are not diagnostic, and for multiple features that must be compared, combined, or held in memory (vs. diagnostic, simple, and explicitly represented task-relevant features). For example, in the search for worsening trend in Trend Graphs, orientation alone is not a diagnostic cue; it must be combined with the direction/position of trending information relative to normal to assess if values are worsening by heading away from normal or resolving by returning to normal. As another example, the direction and rate of change of the Gauge’s needle must be assessed over time to detect values heading away from normal. In contrast, a simple search strategy for worsening trends of “search for bold-outlined Trendicons” can be executed easily and effectively.

There is insufficient information access (and salience variation) to differentiate worsening vs. resolving status in standard formats. Current value-based formats appear similar for similar current values even when one is worsening and the other is resolving (see Fig. 1). The threshold-based Stoplight Coding does not distinguish between worsening and recovering states within the same caution range or warning range. Although color is a salient visual cue, its mapping for Stoplight Coding draws attention similarly for two potentially different cases (resolving vs. worsening) that may require different actions. Thus, time may be unproductively spent investigating lower priority indicators with resolving trends (constituting a “false alarm”), while responses to higher priority indicators with worsening trends may be delayed or missed; in both cases, there are important implications for safety. In contrast, worsening and resolving Trendicons are easily distinguished with bold (worsening) vs. not bold (not worsening) coding to aid prioritization and reduce false alarms and misses.

The systematic format analysis reported here in combination with empirical findings described earlier [7] motivated the design and testing of three additional status formats, shown at the bottom of Fig. 1. To address the lack of formats providing access to worsening trend, we created a Linear Estimator that conveys worsening trends based on prediction of future problems. The logic underlying this format leverages previous work on predictive automation and state estimation [19, 20]. A least-squared linear regression over the last 10 time points is used to estimate a critical threshold crossing, with darker achromatic fill for indicators predicted to cross threshold sooner. Trend Graphs were augmented with Linear Estimators and a current deviation bar (similar to the one in Trendicons) to facilitate access and map visual salience to more extreme deviations and worsening trends. In this Augmented Trend Graph, older historical trend information was also faded to de-clutter past events. Finally, the Linear Estimator was applied to the border of the Trendicon’s directional shape to create an Alternative Trendicon intended to improve access to worsening trends and aid in prioritizing multiple worsening indicators.

2 Empirical Validation of Design Concepts

An empirical study was conducted to investigate the impact of the visual features intended to improve information access and attention management in the bottom three formats in Fig. 1. Since status displays are often consulted only occasionally during monitoring, the ability to easily detect and prioritize current and future problems is paramount. The current study tested the effectiveness of four status indicator formats (three new formats in Fig. 1 vs. Trend Graphs) in tasks that simulated intermittent monitoring under varying time pressure: quickly identifying the worst current indicator, quickly identifying the next indicator expected to go critical (“next-critical”), and thoughtfully prioritizing the three next-critical indicators. To test the robustness of the formats for monitoring multiple agents, the number of emerging problems was varied.

The goals of the experiment were to (1) validate the expected performance enhancements from improved information access and salience assignment in the novel and alternative formats compared to Trend Graphs, (2) determine the extent to which the strong Trendicon advantage found in a dynamic monitoring task [7] extends to intermittent monitoring, and (3) assess whether the available time for monitoring leads to performance tradeoffs for formats. Specifically, are formats that provide more detailed information access superior when more vs. less time is available for monitoring (and vice versa)? We hypothesized that more detailed trend information would improve accuracy on the less time-pressured ranking task for the Augmented Trend Graphs vs. Alternative Trendicons, and the reduced information in the Alternative Trendicons would support the more time-pressured next-critical task.

2.1 Methods

Participants. Twenty-four students (15 male; mean age = 21 years) recruited from local universities were paid for their participation. All reported normal or corrected-to-normal vision, gave informed consent, and were naïve to the intent of the study. The experiment was approved by a Federally-approved Institutional Review Board.

Stimuli. Each trial consisted of 24 status indicators shown in one of the four formats, Trend Graphs, Trendicons, Augmented Trend Graphs, or Alternative Trendicons (see Fig. 1), presented in a 4 by 6 array. Each indicator subtended roughly 1.9 degrees of visual angle. Both Trend Graph formats showed historical indicator values plotted over 60 time points. Both Trendicon formats used the previous 10 values to calculate recent changes. The historical data driving each indicator was randomly generated and identical across formats; indicator positions were shuffled within trials. Indicator values could range from 0 to 100, with a normal value of 50. Indicators with emerging problems were created by inserting a constant sloped deviation into the time course at a random time. These emerging problem indicators were constrained to be outside a normal range, but not yet in a critical state. Current deviations from normal for emerging problem indicators ranged from 12 to 45 units. Within a trial, the current values of indicators with emerging problems were roughly equally spaced in this range. All other indicators had current deviations from normal that were 10 units or less (values between 40 and 60). Stimuli were constrained so that on half the trials, the indicator with the current worst value was also the next to cross critical threshold, randomly ordered. Correct responses for identification or ranking of indicators projected to cross critical threshold were obtained by fitting a linear regression line for each indicator starting from the deviation onset and calculating the time until critical.

Design and Procedure. Four indicator formats (Trend Graphs, Trendicons, Augmented Trend Graphs, and Alternative Trendicons) and three numbers of emerging problems (3, 6, and 9) were tested in a within-subjects design. The order of indicator formats was blocked and counterbalanced between participants. Within each format block, task was blocked in sequential order (current worst, next-critical, ranking). The number of emerging problems (3, 6, or 9) was randomized over trials. There were 30 trials within each experimental format block. Each experimental block was preceded by a practice block with 12 practice trials. Participants’ reaction times and indicator selections were recorded. To engage participants and discourage errors, performance feedback was given at the end of each trial.

Participants were asked to role-play a supervisor whose job was to monitor a team of autonomous robot vacuums as they cleaned houses. Participants were told that they would remotely view the energy use of robots through an array of 24 indicators, each showing the status of one robot. The participant’s goal was to identify robots in danger of crossing into critical energy use ranges, either too high (≥ 95) or too low (≤ 5), so that a fictitious technician could prioritize robots in need of fixing. This proxy task for monitoring agent status was used to allow naïve participants to quickly learn and perform the task. Participants performed three tasks that involved monitoring current status and anticipating future status in arrays of static indicators: (1) quickly identify the indicator currently closest to a critical threshold (current worst task), (2) quickly identify the next indicator to cross a critical threshold (next-critical task), and (3) carefully rank the three next-critical indicators in order of anticipated threshold crossing (ranking task). Participants clicked on indicators to indicate their responses.

After participants gave their informed consent, the proctor provided instructions on the indicator formats and tasks, and demonstrated the task in detail. After participants successfully demonstrated comprehension of the task and formats, they completed the practice and experimental trials, followed by an experiment debrief and payment for participation. The entire procedure lasted about 90 min.

3 Results

Dependent variables were analyzed with repeated measures ANOVAs including within-subject factors of format and number of emerging problems. In cases where Mauchly’s test for sphericity was significant, reported p values are Greenhouse-Geisser corrected. Reported effect sizes are generalized η2 statistics. Each dependent variable and task, including each ranking response, was analyzed separately.

Participants’ average accuracy and response times are plotted in Figs. 2 and 3. We investigated the results statistically through planned comparisons between targeted format pairs corresponding to the three goals of the experiment described in Sect. 2. Main effects of the number of emerging problems in the comparisons below do not relate to the hypotheses and are not reported.

Fig. 2.
figure 2

Mean proportion of correct responses (top) and mean of participant median response times (bottom) across indicator formats and number of emerging problems for speeded identification of the current worst indicator (left) and the next indicator to go critical (right). Error bars are ± 1 standard error of the mean.

Fig. 3.
figure 3

Mean proportion of correct responses (top) and mean of participant median response times (bottom) across indicator formats and numbers of emerging problems for carefully ranking the next three indicators to go critical. Error bars are ± 1 standard error of the mean.

  1. (1)

    The expected performance enhancements from improving information access and salience assignment were found. Augmented Trend Graphs significantly improved performance relative to baseline Trend Graphs, resulting in faster and more accurate performance in all tasks (current worst, F(1,23) = 4.43, p = .046, η2 = 0.04; all other tasks F(1,23) ≥ 13.00, p ≤ .001, η2 ≥ 0.11).

    Performance with Augmented Trend Graphs matched and in some cases surpassed performance with Trendicons. Responses for the second and third rankings were faster and more accurate with Augmented Trend Graphs (all F(1,23) ≥ 8.40, p ≤ .008, η2 ≥ 0.06). Augmented Trend Graphs were also less impacted by the number of emerging problems than Trendicons in the current worst task (F(2,46) = 4.13, p = .022, η2 = 0.02), and the second (F(2,46) = 5.75, p = .006, η2 = 0.04) and third (F(2,46) = 15.31, p < .001, η2 = 0.13) rank responses. Although response times with Trendicons were generally slower overall, they were less impacted by number of emerging problems in the next-critical (F(2,46) = 6.77, p = .003, η2 = 0.01) and the first ranking tasks (F(2,46) = 7.40, p = .002, η2 = 0.03).

    The addition of the Linear Estimator enhanced performance with the Alternative Trendicons vs. the original Trendicons. Other than response times for the current worst task, performance was significantly better across the board for Alternative Trendicons (all F(1,23) ≥ 4.52, p ≤ .044, η2 ≥ 0.04). The effect of more emerging problems was also less pronounced for Alternative Trendicons on the accuracy of the third ranking task (F(1,23) = 9.73, p < .001, η2 = 0.09).

  2. (2)

    The strong benefits of Trendicons over Trend Graphs observed in a dynamic monitoring task [7] generally extended to the static task paradigm in the current experiment. Performance was significantly more accurate for Trendicons vs. Trend Graphs for the current worst, next-critical, and ranking tasks (F(1,23) ≥ 5.86, p ≤ .024, η2 ≥ 0.05), and responses were significantly faster for the ranking tasks (F(1,23) ≥ 4.40, p ≤ .047, η2 ≥ 0.02) and marginally faster for the next-critical task (F(2,46) = 3.94, p = .059, η2 = 0.05).

  3. (3)

    Although the Augmented Trend Graphs and Alternative Trendicons resulted in similar performance on most measures, some of the expected time availability tradeoffs in formats were found. While accuracy for the (speeded) next-critical task did not differ significantly, response times to the next-critical task were faster for Alternative Trendicons vs. Augmented Trend Graphs (F(1,23) = 4.53, p = .044, η2 = 0.04). Alternative Trendicons were also less impacted by the number of emerging problems (F(2,46) = 5.42, p = .008, η2 = 0.01). Thus, for the time-pressured next-critical task, the format with access to less detailed trend information was faster without sacrificing accuracy, and was less impacted by the number of emerging problems. Alternative Trendicons were also more accurate for the first ranking response (F(1,23) = 12.42, p = .002, η2 = 0.11).

4 Discussion

The design and use of status formats is often based on the convenience of using legacy formats already implemented in systems, the tradition of using the same formats over many years, and the familiarity of the formats to operators and system designers. This tendency to retain legacy formats can result in inadequate support when operator task needs change and evolve. To provide an objective basis for the design and use of status indicator formats that match operator task needs, we conducted a systematic analysis of a range of status indicator formats to assess their viability for supporting proactive monitoring of multiple agents.

This analysis revealed that many formats commonly used in today’s control systems are not well-suited to support proactive monitoring, and are not poised to support the task needs of future operators who will need to proactively monitor multiple agents in densely populated visual displays. No standard formats directly code or cue attention to worsening trend, a key attribute needed to proactively monitor multiple agents. Noting these shortfalls, we focused our design guidance and decisions on what information to make accessible, how to map information importance to visual salience in formats, and how to maintain salience of the most important features in densely populated multi-agent displays to effectively manage attention. Based on the results of our analyses, we created novel formats and augmented existing formats and demonstrated their performance advantages through an empirical study.

Recognizing resistance to accepting new formats due to cost and biases toward familiar formats, we have proposed possible augmentations (i.e., the Linear Estimator) for sub-optimal standard formats to improve performance, as demonstrated with the Augmented Trend Graph. Specifically, the Augmented Trend Graphs improved accessibility and cueing of task-relevant information (worsening trend) missing from Trend Graphs. We also demonstrated performance gains from augmenting an already-improved format (Trendicon) with the Linear Estimator coding, which improved salience assignment and access to information needed to prioritize multiple worsening indicators. The Alternative Trendicons’ speed advantage and robustness to number of emerging problems compared to the Augmented Trend Graphs suggests that the icon format may remain more robust when larger numbers of emerging problems must be searched through and prioritized.

Beyond the point designs discussed here, our analysis can be extended to assess the effectiveness of other formats (e.g., configural displays representing multivariate status, [21]) in different tasks. It can also identify features that can be applied to existing sub-optimal formats when changes to systems are limited, complement other design approaches (e.g., Ecological Interface Design, [22]), and inspire novel designs through combinations of features and development of new representations.

In related work, we are exploring computational modeling approaches to assess attention management and predict performance across formats to complement our behavioral data and inform the design of alternative formats [1]. Follow-on research will extend these modeling approaches to assess and validate the information access estimates reported here.

5 Conclusion

In the current paper, we analyzed and manipulated core properties of visual status indicator formats impacting human monitoring performance. This analysis revealed limitations of applying certain standard formats to multi-agent monitoring tasks, provided design mitigations to improve format effectiveness, and motivated the design and empirical testing of novel improved formats. Human-computer interaction researchers and system designers must ensure that information displays support the changing roles and task needs of human operators. This requires a principled and deliberate approach to design that considers the task needs and cognitive attributes of human operators, and that recognizes when formats require redesign, augmentation, or replacement.