Keywords

1 Introduction

This work follows a large volume of prior research done on 3D user interactions [3, 6, 10, 24], immersive analytics [1, 4, 5, 18] and the combination of the two [9, 17, 21, 23]. Although the task-specific layout of an immersive data visualization is arguably the most important aspect determining its utility [15], non-intrusive and intuitive user interfaces (UI) and overall user experiences (UX) are also important in determining the usability and utility of an immersive data visualization. In this paper, we report on the applicability of various user interaction (UIa) methods for immersive analytics of node-link diagrams.

Fig. 1.
figure 1

A computer network’s topology visualized with VDE, using a Mixed Reality headset.

Work on Virtual Data Explorer (VDE, Fig. 1) started in 2015, initially as a fork of OpenGraphiti and then rebuilt from scratch as a Unity 3D project [14]. One of the factors that motivated the transfer away from OpenGraphiti at the time was its lack of support for user interactions in virtual reality, which became a particularly significant omission when Oculus Touch controllers were released in late 2016 which enabled sufficiently precise user interactions to be implemented with Unity 3D. User feedback solicited from early VDE users motivated various alterations and additions to the interactions implemented for virtual and mixed reality in VDE.

2 Objective

Encoding information into depth cues while visualizing data has been avoided in the past for a good reason: on a flat screen, it’s not helpful [19]. Nevertheless, recent studies have confirmed [23] that with equipment that provides the user with stereoscopic perception and parallax, three-dimensional shapes can be useful in providing users with insight into the visualized dataset [12]. Additionally, researchers have found that test subjects managed to gather data and to understand the cyber situation presented to them only after few sessions with great performance scores, even if the task seemed difficult to them on the first try [8].

The motivating factors for creating VDE were the challenges that cyber defense analysts, cyber defense incident responders, network operations specialists, and related professionals face while analyzing the datasets relevant to their tasks. Such datasets are often multidimensional but not intrinsically spatial. Consequently, analysts must either scale down the number of dimensions visible at a time for encoding into a 2D or 3D visualization, or they must combine multiple visualizations displaying different dimensions of that dataset into a dashboard. The inspiration for VDE was the hope that immersive visualization would enable the 3D encoding of data in ways better aligned to subject matter experts’ (SMEs’) natural understanding of their datasets’ relational layout, better reflecting their mental models of the multilevel hierarchical relationships of groups of entities expected to be present in a dataset and the dynamic interactions between these entities [13].

Therefore, the target audience for the visualizations created with VDE are the SMEs responsible for ensuring the security of networks and other assets. SMEs utilize a wide array of Computer Network Defense (CND) tools, such as Security Information & Event Management (SIEM) systems which allow data from various sources to be processed and for alerts to be handled [15]. CND tools allow analysts to monitor, detect, investigate, and report incidents that occur in the network, as well as provide an overview of the network state. To provide analysts with such capabilities, CND tools depend on the ability to query, process, summarize and display large quantities of diverse data which have fast and unexpected dynamics [2]. These tools can be thought of along the lines of the seven human-data interaction task levels defined by Shneiderman [22]:

  1. 1.

    Gaining an overview of the entire dataset,

  2. 2.

    Zooming in on an item or subsets of items,

  3. 3.

    Filtering out irrelevant items,

  4. 4.

    Getting details-on-demand for an item or subset of items,

  5. 5.

    Relating between items or subset of items,

  6. 6.

    Keeping a history of actions, and

  7. 7.

    Allowing extraction of subsets of items and query parameters.

These task levels have been taken into account while developing VDE and most have been addressed with its capabilities. When appropriate, Shneiderman’s task levels are referred to by their sequential number later in this paper.

3 Virtual Data Explorer

VDE enables a user to stereoscopically perceive a spatial layout of a dataset in a VR or MR environment (e.g., the topology of a computer network), while the resulting visualization can be augmented with additional data, like TCP/UDP/ICMP session counts between network nodes [16]. VDE allows its users to customize visualization layouts via two complimentary text configuration files that are parsed by the VDE Server and the VDE Client.

To accommodate timely processing of large query results, data-processing in VDE is separated into a server component (VDES). Thread-safe messaging is used extensively - most importantly, to keep the Client (VDEC) visualization in sync with (changes in) incoming data, but also for asynchronous data processing, for handling browser-based user interface actions, and in support of various other features.

A more detailed description of VDE is available at [11].

3.1 Simulator Sickness

Various experiments have shown that applying certain limitations to a user’s ability to move in the virtual environment - limit their view and other forms of constrained navigation - will limit confusion and help prevent simulator sickness while in VR [7]. These lessons were learned while developing VDE and adjusted later, as others reported success with the same or similar mitigation efforts [20]. Most importantly, if an immersed user can only move the viewpoint (e.g., its avatar) either forwards or backwards in the direction of user’s gaze (or head-direction), the effects of simulator sickness can be minimized or avoided altogether [12]. This form of constrained navigation in VR is known as “the rudder movement" [20].

3.2 Virtual or Mixed Reality

Although VDE was initially developed with Virtual Reality headsets (Oculus Rift DK2 and later CV1 with Oculus Touch), its interaction components were always kept modular so that once mixed reality headsets such as the Meta 2, Magic Leap, and Hololens became available, their support could be integrated into the same codebase.

The underlying expectation for preferring MR to VR is the user’s ability to combine stereoscopically perceivable data visualizations rendered by a MR headset with relevant textual information represented by other sources in the user’s physical environment (SIEM, dashboard, or another tool), most likely from flat screens. This requirement was identified from early user feedback that trying to input text or define/refine data queries while in VR would be vastly inferior to the textual interfaces that users are already accustomed to operating while using conventional applications on a flat screen for data analysis. Hence, rather than spend time on inventing 3D data-entry solutions for VR, it was decided to focus on creating and improving stereoscopically perceivable data layouts and letting users use their existing tools to control the selection of data that is then fed to the visualization.

A major advantage provided by the VR environment, relative to MR, is that VR allows users to move (fly) around in a larger scale (overview) visualization of a dataset while becoming familiar with its layout(s) and/or while collaborating with others. However, once the user is familiar with the structure of their dataset, changing their position (by teleporting or flying in VR space) becomes less beneficial over time. Accordingly, as commodity MR devices became sufficiently performant, they were prioritized for development - first, the Meta 2, later followed by support for the Magic Leap and HoloLens.

Fig. 2.
figure 2

Head-Up Display showing labels of visualized groups that the user focuses on, retaining visual connections to those with Bézier curves. HUD is used also for other interaction and feedback purposes.

3.3 User Interface

In the early stages of VDE development on Unity 3D, efforts were made to either use existing VR-based menu systems (VRTK, later MRTK) or to design a native menu, such that would allow the user to control which visualization components are visible and/or interactive; to configure connection to VDE Server; to switch between layouts; and to exercise other control over the immersive environment. However, controlling VDE’s server and client behavior, including data selection and transfer, turned out to be more convenient when done in combination with the VDES web-based interface and with existing conventional tools on a flat screen. For example, in case of cybersecurity related datasets, the data source could be a SIEM, log-correlation, netflow, or PCAP analyzing environments.

3.4 Head-Up Display

Contextual information is displayed on a head-up display (HUD) that is perceived to be positioned a few meters away from the user in MR and about 30m in VR. The HUD smoothly follows the direction of user’s head in order to remain in the user’s field of view (see Fig. 2). This virtual distance was chosen to allow a clear distinction between the HUD and the network itself, which is stereoscopically apparent as being nearer to the user.

3.5 User Interactions

The ability to interact with the visualization, namely, to query information about a visual representation of a datapoint (ex: semi-transparent cube for a node or line for a relation between two nodes) using input devices (ex: hand- and finger-tracking, input controllers) is imperative. While gathering feedback from SMEs [12], this querying capability was found to be crucial for the users’ immersion in the VR data visualization to allow them to explore and to build their understanding of the visualized data.

Fig. 3.
figure 3

In an MR environment, the user pinches a node, that is sized accordingly, to move that around and explore its relations. Notice the two gray spheres indicating the location, where the MR device (Magic Leap) perceives the tips of user’s thumb and index finger to be: due to the device’ lack of precision, these helper-markers are used to guide the user. Note that the distortion is further aggravated by to the way the device records the video and overlays the augmentation onto it. For comparison with Virtual Reality view, please see Fig. 5.

Fig. 4.
figure 4

MR view of Locked Shields 18 Partner Run network topology and network traffic visualization with VDE; user is selecting a Blue Team’s network’s visualization with index finger to have it enlarged and brought into the center of the view. Please see the video accompanying this paper for better perception: https://coda.ee/HCII22

The MR or VR system’s available input methods are used to detect whether the user is trying to grab something, point at a node, or point at an edge. In case of MR headsets, these interactions are based on the user’s tracked hands (see: Fig. 3 and Fig. 4), and in case of VR headsets, pseudo-hands (see: Fig. 5 Fig. 6) are rendered based on hand-held input controllers.

A user can:

  1. 1.

    point to select a visual representation of a data-object - a node (for example, a cube or a sphere) or an edge - with a “laser" or dominant hand’s index finger of either the virtual rendering of the hand or users real hand tracking results (in case of MR headsets). Once selected, detailed information about the selected object (node or edge) is shown on a line of text rendered next to user’s hand, (Shneiderman Task Level 4).

  2. 2.

    grab (or pinch) nodes and move (or throw) these around to better perceive its relations by observing the edges that are originating or terminating in that node: humans perceive the terminal locations of moving lines better than that of static ones, (Shneiderman Task Levels 3, 5).

  3. 3.

    control data visualization layout’s properties (shapes, curvature, etc.) with controller’s analog sensors, (Shneiderman Task Levels 1, 5).

  4. 4.

    gesture with non-dominant hand to trigger various functionalities. For example: starfish - toggle the HUD; pinch both hands - scale the visualization; fist - toggle edges; etc.

In addition to active gestures and hand recognition, the user’s position and gaze (instead of just their head direction) are used if available to decide which visualization sub-groups to focus on, to enable textual labels, to hide enclosures, to enable update routines, colliders, etc. (Shneiderman Task Levels 2, 3, 4, 5, 7). Therefore, depending on user’s direction and location amongst the visualization components and on the user’s gaze (if eye-tracking is available), a visualization’s details are either visible or hidden, and if visible, then either interactive or not.

The reasons for such a behavior are threefold:

  1. 1.

    Exposing the user to too many visual representations of the data objects will overwhelm them, even if occlusion is not a concern.

  2. 2.

    Having too many active objects may overwhelm the GPU/CPU of a standalone MR/VR headset - or even a computer rendering into a VR headset - due to the computational costs of colliders, joints, or other physics. (see “Optimizations" section, below)

  3. 3.

    By adjusting their location (and gaze), the user can:

    1. (a)

      See an overview of the entire dataset (Shneiderman Task Level 1),

    2. (b)

      Zoom on an item or subsets of items (Shneiderman Task Level 2),

    3. (c)

      Filter irrelevant items (Shneiderman Task Level 3),

    4. (d)

      Get details-on-demand for an item or subset of items (Shneiderman Task Level 4),

    5. (e)

      Relate between items or subsets of items. (Shneiderman Task Level 5).

Fig. 5.
figure 5

In a VR environment, the user grabs a node, that is sized to sit into ones palm. For comparison with Mixed Reality view, please see Fig. 3.

Figure 7 and Fig. 8 show this behavior, while the video (https://coda.ee/HCII22) accompanying this paper makes understanding such MR interaction clearer than is possible from a screenshot, albeit less so than experiencing it with a MR headset.

Fig. 6.
figure 6

User touches an edge with the index finger of Oculus avatar’s hand, to learn details about that edge.

Fig. 7.
figure 7

Once user moves closer to a part of the visualization that might be of interest, textual labels are shown for upper tier groups first, while the rectangular representations of these groups are disappeared as the user gets closer, to enable focusing on the subgroups inside, and then the nodes with their IP addresses as labels. To convey the changes in visualization as the user moves, screenshots are provided sequentially, numbered 1–4. For comparison with Virtual Reality view, please see Fig. 8.

Fig. 8.
figure 8

Once user moves closer to a part of the visualization that might be of interest, textual labels are shown for upper tier groups first, while the rectangular representations of these groups are disappeared as the user gets closer, to enable focusing on the subgroups inside, and then the nodes with their IP addresses as labels. To convey the changes in visualization as the user moves, screenshots are provided sequentially, numbered 1–4. For comparison with Mixed Reality view, please see Fig. 7.

3.6 Textual Information

Text labels of nodes, edges, groups are a significant issue, as these are expensive to render due to their complex geometrical shapes and also risk the possible occlusion of objects which may fall behind them. Accordingly, text is shown in VDE only when necessary, to the extreme that a label is made visible only when the user’s gaze is detected on a related object. Backgrounds are not used with text in order to reduce their occlusive footprint.

3.7 Optimizations

The basis for VDE: less is more.

Occlusion of visual representations of data objects is a significant problem for 3D data visualizations on flat screens. In VR/MR environments, occlusion can be mostly mitigated by stereoscopic perception of the (semi-transparent) visualizations of data objects and by parallax, but may still be problematic [5].

While occlusion in MR/VR can be addressed by measures such as transparency, transparency adds significant overhead to the rendering process. To optimize occlusion-related issues, VDE strikes a balance between the necessity of transparency of visualized objects, while adjusting the number of components currently visible (textual labels, reducing the complexity of objects that are farther from the user’s viewpoint, etc.) based on the current load (measured FPS); on objects’ relative positions in user’s gaze (in-view, not-in-view, behind the user); and on the user’s virtual distance from these objects. This XR-centric approach to semantic zooming proves a natural user experience, visually akin to the semantic zooming techniques used in online maps which smoothly but dramatically change the extent of detail as a function of zoom level (showing only major highways or the smallest of roads, toggling the visibility of street names and point of interest markers).

Although colors and shapes of the visual representations of data objects can be used to convey information about their properties, user feedback has confirmed that these should be used sparsely. Therefore, in most VDE layouts, the nodes (representing data objects) are visualized as transparent off-white cubes or spheres, and the latter only in case if the available GPU is powerful enough. Displaying a cube versus a sphere may seem a trivial difference, but considering the sizes of some of the datasets visualized (>10,000 nodes and>10,000 edges), these complexities add up quickly and take a significant toll.

4 Conclusion

Immersive visualization of large, dynamic node-link diagrams requires careful consideration of visual comprehensibility and computational performance. While many of node-link visualization idioms are well-studied in 2D flat screen visualizations, the opportunities and constraints presented by VR and MR environments are distinct. As the pandemic made a larger-scale study with many participants impossible, VDE instead underwent a more iterative review process, drawing input from representative users and domain expertise. The approach described herein reflects many iterations of performance testing and user feedback.

Optimizing user interactions for VDE presented the design challenge of providing an interface which intuitively offers an informative presentation of the node-link network both at a high-level “overview" zoom level and at a very zoomed “detail" view, with well-chosen levels of semantic zoom available along the continuum between these extremes. Constrained navigation further optimizes the user experience, limiting confusion and motion sickness. Dynamic highlighting, through the selection and controller-based movement of individual notes, enhances the users’ understanding of the data.