1 Introduction

Extended Reality (XR) technologies provide a versatile tool to inspect and interact with three-dimensional virtual elements in physical or completely computer-generated environments. Throughout the last decade XR technologies have been assigned considerable potential to support training and workflows in many different domains such as in the automotive and aerospace industry, healthcare, interior design, factory layout planning, at construction sites, for maintenance and repair tasks as well as for co-located and distributed collaboration in general. Even within each of these domains, XR technologies can support multiple different workflows and tasks. For instance, in the automotive industry, XR can be employed for early design and engineering reviews of virtual prototypes to save material costs as well as for training the teleoperation of machines and providing support by remote experts.

While the field of applications for XR technologies is vast, its actual profit-making implementation in real-world scenarios is still limited, due to the use case driven development of XR applications resulting in solutions that are limited in terms of hardware, tasks, and number of users. As such, frequent system adaptations are required which evoke temporal and cognitive efforts that outweigh XR’s actual potential and hinder its application in the real world.

Seeking to address these issues, this paper presents a framework for Scalable Extended Reality spaces that provides scalability between different degrees of virtuality, different devices, and different numbers of users. The development of the framework followed a human-centered design process: We first defined its context of use by high-level use cases that exploit XR’s key benefits and can be combined with each other to describe specific low-level use cases. Next, we defined functional and non-functional requirements, based on which we developed a framework design solution. Eventually, theoretical walkthroughs are provided to demonstrate its applicability to different use cases.

2 Background

2.1 Terminology

Extended Reality (XR) is currently being used as an umbrella term for environments that are encompassed by the so-called Reality-Virtuality Continuum that was introduced by Milgram et al. [1]. It ranges from reality (i.e., physical environments) to Mixed Reality (MR), to completely computer-generated virtual environments (i.e., Virtual Reality, VR). Initially, MR encompassed physical scenes augmented with virtual components (Augmented Reality, AR) as well as virtual scenes augmented with real components (Augmented Virtuality, AV). Due to both its complex implementation and its limited use cases, AV failed to materialize such that the terms AR and MR are now often used synonymously. The term MR, as used by us in this paper, encompasses AR as physical scenes with pure virtual overlays but also expands to more complex environments that virtually augment reality while considering the scene’s physical constraints.

XR applications can be accessed with a variety of technologies. First, head-mounted displays (HMDs) can be employed for both, MR and VR applications. Especially for MR applications, handheld displays (HHDs) such as smartphones or tablets can be used as well. Besides these mobile accesses to XR, there also exist projection-based setups: The term Spatial Augmented Reality (SAR) is commonly being used for spaces that augment physical scenes with virtual elements that are projected directly into the scene. On the other hand, CAVE [2] systems provide a projection-based access to VR environments.

This paper is focused on the development of a human-centered framework that provides mobile access points to MR and VR scenes, i.e., MR-HMDs, VR-HMDs, and MR-HHDs. In this context, we distinguish between on-site and off-site users: On-site users access the system from the actual working environment. This can be the site at which a real machine to be operated is located, at which already existing physical parts of a prototype are located, or at which co-located collaborators are located. Off-site users on the other hand access the system from a different location. These can be persons that are operating a machine remotely as well as remote experts that join the system as distributed collaborators in virtual replications of the on-site environment. Furthermore, we distinguish between real and virtual components: We use the term real component to refer to a physically existing object and the term virtual component to exclusively virtual objects (i.e., there is no physically existing counterpart for this object). In this context, it is important to note that real components may be displayed as virtual replications off site. On-site users may interact with exclusively virtual components and with real components. Off-site users may interact with exclusively virtual components as well as with real components through interaction with their virtual replications. Thereby, we use the terms static and dynamic to indicate if scene components are meant to change their position or orientation during the session.

2.2 Developing Collaborative Extended Reality Applications

A major impediment to the development and application of XR technologies in collaborative real-world settings concerns the variety of hardware producers and the platform-specific requirements. While game engines like Unity and Unreal support the development of XR applications for different operating systems, these platforms still require the integration of different APIs. Seeking to reduce this fragmentation, OpenXR was introduced as a cross-platform API. Furthermore, real-time collaboration between different XR applications requires low-latency, wireless communication between the respective devices. For example, Photon Engine can be employed to handle communication between multiple clients as it offers SDKs for various platforms. Despite the existence of these commercial solutions, backend development is still far from straightforward. Previous research [3,4,5] presented architectures and frameworks that combine the different available solutions that seek to facilitate the backend development of collaborative, multi-device XR applications.

Apart from issues related to multi-platform development, network communication, and calibration, the application of XR technologies in real-world, profit-making settings is also impeded by inadequate user interfaces which opens further fields of research. To enable collaboration between on-site collaborators in MR scenes and off-site collaborators in VR scenes, the remote collaborator needs to be provided with a detailed virtual replication of the on-site environment in real time. Existing replication techniques such as 360-degree videos [6], RGB-D cameras [7], or light fields [8] however provide different quality-latency tradeoffs. Another important research topic concerns the semantic segmentation of these virtual replications [9] which is required to make single components of the replication referenceable. While most XR devices provide out of the box interaction techniques, plenty of research is being conducted to enhance their usability. Previous research has for example focused on reducing fatigue while performing in-air gestures to interact with virtual components displayed through a MR-HMD [10]. For MR-HHDs, device-based interaction that maps a HHD’s movement to virtual objects has been proposed as an alternative to fatigue-prone touch-based interaction techniques that require the device to be held with one hand [11]. In the context of collaborative settings, further research has been conducted regarding adequate visualizations of remote collaborators [12].

The framework presented by Pereira et al. [4] was implemented with Unity and supposed to support collaborative interior design. Multiple clients can connect to a server through the integration of Photon Engine which handles communication and maintains synchronization between a shared and several device-specific scenes. While their framework provides access points for VR-HMDs and MR-HHDs, it is focused on the interaction with virtual components only and does not provide interaction with virtually replicated parts of the physical scene. While they give users the option to highlight points of interest, their proposed avatar visualization is likely to produce visual clutter as the number of users increases. Furthermore, different interaction paradigms were implemented for object manipulation with VR-HMDs (motion controllers) and MR-HHDs (touch input), which may complicate switching between access points for the user.

Kostov and Wolfartsberger [3] presented a proof-of-concept application that supports collaborative training for engine construction. Their application is accessible through a VR-HMD, MR-HMD, MR-HHD, and a desktop PC. Communication between the clients and synchronization of the environment is handled through Unity’s networking library. Again, the virtual replication of physical scenes remains unaddressed. In contrast to Pereira et al. [4], Kostov and Wolfartsberger [3] highlight the challenges related to the different device-specific input paradigms. Seeking to reduce the complexity for both developers and users they implement the same button-based user interface for all devices. By clicking the different buttons with the device specific input modality, the users can manipulate the virtual objects according to predefined increments. This interaction approach however limits flexibility and disregards the 3D nature of XR technologies.

In contrast to [3, 4], the framework presented by García-Pereira et al. [5] provides the VR-HMD user with a static virtual reconstruction of the physical environment that is scanned in advance. Physical markers are used to align the orientation of the virtual and the physical world. Further access points are provided through a desktop PC and a HHD that can display both the VR and MR scene. To develop the XR applications and set up the server, Unity and Node.js were employed. Again, different device-specific interaction techniques were provided to interact with virtual components: an external sensor was attached to the HMD to capture hand gestures, the HHD offered a touch-based interface, and the desktop application responded to mouse clicks. While more advanced avatars showing the user’s point of view and a hand ray were generated for each access point, the problem of potentially occurring visual clutter remains unaddressed.

A detailed review of literature related to the research topics mentioned above as well as a future research agenda can be found in [13]. While the investigation of all these agenda items and the development of a total solution is far beyond the scope of a single research paper, consideration must be given in advance to how results from independent research in the different fields need to be integrated in the end. Addressing this issue, we build up on the general concept of scalable XR presented in [13] and develop a human-centered framework that considers scalability between the number of users, the degree of virtuality, and the type of device.

3 XRS Framework: Basic Concept

XR technologies have been assigned considerable potential to decrease temporal and cognitive efforts as well as material costs in many domains. However, existing XR applications are limited to single use cases, specific hardware, or two collaborators. This lack of scalability causes large overheads of time and cognitive resources when switching between devices or applications, such that reduced efficiency outweighs the potential awarded to XR technologies and impedes their application to real-world settings. Addressing these issues, we present a framework for Scalable Extended Reality spaces as introduced in [13]. The development of our framework followed the human-centered design approach as defined in ISO 9241-210: First, we specified the context of use by deriving abstract high-level use cases that exploit XR’s key benefits from specific use cases. The following steps in the human-centered design process were completed based on these high-level use cases, i.e., specifying requirements, design solutions, and evaluation.

3.1 Scalable Extended Reality (XRS)

Seeking to address these scalability limitations and to increase XR’s application in the real world, we introduced the term Scalable Extended Reality (XRS) as a concept for XR spaces that provide multidimensional scalability enhancements (see Fig. 1). Firstly, they should scale between different degrees of virtuality, i.e., from completely virtual spaces to spaces with single physical elements that are augmented with multiple virtual elements to physical scenes that are augmented with single virtual elements. Secondly, they should scale between different devices, i.e., the space should be accessible via HHDs and HMDs. Lastly, XRS is supposed to scale between different numbers of users, i.e., from single users to multiple possibly distributed collaborators. As such, XRS spaces could serve as highly flexible, long-time training or working environments [13].

Fig. 1.
figure 1

Scalable Extended Reality (XRS) Concept; modified replication from [13].

3.2 Context of Use

To understand and specify the context of use, we derived abstract high-level use cases that exploit XR’s key benefits from various XR applications that have been proposed by previous research.

3.2.1 XR’s Fields of Application

Throughout the last decade, research on potential application areas for XR technologies has been conducted and revealed promising use cases in different fields. Since XR technologies provide a more intuitive way to inspect and interact with three-dimensional elements than conventional desktop applications, they have been assigned considerable potential to support design and engineering reviews in different domains. For instance, Wolfartsberger [14] developed and evaluated a VR system to support the design review of power units, Gong et al. [15] developed a multi-user VR application that allows globally distributed users to cooperate in an automotive design review task, and Kaluza et al. [16] integrated methods from visual analytics in a MR application to support decision-making in automotive life cycle engineering. Another promising field of application for XR technologies is factory layout planning such as presented by Gong et al. [17] who developed a VR system for factory layout planning that seeks to facilitate the modeling process and to improve decision-making through more accessible visual representations. XR applications may be used in a similar way for supporting interior design. For instance, Vazquez et al. [18] developed a MR tool that provides scale-accurate virtual augmentations with virtual furniture. Furthermore, XR technologies may find application at construction sites for safety training in VR scenarios as presented by Wu et al. [19] as well as for supporting workers in monitoring and documentation tasks with virtual augmentations such as presented by Zollmann et al. [20]. A detailed review of potential aerospace applications of VR technologies was given by Pirker [21]: Use cases listed in the paper include training in simulations, teleoperation of remote machines, testing, design reviews, collaboration, and remote assistance. Further interesting fields of application for XR technologies can be found in healthcare. The literature review conducted by Sadeghi et al. [22] revealed several interesting XR applications in the context of cardiothoracic surgery, including surgical planning, training in virtual simulators, and intraoperative guidance. For instance, in assistive scenarios visual information augmenting the surgeon’s field of view can be scaled and placed according to the surgeon’s personal preferences [23]. Similarly, maintenance tasks can be supported by XR technologies as relevant information can be directly projected into the worker’s field of view [24]. These instructions can either be provided by the system automatically or by a remote collaborator. Ultimately, XR technologies provide a powerful tool for supporting collaboration between co-located and distributed collaborators in more complex, 3D tasks that cannot be completed via 2D desktop sharing. For instance, Bai et al. [25] presented a system that allows to share a local working space with a remote collaborator who can deliver support in terms of visual cues that augment the local worker’s field of view.

3.2.2 XR’s High-Level Use Cases

As summarized in Sect. 3.2.1, the application of XR technologies is deemed beneficial in many domains. To develop our framework, we abstracted these specific use cases and grouped them into the following five high-level use cases. By implication, the framework can then be adapted to any low-level use case that can be described by the blueprints of the high-level use cases or any combination of them.

XR technologies can be applied to support the training of complex or safety-critical tasks as virtual environments reduce safety issues, such that less supervision is required, and trainees can practice the task independently and more often. Virtual training environments can be easily set up multiple times such that accessibility to the training environment is improved and multiple trainees may practice a task at the same time. Furthermore, XR technologies can be used to support both co-located and distributed collaboration. In co-located scenarios, multiple collaborators may be provided with a customized access to the XR space. As such they can be provided with the level and representation of information that fits their responsibilities, experience, and personal preferences. Distributed collaborators can join this collaborative session in a virtual replication of the scene. In both cases, collaborators can be provided with awareness cues displaying the other collaborators’ locations and activities. Like the use case of distributed collaboration, XR applications can be applied for teleoperating machines and robots. Depending on how far the worker is away from the actual working place, he or she can be provided with a MR or VR scene in which the machine can be operated through virtual user interface components that allow reviewing the effect of a command in a virtual simulation prior to actual execution. Another promising field of application for XR technologies concerns the design and development of physical items. Temporal and financial costs of prototyping can be reduced by the integration of virtual components.

3.2.3 Dependencies Between XR’s Key Benefits and High-Level Use Cases

The relevance of the high-level use cases described in Sect. 3.2.2 can be explained by their exploitation of XR’s key benefits. The benefits exploited by each high-level use case and possible combinations of high-level use cases are displayed in Fig. 2.

Training scenarios are supported by the seamless integration of real and virtual elements that allow to display virtual augmentations in the exact right time and place. Hence, trainees do not have to shift their focus between multiple sources and can keep concentrating on their actual task. Since these virtual elements can be quickly modified, the trainee can be provided with the degree of virtuality and level of information that is needed. Similarly, in collaborative scenarios each user may be provided with a customized access to the XR space as virtual augmentations can be easily modified. To facilitate communication and prevent misunderstandings between collaborators, virtual elements that display user activities can be seamlessly integrated in the co-located collaborator’s field of view, whereas distributed collaborators may be provided with real-time virtual replications of the on-site environment. Depending on a teleoperator’s location he or she may be provided with a virtual replication of the on-site environment or with virtual augmentations that are seamlessly integrated in the physical scene. The design and development process of physical items also benefits from the digital nature of virtual prototypes which can be modified quickly and thus accelerate workflows and save material costs. As the physical item evolves, the degree of virtuality can decrease, and the existing physical parts of the prototype can be seamlessly augmented with virtual elements. Depending on the specific use case, decision-making and quality control can further be supported by the in-context visualization and analysis of data from relevant digital twins or sensors.

Apart from the independent implementation of each of these high-level use cases, they can also be combined with each other. For instance, design and development tasks can be performed by single users, together with co-located or distributed collaborators, or both. Furthermore, teleoperating a machine can be a task that is subject to a training scenario. At the same time, teleoperation of robotic arms may be implemented to allow distributed collaborators to remotely manipulate physical objects on site (see Sect. 5.7). Training tasks can be set up in collaborative scenarios, too. In that way the trainee can be supported by distributed or co-located collaborators that watch the trainee completing the task and may intervene if necessary.

Fig. 2.
figure 2

XR’s key benefits exploited by high-level use cases (red, green, and orange arrows on the left) and possible combinations of high-level use cases (blue lines on the right).

4 XRS Framework: Requirements

Based on the identified high-level use cases as listed in Sect. 3.2.2, we specified functional and non-functional requirements for a human-centered XRS framework.

4.1 Functional Requirements

The functional requirements of the XRS framework concern the hardware and technology that is employed to access the XRS space, the interaction modalities the users are provided with, and the visualization of real and virtual components as well as of the users’ locations and activities.

Access

RQ 1:

On-site users can access the XRS space via a MR-HMD or a MR-HHD.

RQ 2:

Off-site users can access the XRS space via a VR-HMD.

Interaction

RQ 3:

On-site users can reference real components.

RQ 4:

On-site users can reference virtual components.

RQ 5:

On-site users can manipulate real components.

RQ 6:

On-site users can manipulate virtual components.

RQ 7:

Off-site users can reference real components.

RQ 8:

Off-site users can reference virtual components.

RQ 9:

Off-site users can manipulate real components.

RQ 10:

Off-site users can manipulate virtual components.

Visualization

RQ 11:

Each collaborator sees where the other collaborators are.

RQ 12:

Each collaborator sees what the other collaborators do.

RQ 13:

Off-site users are provided with a virtual replication of static real components.

RQ 14:

Off-site users are provided with a virtual replication of dynamic real components.

RQ 15:

On-site users are provided with visual representations of virtual components that are seamlessly integrated into the physical scene.

RQ 16:

Off-site users are provided with visual representations of virtual components that are seamlessly integrated into the virtual scene.

4.2 Non-functional Requirements

The application of such a system in real-world settings further requires the maintenance of usability across different system configurations.

RQ 17:

Users can intuitively switch between devices.

RQ 18:

Users can intuitively switch between degrees of virtuality.

RQ 19:

Usability is maintained with an increasing number of collaborators.

RQ 20:

The interaction techniques for manipulating and referencing virtual elements provide high usability.

5 XRS Framework: Design Solution

Based on the functional and non-functional requirements specified in Sect. 4, we developed a framework design solution for XRS spaces (see Fig. 3) that incorporates the following system features.

5.1 Access Points and Data – RQs 1, 2, 17, 18

Our framework provides three different access points to the XRS space: On-site users can access the XRS space either via MR-HMDs or via MR-HHDs and off-site users can access the XRS space via VR-HMDs. Firstly, a virtual replication of the static components is generated. This builds the basis for the VR scene through which off-site users can access the XRS space. Next, virtual replications of dynamic real components as well as of on-site and off-site collaborators are created and added to the VR scene. On the other side, virtual replications of off-site users are integrated into the MR scene. Furthermore, exclusively virtual components are added to both the MR and VR scene. Throughout the session, clients read and write data from and to a database storing information about each user, real and virtual component.

5.2 Subscribing to Collaborators – RQs 11, 12, 19

As the number of collaborators increases, usability may decrease as adding visualizations of each collaborator’s location and activity to the scene is likely to produce visual clutter and confusion. To prevent these issues and allow collaborators to keep concentrating on their actual task, they should only be provided with the information needed for task completion. To this end, each user can individually subscribe to visual representations of the other collaborators. The database holds information about each user’s id, role (i.e., on-site user or off-site user), activity (i.e., referencing or manipulating objects), position and orientation in space, and the individual subscriptions to other collaborators’ locations and activities. For each collaborator, the database stores references to a set of collaborators whose location should be represented as an avatar and whose activities should be represented by visual cues such as hand pointers and gaze rays. In contrast to off-site users, on-site users cannot subscribe to avatars of other on-site users. These references can be set prior to the collaborative session and updated through the users during the session. Users can subscribe and unsubscribe to visual representations of other collaborators’ locations and activities through different modalities that may be based on context-menus, speech recognition, or direct interaction (e.g., looking or pointing at avatars to activate cues). As such, the corresponding references in the database will be updated accordingly and the collaborators’ visual representation will be adapted individually for each user.

5.3 Visualizing Static Scene Components – RQ 13

We refer to scene components as static if they are not meant to change their position or orientation during the session (e.g., the room in which on-site collaborators are located). To provide scalability between degrees of virtuality, off-site users should be provided with a virtual replication of these static real components. As described in [13], physical scenes can be virtually replicated with different techniques that differ in terms of their quality-latency trade-off. Hence, static scene components that require few to no updates during the collaborative session should be replicated with techniques providing the highest quality at the cost of high latencies. If the static scene components need to be referenceable, the virtual replication needs to be semantically segmented.

Fig. 3.
figure 3

The Human-Centered Scalable Extended Reality (XRS) Framework.

5.4 Visualizing Dynamic Scene Components – RQs 14, 15, 16

Dynamic scene components on the other hand refer to real and virtual components whose position and orientation is meant to change a lot throughout the session. We use the term dynamic scene components for objects that physically exist on site, for their virtual replication, and for exclusively virtual objects. In this context it should be noted that while users themselves could be considered as dynamic components, we consider them separately in the next chapter because they have more properties that may change during the session than objects. Physically existing components do not have to be visualized for on-site but for off-site users. In contrast to static scene components, they are manipulable and hence the position and orientation of their virtual counterpart must be updated more often. The generation of the virtual replications of these dynamic scene components depends on each physical object’s properties which are manipulated during the session. For instance, if only the position and orientation is manipulated, the object can be replicated in advance and integrated into the off-site collaborator’s virtual environment such that during runtime, only the position and orientation of the object must be tracked, exchanged, and updated. However, if the object’s appearance itself will be manipulated during the session, the effects of these manipulations on the object’s appearance must be tracked and updated accordingly. The visualization of dynamic virtual components (i.e., components that are exclusively virtual in both the MR and VR scene) is less complex: The object’s properties can be stored in a database which is updated in line with user interactions. As such, each client can read the same information about the object (e.g., its position, orientation, size, status, or current owner) from the database.

5.5 Visualizing User Location and Activity – RQs 11, 12

To support both, co-located and distributed collaboration, users should be provided with information regarding their collaborators’ location and activity when needed. While co-located collaborators that are using MR devices can naturally see each other, remote collaborators using VR devices need to be provided with virtual replications of their collaborators whose position and orientation is updated in real time. Similarly, on-site users need to be provided with information about their remote collaborators. Besides information on the user’s location, information regarding a user’s current activity (e.g., referencing or manipulating objects) can also be relevant for both co-located and distributed collaborators. Hence, each user is provided with a visual representation of the other collaborators that corresponds to the user’s individual subscriptions that are stored in the database as described in Sect. 5.2 . As such, off-site users can for example activate avatar representations of their collaborators together or without hand pointers and gaze rays. The same holds for on-site users that subscribe to off-site collaborators. On the contrary, on-site collaborators can only subscribe to hand pointers and gaze rays for other on-site collaborators.

5.6 Referencing Scene Components – RQs 3, 4, 7, 8

To support co-located and distributed collaboration, collaborators need to be able to reference scene components. This means they should be able to execute an action (e.g., pointing at a scene component) such that this scene component is highlighted for all collaborators that subscribed to the user referencing the component. For instance, this can be implemented by adapting the object’s visual representation (e.g., by changing the color of a virtual component or a real component’s virtual replication in VR or by augmenting real components with virtual overlays in MR) or by playing 3D audio. Referencing an object can also be interpreted as the selection of an object which can be followed by a manipulation.

5.7 Manipulating Dynamic Scene Components – RQs 5, 6, 9, 10

Our framework focuses on the manipulation of virtual and real components in terms of translation and rotation. To let on-site and off-site collaborators manipulate virtual components, appropriate input techniques are required such that the updated position and orientation of the virtual component can be computed, and its visual representation can be adapted accordingly. The manipulation of real components in the on-site environment, especially for off-site collaborators, is more complex. To this end, we propose the integration of a robotic system, that allows collaborators to remotely translate or rotate real components on site: First, off-site collaborators can manipulate the virtual replication of the real component. As soon as the off-site collaborator confirms the manipulation, the updated position and orientation of the virtual replication is sent to the robot application which automatically computes the necessary motion planning for the robot to adapt the corresponding real component’s position and orientation respectively. A similar approach could also be useful for on-site users that want to manipulate large or heavy objects: Physical objects could be augmented with virtual overlays that are manipulated by the on-site user to control the robot. In the case of a training session, the connection with the robotic system can be disabled prior to the start of the session.

5.8 Scalable Interaction Techniques – RQs 17, 18, 20

To provide scalability between different devices and degrees of virtuality and allow users to intuitively switch between them, scalable interaction techniques for referencing and manipulating virtual components or virtual replications of real components are needed. These interaction techniques must scale between all access points to the XRS space (i.e., MR-HHDs, MR-HMDs, and VR-HMDs). In this context we refer to an interaction technique to be scalable, if switching between access points is possible without large overheads of cognitive efforts that are required to relearn and re-adapt to the system. In other words, users should be able to switch between access points intuitively. While existing interaction techniques rely on different input paradigms (e.g., in-air gestures for MR-HMDs, touch for MR-HHDs, controllers for VR-HMDs), scalable interaction techniques should be based on similar input paradigms that provide high learnability and memorability. Despite the relevance of this question, it has been addressed by very few research papers so far. For example, Kostov and Wolfartsberger [3] implemented a cross-device button-based user interface for VR-HMDs, MR-HMDs, and MR-HHDs. However, the selection of these buttons still relies on the device-specific input modality and disregards advanced, spatial interaction paradigms. Apart from enhanced scalability, the interaction techniques themselves should provide high usability as defined in ISO 9241-11. The development of such scalable interaction paradigms is a complex and interesting topic for future research. The full scalability regarding options of interaction between on-site and off-site users as implemented in our framework builds the basis for a future integration of such scalable interaction techniques.

6 XRS Framework: Walkthrough

As a first evaluation of our framework this chapter provides a theoretical walkthrough for two use cases that can be described by a combination of the high-level use cases introduced in Sect. 3.2.2.

6.1 Collaborative Prototyping

The design and development of physical items such as cars usually involves co-located as well as distributed collaborators from different fields. Throughout the design and development process, each of these collaborators has different tasks and responsibilities that require different levels of information. The framework presented in this paper can help providing these collaborators with this information exactly when and where it is needed.

At the beginning of the design and development process, during the ideation phase, users may be immersed in a completely virtual environment to collect ideas and develop a proof of concept. If no physical parts of the product exist yet, there is no on-site environment and all users join the XRS space in a VR scene in which they can reference and manipulate dynamic virtual components (i.e., the components of the product to be developed). Each collaborator may subscribe to other collaborators to receive information regarding this collaborator’s location and activity. In each frame the position, orientation, and activity of each user is written to the corresponding database and retrieved from the collaborators according to their subscriptions.

As soon as the first physical parts of the product exist, co-located collaborators that are located on site may switch to a MR scenario in which the physically existing real components are augmented with virtual components that display the missing parts. In iterative prototyping stages, different virtual configurations of the missing parts can be tested out and reviewed according to key parameters that may be generated in real time by integrated digital twins. To do so, on-site users can reference and manipulate the real and virtual components as described by the framework. Collaborators that are located off site can join the XRS space through the same VR scene as before. To this end, a virtual replication of the static real components is generated which builds the basis for the VR scene. Properties of dynamic real components are tracked, and their virtual replication is integrated into the VR scene and updated together with the exclusively virtual components that are continuously tracked as well. Off-site users can then reference and manipulate both virtual replications of real components and exclusively virtual components as described in the framework. Like in the exclusively virtual space in the beginning, both on-site and off-site users may subscribe to each other to receive information about each other’s location and activities.

The degree of virtuality decreases throughout these iterative prototyping stages until the physical prototype is only augmented with single virtual components. The implementation of scalable interaction techniques for referencing and manipulating objects allows users to intuitively switch between degrees of virtuality (which may change continuously from virtual to real as the product evolves or abruptly if on-site users become off-site users or vice versa). Considering real-world settings, it is very likely that one person is involved in multiple design and development processes of different products at once. As the current stage of development may differ between these products, this person may have to switch between devices and degrees of virtuality multiple times per day. Furthermore, on-site users may become off-site users depending on their location which again requires switching between devices and degrees of virtuality. Our framework implements multidimensional scalability enhancements to reduce the temporal and cognitive efforts of switching between these technologies and allows users to keep focusing on the actual task.

6.2 Training and Teleoperation

To guarantee the correct and safe execution of complex and dangerous tasks workers need to undergo appropriate training in advance. However, access to the corresponding machines may be limited as these are in use or occupied by other trainees. Furthermore, especially dangerous tasks may require supervision by experts whose availability is limited. Transferring these training sessions to virtual scenarios can reduce the necessity of supervisors as safety-critical parts are eliminated. On top of that, the training of multiple trainees can be parallelized as virtual environments can be replicated, given that enough hardware is available.

The application of our proposed framework allows trainees to start in a VR scene in which they are provided with a virtual replication of the actual working environment and can practice operations by manipulating virtual replications. In this case, the interaction with virtual replications should not be executed in the real working environment. As the trainees make learning progress, they may switch to MR scenarios where they can practice operations in the actual working environment while safety-critical parts may still be virtualized. Through the implementation of scalable interaction techniques that rely on the same interaction paradigms as in the VR scene, they can concentrate on the actual task (i.e., the operation to be practiced). Once training is accomplished and they move on to the operation in the real world, they can still be provided with visual overlays that display in-context information in the beginning.

At the same time, the novice worker can request help from a remote expert which is located off site and joins the XRS space in a VR scene. Again, this VR scene can be generated by the virtual replication of static real components which is then augmented in real time with virtual replications of dynamic real components and exclusively virtual components. Depending on the specific task, the remote expert’s interactions with virtual replications may be executed on site. Thus, remote experts may act as teleoperators that manipulate virtual replications of a machine in VR to command a robot or machine on site to execute the operation in the real world. The application of our framework allows to design similar user interfaces for remote assistance and teleoperation – tasks that are likely to be completed by the same person. As such, persons that act as both remote experts and teleoperators benefit from the multidimensional scalability enhancements that provide them with a highly memorable user interface.

7 Conclusion

This paper presents a human-centered framework for XRS spaces that implements multidimensional scalability enhancements regarding different degrees of virtuality, different devices, and different numbers of users. As such, we contribute to the list of highly relevant research topics presented in [13] and seek to foster the application of XR technologies in profit-making, real-world use cases which is currently impeded by overheads of cognitive and temporal efforts that outweigh the potential inherent in XR technologies.

The presented framework provides three access points for on-site and off-site users that can reference and manipulate both real and virtual components. To this end, off-site collaborators are provided with a virtual replication of the on-site environment that integrates virtual replications of real components and exclusively virtual components to a VR scene which can be accessed with a VR-HMD. On-site users can access the XRS space with MR-HMDs or MR-HHDs. The visualization of scene components is handled through a database that stores information about each user, real and virtual component. While all this information is accessible for all users, not all users may need all this information for effective task completion. Hence, users may subscribe to their collaborators individually to obtain the needed level of information. Furthermore, our framework provides full scalability regarding interactivity through the integration of a robotic system which allows remote users to manipulate real components on site. This scalability is of high relevance, as users should be able to switch between degrees of virtuality and devices depending on their role and location without losing options for interaction. In the future, the framework may also be extended with scalable interaction techniques that rely on similar input paradigms for referencing and manipulating real and virtual components through all access points.

The framework was developed based on five high-level use cases that exploit XR’s key benefits: Design and development of physical items, training, teleoperation, co-located and distributed collaboration. Since these high-level use cases serve as blueprints that can be combined with each other to describe specific low-level use cases in the real world, the framework can by implication be used by collaborators as well as by single users in many different domains. Hence, our framework provides the foundation required to implement specific XRS applications which will be part of our future work.