1 Introduction

Orientation and navigation are fundamental aspects of everyday life. Estimating distances and angles, recalling and recognizing object locations and memorizing routes or landmarks are just a few examples of many basic tasks in outdoor as well as indoor environments (e.g., Keil et al. 2019; Plumert et al. 2005; Postma and De Haan 1996). To optimize the effectiveness and efficiency of these tasks, people often use geospatial media, such as maps or map-like representations. The academic fields of cartography and spatial cognition have increasingly addressed the question of how to integrate and use modern technologies—usually originating in IT- and entertainment industries—to develop proper geospatial media (e.g., Edler et al. 2019; Hruby 2019; Knust and Buchroithner 2014).

Going beyond traditional 2D print approaches, the portfolio of cartographic products was strongly influenced by digital techniques since the establishment of the computer as a mass media device and, in addition, as a tool to create other mass media (e.g., Clarke et al. 2019; Taylor and Lauriault 2007; Müller 1997). Multimedia cartography—sometimes also referred to as “cybercartography” (Taylor 2005)—lead to fundamental approaches of computer-based animated, interactive and multisensory (web) map applications (e.g., Kraak 1999; Peterson 1995; Krygier 1994). It is argued that animation techniques used in cartography have been influenced by the computer and video game industries (Edler et al. 2018a; Edler and Dickmann 2017; Ahlqvist 2011; Corbett and Wade 2005).

The development of computer-based animation techniques came along with appropriate software and hardware solutions. Higher performances of software and hardware also allowed the further development of stable and detailed 3D visualization methods and techniques. For example, autostereoscopic displays allowed to generate 3D depth effects, which was explored in several studies on visualization and user experiments in cartography (e.g., Edler and Dickmann 2015; Bröhmer et al. 2013; Buchroithner 2007). Moreover, the open availability of game engines, such as Unity and Unreal Engine, supports the creation of individual 3D landscapes that can be accessed with virtual reality (VR) headsets, in real-time and from the ego perspective—thus, creating an impression of immersion. The potentials of VR-based visualization are currently under study (e.g., Cöltekin et al. 2019; Edler et al. 2018b; Hruby et al. 2019; Kersten et al. 2018).

Closely related to 3D visualization in VR are 3D visualizations in augmented reality (AR). AR techniques allow to project static or animated objects into real environments, thus extending real physical environments. Representing an early development stage, AR visualization techniques can be based on so called mid-air displays, sometimes also referred as free-space displays (Dickmann 2013). A mid-air displays projects graphical objects on free projection surfaces, such as a hardly visible wall of fog (“fog screen”) created by an installed blower (DiVerdi et al. 2008).

A famous example of an AR application interacting a lot with space is the gaming app “Pokémon GO” (Zhao and Chen 2017). In this smartphone- or tablet-based game, users interact with audiovisually animated game characters that can be found in the real environment. In this way, the whole logic and process of the game is added like an additional information layer into the physical landscape. The smartphone or tablet is the ‘physical gateway’ to this augmentation. The camera of the device is used to record the area in front of the user and the recordings are augmented with virtual objects in real time. As demonstrated by de Almeida Pereira et al. (2017), this technique can also be used to augment physical 2D maps with 3D geographic information as height maps.

Two head-mounted display (HMD) devices representing the state of the art of current AR are the Microsoft HoloLens (version 2 recently introduced) and the HTC Vive Pro. The Microsoft HoloLens (Fig. 1) uses a pair of smart glasses. Stereoscopic images (holograms) are projected onto two small lenses in front of the eyes (Noor 2016). As these lenses are see-through (Gruenefeld et al. 2017), the projected holograms merge with the real environment. The HTC Vive Pro (Fig. 1) is a VR headset with AR capability. The environment in front of the user can be recorded with two cameras inside the HMD and rendered on two displays located in front of the eyes of the user (HTC 2019). Additionally, the camera recordings can be augmented by artificial elements acting as stereoscopic holograms.

Fig. 1
figure 1

The Microsoft HoloLens (top) and the HTC Vive Pro (bottom) represent the state of the art of augmented reality HMDs

1.1 The Potential of AR Techniques for Experiencing Space

As both the Microsoft HoloLens and the HTC Vive Pro are capable of tracking head movements, they make it possible to create an impression of permanent presence of holographic geospatial objects. Even if the user walks around in a defined area, commonly indoor area, holograms remain and adopt to the user location and viewing perspective. This permanent and adaptable holographic projection may lead to visualization approaches that bring additional advantages for the cognitive processing of the geospatial area experienced.

Empirical research of cartography, spatial cognition and experimental psychology has recently led to some recommendations for the construction of user cognition-oriented cartographic media. For example, it was reported in user experiments that an additional layer of square grids increases the performances in memory of object locations (Bestgen et al. 2017; Kuchinke et al. 2016; Edler et al. 2014) and in estimating longer linear distances in maps (Dickmann et al. 2019). The grid-based memory effect also occurs if the grid structure is physically reduced to indicated (“illusory”) lines (Dickmann et al. 2017), gets a depth offset (Edler et al. 2015) or is changed to a hexagonal pattern (Edler et al. 2018c). Other studies reported that reducing the visibility of some map areas can direct visual attention towards other map areas (Keil et al. 2018), and that the display of landmarks can improve route knowledge (Ruddle et al. 2011) and orientation (Li et al. 2014).

The above mentioned cognition-based effects on spatial performance measures in maps are promising results indicating that an extended communication of spatial information can bring advantages for the map user in terms of map perception, orientation, navigation and the formation of spatial knowledge. Similar effects will likely occur in real 3D environments augmented by holographic spatial objects. These holographic layers could offer an additional (geometrical) structure to support the cognitive processing of object locations, distance estimations and relative directions between objects.

To investigate possible effects for the perception of spatial information, new methodological challenges occur. The possibility to implement spatial models in AR applications has already been investigated and described (Wang et al., 2018). However, to take full advantage of the possibilities of AR for geospatial applications, technical limitations of the current available AR devices must be faced. These include the precise placement and stability of holograms in the three-dimensional space, a crucial quality criterion for AR applications (Harders et al. 2008). Having found stable solutions that guarantee a high spatial precision, the AR devices can become valuable methodological tools in geospatial experiments focused on fundamental questions of spatial cognition in 3D environments. In user studies, they could be used to project holographic objects in the environment. Moreover, AR devices could assist experimental investigators to arrange the spatial layout of movable real world objects used in their study. The projection of ‘virtual place markers’ can increase the precision of identical spatial object arrangements, which—from a methodological perspective—increases the comparability of acquired user data (between participants). Moreover, projected ‘virtual markers’ can support the analysis of user tasks, such as the identification/measurement of distortion errors, for example, in location memory tasks.

To exploit the possibilities of AR systems for geospatial user experiments, it is necessary to create technical methods to establish controlled procedures and to standardize the placement of holographic objects in a real 3D setting. In the following sections, we describe the functionality of current AR systems, how technical factors of these AR systems affect the targeted placement of holograms, and additional requirements for geospatial applications and experiments. To address and resolve the described discrepancy between requirements and limitations, we present a self-developed AR interface application capable of (re-)placing holograms highly accurately during runtime. Code examples are provided for transparency, a better understanding, and replicability.

2 Hologram Placement and Display

In many VR and monitor-based 3D applications, positions of 3D objects are ‘hard coded’, i.e., their positions are predefined and cannot be changed. The advantage is that all participants see exactly the same arrangement of visual stimuli, which is often a highly relevant precondition for geospatial experiments. However, in AR-based applications, several technical limitations demand a more flexible approach for the placement of visual stimuli.

One of the greatest technical differences between the two most popular AR headsets, the Microsoft HoloLens (including version 2) and the HTC Vive Pro, lies in the tracking system used to register head movements and to adjust the displayed holograms. The Microsoft HoloLens uses an inside-out tracking system, i.e., the sensors used to track the position and rotation of the head are mounted to the HMD. Six cameras integrated in the headset scan the environment (Evans et al. 2017). The camera recordings are then used to build a spatial representation of all perceived objects (Liu et al. 2018). As this representation can be exported as a 3D model, the HoloLens can also be used to visualize 3D space (see Fig. 2).

Fig. 2
figure 2

Picture of a room (left side) and the 3D model of this room built by the Microsoft HoloLens based on camera recordings (right side). Notice how static objects (e.g., the doorframe and shelves) are represented very accurately whereas moveable objects (e.g., the monitor) are somewhat bulky. This bulkiness represents the attempt of the HoloLens to account for the modified arrangement of spatial objects after the objects have been slightly repositioned

Head movements are registered by matching objects recorded by the cameras in real-time to objects already represented in the spatial model. Calculating the relative position towards these objects then allows to triangulate the current head position and rotation inside the 3D space. The advantage of pure inside-out tracking is that no additional hardware is required. Given that the Microsoft HoloLens is a standalone device (Evans et al. 2017), people can walk freely and use the device seamlessly in different rooms or even floors. However, inside-out tracking also has some serious disadvantages concerning the placement of static holograms. First, image analysis, the precondition for image-based tracking, requires a lot of processing power (Liu et al. 2013). As the processing power of a standalone device is naturally limited, this leads to only moderate tracking accuracy and occasional tracking lags. Second, the cameras need to identify at least some reference objects within the spatial range of the cameras, which according to our experience is approximately 5 m. Therefore, tracking lags regularly occur in large empty spaces, especially outdoors. Additionally, poor lighting conditions may negatively affect the capability to identify referenced objects (Loesch et al. 2015). The mentioned tracking lags often lead to a distorted or shifted internal coordinate system. In these occasions, the positions of all ‘hard-coded’ holograms relative to real world objects are also distorted or shifted and need to be readjusted. A third limitation of the Microsoft HoloLens concerning the placement of static holograms is rather code-based than tracking-based. Each time an application is started, the internal coordinate system is set relative to the position of the headset. As it is almost impossible to place the headset at the exact same position when an application is started, ‘hard coded’ static holograms cannot be placed reliably at the same real world position twice.

In contrast to the Microsoft HoloLens, the HTC Vive Pro uses a combination of inside-out and outside-in tracking system. In contrast to inside-out tracking, outside-in tracking uses stationary sensors located around the tracked space to register head movements. In the case of the HTC Vive Pro, two SteamVR tracking infrared base stations (previously called Lighthouse) are placed in opposite corners of a tracked space with a maximum diagonal size of 5 m (HTC 2019). These base stations interact with photo sensors built into the headset and hand controllers. By comparing the times different sensors perceive the signals of the base stations, the positions of the HMD and the controllers in 3D space can be triangulated. Additionally, inside-out tracking is used to generate a 3D model of the objects inside the tracked area. Similar to the Microsoft HoloLens, the 3D model is generated based on two cameras in the headset. Combining inside-out and outside-in tracking limits the mobility of the device, as the tracking area is confined to the range of the base stations. However, this technique has clear advantages in terms of tracking accuracy. First, the use of infrared base stations provides a high tracking accuracy and precision (Ng et al. 2017). Second, the static positioning of base stations also allows for much more accurate resets of the internal coordinate system after short tracking losses or lags. Therefore, the positions of static holograms relative to real world objects need to be readjusted less often.

Readjustments of displayed holograms may not only be required based on the described technical limitations of AR systems. Some experimental designs or other AR applications may pose requirements to change the real world positions of holograms, or to add (spawn), select, or remove (despawn) holograms during runtime. For example, a small scale holographic visualization of a construction project (see example in Fig. 3) would benefit from the ability to easily modify building positions to visualize different architectural layouts. Although natural interaction like dragging objects with a controller is theoretically possible, this does not apply for the highly accurate and standardized placement of holograms required in scientific experiments used to investigate cognitive effects of holograms on the perception of geographic space. Additionally, the lack of a multi button controller for the Microsoft HoloLens demands other ways of controlling hologram display and interaction effectively. Therefore, we decided to develop a holographic AR interface application that allows spawning, selection, despawning and highly accurate (re-)positioning of holograms during runtime. As visibility of the developed interface may unintentionally affect the behavior of participants in scientific experiments, we also provided a function to reduce interface visibility. To support the development of other demand oriented AR interfaces, the workflow and functions of the developed interface are described in detail.

Fig. 3
figure 3

Holographic representation of the old industrial landscape Zeche Holland in Bochum Wattenscheid displayed by the Microsoft HoloLens

3 Implementing Interactivity of Holograms

We chose to develop the AR interface application with the Unity game engine (version 2018.3), as it supports both the Microsoft HoloLens and the HTC Vive Pro. The code was written in the C# programming language. The holographic interface consists of UI (user interface) elements provided by Unity, namely Buttons and Texts. Buttons are used to trigger actions by user input and Texts are used to provide information about the Buttons. The UI elements are arranged on a Canvas, a layer required to display the UI elements. Additionally, an Event System is used to send input from the user to the UI elements. Figure 4 illustrates a minimalistic UI hierarchy in Unity.

Fig. 4
figure 4

Minimalistic example of a user interface in Unity. The button and text elements are set as children of the canvas. The automatically added event system handles user input

3.1 User Input

The mode of user input is determined by the used AR hardware. The HTC Vive Pro uses hand controllers that send a ray similar to a laser pointer. The user can use this ray to aim at a Button of the holographic interface and to interact with it by clicking on a trigger button on the hand controller (see Fig. 5). The Microsoft HoloLens uses an invisible ray that is casted forward from the front of the headset. When the invisible ray hits a hologram, a circular cursor is displayed at the hit point. To aim at a Button, the user needs to turn the headset towards the Button. In other words, the Button needs to be in the center of the field of view. The user can then interact with it using the air tap gesture, which includes lifting and then bending the index finger (see Fig. 5).

Fig. 5
figure 5

The upper picture shows a HTC Vive Controller and the ray used to aim at UI buttons. The two bottom pictures demonstrate the air tap gesture used by the Microsoft HoloLens to interact with UI buttons

3.2 Spawning

As adding spawnable holograms or removing them should be as easy as possible, we created an empty array of spawnable Prefabs. Prefabs are Unity elements that act like a copy template. They are objects with defined characteristics, such as shape, size or color. In this specific case, Prefabs are the spawnable holograms. The size of the Prefab array can be changed and Prefabs can be added or removed. When the application is started, a button is added to the AR interface for each element in the Prefab array (Fig. 12, area 1). Pressing one of these buttons creates an instance (a copy) of the associated Prefab and adds it to the list of spawned objects (see Fig. 6). By pressing the same button again, additional instances of the same hologram Prefab can be spawned. For each spawned hologram, a button is added to the list of spawned object buttons (Fig. 12, area 2).

Fig. 6
figure 6

The function “SpawnObject” creates a new hologram. The spawn position is set relative to the position of the interface (0.5 m below and 3.5 m in front of the interface). Subsequently, the list of spawned buttons gets refreshed to add a button for the newly spawned hologram

3.3 Selection

To select a spawned hologram, the user has to interact with the button representing the spawned object (Fig. 12, area 2). The button then invokes a function that defines the associated hologram as the active object (Fig. 7), which can then be despawned, moved or rotated. The associated button of the selected hologram is highlighted with a red background.

Fig. 7
figure 7

When a button of a spawned object is pressed, the associated object is defined as the active object

3.4 Despawning

Once a hologram has been selected, pressing the despawn button (Fig. 12, area 3) despawns the object using the Unity function “Destroy()” and removes it from the list of spawned objects (Fig. 8). Simultaneously, all spawned object buttons are removed and a new button is added for each element in the updated list of spawned objects (Fig. 12, area 2).

Fig. 8
figure 8

The function “DespawnObject” despawns the currently active object, removes the reference to this object from the list of spawned objects and removes the associated object button

3.5 Placement

To enable movement and rotation of spawned holograms, eight buttons were added to the interface (Fig. 12, area 4). Six buttons allow positive or negative movement along the three spatial axes. Two buttons allow clockwise or counterclockwise rotation. Pressing one of these buttons moves or rotates the currently selected hologram by a value defined in a scale variable (see Fig. 9).

Fig. 9
figure 9

Pressing the left or right arrow in the interface (see Fig. 12, area 4) invokes the function “setX”. This function checks if an object has been selected and, if this is the case, moves the object along the x axis. The scale variable defines the distance that the object is moved along the x axis

As the required accuracy and the distance between the original and desired location may vary, four buttons used to adjust the scale variable were added (Fig. 12, area 5). Pressing these buttons allows to set the scale to 1 mm, 1 cm, 10 cm or 1 m (see Fig. 10). The corresponding button of the currently selected scale is highlighted with a red background. Switching the scale enables quick movement of holograms across long distances, as well as accurate placement in the millimeter range.

Fig. 10
figure 10

The function “SetScale” changes the length by which objects are moved in one step. Additionally, it highlights the button linked to the currently selected scale value

3.6 Reducing Interface Visibility

A relevant precondition for the ability to use the developed interface for scientific experiments is the possibility to optionally reduce its visibility, as it could distract participants’ attention and thereby affect the experimental results. Therefore, we implemented a Button that can toggle the visibility of all other Buttons and the Texts on or off (Figs. 11 and 12, area 6). The invoked function checks whether all other UI elements are currently visible (“UiVisible”). Then it loops through all static UI elements (“objectsToToggle”) and the dynamically generated Buttons of the spawnable and spawned objects (Fig. 12, area 1 and 2) and either hides or unhides them using the Unity function “SetActive()”.

Fig. 11
figure 11

The function “toggleUi” either hides or displays all UI elements except the Button used to invoke the function

Fig. 12
figure 12

Overview of the built AR interface. The green rectangles and numbers are inserted to visualize the different areas and are not part of the actual interface. Area one contains a Button for each spawnable object. Pressing one of the Buttons spawns an instance of the associated spawnable object. For each spawned object, an associated Button is added to area two. These are used to select spawned objects. The Button in area three is used to despawn the selected object. With the arrow Buttons in area four, the selected object can be moved and rotated. The Buttons in area five set the scale for the object positioning. Pressing the Button in area six toggles the visibility of all other UI elements on or off

4 Summary

Despite the massive leaps forward during the last years, AR hardware has not yet been developed far enough to be classified as a consumer-friendly technology. In terms of Gartner’s Hype Cycle Approach (Jarvenpaa and Makinen 2008), one could say that AR passed the peak of inflated expectations. Initial ideas for AR applications have been confronted with technical limitations still to be resolved, as localization of the HMD position, processing power of untethered devices, and hologram placement. In this paper, the last mentioned limitation has been addressed. In the context of geospatial experiments and applications, we specified the requirements for hologram positioning and display. These include precise and standardized hologram placement and repositioning of holograms in the real world. The presented AR interface (Fig. 13) addresses these requirements and contributes to an optimized experimental user testing in a real 3D spatial layout. The interface is specifically designed for user-studies that focus on the cognitive processing of 3D spatial arrangements, such as object locations, distances and relative directions. Additionally, the presented solution can act as a template for the development of other task-oriented AR interfaces for geographic and cartographic applications in Unity. Application examples would be to display holographic models of 3D environments with adjustable scale or selectively overlaying 3D height maps with additional information as population density (cf. de Almeida Pereira et al. 2017) or annual precipitation.

Fig. 13
figure 13

The finalized AR interface captured with the screenshot function of the Microsoft HoloLens

Furthermore, we illustrated which technical characteristics of current AR devices are in conflict with the identified requirements. Especially the stability of holograms was argued to be affected by tracking issues of current AR headsets. As a workaround, we described the development process of an AR interface capable of adding, removing and placing holograms precisely in real world space. This interfaces allows to perform standardized scientific experiments using AR hardware by correcting false hologram positions manually. To reduce interference with experimental visual stimuli, visibility of the AR interface can be reduced to a minimum when it is currently not required. However, our proposed solution addresses only some limitations of current AR devices. The most crucial limitation, the incapability to use current AR devices in large scale and outdoor environments, still remains. As long as highly accurate and reliable tracking cannot be provided by AR hardware (e.g., realized by a combination of inside-out and satellite tracking), the use of AR devices will be limited to spatially confined environments.