1 Introduction

1.1 Navigation in a virtual architectural environment

Nowadays, museums are increasingly using digital technology to enhance the exhibition experience. Among museums’ educational and descriptive resources, virtual and interactive reconstruction of ancient buildings is very common, especially in interpretation centres related to archaeological or historical sites. Visiting an actual building, especially when the visitor enters for the first time, is an experience of discovery and exploration. When this habitual task translates into a virtual realm walkthrough, intuitiveness and ease of use are fundamental to ensure users have a good engagement.

The literature related to the study of how people explore spaces underlines two concepts that describe the two main strategies humans use to apprehend and understand the environment and trace trajectories to move across them. Those strategies are navigation and wayfinding, sometimes considering the former to be a type of the latter [1].

According to Bowman, navigation is the most universal and common interaction task in 3D user interfaces. It often supports other tasks, and the secondary nature of the navigation procedure increases the need for usability [2]. Navigation is an essential user action in virtual environments, with travel and wayfinding components [3].

Travel is the motor component of navigation; wayfinding is the cognitive part of navigation and implies developing mental constructions to classify, order and relate spaces and places, creating complex cognitive models of the environment that are used to plan the route to follow. Such mental constructions can vary among different people and depend on factors such as viewpoint, scale or spatial complexity of the environment. Previous experiences can also affect this process [4, 5].

Navigation is related to using elements inside the space as references, such as landmarks, milestones, reference points and even external aids such as maps. Those reference points are of great importance in attracting the user’s interest to any given direction due to visual, cognitive and structural factors [6, 7]. Hence, appropriately managing these clues may contribute to improving the wayfinding process.

There is a growing demand for research on wayfinding and navigation. Some studies focus on navigation in virtual environments to help understand the wayfinding processes in the real world [8,9,10,11]. Others look for navigational aids to provide visual information about the virtual environment so that the user can find the destination [12,13,14] using 2D and 3D maps to make multi-storey buildings more understandable to users [15, 16].

Existing approaches to the problem of assisting the navigation along virtual environments include those centred on designing intelligent camera control systems that consider specific constraints to calculate partial paths [17]. In contrast, others focus on the computational analysis of the space to extract the way to follow before presenting the virtual space to the visitor [18].

Other approaches focus on avoiding collisions by tracing a path clear of obstacles [19] or making the user move sequentially through a predefined sequence of viewpoints of particular interest [20]. Those approaches have a common intention to provide the user with a path that takes them to specific viewpoints.

When facing the design of a museum installation, there are other essential aspects to consider related to the diverse user profile. Among those aspects, the influence of previous skills in moving inside virtual environments, such as video games, can be determinant. In this regard, several studies point to a better orientation inside the virtual space among those users accustomed to playing computer games [21,22,23,24]. These differences may be a priori a drawback in using a museum facility.

Therefore, finding an interface and a form of interaction that could fulfil two conditions is necessary. On the one hand, it may be accessible to any museum visitor in terms of age and previous experience with technology. On the other hand, it should let the user navigate in virtual spaces paying minimal attention to control the movement.

In the world of video games, several techniques are commonly used as navigation assistants for players. Non-playable characters guiding the player are common. The game may also indicate actions through onscreen icons, audio cues or camera adjustments to help players when they become lost or stuck.

Unlike video games, however, our system does not rely on visible assistants for guidance. Instead, our agent directly and continuously influences the user’s movement system but permits the user to use their body to control the camera direction if desired. The system facilitates navigation in the virtual environment; however, it does not dictate the trajectory or destination to reach since the centre of interest can change based on the user’s surroundings.

The tool is designed to present examples of digital architecture, historical heritage and virtual spaces, facilitating museum installation development. There are no objectives or goals to accomplish, unlike in video games. The user may disregard the assistance by turning in a different direction, and the experience continues without consequence.

This paper aims to contribute in filling an existing gap in navigation assistance in virtual architectural spaces, since no previous studies consider the simultaneous use of natural interaction. Our work also presents a novel way to assist users’ movements by employing an autonomous agent.

1.2 NUI interaction for the architectural walkthrough

Natural user interfaces (NUIs) provide novel ways of fostering active interaction and enhancing creative and playful engagement [25]. These technologies do not demand any specific technical or developmental skills to interact. In the case of NUI’s applied to the dissemination of architectural heritage, the problem of controlling the walkthrough inside virtual environments without the need for handling physical interfaces has existed since the appearance of the first virtual reality installations decades ago.

Some authors, like Bowman [26], noted that detecting the user’s intentions by interpreting the user’s natural gestures could improve and facilitate the design and implementation of interaction models inside such virtual environments. Depth cameras are frequently used to detect user gestures. Among them, Kinect [27], the famous depth camera introduced by Microsoft for the Xbox console, has been profusely used in scientific research in the field of natural interaction.

The Kinect camera bases its operation on calculating the time used by light pulses to reach their destination and be captured by a sensor, thus obtaining the distance to the device of each visible point of the scene. The emergence of depth camera technology allowed developers to easily grasp the user’s pose and gestures without needing apparent physical interfaces. As a result, depth cameras are now being used in many fields, including architectural visualisation, virtual archaeology and virtual museums. Depth cameras are ideal for controlling user movement inside virtual buildings with minimum or no training and without needing to handle or touch any physical device. This aspect is becoming even more critical in post-COVID times. Nowadays, general museology uses this type of device to develop highly attractive and engaging installations to interact with digital content of any kind.

There are many examples of management of users’ movements inside virtual environments using depth cameras. Some of them belong to the field of virtual archaeology in the form of museum installations, where visitors can experience the visit of digital reconstructions of ancient buildings and structures [28,29,30,31]. Commonly, those installations consist of a display that presents the virtual building to the visitor, who stands in front of it, controlling the walkthrough and sometimes performing other kinds of interaction such as grabbing virtual objects or clicking, employing different gestures. They all use depth cameras to capture the user’s gestures. In some cases, they require additional instructions, such as drawings on the ground, to explain the operation of the device.

In order to research new ways of moving inside such virtual architectural spaces, this paper will focus on the experience of the virtual architectural walkthrough itself in search of better ways to explore digital buildings.

2 Previous research

We were aware of the problems of this kind of installation. On the one hand, due to the great diversity of different gesture schemes used in all the analysed examples and the absence of studies comparing their performance. On the other hand, due to the lack of usability studies related to the exploration of virtual architectural spaces.

2.1 Gestures for NUI interaction with a depth camera

In this respect, our previous research [32] tested six gestural schemes used for navigation. We considered both gestures for controlling the marching actions to start, stop and perform speed variations and gestures for managing orientation and steering. The studied set of gestures included the following:

March gestures:


  • Point forward with arm The user raises their hand to start moving forward and releases it to stop. The angle of separation of the elbow-wrist vector from the vertical determines the displacement speed;

  • Lean forward The user leans forward slightly to start the walk, reaching max speed quickly, and straightens up to stop. The angle formed between the vector with origin in the neck’s base, pointing forward and the horizontal determines the user’s intention to move if greater than a given threshold. This neck vector depends significantly on the user’s postural configuration in an idle pose, so a previous, automatic or semi-automatic calibration is required for every user;

  • Swing arms The user swings their arms back and forth like walking. The virtual camera moves forward at a constant speed, while the user keeps moving their hands;

  • Step forward The user stars on a mark on the floor. A step forward from this point initiates the movement; another step ahead increases speed. Stepping backwards reduces speed or even permits walking backwards. When the system detects orientation changes, it reduces the speed to facilitate turning.

Turn gestures:

  • Point sideways with arm The user steers by pointing their hand left or right. The separation of the elbow-wrist vector from the initial idle angle determines the angular speed of the turn;

  • Twist upper body Twisting the upper body, as in the natural rotation that occurs while changing the walking direction, indicates the desire to turn. The angle between the vector that connects both shoulders and the screen plane determines the angular speed.

The studied examples combine one gesture for displacement and another one for turning. The six combinations tested are summarised in Table 1.

Table 1 Movement schemes

Based on the results of this experiment, we selected three combinations (PP, LT, ST) for the following stage due to their best performance compared to the others. The users’ heterogeneity is one of the most important factors when designing NUI interfaces. Museum visitors’ age, education, and previous experience with digital interfaces and virtual content can be pretty diverse. They can also belong to various cultures and speak a variety of languages.

A lack of prior experience using a NUI-based installation can also be a drawback since users tend to focus on the system’s control rather than the virtual visit’s enjoyment. Therefore, the study’s next phase was intended to determine which of those previous movement paradigms was best suited for museum visitors, both expert and non-expert users, based on their prior experience with 3D video games.

Aspects analysed included, among others, navigation performance, interface intuitiveness, efficiency and space awareness [33]. Although all proposed schemes are functional, the frequent collisions with objects and walls are especially evident when going through doors, turning in corridors, etc.

This problem indicates that the user cannot enjoy the experience as it should due to issues in movement control. Two movement schemes (Step/Twist, Lean/Twist) were preferred, respectively, for non-regular and regular gamers. Based on this, the authors implemented the capability to detect and use both ST and LT schemes simultaneously, so users can move forward either by leaning or stepping forward, using, in any case, the upper body twist for turning.

2.2 Assisted navigation in desktop computers

When the digital architectural model is specially created to be explored interactively by non-technical visitors, such as in the case of a museum installation, other specific problems arise. Hence, users cannot maximise their performance and experience without adequate means of moving through virtual environments.

The model’s visual accuracy and realism are not enough to provide a good experience when the act of navigating throughout the virtual building is not easy, pleasurable, intuitive and fruitful.

In general, most of the studies in the field of navigation in virtual environments focus mainly on cognitive aspects, such as finding a route so that the user reaches the destination quickly and accurately, defining paths or measuring how long it takes to carry out a specific task.

One aspect not considered in these systems is that although they seek to reproduce the free exploratory experience of the person, the movement within these environments is rigid and mechanical, limited to the forms of interaction previously defined for the input and output devices.

Natural human movement has its own rhythms that arise from the user’s specific interest in a given event or object [34, 35]. Some of the limitations found in the reviewed systems are the following:

  1. 1.

    Their main objective is the movement itself, travelling through space from one point to another at a constant speed. Hence, those systems cannot mimic the naturalness of the navigation experience, which has its own rhythms, speed variations and stops. Instead, the movement of humans within space is more similar to a nonlinear narrative system, where the user chooses where to go, what to look at, the pace, rhythm and pauses, controlling the evolution of his own experience;

  2. 2.

    The use of cameras with a fixed field of view (FOV) leads to an unnatural perception. The human eye adapts its FOV to the size, distance and location of objects and the dimensions and scales of the spaces. In the experience of movement in the real world, as the user moves, the change between the different points of view and, therefore, the change in the perspective deformations of the architectural object do not happen abruptly; on the contrary, it occurs gradually.

Our approach follows and extends a model from the field of the Psychology of Perception already pointed out by Lewin in his definition of hodological space [36]. In Lewin’s model, every part of the space surrounding the user may contain zones that pull the user’s attention based on their personal interest. We also implement Gibson’s theory of affordances [37], summing potentially perceivable “signals” inside the virtual space. The theoretical framework for this study also includes Posner’s Attention Model [38], the attentional reference points proposed by Sorrow and Hirtle [7] and Colmenero’s attentional gradient [39].

In Lewin’s model, every part of the space has an associated scalar value, which he calls “valence” that measures the effect of attraction, although this author does not give any clue of how to calculate it.

The assistance we propose analyses the presence of elements of interest around the user, including objects, architectural features and spaces. The system then weighs their importance and their capability to attract attention considering their intrinsic interest, distance to the visitor, angle of sight with respect to the user’s view direction and other variables, hence suggesting the user the direction to move by smoothly turning the camera to point the visitor to travel to the resultant area of interest. It is essential to highlight that although the assistant can take the user to the zones of interest of a building, he never loses the ability to choose or modify the route.

The assistant takes the form of an autonomous agent, associated and moving with the avatar that represents the user. The agent checks the presence of some particular, non-visible elements that we call attractors, placed by the developer of the experience in the areas of interest (Fig. 1).

Fig. 1
figure 1

Left to right: field of attention. Parameters for calculation of the attention of attractors. Centre of attention

All attractors (A) that populate the space are considered generators of attractive force. We define the geometry of the resulting force field by taking into account several parameters that control the degree of attention that each attractor may grasp from the visitor considering its distance (d) and the subtended angle \(\left( {\widehat{{\mathbf{A}}} \cdot \widehat{{\mathbf{V}}}} \right)\).

Those parameters include intrinsic importance \({I}_{a}\), decay of attraction with the distance (qd) (distance willingness), decay of attraction for non-frontal contemplation (qa) (angular willingness) and decay due to previous contemplation, among others. A vector (\(\widehat{\mathbf{F}}\)A) with origin in the attractor, pointing to the user, represents this attractive force, whose module FA will represent the value of its valence:

$$\left( {\widehat{{\mathbf{A}}} \cdot \widehat{{\mathbf{V}}}} \right) \ge \cos \omega \to F_{A} = \frac{{I_{a} }}{{d^{{q_{d} }} }} \left( {\widehat{{\mathbf{A}}} \cdot \widehat{{\mathbf{V}}}} \right)^{{q_{{\text{a}}} }}$$
(1)

The system detects all attractors in what we define as the field of attention Ω (FOA) that may or may not coincide with the field of view and calculates its valence. Occlusion by objects and walls is considered, even when behind transparent glass occluders.

Once the attractors are weighted, the agent obtains a centre of attention (C) by following a procedure similar to calculating a centre of forces:

$$\mathbf{C}=\frac{\sum {\mathbf{A}}_{i}{F}_{{A}_{i}}}{\sum {F}_{{A}_{i}}}$$
(2)

Once the centre of attention is calculated, the system smoothly turns the camera towards it if no turning input coming from the user is detected in a given time. By placing attractors inside the virtual environment coincident with the most interesting parts, and fine-tuning its characteristics, the designer of the walkthrough experience can influence the movement of the user by pointing them to the most exciting items, areas or places in their surroundings for any current location, both being static or marching.

The centre of attention is calculated in every frame of the simulation, considering the user changes in location and orientation. This system, which behaves as an autonomous agent inside the visual simulation, was given the name Paseante (the Spanish word for a person strolling).

It was implemented using a game engine (Unreal Engine 4) and tested using different architectural models [40, 41]. It initially gave good results for interaction using a mouse and keyboard. The two main problems were reported:

Confusion regarding the awareness of being assisted. Initially, there were no clues for the user of the presence of the agent, so they frequently expressed their surprise when the movement of the viewpoint did not correspond in intensity or extension with their input, even if the resulting direction was the one intended.

We solved this problem by giving a visual clue when the agent was activated (no turn input for a while), making some horizontal translucent black bars appear at the top and bottom of the screen, while the direction was changing smoothly to the centre of attention.

This sign provides feedback to the user, indicating that the computer is taking some control, generally following their intention;

excessive intrusion. The effect of the assistance has to be progressive since an intense impact on changing direction results in being counterproductive. The users sometimes even tried to counteract to correct it.

We solved this issue by gradually applying the effect, so a bit of user movement in the initial moments of the assistance is enough to counter the effect. After that, users rapidly familiarise themselves to let the effect go if it matches their intention or corrects it if they prefer to go elsewhere.

3 Assisted navigation in NUI-based installations

Natural user interfaces (NUI) can help to obtain more pleasant, user-friendly movement schemes. Nevertheless, to make the experience fruitful, the user has to achieve the objectives planned by the designer of the visit. Those goals may include locating certain places, contemplating specific spaces or examining objects.

Since most users will enter the virtual building for the first time, we think some kind of assistance would be convenient for them to enjoy the visit properly.

Once the previous work determined the most suitable movement schemes for NUI interaction inside such virtual architectural environments, it is the moment to study if assisted navigation can finally benefit the walkthrough experience in those NUI installations and to what extent.

Given the background and the problems extracted from the set of NUI installations analysed in previous works, we can ask ourselves a question and establish a hypothesis:

Question: Can a NUI installation’s navigation experience in a virtual architectural environment be improved?

  • Hypotheses Navigation assisted by an autonomous agent of the Paseante type can facilitate the control of the system by the user and improve the overall experience.

  • Sub-hypotheses There may be differences in the perception of assisted navigation depending on whether or not the visitor is a frequent video game user.

4 Methodology

The authors took a group of users to study the effect of the autonomous agent assisting their navigation in virtual architectural environments. They completed a series of tests comparing the walkthrough experience in two different conditions: unassisted and assisted by the agent.

The research participants were briefly informed about the available gestures’ mechanics (lean or step forward to advance, twisting upper body for turning). After trying both movements in the virtual environment, the user naturally adheres to their preferred choice (Table 2).

Table 2 Selected movement schemes

The authors collected quantitative data, a series of performance-based measures, including task completion times, the number of collisions suffered during navigation tasks and the number of gestures needed to complete the walkthrough. Furthermore, we used questionnaires to acquire qualitative data about the users’ experience with the navigation system, including notes and responses to open interview questions.

4.1 Test research participants

Twenty-two participants (54.5% male–45.5% female) took part in this study. Their age ranged from 18 to 57 years (⁠M = 26.9⁠, SD = 11.3⁠). Most were university students (72.8%), and the remaining 27.2% were faculty and other staff. All people participated voluntarily.

They were asked about their previous experience and gaming abilities. In this study, 54.5% were casual gamers (18.1% played games rarely, 36.4% played video games occasionally), and 45.5% were frequent gamers.

4.2 Session procedure

Before the beginning of the test, the moderator explained the session’s mechanics to the participants and required them to fill out a brief demographic questionnaire and self-reported gaming experience.

We used a within-participants research approach in this comparative usability study. Each session took approximately 30 min on average to complete for each individual.

To carry out this case study, the authors used the facilities of the University of A Coruña, Spain. The experiment set consisted of a room with a 65ʺ Ultra HD 4 K TV screen and a Kinect sensor below it. Some marks on the floor, located approximately 2.40 m in front of the screen, indicated the experience’s starting point. Before starting each task, the system automatically calibrated each participant’s height and idle lean angle.

The experiment was composed of five stages, with three different test sets of increasing complexity.

Training path (Task #0): This test consisted of an easy walkthrough inside a very simple space along three different parallel corridors of decreasing width (Fig. 2). Here the participants got familiar with the body gestures and their effect on their displacement. The goal was to cross the corridors avoiding touching the walls. This initial task was essential to check the participants’ understanding of the system.

Fig. 2
figure 2

View of the training set for Task#0

Complex test space. (Task #1): Participants had to follow a course along a sequential series of corridors with different sizes, widths, distances, angles and degrees of complexity, indicated by direction arrows (Fig. 3). The authors set up open and closed turns at specific points to increase the route’s complexity, avoiding the systematic alternation between right and left. The task consisted of two stages. In the first one, we measured the perceived difficulty of completing the proposed walkthrough and the required time. This time participants completed the established route without navigation assistance.

Fig. 3
figure 3

Set for Task #1 a general view (ceiling removed for better display), b user view, c attractors placed in the set (invisible for users)

Later, they took the same route using assisted navigation. Several attractors were put in place to assist users in getting oriented at every turn. Paseante helped them to steer to entrances and exits of corridors as they appeared in their field of attention.

The mission of this first task was to gather metrics to evaluate the assisted navigation system’s performance (A. Task #1), comparing it with the unassisted navigation system (U. Task #1). Accordingly, we measured and extracted relevant data about the time spent to complete the task, the number of direction changes, the number of user’s start–stop sequences, counting the collisions detected and the time that the users remained in a collision state (i.e. sliding against a wall).

We used these metrics to contrast the users’ subjective opinions against their related movement measures.

First questionnaire: Users answered a short questionnaire about their impressions of the two navigation modes. Users scored zero to ten for every aspect related to ease, amount of attention put on controlling the movement, physical and mental effort and user comfort (Table 3).

Table 3 Comparing the two navigation systems: metrics and questionnaire

Ville Savoye (Task #2): The authors asked the participants to make a free walkthrough inside a digital model representing this famous house (Fig. 4).

Fig. 4
figure 4

a, c, e User view in different parts of the Ville Savoye. b, d, f Attractors, invisible to users, placed around the house

Many attractors were previously placed inside the house attached to points of interest, such as distinct architectonic elements and spaces, access to principal rooms, unique furniture, doorways, windows with exciting views. The objective of the second task was the comparative study of navigation performance with and without assistance during the travel of an actual virtual architectural environment.

This task is much more complex than that carried out in the test space due to the building features since it represents a prototypical example of the architecture of the twentieth century that invites contemplation of the different singular spaces and elements.

In this task, users were asked to tour the house, climb through the ramp from the ground floor to the upper terrace and contemplate the different spaces.

As in task #1, this task consisted of two steps, the first unassisted (U. Task#2) and the second with assistance (A. Task #2). There was no time limit for this task to avoid exerting timing pressure on the user, contrary to the contemplative aim of the tour of the house spaces so that we can get a better impression of their perception of the ease of interaction with the interface.

Second questionnaire. Users answered some final questions regarding general aspects of the experience as ease, enhancement, intrusiveness, adaptability, movement assistance or contemplation assistance (Table 4).

Table 4 Questionnaire regarding the Paseante assisted navigation system

4.3 Measurements

Each participant fulfilled the same amount of tasks. Performance-based measures were derived from the participants’ navigation behaviour both without and with assistance. The authors applied a user-centred methodology based on the measurement and systematic analysis of the values used to define user experience [42, 43].

4.3.1 User navigation performance

The experiment compared users’ navigation performance in two conditions: unassisted and assisted by the agent, evaluating the influence of their previous expertise with 3D video games. Beginning with the concept of usability [44, 45], we measured successful task completion rates (effectiveness) and the mean task #1 completion times in seconds (efficiency). Concerning task #1 time, detecting collisions and determining contact points is of fundamental importance to know its impact on navigation skills.

The system counted the number of collisions (hits) and the number of frames of the simulation that every participant spent colliding with walls or objects. With these data and considering a frame rate of 60 fps, we obtained the percentage of collision time.

The system also measured the number of direction changes and start–stop movements made by the participants during task #1.

4.3.2 Natural interface behaviour

One of the goals of natural interface design is to develop systems that interfere as little as possible with the user’s experience. When the interface behaves as we expect it to [46], it responds to the user’s desires in a fluent, comfortable and confident way, allowing the user to focus their attention on the experience instead of on the interface. We used two variables to measure this aspect: attention share and ease of use.

4.3.3 User experience

Finally, we measured user experience for each task through the users’ responses to specific questions related to the different aspects of the experience, such as physical effort, mental effort, comfort, enhancement, interference, adaptability, movement assistance and contemplation assistance [47].

Furthermore, the emotional factor influences the potential to learn new skills and acquire new knowledge, which are key points of this kind of installation.

4.4 Data collection

Two moderators observed and interacted with the users during the session as they completed the tasks using the different movement paradigms.

We video-taped the users performing the different tasks and noted their spontaneous comments about their impressions related to the experience.

The system automatically registered the tasks’ completion times and frames in collision state.

After each task and at the end of the session, the participants rated each navigation system on a 10-point scale.

5 Results

This study used the IBM SPSS 26 statistics software. The analysis compared several measures obtained from the two navigation systems (unassisted and assisted) obtained from the same user group in two consecutive phases.

5.1 Task 1: Complex test space

This task tested the performance of Paseante in a challenging architectural space. The goal consisted in completing a walkthrough containing a series of corridors of different sizes, widths, distances, orientations and obstacles of varying degrees of complexity in the minimum possible time.

Since it is a small sample presenting a non-normal distribution, this study used a nonparametric Wilcoxon test (α = 0.05).

We measured the time used to complete the task, the number of direction changes, the number of user’s start–stop sequences, counting the collisions detected and the time the user remained in a collision state.

All participants, 22 out of 22, completed Task #1 in both unassisted navigation mode (U. Task #1) and assisted navigation mode (A. Task #1).

Comparative analysis of these factors helped determine whether assisted navigation was more effective than unassisted navigation.

The previous results show significant differences between navigation systems for all users (Wilcoxon p < 0.001). In this regard, assisted navigation shows a strong g effect size for all variables, except for the number of start–stops, where the g effect size was moderate (see Table 5).

Table 5 Comparison of users’ measures of task#1 for assisted (A) and unassisted (U) navigation for all users

Completion time in Task #1 is shorter with assistance than in the unassisted navigation mode. Concretely, the assisted navigation system produced an 11% reduction in the test completion time and a 79% reduction in time in a collision state. It makes sense since, in those complex spaces, users had more opportunities to collide with more narrow passages and tricky turns, making several direction changes and start–stop movements.

This way, for all participants, the number of collisions in Task #1 is much lower with assistance than in the unassisted navigation mode. It is important to note that collisions did not affect task completion.

Measures obtained when the assisted navigation system is active show a 41% reduction in the number of starts and stops, a 25% reduction in direction changes and a 60% reduction in collisions.

Regarding the influence of previous video game experience, the time reduction was similar for the two profiles. Frequent players completed the task earlier in both modes. Users with more gaming experience completed the tasks more quickly and with fewer collisions than users with less experience in both systems. Comparing assistance modes and regarding users’ gaming experience, assisted navigation times are 16.7% lower for casual gamers and 9.8% lower for frequent gamers (Fig. 5).

Fig. 5
figure 5

Task#1 completion time according to their previous gaming experience

In general, when using assisted navigation, users made fewer start–stop movements and direction changes for all users independently of their degree of previous experience in video games (Fig. 6).

Fig. 6
figure 6

Distribution according to their previous gaming experience

After completing Task #1, participants responded about how easy it was for them to adapt their movement to Paseante. The assessment of this point was positive, with mean values of 8.4, 95% CI [7.6; 9.1].

Considering gaming experience, both groups rated the system very positively with very similar ratings, although in this case, frequent gamers rated it better (M = 8.5, SD = 1.72). Likewise, the casual gamer rating (M = 8.3, SD = 1.724) was similar. Subsequently, users rated each aspect of the navigation systems on a 0- to 10-point rating scale.

Table 6 summarises the four aspects analysed for all users and considers their gaming experience. Generally, the assisted navigation system got better assessments than the unassisted mode.

Table 6 Comparison of users’ perception of attention share, physical and mental effort, and comfort for assisted (A) and unassisted (U) navigation for all users

Attention share. This value measures the amount of attention put into controlling the movement versus the attention dedicated to contemplating the environment and enjoying the experience. The users’ responses indicate that assisted navigation allows paying more attention to contemplation, while in the unassisted navigation mode, users pay more attention to controlling the system.


Physical effort. This aspect relates to the perceived amount of physical effort required to complete the task. The question asks the user about the physical effort put into completing the task, from strenuous to effortless. The responses indicate that assisted navigation requires less physical effort than unassisted navigation, which the participants perceived as more tiring.


Mental effort. This question asks the user about the mental effort required to complete the course. Answers range from complex to simple. Results indicate that assisted navigation requires less mental effort than unassisted navigation.


User comfort. Users rated the comfort level from more uncomfortable and tenser to more relaxed and comfortable. Assisted navigation was more relaxed than the unassisted mode, which produced a more uncomfortable experience.

Assisted navigation had a very significant effect on all variables analysed, especially attention (see Table 6).

Table 7 shows the results for these variables considering the users’ previous gaming experience, which is important regarding universal accessibility.

Table 7 Comparison of users’ perception of attention share, physical and mental effort, and comfort for assisted (A) and unassisted (U) navigation according to their previous gaming experience

Assisted navigation is restful for both player profiles and even more beneficial for users less experienced with these technologies. It is interesting to note that frequent gamers did not manifest statistically significant differences regarding physical effort (p > 0.05).

Figure 7 groups the results for each variable and assistance mode considering previous gaming experience. The results indicate that assisted navigation improves contemplation, is simpler and more restful, and is more comfortable for both player profiles similarly.

Fig. 7
figure 7

Comparative scheme of users’ perception of attention share, physical and mental effort, and comfort for assisted (A) and unassisted (U) navigation for casual and frequent gamer

5.2 Task 2

Both navigation systems were applied to emulate one of the most paradigmatic architectural walkthrough experiences: La Promenade Architecturale of the Ville Savoye, proposed by Le Corbusier in 1929 [48].

The objective of this task was to verify if the navigation assistance helps the user understand what they see, since many inexperienced users may feel lost or not have a clear idea of what or where to look.

Here it is important to note that, although assisted navigation can orient users during their walkthrough, guiding them to the most exciting parts of the building, they never lose the control to modify the current course instantly or to choose a different route. Therefore, it is interesting to determine if assisted navigation combines well with the movements’ naturality and facilitates the contemplation of the spaces.

Let us remember that the purpose of test #1 was to analyse aspects related to the system’s adaptability to the user and its effectiveness in enhancing their movements in a three-dimensional environment. That test evaluated factors such as completion time, trajectory accuracy, and collision avoidance, while also putting the user under some stress due to the time issues and challenging turns.

In contrast, test #2 was specifically designed to assess the system’s contribution to the user’s experience as a visitor of a building or a museum (in fact, the set was a replica of a real-world museum building). Test #2 involved a leisurely walk without any time constraints, as the duration was indefinite, and the user was free to choose their preferred route without the pressure to follow a specific path. As the tests measured different aspects under different stress conditions, it cannot be assumed that the learning achieved in task #1 had a significant impact on the results of task #2.

For this reason, after finishing the walkthrough, we required participants to assess the assisted navigation system in aspects related to ease, enhancement, intrusiveness, adaptability, movement assistance and contemplation assistance.

The participants rated the assisted navigation system on a 0- to 10-point scale, with 0 being the worst and 10 being the best. The results obtained for all analysed variables, considering all users first and then considering previous gaming experience, are as follows:

  • Ease We measured the effect of assisted navigation in making the walkthrough easier. Users generally considered that the system makes the route easier compared to unassisted navigation (M = 7.6, 95% CI [6.7, 8.4]). If we consider gaming experience, casual gamers assessed the system more favourably (M = 8.3, 95% CI [7.7, 8.9]) than frequent gamers (M = 6.6, 95% CI [4.9, 8.3]);

  • Enhancement We measured the effect of assisted navigation in enhancing the user’s movement throughout the simulation. Users generally considered that the system improves their movements compared to unassisted navigation (M = 7.3, 95% CI [6.5, 8.1]). The previous gaming experience did not produce any noticeable differences;

  • Intrusiveness We analysed to what extent the assisted navigation system would be intrusive to the user by acting unexpectedly, taking too much control, etc. Participants considered the assistant not or a little intrusive (M = 7.4, 95% CI [6.7, 8.1]). Considering gaming experience, although both groups assessed the system favourably, casual gamers considered the assistant less intrusive (M = 7.8, 95% CI [7.1, 8.6]) than frequent gamers (M = 6.8, 95% CI [5.4, 8.1]);

  • Adaptability This aspect evaluates the user’s difficulty in adapting to the system. Participants considered it easy to adapt to the assistance (M = 7.3, 95% CI [6.5, 8.0]). Considering previous experience, both groups assessed the system positively with very similar values, being the group of frequent players the one that expressed the best adaptation (M = 7.4, 95% CI [6.1, 8.7]);

  • Movement assistance Analysing the results, at a general level, the perception of movement assistance is high (M = 7.3, 95% CI [6.6, 8.0]). This perception is higher for users less experienced in video games (M = 7.8, 95% CI [6.7, 8.8]) and slightly smaller for frequent gamers (M = 7.2, 95% CI [5.9,8.5]);

  • Contemplation assistance To assess this aspect, we used subjective responses about the ease of contemplating objects, spaces and particular elements inside the house and to what point Paseante facilitated this. The perception of contemplation assistance is high (M = 7.9, 95% CI [7.3, 8.5]). This perception is higher for casual gamers (M = 8.0, 95% CI [7.1, 8.9]) and slightly smaller for frequent gamers (M = 7.7, 95% CI [6.7,8.7]).

Figure 8 shows users’ subjective perception for the six variables analysed considering their previous gaming experience.

Fig. 8
figure 8

Task#2 variable measures according to their degree of previous experience

From these results, one can observe that users with less experience in video games generally show more appreciation for the assistance in moving and contemplating. However, Paseante seems to be useful for all users.

6 Conclusions

In light of the results obtained from the tests carried out for this experiment, it seems clear that, as a starting point, natural interaction constitutes an excellent tool for controlling the navigation inside architectural environments. All users completed all tasks without any problem with very little explanation of the system operation and mechanics by letting their bodies transmit their intentions. Nevertheless, the results point in the direction that the contribution of an autonomous agent assisting the navigation facilitates, even more, the walkthrough experience.

The results for Task#1 indicate that the levels of efficacy and efficiency in completing the course increase when the assistance is activated. In general, collisions lessened when participants used assisted navigation, indicating that the trajectories followed the users’ intentions more closely. Also, the number of changes in orientation, accelerations and decelerations required to follow a course is significantly smaller with assisted navigation, even for people with previous experience with similar environments such as video games. The time taken to complete the task is also reduced. All this indicates that assisted navigation facilitates the ease and precision of the walkthrough.

An essential aspect of the architectural walkthrough experience is that the amount of attention required to control the movement has to be as small as possible, allowing the user to concentrate on enjoying the stroll, contemplate the building, feel its spatiality and appreciate the objects that it hosts. In this regard, and considering the results of both Task #1 and Task #2, the proposed assisted navigation agent fulfils the expectations. However, it is still possible to continue researching to reduce the intrusiveness, making the system even more transparent to the user. Nevertheless, the participants considered the system to be little intrusive and easy to adapt.

Regarding the effort needed to accomplish the tasks, assisted navigation requires less physical and mental effort and permits a more relaxed experience.

The explorative experience of architectural spaces requires the visitor to follow their own pace and rhythm, determined by their particular interest in specific elements and parts of the building. Therefore, it is important to remark on the assessment of the aspects related to movement assistance and contemplation assistance. In this regard, the perception of both kinds of assistance is positive for all users.

The study attempted to recruit individuals with varying ages and levels of experience in video games; however, the fact that the participants were predominantly from a university environment can be a limitation. Based on the data collected in this case study, which identified areas for improvement in the experience, the authors are considering conducting a new study on a functional application that involves a more diverse group of individuals that resemble better the regular museum visitors.

Finally, and for all variables analysed, the less experienced video game users gave higher scores than frequent gamers. This suggests that assisted navigation is an adequate aid for installations designed for the general public, such as those in museums, exhibitions and interpretation centres. People who attend such events can be of any origin, age and expertise, including older people unfamiliar with the technology. In this regard, Paseante could be beneficial for populations such as older adults, as it allows for interaction with the three-dimensional environment without requiring prior technical knowledge and with minimal physical effort and attentional demand. In addition, the system can simultaneously use multiple combinations of articulation groups to infer the user’s intention to turn and move forward. Consequently, the gestural approach used in Paseante can help enable interaction for individuals with different disabilities as long as the participant can demonstrate their intention to move through some form of body language. Assisted navigation combined with NUI’s, using gestures very close to those they use in their daily life can be especially useful and convenient for this population segment.

Future research in this direction includes applying this technology to more immersive virtual environments, such as those in the realm of extended realities, where deviceless interaction can foster highly engaging experiences.

7 Supplementary materials

The following are available online at https://videalab.udc.es/paseante Video S1: Paseante: An authoring tool for architectural walkthrough design based on a game engine.