Navigation to a remembered location can be informed by a multitude of cues, some internal and some external to the navigator. For example, a student navigating to the library uses known landmarks (environmental cues) to guide movement through the campus environment and toward the goal location. Additionally, the student can continue walking and progressing toward the goal while looking away from those landmarks to read a text message. At that moment when vision is directed away from the environment, the student’s movement is guided primarily by body-based (internal) cues, until vision is again directed toward the environment. Upon looking up, the student may find that a known landmark is in a slightly different location than expected, creating a conflict between environmental and internal cues that must be resolved in order to continue navigation. In this case, a small conflict is probably attributable to sensory system noise, but a large conflict could be due to something else, such as confusion about landmark identity. This article focuses on how environmental and internal navigation cues are combined during navigation and considers situations in which the cues provide consistent or contradictory information about self-location.

Two salient environmental cues to navigation are geometric cues and landmark cues (see Cheng & Newcombe, 2005). Geometric cues are those that can be defined by principles of geometry, and include extended surfaces and angles formed by the intersections of surfaces. One example of a geometric cue is room shape. Landmark cues are those that cannot be defined solely by the geometry of extended surfaces and typically are more localized than geometric cues. For example, a landmark may include a distinctive statue on campus or artwork on a room wall.

Internal cues (i.e., cues internal to the navigator) may also be used in navigation. Path integration is the process of updating self-position over time by combining multiple self-motion cues, including vestibular and proprioceptive stimulation, efferent motor commands, and optic and acoustic flow. Path integration accumulates error over time and with movement, such that the greater the walking and turning, the more error occurs in an estimate of self-location (Klatzky et al., 1990).

Real-world navigation typically involves multiple cues, internal and external, some of which provide redundant information about the navigator’s current location relative to the goal. The goal of the current project was to evaluate how room shape and path integration are combined during navigation.

According to the adaptive-combination model (Ratliff & Newcombe, 2008) navigators weight and combine cues. Cue weights are thought to be determined by cue reliability, validity, and salience, as well as previous experience. For example, a nearby landmark allows for greater precision as compared to a distant landmark when returning to a remembered goal location (Zhao & Warren, 2015b). According to this logic, room shape should receive greater weight in smaller rooms than larger rooms. Indeed, when room shape and a landmark provided conflicting information about the location of a goal, participants assigned higher weight to the room shape cue when the room was small as compared to when the room was large (Ratliff & Newcombe, 2008).

One approach to studying the integration of multiple cues to navigation involves comparison of human behavior to predictions based on Bayesian principles (Butler, Smith, Campos, & Bülthoff, 2010; Cheng, Shettleworth, Huttenlocher, & Rieser, 2007; Nardini, Jones, Bedford, & Braddick, 2008), which can be used to determine the optimality with which multiple cues are combined using maximum-likelihood estimation. In this context, optimal combination maximizes response precision (i.e., minimizes response variability) when multiple cues are available. Response precision does not necessarily correspond to response accuracy, which is affected by constant error (i.e., response bias). This research follows seminal work on cue combination in the perception of properties such as object thickness (Ernst & Banks, 2002), object location (Alais & Burr, 2004), and surface slant (Hills, Watt, Landy, & Banks, 2004). When provided with two cues, either of which could be used to perform the same task, the optimal combination is a weighted average in which the weights are inversely proportional to the variance associated with each cue. In other words, the more reliable cue (i.e., the one that produces the least response variance) should receive higher weight than the less reliable cue. For example, a navigator who walks along a circuitous outbound path before attempting to return directly to the path origin can use path integration and room shape to return. However, if the path integration cue would result in greater response variance when returning to the origin than would the room shape cue, then path integration should receive proportionally lower weight than room shape. According to rules derived from Bayes’s (1763) theorem (Blake, Bülthoff, & Sheinberg, 1993; Ernst & Banks, 2002; Knill, 1998; Landy, Maloney, Johnston, & Young, 1995), the optimal weights (W) associated with two cues (X and Y) are

$$ {W}_X={\sigma^2}_Y/\left({\sigma^2}_X+{\sigma^2}_Y\right), $$
(1)
$$ {W}_Y={\sigma^2}_X/\left({\sigma^2}_X+{\sigma^2}_Y\right), $$
(2)

and w X and w Y sum to 1.

Determination of the optimal cue weight requires testing participants under single-cue conditions to determine the variances associated with each cue. In the earlier navigation example, this would involve measuring response variance when returning to the path origin (variance would be assessed on the basis of repeated responses) on single-cue trials in which only path integration or only room shape was available.

The determination of actual cue weights (as compared to the optimal cue weights in Eqs. 1 and 2) involves a cue conflict situation (Alais & Burr, 2004; Ernst & Banks, 2002; Hills et al., 2004; Nardini et al., 2008; Zhao & Warren, 2015b). In the navigation example, the room could be rotated around the navigator by a subthreshold amount prior to responding. This conflict places the room-defined target location in conflict with the path-integration-defined target location, much like a student who looks up and sees that a known landmark is not in the predicted location. When the navigator attempts to return to the origin using conflicting cues, the relative proximity of the response to each cue-indicated location indicates the navigator’s actual cue weightings. This is calculated as follows:

$$ {rprox}_X=\frac{\frac{1}{d_X}}{\frac{1}{d_y}+\frac{1}{d_X}}=\frac{d_y}{d_y+{d}_x}, $$
(3)

where rprox X is the relative proximity of the response to cue X, d x is the distance of the response from the correct response location indicated by cue X, and d Y is the distance of the response from the correct response location indicated by cue Y.

When multiple cues are combined, response variance can be reduced as compared to single-cue responses. In the navigation example, the distribution of repeated responses will be smaller when both path integration and room shape are available, as compared to when only one cue (either room shape or path integration) is available. For given cue weights, whether optimal or not, response variance when both cues are present is predicted from single-cue response variance:

$$ {\sigma^2}_{X\kern0.5em +\kern0.5em Y}={w^2}_Y{\sigma^2}_Y+{w^2}_X{\sigma^2}_X, $$
(4)

where w X and w Y are the weights given to the cues.Footnote 1 Given optimal cue weights (Eqs. 1 and 2), the variance when two cues are present will be less than either of the variances of the single-cue conditions. Furthermore, the variance reduction will be greatest when the single-cue variances are the same.

If individuals do not integrate cues, but instead alternate between cues placed in conflict, the response variance can also be predicted (equation provided in Nardini et al., 2008):

$$ {\sigma^2}_{X\kern0.5em +\kern0.5em Y}={p}_X\left({\mu^2}_X+{\sigma^2}_X\right)+{p}_Y\left({\mu^2}_Y+{\sigma^2}_Y\right)-{\left({p}_X{\mu^2}_X+{p}_Y{\mu^2}_Y\right)}^2, $$
(5)

where p X is the probability of following cue X and p Y is the probability of following cue Y. The probabilities of following the two cues sum to unity. This alternation model predicts the variance associated with the probability of following each cue rather than the weight assigned to each cue. The probability of following either cue is calculated using participants’ relative proximity to each cue-indicated correct location (the same variable used to measure cue weights under the assumption of cue integration). Alternation between cues leads to higher response variance than in single-cue conditions because of the separation between cues, and the mean of the mixture of cues is a linear function of their mixture probabilities. For example, if the navigator follows the path integration cue on 50% of trials and follows the room shape cue on 50% of trials, then the distribution of response locations will be bimodal and the variability of all of the responses will be larger than had they combined cues (Eq. 4) or simply followed the most reliable cue on 100% of trials. Furthermore, the mean of the bimodal distribution will reflect the probability with which the participant relied on the two cues, such that a mean response that falls directly between the two cue-indicated locations reflects a 50% probability of following each cue. Interpreting relative proximity as the probability of following each cue, we can determine whether actual response variance differs from model-predicted variance.

Nardini et al. (2008; see also Chen, McNamara, Kelly, & Wolbers, 2017) examined landmark and path integration cue use among children and adults and compared performance to the Bayesian integration model and the alternation model. Adults and children navigated a darkened room with only path integration and landmark cues (three unique glowing objects mounted on the walls). Participants picked up a series of three objects from the floor of the enclosure and then attempted to return to the location of the first retrieved object. In essence, they walked along a two-segment outbound path before attempting to return directly to the path origin, a task sometimes referred to as triangle completion. Path integration and landmarks were both available on the outbound path, and experimental manipulations created two single-cue (path-integration-only or landmark-only) conditions and two dual-cue conditions (both cues available and cues at a 15° conflict) for the return path.

In the path-integration-only condition, the landmarks were removed so that participants had to rely on path integration alone for the return path. In the landmark-only condition, the participants were disoriented, rendering path integration an unreliable cue and causing participants to rely on landmarks alone for the return path. In the cue-combined condition, both path integration and the landmarks remained intact for the return path. In the cue conflict condition, the landmarks were rotated by 15° (a subthreshold amount) relative to the center of the enclosure, which placed the path-integration-indicated location in conflict with the landmark-indicated location. Unlike the introductory example of the student who looks up to find that a known landmark is in an unexpected location, the participants in the conflict condition of Nardini et al. (2008) were not explicitly aware of the cue conflict. Adults, but not children, demonstrated reduced response variance in the combined conditions relative to single-cue conditions. Additionally, the relative proximity of responses for adults in the cue conflict condition reflected the optimal weightings predicted by the variances in the single-cue conditions (Eqs. 1 and 2). It was concluded that adults combined landmark and path integration cues in a Bayesian optimal manner whereas children alternated between cues.

Zhao and Warren (2015b) examined cue combination and competition when landmark arrays and path integration conflicted by 0°–135°. Participants performed triangle completion under single-cue and dual-cue conditions. The response variabilities, which represented response precision, in the dual-cue conditions were consistent with near-optimal combination of landmark arrays and path integration up to a conflict of 90°. However, the homing direction, which represented response accuracy, was dominated by one cue over the other. Zhao and Warren (2015b) concluded that cues are integrated to increase response precision, but they compete to determine response accuracy.

Under conditions of large cue conflict, animal and human navigation research has indicated that one cue is sometimes abandoned in favor of another. When human navigators were presented with 115°–135° conflict between landmark and path integration cues, homing direction was controlled by path integration (Zhao & Warren, 2015b). Research with rats has produced similar results, whereby large conflict with path integration caused rats to abandon a single landmark (Shettleworth & Sutton, 2005), although they continued to follow a set of multiple landmarks (Suzuki, Augerinos, & Black, 1980). Cue combination under conditions of small cue conflict and reliance on path integration in situations of large cue conflict makes sense, because a small cue conflict is attributable to sensory noise, whereas a large conflict could be caused by memory failure or confusion about the identities of landmarks (e.g., confusion as to which tree marked the goal location), in which case one cue should be ignored. In this sense, abandonment of landmarks under large-cue-conflict conditions may be suboptimal in the Bayesian sense but might be a logical choice when considering other factors that could have caused the large conflict. A similar argument has been made to address large sensory conflicts in the perceptual domain (Knill, 2007; Körding et al., 2007).

The results of cue combination studies (Chen et al., 2017; Nardini et al., 2008; Zhao & Warren, 2015b) indicate that adult humans can combine path integration and landmark cues to reduce response variability in a Bayesian optimal manner as long as the conflict between cues is not too large. However, it is unknown whether path integration and room shape are optimally integrated during navigation. Humans do appear to integrate room shape and path integration (Kelly, McNamara, Bodenheimer, Carr, & Rieser, 2008, 2009), but that research has lacked the necessary conditions to evaluate whether cue integration is optimal. Participants in those studies navigated in rooms varying in shape, some of which provided ambiguous orientation cues. For example, a disoriented navigator in a square room only has a 1-in-4 chance of reorienting correctly due to the rotational symmetry of the room. However, when allowed to use path integration, participants remained oriented while walking in a square room, but not in a circular room. This suggests that participants were integrating the shape of the square room with path integration information to remain oriented. Participants clearly integrated the two cues (path integration and room shape) when navigating in the square room, but there are at least two explanations for how they could do so: (1) participants could have integrated cues in a Bayesian optimal manner, or (2) participants could have occasionally referenced the shape of the square room in order to “zero-out” accumulated error in path integration. However, that study lacked the necessary single- and dual-cue conditions to determine whether humans optimally integrate path integration and room shape.

The present study was designed to determine whether path integration and room shape are optimally integrated during navigation. These experiments employed virtual reality to present the visual environment. Research on cue combination during navigation using virtual environments (Chen et al., 2017; Zhao & Warren, 2015b) has produced results consistent with those from real environments (Nardini et al., 2008). Experiment 1 examined whether path integration and room shape cues are combined in a Bayesian optimal manner using methods similar to those used in past research (Chen et al., 2017; Nardini et al., 2008). To preview, Experiment 1 used single cue and dual cue conditions to determine the optimality with which room shape (a geometric cue) and path integration are combined. Experiment 2 replicated the main finding of Experiment 1 and compared cue integration under small and large conflict conditions. The small cue conflict was of the same magnitude as that in Experiment 1, and was not noticeable to participants, whereas the large cue conflict condition was noticeable.

Two predictions were made under the assumption that participants would optimally combine room shape and path integration. First, the standard deviation of responses would be lower when path integration and room shape are both available than in either of the single-cue conditions, consistent with optimal standard deviations predicted by the single-cue conditions (Eq. 4). Second, the mean response locations on cue conflict trials (Eq. 3) would reflect the optimal cue weights predicted by the single-cue conditions (Eqs. 1 and 2). These predictions are based on similar findings in past work evaluating the combination of path integration and landmarks (Chen et al., 2017; Nardini et al., 2008; Zhou & Warren, 2015b).

Experiment 1

Experiment 1 was conducted to examine whether adults optimally combine path integration and room shape. The size of the room was also manipulated between participants, on the basis of previous research suggesting this manipulation may influence relative cue weights (Ratliff & Newcombe, 2008; Sovrano, Bisazza, & Vallortigara, 2007).

Method

Participants

Forty-eight undergraduate students (26 men, 22 women) from Iowa State University participated for course credit. Two additional students did not complete the study due to simulator sickness. Participants were assigned to either a large or small room condition. Gender was balanced across conditions.

Stimuli, design, and procedure

The virtual environment was displayed on a head-mounted display (HMD; nVisor SX111, NVIS, Reston, VA), which presented stereoscopic images at 1,280 × 1,024 resolution with 102° horizontal × 64° vertical field of view. Images were refreshed at a rate of 60 Hz and reproduced head movement and orientation of the participants as they navigated the virtual environment. In this way, participants were able to physically walk and turn to move through the virtual environment. Head orientation was tracked in three dimensions using an orientation sensor (InertiaCube2; Intersense, Bedford, MA), and head position was tracked in three dimensions using an optical tracking system (Precision Position Tracker, PPTX4; WorldViz, Santa Barbara, CA). Vizard software (WorldViz, Santa Barbara, CA) was used to render graphics on a desktop computer with Intel Core2 Quad processors and Nvidia GeForce GTX 285 graphics card.

The virtual environment consisted of a rectangular room that had one wall removed to create a three-walled room on an endless grassy plane. The room size was 4 × 4 m in the small-room condition and 8 × 8 m in the large-room condition (Fig. 1). Target posts were placed in the same physical locations, regardless of the room size, and were 0.1 m in diameter (Fig. 1). Figure 2 shows the participant’s view looking into the large room.

Fig. 1
figure 1

Room sizes (rectangles) and post locations (circles) used in Experiment 1. Larger colored circles indicate the target post locations from Nardini et al. (2008); smaller colored circles indicate the additional target post locations included in the present study. All target posts were of identical size during the experiment. The participant’s start location (lowest circle) was 1 m behind the participant’s end location. The target post locations (top arc of circles) were 1.75 m from the trial end location. The mid-path gray posts were 1.31 m from the trial end location, at angles of –33°, –11°, 11°, and 33° from the center

Fig. 2
figure 2

Participant’s starting view into the large room, with an example target post

On the outbound path, participants were always provided with two cues to navigation: the virtual three-walled room and path integration. Participants began each trial standing outside of the room at the location of a blue post (blue circle in Fig. 1), facing into the virtual room. At the start of each trial a red target post appeared at one of 14 locations in the virtual room (red circles in Fig. 1). Participants were instructed to remember the location of the red target post for the duration of that trial. Participants then walked to the target post, which disappeared upon their arrival. A gray post then appeared at one of two locations on the opposite side of the virtual room (e.g., if the target post was on the right side of the room, the gray post appeared on the left side of the room). Participants walked to the gray post and, again, the post disappeared and was replaced by the final gray post. The final gray post was always in the same location, 1 m in front of the blue start post. Participants walked to the final gray post and then turned to face the blue start post. When the head position and orientation data indicated that participants were at the final gray post facing the blue start post, the entire virtual world disappeared and was replaced by a gray screen. The gray screen was displayed for 15 s, and participants were instructed to count backward from a randomly chosen start number (provided verbally by the experimenter) by increments of three.

After 15 s had elapsed, participants stopped the counting task and attempted to return directly to the location of the red target post under one of four conditions. In the path-integration-only (PI-only) condition, the virtual ground plane reappeared prior to the participant’s response, but the virtual room was absent. In the room-only condition, participants were spun gently in place to disorient them prior to the virtual room and ground plane reappearing, thereby rendering the path integration cues unreliable. In the combined-cue condition, participants could use the room shape and path integration cues on the return path (i.e., the virtual ground plane and room appeared, and participants were not disoriented). Finally, in the cue conflict condition, the virtual room was rotated by 15° during the 15-s delay (during which time the room was not visible), placing in conflict the correct target post location indicated by room shape and the location indicated by path integration. In each of the four conditions, the primary dependent measure was participants’ standing positions when they believed they had reached the target post location.

Participants completed a practice block of trials with one of each of the four trial types displayed in a predetermined order (combined, PI-only, room-only, conflict), followed by four test blocks. Each test block consisted of four trials, one of each trial type in a random order. Although other studies exploring the optimality of cue combination have often included many more trials (e.g., Ernst & Banks, 2002; Zhao & Warren, 2015b), the number of trials per condition and the general research design in the present project are consistent with those in two previous studies (Chen et al., 2017, and Nardini et al., 2008). More data per participant would produce lower standard errors of the estimated population response variability but would also result in a higher dropout rate, due to simulator sickness.

Analyses

Because the target location was randomly selected from 14 possible locations, responses were aligned (rotated and translated) to a single target location prior to the analysis. Using the aligned responses, analyses focused on comparison of the standard deviations of responses across repeated trials around each participant’s mean response location. The analyses followed those of past work (Chen et al., 2017; Nardini et al., 2008), whereby the standard deviation was calculated on the basis of the absolute distance of responses relative to the response mean, separately for each participant (Zhao & Warren, 2015b, used a similar approach to evaluate response variability but analyzed the responses in terms of the direction rather than the distance to the target). Consistent with past work (Chen et al., 2017; Nardini et al., 2008), the response proximity relative to each of the two cues on conflict trials was used to determine the actual cue weightings, which were then compared to the optimal cue weights calculated from the variances of the single-cue conditions. Using Eq. 3, the relative proximity of a participant’s mean response to the room cue was determined by calculating the absolute distance of the mean response to the path-integration-defined location and dividing that by the sum of the absolute distances to the path-integration-defined location and the environment-defined location.Footnote 2 The relative-proximity measure is related to response accuracy, but the focus is on determining the relative influences of the individual cues. In contrast, Zhao and Warren (2015b) analyzed response direction, in part because their data indicated that response accuracy was guided by a single cue and did not reflect cue combination.

Due to the potentially large number of statistical comparisons, the data were analyzed according to the following principles. First, the dual-cue conditions (combined and conflict) were compared with the single-cue conditions in order to evaluate whether the availability of multiple cues caused reduction of the response standard deviation. If the response standard deviation in a dual-cue condition was lower than those in both single-cue conditions, then the data from that dual-cue condition were compared to model predictions of cue combination. If the response standard deviation in a dual-cue condition was not lower than those in both single-cue conditions, no comparison was made with the model predictions of optimality because an optimal combination should, by definition (Eq. 4), reduce the response standard deviation below those in the single-cue conditions.Footnote 3 When comparing data to the cue combination model predictions, the optimal cue weights were used first to predict the optimal standard deviations. If the observed standard deviation was higher than the model prediction using optimal weights, a follow-up test compared the observed standard deviation with model predictions using the actual cue weights derived from the conflict condition.

Results

A trial response was considered outlying if it fell beyond three times the interquartile range above the third quartile of the average response distance from the target location for that condition; 1.8% of the total trials were removed as outliers. Scatterplots showing individual responses are presented in Supplemental Fig. S1.

We observed no effect of room size on either the response standard deviation or the relative proximity of responses to the target-indicated correct locations (see the Supplemental Results for more detailed analyses). Due to the lack of a room size effect, the room size variable was removed from all subsequent analyses. The effect of condition (combined, room-only, PI-only, and conflict) on response standard deviations (see Fig. 3) was analyzed in a one-way analysis of variance (ANOVA). The effect of condition was significant, F(3, 141) = 12.65, p < .001, η G 2 = .14.Footnote 4 Planned contrasts revealed that the response standard deviation in the combined condition (M = 0.17, SD = 0.10) was significantly lower than those in the room-only condition (M = 0.23, SD = 0.14), F(1, 47) = 5.63, p = .022, η G 2 = .11, and the PI-only condition (M = 0.36, SD = 0.25), F(1, 47) = 28.58, p < .001, η G 2 = .18. The response standard deviation in the conflict condition (M = 0.22, SD = 0.12) was not significantly lower than that in the room-only condition, F(1, 47) = 0.011, p = .916, η G 2 = .00, but it was significantly lower than that in the PI-only condition, F(1, 47) = 10.64, p < .01, η G 2 = .10.

Fig. 3
figure 3

Light gray bars indicate the average response standard deviations as a function of condition in Experiment 1. Dark gray bars indicate the predicted standard deviations based on models of either optimal combination or response alternation. Error bars represent ±1 standard error

We predicted that responses on the cue conflict trials would reflect the optimal cue weightings predicted by the single-cue conditions. Optimal weights for the room shape and path integration cues were calculated for each participant using the variances from each of the single-cue conditions following Eqs. 1 and 2. The actual weights for room shape and path integration were calculated as the relative proximity of responses to the room-defined and path-integration-defined locations on conflict trials, following Eq. 3. A paired-samples t test compared the calculated optimal weight for the room shape cue for each individual participant to the participant’s actual room shape weighting. The optimal room weight (M = 0.65, SD = 0.28) and the actual room weight (M = 0.58, SD = 0.08) were not significantly different, t(47) = 1.62, p = .112, 95% CI [–0.02, 0.15], suggesting that participants optimally weighted path integration and room shape in the cue conflict condition.

Figure 3 shows the optimal response standard deviation calculated using Eq. 4, assuming optimal cue weights. There was no significant difference between the actual standard deviation of responses in the combined condition (M = 0.17, SD = 0.10) and the predicted optimal standard deviation of responses (M = 0.16, SD = 0.09), t(47) = 0.80, p = .428, 95% CI [–0.05, 0.02], suggesting that participants optimally combined room shape and path integration. Furthermore, the standard deviation predicted by the alternation model (M = 0.39, SD = 0.14) was significantly higher than those in the combined-cue, t(47) = 10.91, p < .001, and conflict, t(47) = 6.46, p < .001, conditions.

Although we found no significant differences between the actual and optimal cue weights or between the actual and predicted standard deviations of responses in the combined condition, it is difficult to make theoretical conclusions on nonsignificant null hypothesis tests. However, it is important to make theoretical conclusions based on the equivalence of observations (Gallistel, 2009). Therefore, we also subjected these comparisons to Bayesian analyses. Unlike null hypothesis testing, these analyses can be used to determine evidence in support of the null hypothesis (Gallistel, 2009). As is displayed in Supplemental Table S1, the results supported the equivalence of the actual and optimal cue weights as well as the actual and predicted standard deviations of responses in the combined-cue condition.

Discussion

The response standard deviation when returning to the path origin was reduced in the combined condition relative to the single-cue conditions and was consistent with the optimal standard deviation. Additionally, the responses on cue conflict trials reflected the optimal weightings predicted by the single-cue conditions. However, the response variability on cue conflict trials was suboptimal. Taken together, these findings partially support the proposal that humans optimally integrate room shape and path integration cues during navigation.

Counter to our prediction, the room size manipulation did not affect the responses, for which at least two explanations are possible (see the Supplemental Results for more detailed consideration). First, the room size in the small-room condition was similar to that in the large-room condition used in previous research (Ratliff & Newcombe, 2008). Second, the past studies showing room size effects were methodologically distinct from this experiment. Therefore, it is possible that the previously reported room size effects would not generalize to the present stimuli and methodology.

Experiment 1 supported the optimal integration of geometric room shape and path integration cues during navigation. In Experiment 2 we sought to replicate this result and additionally to determine whether the size of conflict between environmental and path integration cues influences the cue weights and cue combination. Previous research had indicated that participants may favor path integration over a single landmark and over three landmarks when the different cue types are placed in large conflict (Zhao & Warren, 2015b). In contrast, it is possible that participants will favor room shape over path integration, similar to the preference for following room shape rather than landmarks in previous research (Ratliff & Newcombe, 2008).

Experiment 2

Under conditions of large cue conflict, research has indicated that landmark cues are sometimes abandoned and path integration used exclusively (Zhao & Warren, 2015b). This could be considered optimal, because small conflicts are probably attributable to sensory noise, whereas large conflicts could be due to memory failures or confusion about landmark identity. Experiment 2 was designed to evaluate the influence of environmental and path integration cues under conditions of small and large cue conflict.

Participants navigated in the presence of one of three environmental cues: a single landmark, three landmarks, or a surrounding room. Cue manipulations prior to the return path created single-cue and dual-cue conditions, including conditions of small and of large cue conflict, in which the environmental cue was rotated 15° or 90°, respectively. The predictions regarding combined and small-conflict conditions were identical to those in Experiment 1. Of particular interest in Experiment 2 was the large-conflict condition and whether the cue weights and response variability would depend on environmental cue type. Zhao and Warren (2015b) found that human participants ignored a configuration of three landmarks placed in sufficiently large conflict with path integration. In light of those results, it was expected that the participants in Experiment 2 would favor path integration over both a single landmark and three landmarks when the cues were placed in large conflict. However, geometric cues have never been tested in a large conflict with path integration. It is possible that participants would favor room shape over path integration, similar to the preference for following room shape over landmarks in previous research (Ratliff & Newcombe, 2008).

Method

Participants

Sixty Iowa State University undergraduate students (33 women, 27 men) participated in exchange for course credit. Twenty-nine additional students did not complete the study due to simulator sickness.Footnote 5

Participants were randomly assigned to three between-participants cue conditions: room, one landmark, and three landmarks. There were 20 participants in each of the three environmental cue conditions, and gender was balanced across conditions.

Stimuli, design, and procedure

Experiment 2 included three between-participants environmental cue conditions: one landmark, three landmarks, and room shape (Fig. 4). The landmarks used in Experiment 2 were gray cylindrical posts with differentiating unique, nonsymmetrical objects (car, cup, and train) placed on top at approximately eye height. All cylindrical posts were 0.1 m in diameter, and the objects were approximately 0.3 m in the longest dimension. In the one-landmark condition, participants completed the experiment with one landmark used as an environmental cue, located in the position of the most central landmark used by Nardini et al. (2008). In the three-landmark condition, participants experienced three landmarks arranged to match the positions of the landmarks used by Nardini et al. In the room condition, participants experienced the small room used in Experiment 1; we did not manipulate room size in Experiment 2.

Fig. 4
figure 4

Environmental cue and post locations used in Experiment 2. Black lines indicate the room walls in the room condition (left), and black circles indicate the cue post locations in the multiple-landmark (middle) and single-landmark (right) conditions

The task was identical to that of Experiment 1. Cue manipulations prior to the return path created five within-participants conditions: combined, environmental-cue (EC) only, PI-only, small-conflict, and large-conflict. In the small-conflict condition the environmental cue was rotated by 15°, as in Experiment 1. In the large-conflict condition the environmental cue was rotated by 90°, a noticeable amount.Footnote 6

Results

A trial response was considered outlying if it fell beyond three times the interquartile range above the third quartile of the average response distance from the target location for that condition; 0.83% of the total trials were removed as outliers. Scatterplots showing individual responses are presented in Supplemental Fig. S2.

We observed no effect of environmental cue type (room, single landmark, three landmarks) on the response location standard deviation or the relative proximity of responses to the target-indicated correct locations (see the Supplemental Results). Therefore, the environmental cue variable was removed from all subsequent analyses. Standard deviations based on absolute response distance (see Fig. 5) were analyzed in a one-way ANOVA on condition (combined, EC-only, PI-only, small-conflict, and large-conflict). There was a significant main effect of condition, F(4, 236) = 14.87, p < .001, η G 2 = .14. Planned contrasts revealed that the standard deviations of responses in the combined condition (M = 0.20, SD = 0.15) were significantly lower than those in the EC-only condition (M = 0.26, SD = 0.16), F(1, 59) = 6.10, p = .016, η G 2 = .03, and the PI-only condition (M = 0.40, SD = 0.21), F(1, 59) = 42.20, p < .001, η G 2 = .23. The standard deviations in the small-conflict condition (M = 0.23, SD = 0.14) were not significantly different from those in the EC-only condition, F(1, 59) = 0.82, p = .369, η G 2 = .01, but were significantly lower than those in the PI-only condition, F(1, 59) = 27.91, p < .001, η G 2 = .19.

Fig. 5
figure 5

Light gray bars indicate the average response standard deviations as a function of condition (path integration [PI] only, environmental cue [EC] only, large conflict, small conflict, and combined) in Experiment 2. Dark gray bars indicate the predicted standard deviations based on models of either optimal combination or response alternation. The alternation model is shown only based on the small-conflict condition (the large-conflict alternation model is not shown due to its impact on the figure scale). Error bars represent ±1 standard error

We predicted that the responses on small-conflict trials would reflect the optimal cue weightings predicted by the single-cue conditions. Optimal weights for the environmental and path integration cues were calculated using the variances from each of the single-cue conditions, following Eqs. 1 and 2. A paired-samples t test compared the calculated optimal weights for the environmental cue for each individual participant to the actual environmental-cue weightings on small-conflict trials (i.e., the relative proximities of the response locations to the environmental-cue-indicated correct target locations in the small-conflict condition). The optimal weight (M = 0.66, SD = 0.26) and the actual weight (M = 0.55, SD = 0.10) were significantly different, t(59) = 3.11, p = .003, 95% CI [0.04, 0.18], indicating suboptimal cue weighting for the environmental and path integration cues.

Figure 5 shows the optimal response standard deviation calculated using Eq. 4 and assuming optimal cue weights. A paired-samples t test compared each individual’s actual standard deviation of responses in the combined condition to the standard deviation predicted using the optimal cue weights and Eq. 4. We found no significant difference between the actual standard deviation of responses in the combined condition (M = 0.20, SD = 0.15) and the predicted standard deviation of responses based on optimal cue weights (M = 0.19, SD = 0.11), t(59) = 0.80, p = .425, 95% CI [–0.05, 0.02], indicating that participants optimally combined the environmental and path integration cues. Although it is unclear why the actual weights (determined using the relative response proximity in the small-conflict condition) on small-conflict trials were found to be suboptimal, response variability in the combined condition was consistent with optimal cue integration. The standard deviation predicted by the alternation model (M = 0.42, SD = 0.13) was significantly higher than those in the combined-cue, t(59) = 10.35, p < .001, and conflict conditions, t(47) = 8.52, p < .001.

Although there was no significant difference between the actual standard deviation of responses in the combined condition and the predicted standard deviation of responses, it is difficult to make theoretical conclusions on nonsignificant null hypothesis tests. Therefore, we also subjected this comparison to Bayesian analyses (Gallistel, 2009). As is displayed in Supplemental Table S1, the results supported the equivalence of the actual and predicted standard deviations of responses in the small-cue-conflict condition.

A prediction for the large-conflict condition was that participants would no longer optimally integrate cues due to the noticeable conflict. A paired-samples t test compared the calculated optimal weight for the environmental cue for each individual participant to the actual environmental cue weighting determined by the relative proximity of the response locations to the environmental-cue-indicated correct location. The optimal weight (M = 0.66, SD = 0.26) and the actual weight (M = 0.34, SD = 0.19) were significantly different in the large-conflict condition, t(59) = 8.35, p < .001, 95% CI [0.24, 0.39], indicating that participants did not weight cues optimally in the large-conflict condition, and instead were more likely to rely on path integration.

The response standard deviation in the large-conflict condition (M = 0.34, SD = 0.23) was higher than that predicted using optimal cue weights based on the single-cue conditions (M = 0.19, SD = 0.11), t(59) = 4.66, p < .001, 95% CI [–0.21, –0.08], but lower than that predicted by the alternation model (M = 1.25, SD = 0.17), t(59) = 28.58, p < .001, 95% CI [0.85, 0.98]. Since the large-conflict data were inconsistent with both the optimal cue combination model and the alternation model, those data were also compared to cue combination model predictions based on the suboptimal weights derived from response proximity on the large-conflict trials. Here we found no significant difference between the actual standard deviation of responses (M = 0.34, SD = 0.23) and the predicted standard deviation of responses based on the actual cue weightings (M = 0.29, SD = 0.15), t(59) = 1.59, p = .116, 95% CI [–0.11, 0.01], suggesting that response variability was consistent with the cue combination model predictions given the actual cue weightings.

To evaluate whether the weights for environmental and path integration cues differed between large and small conflict, a paired-samples t test compared each individual’s relative proximity to the environmental-cue-defined correct location in the large-conflict condition (M = 0.34, SD = 0.19) and the relative proximity to the environmental-cue-defined correct location in the small-conflict condition (M = 0.55, SD = 0.10). In the small-conflict condition, participants responded significantly closer to the environmental-cue-defined correct location than they did in the large-conflict condition, t(59) = 7.75, p < .001, 95% CI [–0.26, –0.15].

Discussion

Response variability when returning to the path origin was reduced in the combined relative to the single-cue conditions, and was consistent with the predicted optimal standard deviation. However, response variability under small cue conflict was no better than with the least-variable single cue. Taken together, these findings partially support the proposal that humans optimally integrate room shape and path integration cues during navigation. Cue type (room shape, single or multiple landmarks) had no differential effect on either response variability or cue weight.

The cue weights in the conflict conditions of Experiment 2 depended on the size of the conflict. In the small-conflict condition, participants responded closer to the environmental-cue-indicated correct location than to the path-integration-indicated location, whereas in the large-conflict condition they responded closer to the path-integration-indicated correct location.Footnote 7 The greater reliance on path integration under large cue conflict is consistent with previous research on human navigation with path integration and multiple landmarks (Zhao & Warren, 2015b), and the present results extend that work by showing that navigators place greater weight on path integration, regardless of the environmental cue (single landmark, multiple landmarks, or room shape), under large-conflict conditions.

General discussion

The results of this study extend the understanding of cue combination during human navigation. Previous research indicates that humans combine room shape and path integration cues during navigation (Kelly et al., 2008); however, it is unclear from that work how cue combination occurs. The present experiments provide evidence that adult humans combine geometric cues defined by room shape with path integration cues at optimal or near-optimal levels, similar to the optimal combination of landmark and path integration cues reported elsewhere (Chen et al., 2017; Nardini et al., 2008; Zhao & Warren, 2015b). Additionally, Experiment 2 replicated past work (Chen et al., 2017; Nardini et al., 2008; Zhao & Warren, 2015b) indicating that landmark (both single and multiple) and path integration cues are combined in a Bayesian optimal manner to increase precision in cases of no or only small conflict between cues.

Rats have been reported to abandon a single landmark (Shettleworth & Sutton, 2005), but not multiple landmarks (Olton & Samuelson, 1976; Suzuki et al., 1980), in situations involving large conflict with path integration, whereas human participants ignored a configuration of three landmarks placed in large conflict with path integration (Zhao & Warren, 2015b). In the large-conflict condition of Experiment 2, participants assigned a higher weight to path integration than to any environmental cue (single landmark, three landmarks, or room shape), despite the relatively greater precision afforded by the environmental cues. This result may be consistent with the proposal that path integration serves as a backup navigation system (Cheng et al., 2007). According to that proposal, path integration provides a ground truth to be followed exclusively when environmental cues are unavailable or in large conflict. In the context of the large-conflict trials in Experiment 2, participants may have assigned a higher weight to path integration upon noticing that the environmental cues had shifted. However, other research has cast doubt on the idea that path integration functions as a backup system. Zhao and Warren (2015a) found that participants could not use path integration to perform a simple navigation task after previously visible environmental cues had been unexpectedly removed. On subsequent trials, however, participants were able to use path integration alone, indicating that they changed strategies after learning that environmental cues might not be completely reliable. The participants in the present study experienced all conditions during a practice block, and it is possible that this practice affected their strategies on experimental trials.

Experiments 1 and 2 both revealed a significant reduction of the response standard deviation in combined-cue relative to the single-cue conditions, indicating that participants utilized both path integration and room shape cues when navigating in the virtual environment. However, both experiments also revealed that response variability in the small-conflict condition was no different from that in the best single-cue condition. Failure to find a significant reduction in response variability in small-conflict conditions might reflect slightly different cue weights from trial to trial, which would artificially inflate the response standard deviation in the conflict condition because each cue indicated a different correct target location, although near-optimal weights should still result in a variance reduction relative to the single-cue conditions. It is also possible that the suboptimal performance in the cue conflict condition represents a mixture between cue integration and cue alternation strategies.

Although these experiments utilized virtual environments, the results are likely to generalize to navigation in real environments. Past research on cue combination using virtual environments (Chen et al., 2017) has produced conclusions consistent with those from studies using real environments (Nardini et al., 2008). Virtual reality is a particularly useful tool for creating cue conflicts, but cue conflicts arise regularly in the real world, similar to the example of the student who looks up after viewing a text message and sees that the surrounding landmarks are not in the exact locations predicted by path integration. Such discrepancies are commonly caused by sensory noise, which accumulates quickly within the path integration system (e.g., Kelly et al., 2008; Klatzky et al., 1990).

The results of the present study indicate that adult humans can optimally combine environmental cues such as room shape and landmarks with path integration cues to improve navigation precision. Additionally, human navigators tend to follow path integration over environmental cues under conditions of large cue conflict, regardless of whether the environmental cue is defined by landmarks or room geometry.