Neural evidence supports a novel framework for spatial navigation
- First Online:
- Cite this article as:
- Chrastil, E.R. Psychon Bull Rev (2013) 20: 208. doi:10.3758/s13423-012-0351-6
The spatial knowledge used for human navigation has traditionally been separated into three categories: landmark, route, and survey knowledge. While behavioral research has retained this framework, it has become increasingly clear from recent neuroimaging studies that such a classification system is not adequate for understanding the brain. This review proposes a new framework, with a taxonomy based on the cognitive processes and subprocesses involved in spatial navigation. The neural correlates of spatial memory can inform our understanding of the cognitive processes involved in human navigation, and conversely, the specific task demands of an experiment can inform the interpretation of neuroimaging results. This review examines the neural correlates of each cognitive process separately, to provide a closer inspection of each component of spatial navigation. While landmark, route, and survey knowledge are still important components of human navigation, the neural correlates are not neatly ascribed to these three categories. The present findings provide motivation for a more detailed examination of the cognitive processes engaged during wayfinding.
KeywordsNavigation Spatial memory Cognitive neuroscience Medial temporal lobe Spatial cognition
Human spatial navigation has typically been classified into three general types of knowledge: landmark, route, and survey knowledge (Siegel & White, 1975). This threefold framework has long been held as the standard for behavioral research in human navigation. Landmarks are prominent features in the environment, which can be the location of an associated action or can serve as a beacon to aim toward. Route knowledge is usually defined by place–action associations, and it enables one to follow a known path from one location to another, perhaps encountering many landmarks along the way. Survey knowledge includes some information about the overall layout and how routes fit together—including knowledge of metric distances and angles—which gives one the ability to take novel shortcuts between locations.
The primary goal of this review is to outline a new framework for spatial knowledge. In addition, it aims to introduce the neural correlates of navigation to researchers who may be familiar with the behavioral framework of landmark, route, and survey knowledge, and conversely to motivate neuroimaging research from a behavioral perspective. The neural correlates of landmark, route, and survey knowledge are discussed, allowing for examination of the areas where interpretations of the behavioral and neural data overlap, and where they do not. This review first argues that while the three levels of knowledge in the framework are still useful for understanding behavior, this paradigm has reached the limit of its usefulness for understanding the brain. In general, the framework is not specific enough to incorporate the cognitive processes involved within each type of spatial knowledge; each form of spatial knowledge needs to be broken into smaller components in order to render a more complete understanding of how the brain works. The review next provides a detailed taxonomy of the cognitive processes and subprocesses involved in each type of spatial knowledge. This taxonomy includes an additional type of spatial knowledge, graph knowledge, which consists of the topological connections between locations, resembling a network map. Then, the heart of the discussion is a systematic examination of the relationships between the cognitive processes of spatial navigation and their underlying neural substrates, providing support for the proposed framework. Finally, the review focuses on some of the limitations of the neuroimaging literature.
The landmark, route, and survey framework
In the current framework, route and survey knowledge have been associated with different spatial reference frames. Route knowledge is typically associated with an egocentric (observer-based) reference frame, because routes tend to be a series of left and right turns seen or imagined from a ground-level perspective. Learning a route may require resources related to learning sequences and views, since associating the view (including landmarks) at a given location is important for knowing what action to take in the sequence of the route. In contrast, survey knowledge is associated with an allocentric (world-based) reference frame, because it incorporates information about the distances and angles between locations in the environment, regardless of the location or perspective of the navigator, and is often conceived of as a map-like structure or bird’s-eye view.1 Survey knowledge is generally considered to be viewpoint-independent, because the learned spatial relationships are between different locations in the environment, rather than between environment locations and the navigator (Burgess, 2006; P. Byrne, Becker, & Burgess, 2007; Roche, Mangaoang, Commins, & O’Mara, 2005; Siegel & White, 1975).
Research on the neural correlates of human navigation has often followed this framework of landmark, route, and survey knowledge (e.g., Aguirre, Detre, Alsop, & D’Esposito, 1996; Janzen & Weststeijn, 2007; Latini-Corazzini et al., 2010; Maguire, Frackowiak, & Frith, 1997; Shelton & Gabrieli, 2002; Voermans et al., 2004; Wolbers & Büchel, 2005; Wolbers, Weiller, & Büchel, 2004). However, recent evidence from the neural literature, as well as from some behavioral studies, has suggested that human navigation cannot simply be categorized into these three components. Indeed, many cognitive processes and subprocesses are involved in learning these three aspects of the environment, particularly route and survey knowledge.
It has been hypothesized that as human navigators learn a new environment, they transition smoothly from one form of knowledge into the next. Siegel and White (1975) proposed that children first learn landmark information, followed by routes, and eventually learn survey knowledge of an environment; this trajectory of spatial learning has since been extrapolated to adults. However, recent behavioral research has cast doubt on this process of gradual transitions between forms of spatial knowledge. For example, Taylor, Naylor, and Chechile (1999) found that experimentally manipulating attention to certain aspects of the environment influences the spatial knowledge that is acquired. There also appear to be important individual differences in learning survey and route knowledge (Wolbers & Hegarty, 2010). For instance, after driving participants through two connected routes, Ishikawa and Montello (2006) found that a small minority of them had relatively accurate survey knowledge after one exposure, and another 25% eventually achieved accurate survey knowledge after ten exposures. Only half of the participants improved their survey knowledge at all over time, and some participants showed very poor performance throughout the testing, suggesting that the sequence of learning route and survey information may not always progress in a strict and inevitable order. Although many researchers have acknowledged that Siegel and White’s (1975) sequence is inaccurate, as yet no alternative framework fully describes the trajectory of spatial learning.
The neuroimaging literature has also exposed the limitations of the current framework. For example, Latini-Corazzini et al. (2010) trained their participants to route and survey criteria in a novel environment by showing videos of a route and then allowing them to follow the route using a keyboard. In the subsequent route test, participants were to indicate the direction of the turn at the next decision point in the route. For the survey test, participants determined in which of four directions a target landmark was located relative to their current location. Scans during the tests revealed that both tasks activated multiple brain regions, including the hippocampus, parahippocampal gyrus, caudate, and retrosplenial cortex. Importantly, this result suggests that multiple aspects of a neural network are activated when performing navigational tasks, and that route and survey knowledge do not necessarily have distinct neural correlates.2 It seems likely that when faced with a difficult task, navigators use all of the resources available to them. On the other hand, finding such a broad network of brain activity is somewhat uninformative regarding how those neural structures support spatial navigation. It is unclear from this particular type of study which aspects of the navigational process the hippocampus is sensitive to and which aspects activate the caudate. While it is unlikely that separate brain regions support route or survey knowledge, it is possible that these types of spatial knowledge involve multiple cognitive processes that do have distinct neural correlates.
The neural data support this notion that a finer-grained breakdown of the cognitive processes and subprocesses involved in spatial knowledge is in order. A task of navigating through a town might elicit activity in numerous brain regions, leading to confusion about how these regions support spatial navigation. The confusion remains when the tasks are separated into route and survey tasks: Sometimes a particular brain region will be activated during one route task, but then not in other route tasks. Examining neural activity related to landmark, route, and survey knowledge without considering the cognitive processes involved might lead to several consequences. First, this approach runs the risk of assigning a specific brain region as the seat of one type of spatial knowledge when the region may actually support only one piece of that knowledge. Second, if the same brain regions are active across multiple spatial navigation tasks, this approach could conceal real differences between types of spatial knowledge. Once a closer examination is made of the neural correlates of the processes and subprocesses of spatial navigation, a more coherent picture begins to emerge. The details of the behavioral task must be matched carefully with the cognitive processes involved in order to make sense of the neural correlates.
A taxonomy of spatial knowledge
A taxonomy of spatial knowledge
• Place recognition
• Place recognition
• Place recognition
• Place recognition
Scenes and views
Scenes and views
Scenes and views
Scenes and views
Place within the larger environment
Place within the larger environment
• Sequence learning
• Sequence learning
• Identifying decision points
• Identifying decision points
• Identifying decision points
• Response learning
• Forming associations
• Forming associations
• Locating the goal
• Locating the goal
Relate goal and current location
Relate goal and current location
Transformation between allo- and egocentric perspectives
Transformation between allo- and egocentric perspectives
• Path integration
One of the shortcomings of the current framework is that it misses an important type of spatial knowledge—graph knowledge—situated between route and survey knowledge. For example, in a number of experiments participants have been asked to remember areas that they have learned very well, and then to recall a route between two locations (e.g., Maguire et al., 1997; Rosenbaum, Ziegler, Winocur, Grady, & Moscovitch, 2004; Spiers & Maguire, 2006). While participants in those tasks are planning a route, they are not necessarily relying on route knowledge, which would only imply that participants knew one way to get from the start location to the goal, through a series of left and right turns, without requiring greater knowledge of the environment around them. Instead, those navigators likely could provide several possible routes, using their knowledge of the streets and roads in the environment to determine the most direct path. Indeed, the navigator might have never taken that particular route before. However, route planning is also not a test of survey knowledge, since accurate performance would not require knowledge of metric distances and angles between locations. Thus, an intermediate type of spatial knowledge is required, one in which navigators must use their knowledge of how the paths connect to find a suitable way to the goal.
Graph knowledge fits this intermediate role, consisting of a network of location nodes, connected by path edges. A graph describes how locations are connected to each other, without necessarily containing the metric distance and angle information between locations that would be required for full survey knowledge (R. W. Byrne, 1979; Chown, Kaplan, & Kortenkamp, 1995; Kuipers, Tecuci, & Stankiewicz, 2003; Trullier, Wiener, Berthoz, & Meyer, 1997; Werner; Krieg-Brückner, & Herrmann, 2000). Such graphs can also be labeled with some relative or ordinal metric information, such as “path A is longer than path B,” but they are not metric to the same degree as survey knowledge. In addition, a graph can be built up through the combination of routes, forming a larger network, but it is more flexible than the simple place–action associations of a route. Note that these categories of spatial knowledge form a hierarchy, such that survey knowledge includes graph knowledge, and likewise, graph knowledge incorporates route knowledge. Thus, a navigator with complete metric survey knowledge can derive knowledge of the graph, routes, and landmarks from the complete survey information.
Many of the “wayfinding” tasks described in this review most likely correspond to tests of graph knowledge. In these tasks, the navigator must synthesize knowledge of the overall layout of the city streets with an understanding of how the major and minor roads connect different locations in the city, which would also allow them to take novel detours. Graph knowledge involves more than the place–action associations of a route, but it does not require the metric information of survey knowledge. Indeed, often the knowledge that people rely on to travel around their neighborhood or city is graph knowledge. For example, R. W. Byrne (1979) noted that navigators answering questions about their town tended to simplify turns to 90° and judged routes with more segments as being longer, suggesting a network map rather than a metric map. Many tests of graph knowledge have been oversimplified and called route tests, while others have been portrayed as more complex tests of survey knowledge. This review includes graph knowledge in the taxonomy and will point out where tasks and processes fit into this category.
Incorporating an additional type of spatial knowledge into the taxonomy may help explain some of the inconsistent findings in the neural literature, but it does not provide any simple answers as to how the brain processes spatial information. The major shortcoming of the current framework is that these four spatial-learning categories may themselves be made up of multiple cognitive processes. For example, the broad category of route knowledge is composed of many different types of knowledge (Wiener, Buchner, & Holscher, 2009). Neural evidence suggests that different brain regions may be involved in learning a route for the first time, following a known route, deciding which route to take, traveling on routes in familiar as compared to unfamiliar areas, and distinguishing between overlapping routes. In addition, determining a path to a target can be broken down into many different steps, such as identifying the goal, planning the path, identifying choice points, making the appropriate maneuvers at the choice points, and so on. Landmark, route, graph, and survey knowledge involve many cognitive processes that must be untangled to fully understand the mapping between brain and behavior.
The taxonomy presented here identifies multiple cognitive processes and subprocesses involved in spatial navigation and classifies where these processes fit within the scope of landmark, route, graph, and survey knowledge. The purpose of the taxonomy is to provide a systematic analysis of the interaction between brain and behavior that can facilitate future research in both domains. The taxonomy outlines seven main cognitive processes involved in spatial navigation: place recognition, sequence learning, identifying navigationally relevant decision points, response learning, forming associations, locating the navigational goal, and path integration. Few of these cognitive processes are required by all four types of spatial knowledge. Conversely, no one type of spatial knowledge draws on all of the processes and subprocesses. Some of these functional processes and subprocesses may overlap and interact, leaving open the possibility that the taxonomy could be somewhat fluid. For example, forming associations appears to be relevant on its own, in addition to being an aspect of both place recognition and response learning, but it might also have differing implications that depend on the type of spatial knowledge involved. Some of the cognitive processes could be broken down further into subprocesses if future research reveals additional differences. Likewise, some of the processes could arguably be combined. The remainder of this review will focus on the neural evidence supporting this breakdown of the four categories of spatial knowledge into cognitive processes.
Neural techniques and brain regions in navigation research
Before delving into the evidence regarding the neural substrates of spatial learning, it is important to take a moment to examine the relevant neural techniques and the key brain regions that are typically associated with spatial navigation. Research on the neural correlates of spatial navigation has primarily relied on imaging techniques focused on regions of the medial temporal lobe (see Bandettini, 2009, and Lane et al., 2009, for comparisons of the methods).
Functional magnetic resonance imaging (fMRI) is the most commonly used technique in studies on the neural correlates of spatial navigation. fMRI measures brain activity via changes in blood flow (Huettel, Song, & McCarthy, 2004). This technique uses the blood oxygen level dependent (BOLD) contrast, in which active brain regions are assumed to require more blood oxygen than inactive areas. The change in blood flow takes some time, however, so while fMRI has a high degree of spatial precision, temporally it is less precise. fMRI relies on subtraction techniques, wherein the neural activity during a task is compared to the neural activity during a baseline task. It is also worth noting that neural activity often appears to be correlated with performance (e.g., Baumann, Chan, & Mattingley, 2010; Hartley, Maguire, Spiers, & Burgess, 2003; Mellet et al., 2010; Rauchs et al., 2008); studies that find no effects might be missing key interactions if they are not also informed by appropriate correlation analyses. Another fMRI technique relies on neural adaptation to a repeated stimulus. When the same stimulus is presented multiple times in succession, the BOLD response is not as large as in the first presentation; a reduced BOLD response is thus interpreted as the brain region in question treating the two stimuli as being the same.
Positron emission tomography (PET) uses glucose tracers to detect areas of the brain that are using the most glucose, with the assumption that those areas are the most active. Like fMRI, PET has high spatial resolution but less temporal resolution, and also relies on similar subtraction techniques. Because of the injection of tracers, it can be viewed as more invasive, and tends to be used less often than fMRI (see Cabeza & Nyberg, 2000, for a review of the use of fMRI and PET to study cognition).
Another set of techniques for assessing neural activity is magnetoencephalography (MEG) and the related electroencephalography (EEG), which gauge the electrical activity produced by the brain (Schomer & Lopes da Silva, 2010). MEG and EEG measure neural oscillations—the rhythmic neural activity produced by the brain—especially the theta rhythm, which oscillates around 4–12 Hz (see Klimesch, 1999, and Nyhus & Curran, 2010, for reviews of the role of theta in episodic memory). These techniques offer much greater temporal resolution than does fMRI, but typically they have less spatial resolution.
Finally, some neural techniques involve recording from single cells. Cells are categorized as more or less active by the rates at which they fire. This technique offers a high degree of spatial and temporal resolution and can distinguish between cell types or subregions within a single brain structure, such as the hippocampus. Because it is fairly invasive, single-cell recording is typically done in rats or monkeys, but it is also performed occasionally on humans undergoing surgery.
Neural correlates of spatial navigation
Now we are set for the core of this review: an examination of the neural correlates of the cognitive processes involved in spatial navigation, and of their relationship to the proposed framework. Evidence for the neural substrates of each cognitive process is presented here, proceeding through the taxonomy (Table 1) in roughly the order that the processes might be used during a navigational experience, although the order of acquisition may not be fixed. Each element within the taxonomy is accompanied with a quote to provide a concrete instance of what the processes entail. The quotes do not imply conscious statements, but instead serve merely as examples of each process or subprocess, to make the intent of the functional breakdown clear.
Place recognition (“I am at the bank”)
The process common to all four types of spatial knowledge is place recognition. A navigator must know where they are located in order to take the appropriate action. Place recognition can have different implications, however, depending on the navigator’s spatial knowledge for a given environment. One can use a scene or view-based system to recognize a unique place, which may not require any greater spatial knowledge. More detailed knowledge of a place can situate the location within the larger context of the environment.
Scenes and views (“I am at the bank”)
All four types of spatial knowledge rely on visual information to obtain knowledge of location. Landmark and route knowledge, however, do not require additional spatial localization within the environment.
The parahippocampus appears to be involved in view-based scene recognition. Maguire, Frackowiak, and Frith (1997) asked London taxi drivers3 to describe the appearance of famous landmarks, a spatial task that did not involve the temporal aspects of a route. This landmark task showed increased PET activation of the parahippocampal gyrus (which includes the parahippocampal, perirhinal, and entorhinal cortices), but not the hippocampus, as compared with baseline. The parahippocampal cortex and nearby fusiform gyrus (together often referred to as the “parahippocampal place area” or PPA) also show increased fMRI activity for scenes and landmarks presented in spatial context relative to faces and objects presented in isolation (Epstein & Kanwisher, 1998). Direct brain recordings of single neurons in humans have revealed that cells in the parahippocampal gyrus are more likely to respond to a particular view than are hippocampal cells, which are more likely to respond to a unique place in a desktop virtual environment (Ekstrom et al., 2003). Finally, participants with greater navigational ability showed larger fMRI adaptation effects in the parahippocampal cortex when observing different views of the same place (Epstein, Higgins, & Thompson-Schill, 2005). Such adaptation suggests that good navigators are able to incorporate different views into a viewpoint-independent representation of the same place.
On the other hand, patients with hippocampal damage but intact parahippocampi have been shown to have difficulty identifying a location when the task relies on short-term memory and involves integrating across different viewpoints (Hartley et al., 2007). Patients with hippocampal damage showed no deficits identifying oddball faces or objects that were observed from different viewpoints, but they did have deficits in identifying scenes from different viewpoints. Patients with damage to perirhinal areas had difficulty with all tasks requiring viewpoint-independent identification (Lee et al., 2005). Together, these findings suggest that the parahippocampal cortex is sensitive to scene and view information, rather than to location per se; however, that area may be involved in consolidating the location information from views. The hippocampus may also be important for scene recognition when integration across multiple viewpoints is required.
Place recognition within the larger environment (“I am at the bank, which is on the west side of town and is near the fire station”)
An additional step in place recognition is to situate the location within the broader environmental context. Situating the place could involve metric knowledge of how close the location is to the boundaries of the environment, or knowledge of the relative locations of nearby landmarks. Both graph and survey knowledge require some information about a location beyond simple recognition; they require knowledge of how that place fits within the larger scope of the environment. Place learning of this type is often contrasted with response learning, which will be discussed in detail in a later section (see the section “Response Learning”).
Place cells in the rat hippocampus are tuned to specific features of the environment, but these features may correspond to one or more locations (O’Keefe & Burgess, 1996) and are dependent on the local environmental geometry. In the human posterior hippocampus, some place cells respond mainly to the present location, with other cells responding to the person’s goal as well (Ekstrom et al., 2003). Boundary vector cells in the entorhinal cortex and nearby subiculum fire when near the borders of the environment (Lever, Burton, Jeewajee, O’Keefe, & Burgess, 2009; Solstad, Boccara, Kropff, Moser, & Moser, 2008).
Because hippocampal cells in the rat have been shown to respond preferentially to the boundary geometry of an enclosure (O’Keefe & Burgess, 1996), Doeller, King, and Burgess (2008) proposed that learning related to environmental boundaries in humans would be associated with higher activation in the hippocampus, consistent with place learning. They had participants navigate to the learned locations of target objects in desktop virtual reality (VR), while the locations of the objects were held constant either relative to boundaries and distal orientation cues or relative to a local landmark. The results showed that boundary learning did in fact promote global activation in the hippocampus. The parahippocampal gyrus also appeared to be activated during learning relative to boundaries; while the authors had no specific hypotheses about this region, it seems possible that it was involved in learning specific views or the spatial context required to orient to the distal cues. Further research has shown that the left hippocampus is active when imagining scenes with walls or boundaries, as compared with equally complex scenes consisting of towers (Bird, Capponi, King, Doeller, & Burgess, 2010), suggesting that the hippocampus is important for relating location to environmental boundaries. Finally, hippocampal and parahippocampal theta oscillations are correlated with accuracy in locating a target relative to the boundaries of the enclosure (Cornwell, Johnson, Holroyd, Carver, & Grillon, 2008; Kaplan et al., 2012).
Whereas the hippocampus appears to be important in placing locations near boundaries and other environmental features, the parahippocampal cortex also appears to be involved in situating a location within the larger environmental context. Recent research in humans has suggested that the parahippocampal cortex is responsible for general contextual learning and for forming associations between objects and their locations. To dissociate spatial and nonspatial contexts, Aminoff, Gronau, and Bar (2007) presented objects in three ways: in a group with a particular spatial arrangement, grouped together with other objects without any spatial pattern, or in isolation without any context. Overall, the parahippocampal cortex showed more global activity to objects learned with a context than to those without any context. However, a further functional and anatomical separation was found, with spatial contexts producing a greater response in the posterior parahippocampal cortex, and nonspatial contexts, a greater response in the anterior region (see also Bar & Aminoff, 2003; Bar, Aminoff, & Schacter, 2008; Sommer, Rose, Glascher, Wolbers, & Büchel, 2005). The borders of the so-called PPA overlapped significantly more with spatial than with nonspatial association areas. However, the contextual effects appear to be strongest at slow presentation rates, making it possible that mental imagery of the spatial location could be behind some of the activity in the parahippocampal cortex (Epstein & Ward, 2010).
In addition, the hippocampus has been found to respond to different contexts (Aminoff et al., 2007), although the nature of the contextual learning differs somewhat from that of the parahippocampal cortex. The hippocampus showed increased activation to previously seen isolated objects over novel objects, whereas the parahippocampus did not, indicating that the hippocampus is sensitive to familiarity, consistent with its role in incidental learning (Doeller et al., 2008). The hippocampus has also been implicated as part of a general match–mismatch detection system (Kumaran & Maguire, 2006, 2009), with the hippocampus binding objects and backgrounds, and the parahippocampal cortex being sensitive only to background scene contexts (Howard, Kumaran, Olafsdottir, & Spiers, 2011). Similarly, Rauchs et al. (2008) found that hippocampal activation was correlated with spatial components of learning, whereas activity in the parahippocampal gyrus was correlated with contextual learning during a navigation task. Lesions to the perirhinal and postrhinal regions of the parahippocampal area in rats prevent the acquisition of contextual associations regarding fear (Burwell, Bucci, Sanborn, & Jutras, 2004), but they do not prevent place learning in the Morris water maze, whereas lesions to the hippocampus proper do affect place learning (Burwell, Saddoris, Bucci, & Wiig, 2004). Increased activity in the hippocampus and parahippocampal gyrus during initial navigation to a target relative to three landmarks was correlated with more accurate subsequent retrieval of the target location (Baumann et al., 2010), suggesting that these areas are important for encoding the spatial relationships at a particular location.
Finally, the retrosplenial cortex also appears to be involved with learning both spatial and nonspatial context (Bar & Aminoff, 2003). Epstein and Higgins (2007) found that the parahippocampal gyrus had greater BOLD signal in response to isolated scenes, while the retrosplenial cortex had preferential activation for a scene in a larger context, such as when the location was labeled. The retrosplenial cortex may also treat different views of a scene as being the same when viewed with continuous movement, as evidenced by fMRI attenuation (Park & Chun, 2009). Interestingly, Dilks, Julian, Kubilius, Spelke, and Kanwisher (2011) found that the retrosplenial cortex was sensitive to mirror reversals of scenes, measured by fMRI adaptation, while the parahippocampal cortex was not. One interpretation of these results is that the retrosplenial cortex is sensitive to the egocentric directions in the scene, while the parahippocampus is simply sensitive to the scene itself.
Thus, the hippocampus, parahippocampal cortex, and retrosplenial cortex all appear to be important in making associations between a location and the larger spatial context, which is important for place learning within a larger environment. The distinctions between the exact nature of the contextual associations is still somewhat murky; clarifying these distinctions will be important in understanding how these various brain regions work and how they contribute to place learning. One possibility is that the hippocampus binds landmarks to the scene and geometrical features, while the parahippocampal cortex relates more to the visual information about the scenes, consistent with its role in general scene recognition. The retrosplenial cortex may be involved in contextual learning as far as it relates to movement of the navigator (see the “Transformation Between Allocentric and Egocentric Perspectives” section). Thus, place recognition within a larger environment appears to have multiple neural correlates, suggesting that an even finer-grained breakdown of this process may be necessary as more evidence is uncovered.
Sequence learning (“Down this street, first I will encounter the bank, then the market, and then the school”)
Sequence learning is an important part of route and graph knowledge, because the ordinal relationship between landmarks provides important clues about where to turn, the proximity to the goal, and monitoring for errors. If the next expected landmark in a sequence does not appear, it is possible that the navigator made a wrong turn at the previous location. In route knowledge, the sequence of landmarks is fixed, whereas in graph knowledge, the exact sequence may depend on which path the navigator takes. Sequence learning can also be considered an associative process (see the “Forming Associations” section), and thus may activate some general association areas.
Maguire, Frackowiak, and Frith (1997) examined some of the sequential aspects of spatial navigation in a study of London taxi drivers. They identified regions involved in recall of spatial and temporal information using four tasks: describing the shortest route between two locations (spatial and temporal: a graph task); describing the appearance of famous landmarks (spatial: place recognition); recalling the plots of familiar films (temporal); and describing individual frames from familiar films (neither). The route description task increased activity in the hippocampus and parahippocampal gyrus, and also produced more activation in the hippocampus than did the landmark task, whereas the film tasks did not. These results suggest that the hippocampus is involved in the temporal sequencing of routes, route planning, or the recall of spatial episodic memories, while the parahippocampal gyrus may play a bigger role in view-based recall of places.
In a similar graph task with London taxi drivers, Spiers and Maguire (2006) asked the drivers to navigate to different goal locations using a desktop VR setup. These authors broke the task down into different increments, depending on what the drivers reported thinking about during the ride. The hippocampus only showed activity when the target location was first named, during the time when the driver was planning the overall route. Interestingly, hippocampal activity did not increase in any of the other sections of the ride, including anticipation of landmarks, spontaneous changes in the route independent of the goal, or negotiating traffic. This finding suggests that the hippocampus may be involved in identifying the location of the goal and the general sequence of the route, but does not play a role in the “details” of when and where to turn or recognizing landmarks. It is difficult, however, to determine whether the increased activity in the hippocampus was related to locating the goal, to the general sequencing, or to both.
As was suggested by the results of Maguire et al. (1997), the hippocampus has also been associated with both the encoding and retrieval of nonspatial sequences, and it tends to be correlated with accuracy in learning sequences (Fortin, Agster, & Eichenbaum, 2002; Lehn et al., 2009; Ross, Brown, & Stern, 2009; Schendan, Searl, Melrose, & Stern, 2003), and especially with distinguishing between overlapping sequences (Brown, Ross, Keller, Hasselmo, & Stern, 2010). Increased theta rhythm activity was observed when learning longer routes through a maze than when learning shorter routes (Kahana, Sekuler, Caplan, Kirschen, & Madsen, 1999), possibly indicating a role in sequence learning, although increased memory load could also account for those results. Kumaran and Maguire (2006) found increased hippocampal activity when expectations were violated about the next object in a learned sequence of objects, suggesting that the hippocampus might play a role in predicting sequential events and detecting mismatches. It should be noted that learning the temporal order of the route is not necessarily the same as learning the associated actions, which may have other neural correlates. The increased activity in the hippocampus when recalling routes could also be related to route planning, which involves determining the location of the goal and the sequence of places to get there.
In sum, the hippocampus appears to be heavily involved in learning sequences and the temporal order of locations. A hippocampal contribution to sequence learning may explain its activity during route and graph tasks. The interpretations of some of the studies presented here, however, are often ambiguous as to whether hippocampal activity is related to sequence learning or to thinking about the target.
Identifying decision points (“I will have to make a turn at the bank”)
When learning route or graph knowledge of an environment, a subtle transition occurs in the navigator’s understanding of the environment, which converts some locations from simple places into places that are relevant for navigation. In essence, these locations become landmarks. Often, these navigationally relevant landmarks are located at decision points, where a navigator must turn right or left or go straight on. Arguably, the recognition of decision points could occur as a subprocess of place learning; however, this process requires the additional step of associating some navigational relevance to the location, which goes beyond a sense of place. On a typical route, many places will be irrelevant for navigation, such as unremarkable houses in the middle of a block. Thus, the neural correlates for learning which places hold navigational significance is the focus of this section.
Patients with damage to the parahippocampus or the nearby lingual and fusiform gyri have difficulty identifying landmarks suitable for navigation, as well as difficulty learning new locations from visual memory (Aguirre & D’Esposito, 1999; Habib & Sirigu, 1987). When intact participants first learn an environment, parahippocampal cortex activation occurs during free exploration of a maze and subsequent retrieval of routes between locations, with similar levels of activation throughout the learning and retrieval processes (Aguirre et al., 1996). Shelton and Gabrieli (2002) showed videos of a route through a novel environment from a ground-level perspective. During encoding, they found increased BOLD signal bilaterally in the medial temporal lobes, including both the parahippocampal cortex and the posterior hippocampus. If the hippocampus is involved in sequential learning, it may have been activated by the particular order of places that defined a route in Shelton and Gabrieli (2002), but not by free exploration in Aguirre et al.’s study, when there was no demand to learn a particular sequence or route. It is difficult to determine, however, whether the parahippocampal activity observed in these experiments was related to learning the specific decision points in the environment, to general place recognition, or to other spatial processing.
After learning a route, the parahippocampal gyrus shows greater global activation when viewing objects located at decision points on the route than when viewing other objects along the route (Janzen & Jansen, 2010; Janzen & van Turennout, 2004; Janzen & Weststeijn, 2007). However, this result only holds for routes in unfamiliar areas; routes in well-known areas produce no such decision point effects (Schinazi & Epstein, 2010). Similarly, Weniger et al. (2010) found that the activity in the parahippocampal gyrus at decision points decreases as the route becomes more familiar. These findings suggest that the parahippocampal gyrus plays a role in associating views at a choice point with the place or action, but only while learning a route or recalling a route traversed a few times. The role of the parahippocampus in the view-based recall of places (see the “Place Recognition” section above) seems to be contradictory to the result that the parahippocampal gyrus becomes less active as routes are learned (Schinazi & Epstein, 2010; Weniger et al., 2010). However, Janzen and colleagues also found parahippocampal activation with view-based recall in newly learned routes (Janzen & Jansen, 2010; Janzen & van Turennout, 2004; Janzen & Weststeijn, 2007), and the parahippocampus may play other roles in learning the correct responses to the tasks (see the “Forming Associations” section below), which might be a factor in the activity in that region.
Thus, identifying landmarks that are useful for navigation is an important component of the route- and graph-learning processes that appears to be supported in the parahippocampal gyrus. As the environment becomes more familiar and place or route knowledge is consolidated, other processes may take over to support navigation. It is also important to note that several of the experiments described in this section specifically related to the encoding processes of spatial learning. In general, few studies have examined neural activation during encoding, and it is possible that the neural correlates during encoding are different from those at retrieval.
Response learning (“Turn left at the bank”)
Response learning is knowledge of the necessary action to take at a location, which has typically been studied in contrast with place learning. This classification is related to the longstanding distinction in the animal-learning literature between place learning and response learning, based on a conception of multiple parallel systems for learning and memory (O’Keefe & Nadel, 1978; Packard & McGaugh, 1996; Restle, 1957; White & McDonald, 2002), and the two systems may indeed involve distinct learning mechanisms (Simon & Daw, 2011). Place learning, as it is understood in these studies, involves acquiring the spatial relationships between places in an environment and may depend on an understanding of the overall spatial layout (see the “Place Recognition” section). Response learning is less flexible and involves associating each place or view with a particular response in order to follow a known route to a goal.
Iaria, Petrides, Dagher, Pike, and Bohbot (2003) had participants learn object locations in a radial arm maze using desktop VR. Individuals who relied on spatial relations between the object and distal landmarks had an increase in the BOLD signal in the hippocampus during learning, whereas those who relied on an ordinal or response strategy (e.g., “take the third arm from the starting position”) showed activity in the caudate nucleus. Ordinal learners also made fewer errors, suggesting that a nonspatial strategy may be more reliable in this situation. Spatial learners tended to have a greater gray matter density in the hippocampus, whereas ordinal learners had a greater density in the caudate nucleus (Bohbot, Lerch, Thorndycraft, Iaria, & Zijdenbos, 2007). Participants who learned to navigate to a location on the basis of actions at a local landmark showed increased activity in the dorsal striatum, in particular at the head of the caudate (Doeller et al., 2008). Marchette, Bakker, and Shelton (2011) found that some navigators preferred to use place-learning strategies in a more complex maze by taking novel shortcuts to a target, while others used response strategies by following a known route. Those who preferred place learning exhibited greater activation in the hippocampus during encoding, while those who used a response strategy showed more activation in the caudate, regardless of their accuracy in the task. These results suggest that some individuals rely more on place learning and others on response learning, while the most efficient navigators are able to flexibly switch between these strategies (Bohbot et al., 2007; Hartley et al., 2003; Wolbers & Hegarty, 2010). These systems may be complementary and noncompetitive. For example, Huntington’s patients with selective damage to the caudate nucleus do not show impaired performance on a route task, for they are apparently able to compensate with increased activity in the hippocampus (Voermans et al., 2004).
Consistent with this work, Hartley et al. (2003) reported that finding novel paths between known locations in a virtual town leads to an increased BOLD response in the parahippocampal cortex, posterior parietal cortex, and caudate body, whereas repeating a well-known route yields an increase in the caudate body only. Participants who were more accurate at wayfinding also showed increased activity in the hippocampus, while those who were less accurate at wayfinding had more activity in the head of the caudate during the route-following task. Hartley et al. (2003) suggested that these response patterns are indicative of distinct mechanisms for place and response learning. Although the wayfinding task in this experiment may have been too complex to sort out the roles of the hippocampus and parahippocampus, the clear association of the caudate nucleus with route following is consistent with a neural basis for response learning. This study stands in contrast to studies that have shown parahippocampal gyrus activity during route learning that decreased as the route became more familiar (e.g., Janzen & van Turennout, 2004; Janzen & Weststeijn, 2007; Schinazi & Epstein, 2010; Weniger et al., 2010). While participants in those experiments were led on the route a few times, Hartley et al. (2003) trained their participants through the route many times and found activity in the caudate during retrieval. Thus, it seems likely that as route knowledge becomes consolidated into a more purely response-based process, the parietal cortex and caudate become more involved.
A recent study found a somewhat different pattern of activation with regard to place and response learning. Igloi, Doeller, Berthoz, Rondi-Rieg and Burgess (2010) trained participants on a route to a target location in a star-shaped radial maze and then probed the participants’ responses when starting from a different location in the maze. The experimenters manipulated the views in the different arms of the maze in such a way that on some trials, participants tended to make more response-based movements, while on other trials, they made place-based movements. Igloi et al. found that the caudate was active in both the place and response trials, but that the left hippocampus was active during response-based trials, whereas the right hippocampus was responsive during place trials. Hippocampal activity in the response-based trials is consistent with a role for the hippocampus in sequence learning, as noted earlier (e.g., Fortin et al., 2002), but it is not commonly seen in other tests of response learning. However, activity in the left hippocampus decreased during the course of the test trials, suggesting that its contribution to response learning may occur early during learning. Meanwhile, activation in the right hippocampus seemed to decrease only during training, not during the place-based test trials. The activity of the caudate in both trial types is rather surprising, but it is possible that participants were learning the appropriate response to a place-based trial, since all of the allocentric place-based trials actually required the same movement response.
The neostriate, including the caudate nucleus, has been implicated in habitual response learning of this type, which is considered as procedural memory (Knowlton, Mangels, & Squire, 1996). The parietal cortex may also play a role in response-based learning, for it has been associated with the intent to perform an action as part of a cortical network for reaching movements (Snyder, Batista, & Andersen, 1997, 2000). Medial parietal cortex is often associated with the egocentric direction of spatial targets and a self-reference system (Lou et al., 2004; Pesaran, Nelson, & Andersen, 2006), while lateral parietal cortex is associated with processing spatial attention (Goldberg, Bisley, Powell, & Gottlieb, 2006). Activity in the parietal cortex has been documented in numerous navigation tasks (Brown et al., 2010; Rauchs et al., 2008; Shelton & Gabrieli, 2002, 2004; Wolbers et al., 2004) and has been linked with the representation of space in an egocentric coordinate frame (Andersen, Snyder, Bradley, & Xing, 1997; Galati, Pelle, Berthoz, & Committeri, 2010), including the representation of heading direction (Baumann & Mattingley, 2010). Patients with damage to the parietal cortex have difficulty understanding the egocentric locations of objects (Aguirre & D’Esposito, 1999). Activity in the parietal cortex may, then, be particularly relevant for route-learning tasks (Sato, Sakata, Tanaka, & Taira, 2006; Shelton & Gabrieli, 2004).
Thus, learning the correct response action at a given location appears to have neural correlates in the caudate and parietal cortex, which may play slightly different roles in response learning. The parietal cortex appears to be responsible for egocentric movements, while the precise contribution of the caudate in these experiments has been ambiguous; it could be involved in the formation, storage, or execution of procedural memory. The interaction between these two structures during response learning needs to be resolved. Finally, there also appear to be large individual differences regarding navigational preference for response learning.
Forming associations (“Turning right at the bank takes me to the market, turning left takes me home”)
Forming associations is a key part of spatial navigation. Associative learning occurs at many points in navigation tasks, but with differing purposes and outcomes. For example, associating a place with the other nearby landmarks or the boundaries of the environment is important for place recognition (see the “Place Recognition Within the Larger Environment” section). Forming associations between a place and an action is key to response learning (see the “Response Learning” section). Finally, graph learning may require a navigator to form multiple associations between a location and different actions, depending on the desired goal.
Brown, Ross, Keller, Hasselmo, and Stern (2010) found increased activity in both the hippocampus and parahippocampal cortex while participants made decisions about goals and the associated actions in partially overlapping, as compared to nonoverlapping, routes. In contrast, the researchers found no increased activity in sections of the routes that did not require discrimination. The shift from learning one route to learning multiple, partially overlapping routes could signal the beginning of the transition from route to graph knowledge: Rather than simply associating each landmark with an action, a navigator must now understand how the paths intersect and learn to distinguish between several possible actions at each landmark, depending on the goal. This finding suggests that both the hippocampus and parahippocampal cortex are important to this process. Similarly, Brown, Ross, Tobyne, and Stern (2012) found positively correlated functional connectivity between the hippocampus and caudate when participants distinguished between overlapping routes, suggesting that these regions are both important for graph learning.
Making active decisions about where and how to explore may be a part of forming associations. For example, Voss, Gonsalves, Federmeier, Tranel, and Cohen (2011) found that making decisions about where to explore in a 2-D spatial memory task improved subsequent retrieval and was correlated with activity in the hippocampus. Volitional control of movement and exploratory behavior is also associated with increased hippocampal theta rhythm activity (Caplan et al., 2003; Ekstrom et al., 2005; Kaplan et al., 2012). These results suggest a role for the hippocampus in self-directed learning, such as when a navigator is free to explore and test predictions about spatial information, possibly drawing on reinforcement-learning mechanisms (e.g., Simon & Daw, 2011).
A broader reading of the parahippocampal, hippocampal, and retrosplenial cortices focuses on their role in forming associations. Whether these regions are general association areas or are attuned to specific types of associations is the focus of much debate, with varying results providing subtle differences in the interpretations (e.g., Bird & Burgess, 2008; Buckner, 2010; P. Byrne et al., 2007; Squire et al., 2004). For example, the hippocampus has been implicated in associative learning and declarative memory as serving to synthesize different episodic memories into one larger, declarative system (Eichenbaum, 2000).
Beyond the spatial domain, another interpretation of the roles of hippocampal and parahippocampal associations emphasizes their similarities and has broader implications for learning and memory. Bar (2007) hypothesizes that these regions may be part of a larger associative network responsible for making predictions about scenes and situations (see also Schacter, Addis, & Buckner, 2007). The hippocampus appears to be preferentially sensitive to geometric features and to locations in these associations, whereas the parahippocampal cortex may be more relevant to views and to other contextual information, but both areas appear to be generally related to forming associations. These associations allow the brain to make analogies between the objects and events currently being perceived and those in memory. Associations between objects and context not only allow a person to recognize where he or she is, but also allow that person to recognize what is likely to happen next. It must be noted, however, that while these regions have many nonspatial functions, they are also the primary regions associated with spatial learning and memory, and so need to be considered in a spatial context as well (McNamara & Shelton, 2003). One possible way of reconciling the spatial and nonspatial functions would be to argue that these regions are general association areas, but considering that most human activity occurs in some sort of spatial location, these regions have become particularly attuned to spatial learning, with slight divergence on different aspects of spatial learning in certain regions (O’Keefe & Nadel, 1978). Spatial and nonspatial contexts appear to be dissociated to some extent in the hippocampus and parahippocampal cortex, and while the debate is still unresolved about whether the associations are linked to the spatial context or imagery of a scene, general mechanisms do seem to explain why these areas are so prominent in navigational tasks.
Forming associations is a key part of spatial navigation. The hippocampus and parahippocampal cortex are broadly involved in forming associations, especially when the navigator must learn to distinguish overlapping routes or to test predictions during self-directed exploration. The specific nature of the associations may provide information regarding how differing brain regions process associations.
Locating the goal (“City hall is at the corner of Main and Broadway”)
When traveling to a specified goal, the navigator must have some knowledge of where that goal is located within the larger environment. This process is similar to place recognition, but it has the added component of spatially relating the goal to the present location. Knowledge of the goal location is most important for graph and survey knowledge. While navigators following a route may intend to reach a certain goal, they do not need to know where the goal is located within the large environment; instead, they simply follow the series of place–action associations until they reach the desired location.
Relation between the goal and the current location
In addition to knowledge of the goal location, the navigator must know how that location relates to the current position. This relationship may be expressed in broadly different terms, depending on whether the navigator relies on graph or survey knowledge. With graph knowledge, the navigator needs to know the connections of roads between the current location and goal (“I am on 5th St., which intersects with Broadway”), but not any metric information. In contrast, survey knowledge of the relationship between the current location and the goal requires a qualitatively different type of understanding (“City hall is 1 km northeast of here”).
In many studies of graph and survey knowledge, it is difficult to determine which aspects of the task are specific to identifying the goal, and which relate the goal and the current location. Spiers and Maguire (2006) observed increased activity, relative to a coasting baseline, in the hippocampus bilaterally, as well as in the left parahippocampal cortex, the retrosplenial cortex, and the lateral and medial prefrontal cortex during sections of the taxi ride in which drivers planned the route to the goal, suggesting that these regions may be important for identifying the location of the goal and the general sequence of the route. Taxi drivers also have larger posterior hippocampal areas, correlated with their numbers of years on the job, than do either controls (Maguire et al., 2000) or bus drivers—who follow defined routes—with equivalent experience (Maguire, Woollett, & Spiers, 2006). Microstructural integrity in the hippocampus is correlated with faster times navigating to a goal (Iaria, Lanyon, Fox, Giaschi, & Barton, 2008). It is difficult, however, to determine whether these results stem from the navigator’s ability to relate the goal with the current location, or rather from the transformation between allocentric and egocentric perspectives, the use of metric distance information, or other navigational processes.
Using PET, the right hippocampus has been shown to have greater regional cerebral blood flow when navigating toward a destination in a familiar environment than when simply following a path (Maguire et al., 1998), while BOLD hippocampal activity increases with proximity to the goal (Viard, Doeller, Hartley, Bird, & Burgess, 2011). In the human posterior hippocampus, some cells appear to selectively respond to the navigator’s goal (Ekstrom et al., 2003). The hippocampal theta rhythm may also be important for aspects of navigational planning, although it has also been implicated in non-goal-related exploration (de Araujo, Baffa, & Wakai, 2002; Caplan et al., 2003; Ekstrom et al., 2005). Again, these results could stem either from locating the goal or from other aspects of navigation. More precise experimental paradigms will be needed to discriminate between identifying the goal and other aspects of navigational planning. Nevertheless, consistent activation in the hippocampus has been observed during this process.
Transformation between allocentric and egocentric perspectives (“I turn right on 5th St., then left on Broadway, then go four blocks, and city hall is on the left”)
In addition to understanding where the goal is located relative to the current position, a navigator must transform that information into useable actions in order to reach the goal. In other words, he or she must transform the information from an allocentric reference frame, containing the connections between locations or metric information, into an egocentric reference frame, containing information about how the navigator must turn and move in order to reach the goal.
In a route-learning task, Wolbers et al. (2004) found increased fMRI activity in the retrosplenial cortex during encoding as compared to baseline, but this level of neural activity in the retrosplenial cortex remained steady over the encoding sessions. Wolbers and Büchel (2005) then asked participants to determine the relative location (right, left, or behind) of a target object relative to the current location, thus encouraging learning of the graph structure of the environment over several encoding sessions. This task required knowledge of both the goal’s location and the direction to the goal in egocentric terms. The retrosplenial cortex showed performance-related increases in activity with each encoding session, such that the more information the participant acquired about the maze, the greater the retrosplenial activity. Hippocampal activity was correlated with the amount of spatial layout learning that took place on a given time through the route: Participants who learned graph information during the first few passes on the route showed hippocampal activity early on, while those who learned later had increasing hippocampal activity during the last few passes on the route. Thus, the hippocampus was more important for the relative amount of encoding, while the retrosplenial cortex was sensitive to the overall amount of spatial knowledge. The retrosplenial cortex thus may play a role in integrating egocentric spatial information with information about self-motion, which is important for both graph and survey knowledge.
The findings regarding how the retrosplenial cortex relates to navigation are somewhat unclear and inconsistent. Damage to this area has been associated with confusion regarding heading direction (Aguirre & D’Esposito, 1999). Some researchers have suggested that the retrosplenial cortex serves a translation function between egocentric representations in the parietal cortex and allocentric representations in the hippocampus (P. Byrne et al., 2007; Ino et al., 2002), while others have suggested that the retrosplenial cortex may be involved in survey learning and in putting scenes into context (Epstein, 2008; Galati et al., 2010; Wolbers & Büchel, 2005). This region appears to be active in both place and route learning, and it could be a means for place and response learning to interact, as a region where the transformation between learning a location as a place (thus, in more allocentric terms) and as a response (thus, in egocentric terms) occurs. These various functions may be compatible, but more direct work on the role of the retrosplenial cortex will be needed to fully clarify its involvement in spatial navigation (Vann, Aggleton, & Maguire, 2009).
In sum, goal-directed navigation involves several subprocesses, of which the neural correlates are not always easily distinguished. Experimental paradigms involving navigation to a goal have rarely examined the subprocesses involved, and thus it is unclear whether neural activation in these tasks is responding to locating the goal, sequencing the route, relating the goal to the present location, or movement toward the goal. Although these experiments have been unable to parse the mechanisms, consistent activation in the hippocampus has been observed during goal-directed navigation, while the retrosplential cortex may be important for transitioning between different spatial reference frames to move toward the goal.
Path integration (“I have turned 30° and walked 5 m from my initial heading and location”)
Survey knowledge of an environment relies on knowledge of the metric distances and angles between locations. This knowledge requires a means to gauge metric information, likely via a mechanism for keeping track of the distances and angles traversed by the navigator. Path integration is the constant updating of the navigator’s position and orientation within the environment during movement. The related concept of spatial updating is the ability to keep track of other locations on the basis of idiothetic information from self-motion (P. Byrne et al., 2007). Metric survey knowledge could be formed through the process of path integration, in which information about self-motion is combined to form an allocentric metric map of distances and the angles between locations (Gallistel, 1990).
Single-cell recording has been informative about the types of cells that might support the encoding of metric information. Cells in the rat postsubiculum and entorhinal cortex are sensitive to heading direction (Sargolini et al., 2006; Taube, Muller, & Ranck, 1990). Grid cells in the rat entorhinal cortex appear to signal the animal’s metric position and may represent the process of path integration (Fyhn, Molden, Witter, Moser, & Moser, 2004). Unlike place cells, the structure of the grid “map” is independent of environmental features, but it appears to be aligned with external boundaries and landmarks (Fyhn, Hafting, Treves, Moser, & Moser, 2007; Hafting, Fyhn, Molden, Moser, & Moser, 2005). In humans, regions that respond to particular heading directions are found in parietal cortex (Baumann & Mattingley, 2010), while grid cells (Doeller, Barry, & Burgess, 2010) and cells sensitive to heading direction (Jacobs, Kahana, Ekstrom, Mollison, & Fried, 2010) are found in the entorhinal cortex. The hippocampal theta rhythm is modulated by the visual speed of the navigator and may be relevant for spatial updating relative to the other objects in the environment (Watrous, Fried, & Ekstrom, 2011).
The hippocampus has also been implicated in some aspects of metric distance. For example, Morgan, MacEvoy, Aguirre, and Epstein (2011) found that the hippocampus showed greater fMRI adaptation for locations that were near each other in physical space than for locations that were farther apart. This result suggests that the hippocampus is sensitive to the shared properties of these locations—namely, their proximity in physical space. Viard, Doeller, Hartley, Bird, and Burgess (2011) found increased activity in the hippocampus with proximity to the goal, while Spiers and Maguire (2007a) found that activity in the entorhinal cortex and subiculum was correlated with goal proximity. In another study, the accuracy on relative distance judgments between objects was correlated with global neural activity in both the hippocampus and parahippocampal gyrus (Mellet et al., 2010). Interestingly, Shelton and Gabrieli (2002) found that survey encoding from a bird’s-eye perspective activated the bilateral fusiform and inferior temporal gyri and the posterior superior parietal cortex, but they found no activation in the hippocampus or parahippocampal cortex as compared to fixation. It is difficult to interpret these null results, since the visual information was quite different from experiments that have used a first-person perspective. It is possible that perceiving metric information visually is different from measuring through self-motion, or that neural activity differs between encoding and recall. In general, the results of these experiments suggest that the entorhinal cortex responds to lower-level and metric features, such as distance or heading and turning direction, which may be used downstream by the hippocampus to develop location information, including some metric properties (Jacobs et al., 2010).
Finally, the hippocampus has also been implicated in a path integration task in which participants had to point directly to their starting location after an outbound path (Wolbers, Wiener, Mallot, & Büchel, 2007). In this case, however, the researchers used purely visual input and trained their participants with feedback on the paths, potentially confounding path integration with other forms of learning. On the other hand, hippocampal activity has not always been found consistently in path integration. Wolbers, Hegarty, Büchel, and Loomis (2008) found no additional activity in the hippocampus or the parahippocampus during spatial updating, relative to a control task; however, this null finding may have been due to a difficult control task or to incidental encoding during control. In addition, patients with damage to the hippocampus and entorhinal cortex are able to perform most path integration tasks at the same level as controls (Shrager, Kirwin, & Squire, 2008). These findings suggest that hippocampal mechanisms may be more involved in other aspects of navigation than in spatial updating and path integration, in conflict with results indicating that grid and place cells in the hippocampus and entorhinal cortex are integral for path integration. These equivocal findings suggest that path integration is not necessarily equivalent to measuring metric distances and angles. One possibility is that integrating distance and angle information in a homing task requires additional neural resources, which these experiments have not yet fully isolated. Another possibility is that other strategies, which do not rely on metric information, could be used during these path integration tasks, potentially confounding the results. More detailed investigation into these contrasting results will be needed to resolve this contradiction.
In sum, path integration can serve two functions: It allows a navigator to determine the metric distances and angles between locations, and it is an instrument for a navigator to gauge the distance and direction of his or her own travel. Small errors in path integration can accumulate and lead to large distortions in survey knowledge. Several brain regions, as well as specific types of cells, have been implicated in this process. Cells in the entorhinal cortex and subiculum are sensitive to heading direction and distance; these regions feed into the hippocampus (P. Byrne et al., 2007), which also appears to be sensitive to some metric distance information. The findings are conflicting, however, regarding the contribution of the hippocampus to more complex path integration tasks, which suggests that there might not be a simple relationship between encoding metric information and path integration. Despite these complex findings, path integration appears to be a primary means of incorporating the metric information needed for survey knowledge.
In addition to the processes specific to the types of spatial navigation discussed here, some additional processes might contribute throughout the taxonomy. One such process is mental imagery. Mental imagery contributes to many facets of perception and cognition (Kosslyn, Behrmann, & Jeannerod, 1995) and might be important to landmark, route, graph, and survey knowledge. Path integration tasks rely on mental imagery (Tcheang, Bülthoff, & Burgess, 2011), which is also supported in the hippocampus (P. Byrne et al., 2007). Patients with hippocampal damage have difficulty producing descriptions of both objects and spatial relations in a novel imagined experience (Hassabis, Kumaran, Vann, & Maguire, 2007). The parahippocampal gyrus also shows an increased BOLD signal when making spatial decisions from memory rather than from direct perception (Viard et al., 2011), suggesting a possible role in performing mental imagery. It is possible that some of the observed activity of the hippocampus and parahippocampus during navigational tasks is related to mental imagery of the environment, which might encompass the spatial relationships between locations.
Working memory may also be an important component of spatial navigation. While working memory has likely been involved in many of the experimental paradigms discussed here, it has not explicitly been manipulated to determine how it contributes to navigational processes. Encoding spatial information may be verbally mediated, for example, by encoding a route as a series of left or right turns. Visual–spatial working memory can be broken into sequential and simultaneous components, where sequential visual–spatial working memory relates to a sequence of spatial locations presented one at a time, while simultaneous visual–spatial working memory is important for understanding the spatial relationships among locations in a single image. Verbal and sequential spatial working memory are implicated in encoding routes, while sequential and simultaneous visual–spatial working memory are likely to be relevant to graph or survey knowledge (Garden, Cornoldi, & Logie, 2002; Meilinger, Knauff, & Bülthoff, 2008; Pazzaglia & Cornoldi, 1999; Pazzaglia, De Beni, & Meneghetti, 2007). It is also known that frontal areas, especially the dorsolateral prefrontal cortex, are associated with working memory (Cohen et al., 1997; Courtney, Ungerleider, Keil, & Haxby, 1997; Zimmer, 2008), and some studies have found activity in frontal regions during spatial learning (e.g., Iaria et al., 2003; Shelton & Gabrieli, 2002, 2004). Yet there is limited understanding of how this structure relates to the spatial-encoding process. A case study has revealed that a patient with damage to the ventral medial prefrontal cortex had difficulty keeping navigational goals in mind (Ciaramelli, 2008; Spiers, 2008). Medial prefrontal cortex also shows increased activity with closer proximity to the navigational goal (Spiers & Maguire, 2007a) and with taking detours to a goal (Viard et al., 2011), indicating that the prefrontal cortex may mediate goal-oriented navigation. Still, few functional imaging experiments have attempted to interfere with specific components of working memory during encoding or sought to correlate activity in hippocampal and parahippocampal areas with that in frontal areas (but see Voss et al., 2011).
Gaps in the literature and conclusions
The research examined in this review suggests that there is no direct one-to-one mapping for the neural correlates of landmark, route, graph, and survey knowledge. For example, the neural correlates of route knowledge may actually depend on whether the navigator is already familiar with the environment (Schinazi & Epstein, 2010) and whether he or she is attending to landmarks (Doeller et al., 2008), place–action associations (Hartley et al., 2003), or the spatial relations between objects (Mellet et al., 2010). The neural correlates supporting route knowledge change as a navigator repeats the route and learns the key landmarks at choice points, and as the associated actions become engrained (Hartley et al., 2003; Weniger et al., 2010).
These findings do, however, suggest that the neural correlates of some of the cognitive processes and subprocesses involved in spatial learning are identifiable. First, the parahippocampus is active when learning views, which is a key component of all spatial knowledge. Second, identifying important landmarks for route learning appears to be supported in the parahippocampus, at least during initial learning. However, placing those landmarks within the larger spatial environment involves activity from the hippocampus, parahippocampus, and retrosplenial cortex. These brain regions might each be making different types of associations to facilitate this process. The hippocampus has been implicated in learning both spatial and nonspatial sequences, which is a large component of route and graph knowledge. Habitual actions in a well-learned route appear to be supported by the caudate and parietal regions. Finally, allocentric metric information is most likely supported in entorhinal cortex and hippocampus, whereas parietal regions appear to support egocentric coordinates, with retrosplenial cortex acting as a go-between. Thus, while several of the subcomponents involved in landmark, route, graph, and survey knowledge do have distinct neural correlates, the entirety of one type of spatial knowledge is not isolated in a single brain region.
Thus, when a navigator performs a task that relies on survey knowledge, it is not surprising that a large network of brain regions is active. In some cases, however, a single brain region appears to support multiple processes or subprocesses. It remains to be determined whether the region in question is truly sensitive to multiple cognitive processes involved in spatial navigation, whether subregions of brain areas (e.g., CA1, CA3, and dentate gyrus of the hippocampus) are responsible for the particular activity observed, or whether a more general function—superordinate to the processes laid out here—can be ascribed to these brain regions. The somewhat orthogonal question of the function of each brain region requires a different approach than the one taken here, although the neural correlates of spatial navigation can certainly be informative to that pursuit.
A number of limitations remain in the current literature on the neural correlates of spatial navigation. One limitation exposed here is that researchers often confound place knowledge with survey knowledge. The ability to identify and situate locations may be necessary for survey knowledge, but it might not be sufficient. The brain must integrate the place knowledge from every location into an allocentric survey map, an additional step that may require a number of inferences and that has not been fully explored in the research discussed here. This process likely involves the hippocampus, with the aid of other brain regions and circuits (P. Byrne et al., 2007). Notably absent from these fMRI experiments are the canonical behavioral tests of survey knowledge, in which a participant stands (or imagines standing) at one location, and then walks (or points) to another location. This test requires knowledge of the allocentric distances and angles between locations, which requires more than place information. Only a few imaging experiments have used tasks like this (e.g., Mellet et al., 2010), but the pointing has often been limited to quadrants or hemispheres (e.g., Latini-Corazzini et al., 2010; Wolbers & Büchel, 2005). While many of these studies found hippocampal and parahippocampal activity during the test, the tests did not distinguish or control for imagining the view of the target location, knowledge of the larger environment, thinking about routes to get to the target, or the metric information involved. The evidence certainly supports the idea that place information could be integrated with metric input from the entorhinal cortex to form survey knowledge, but this inference is not unequivocal.
Confusion between place learning and survey knowledge also makes the relationship between place learning and the acquisition of knowledge in Siegel and White’s (1975) framework unclear. Most of the paradigms investigating the topic have revealed that place-based learning occurs first with corresponding activity in the hippocampus, while response learning appears in the caudate after several repetitions (Hartley et al., 2003; Packard & McGaugh, 1996). These findings suggest that place learning occurs before consolidated route knowledge. If place learning is indeed reflective of allocentric or survey knowledge, these finding are in direct contrast to Siegel and White’s theory. There are several possible reasons for this discrepancy: Place knowledge is not equivalent to survey knowledge, route knowledge has been conflated with graph knowledge, or the three-stage incremental acquisition of spatial knowledge is inaccurate. These explanations are not necessarily mutually exclusive, and the framework presented here suggests that all may play a role. As discussed above, place learning is necessary for survey knowledge, but other components are also involved. A logical extension of the framework proposed here could suggest that several of the component processes of spatial knowledge could be acquired simultaneously. For example, Roche et al. (2005) proposed that place and response learning occur in the early stages of spatial encoding and are integrated when a motor response is required, indicative of parallel processing. In addition, some behavioral experiments (e.g., Ishikawa & Montello, 2006) have demonstrated that the acquisition of spatial knowledge varies between individuals, possibly reflecting individual differences in the navigational processes in the taxonomy. Some individuals may be better at sequence learning, while others may be better at path integration; the best navigators are likely able to perform all processes well and can switch between spatial strategies as needed (Bohbot et al., 2007; Hartley et al., 2003; Wolbers & Hegarty, 2010). More work will be needed to determine whether some components of the framework can be acquired simultaneously, or whether the acquisition of spatial knowledge proceeds in a strict trajectory.
Another limitation is the lack of research on how the neural correlates change as a navigator becomes more familiar with the environment. This approach could help elucidate the transitions between types of spatial knowledge, which are important to understand from both a behavioral and a neural perspective. While several experiments have been based on neural data acquired during the encoding process of complex environments (Aguirre et al., 1996; Bohbot et al., 2007; Caplan et al., 2003; Cornwell et al., 2008; Doeller et al., 2008; Ekstrom et al., 2005; Ekstrom et al., 2003; Iaria et al., 2003; Igloi et al., 2010; Kahana et al., 1999; Marchette et al., 2011; Shelton & Gabrieli, 2002, 2004; Viard et al., 2011; Weniger et al., 2010; Wolbers & Büchel, 2005; Wolbers et al., 2004), most have not explicitly compared encoding and retrieval of spatial information or examined change over time. A few studies have looked at environments that the navigator was highly familiar with (Maguire et al., 1997; Morgan et al., 2011; Rosenbaum et al., 2004; Schinazi & Epstein, 2010; Spiers & Maguire, 2006, 2007a, 2007b), but the majority of tasks have relied on environments learned just prior to testing. The differences in neural activation between encoding and retrieval need to be examined in more detail. In addition, the short-term topographical knowledge required to participate in an experiment may be fundamentally different from that of places where the navigator has lived for years and that he or she will “always know,” such as the town in which he or she grew up. For example, Rosenbaum, Ziegler, Winocur, Grady, and Moscovitch (2004) found that the parahippocampus was selectively active in many aspects of recalling a well-known area, including judgments of proximity, distance judgments, landmark sequencing, and blocked-route navigation, as well as landmark identification, but surprisingly, they did not see similar activity in the hippocampus. In addition, several case studies of patients with hippocampal damage have shown that they are as accurate as controls on tests of both route and survey knowledge of regions that they knew well before the damage occurred (Spiers & Maguire, 2007b). Both of these results are inconsistent with other findings in the neuroimaging literature.
Finally, it is important to point out that little neuroimaging research has investigated how these forms of spatial learning are modulated by other factors, such as active or passive learning or attention. One challenge in studying spatial navigation is controlling the cognitive processes while maintaining ecological validity. On that front, research on spatial memory has compared favorably with many domains of cognitive science and neuroscience, by presenting environments that are similar to real-life situations. However, due to the technological limitations of fMRI, imaging studies of navigation necessarily minimize the contribution of idiothetic information—that is, body-based information from self-motion—both during learning and during test, and both are important for survey knowledge (see Chrastil & Warren, 2012, for a review). While some aspects of path integration can be performed with vision alone, the majority of the information comes from idiothetic sources (Kearns, 2003; Tcheang et al., 2011). Thus, imaging studies have relied on passive image sequences or desktop VR, which may lead to the incomplete formation of survey knowledge and inconsistent neural activity. One method that researchers have used to circumvent this difficulty has been to use environments that are well-known to the participants, where they had walked frequently in the past (e.g., Maguire et al., 1997; Morgan et al., 2011; Rosenbaum et al., 2004; Schinazi & Epstein, 2010; Spiers & Maguire, 2006, 2007a, 2007b). The drawback is that this technique cannot examine encoding, and, as was mentioned above, long-term topographical knowledge may differ from short-term knowledge. Another possibility would be to learn by walking in a real or immersive virtual environment and then to test in the scanner, but again, this approach would ignore the question of encoding. More creative solutions will be needed to fully address the problem of encoding metric survey information while simultaneously recording neural activity.
Thus, the neuroimaging research has been limited by confusion between place and survey knowledge, lack of research on encoding and the relative order of acquisition of spatial knowledge, and a lack of idiothetic information in fMRI. Despite these limitations, the current balance of the literature suggests that some mapping exists between neural activity and landmark, route, graph, and survey knowledge, but that a simple three-part neural system is inadequate to fully understand how the brain supports spatial navigation. Research on the neural correlates of human navigation must be conscious of the specific task demands of a particular experimental paradigm and avoid lumping subtly different tasks into broad categories that may ultimately lead to inconsistent results. While the classification into landmark, route, and survey knowledge has proven useful to understanding spatial navigation, it is important to be aware of its limitations, for both brain and behavior. This classification remains useful for understanding navigation behavior, but it requires elaboration when considering the brain. The taxonomy presented here attempts to clarify the cognitive processes and subprocesses involved in spatial navigation, which have more distinct neural correlates than do the broader categories. The taxonomy can be used to develop new paradigms that aim to test specific processes of spatial navigation on both the behavioral and neural levels. Appreciation of both the behavioral and neural contributions of landmark, route, graph, and survey knowledge can lead to a deeper understanding of spatial learning.
The bird’s-eye perspective is also considered egocentric, since locations are encoded relative to the viewer and the perspective has a viewpoint. However, relative distances and angles between locations can also be observed from a bird’s-eye perspective, providing the viewer access to survey information that is not available from a ground-level viewpoint.
Others have found distinct neural correlates between route and survey knowledge (e.g., Shelton & Gabrieli, 2002; Wolbers & Büchel, 2005). The framework proposed here, however, suggests that differential neural activity in route and survey tasks may be the result of the specific cognitive processes involved.
London taxi drivers are required train for 2–3 years and to pass stringent examinations. When driving a fare, they are required to take the straightest line between two points. This system is unusual as compared to other taxi drivers, but it makes London taxi drivers ideal to study for spatial navigation, and they are the most studied group of professional drivers (Maguire et al., 1997).
Preparation of the manuscript was supported by National Science Foundation awards BCS-0214383 and BCS-0843940, as well as by the National Aeronautics and Space Administration/Rhode Island Space Grant Consortium. The author also thanks William H. Warren and Mike Tarr for their helpful comments on an earlier version of the manuscript, and Rebecca Burwell for her comments and assistance with the figure.