Odor supported place cell model and goal navigation in rodents
- 1.3k Downloads
Experiments with rodents demonstrate that visual cues play an important role in the control of hippocampal place cells and spatial navigation. Nevertheless, rats may also rely on auditory, olfactory and somatosensory stimuli for orientation. It is also known that rats can track odors or self-generated scent marks to find a food source. Here we model odor supported place cells by using a simple feed-forward network and analyze the impact of olfactory cues on place cell formation and spatial navigation. The obtained place cells are used to solve a goal navigation task by a novel mechanism based on self-marking by odor patches combined with a Q-learning algorithm. We also analyze the impact of place cell remapping on goal directed behavior when switching between two environments. We emphasize the importance of olfactory cues in place cell formation and show that the utility of environmental and self-generated olfactory cues, together with a mixed navigation strategy, improves goal directed navigation.
KeywordsSelf-marking navigation Reinforcement learning Q-learning Place cell directionality Remapping
Place cells are principal neurons in hippocampus which respond maximally when the animal is in a specific location in an environment. They were discovered in the rat hippocampus by O’Keefe & Dostrovsky in 1971 (O’Keefe and Dostrovsky 1971; O’Keefe and Nadel 1978) and investigated in numerous studies (for reviews see Eichenbaum et al. 1999; Hölscher 2003). Place fields (PF) form from environmental cues and play an important role in spatial navigation. Cells having similar properties to rat place cells had also been found in humans using extracellular recordings from epileptic children (Ekstrom et al. 2003). Thus, the formation of PFs, and their influence on navigation remains an important experimental and theoretical question. In particular, little is known on how different sensory cues contribute to PF formation and spatial navigation. Thus, the goal of the first part of this study is to investigate how PFs are formed under visual as well as olfactory influences. In the second part, we address the question of how PFs can be used in navigation, and compare this to olfactory based navigation based on self-laid scent marks.
1.1 PF formation and their relations to other hippocampal subsystems
Different models have been proposed for hippocampal place cell formation including Gaussian functions (O’Keefe and Burgess 1996; Touretzky and Redish 1996; Hartley et al. 2000; Foster et al. 2000), back-propagation algorithm (Shapiro and Hetherington 1993), auto-associative memory (Recce and Harris 1996), competitive learning (Sharp 1991; Brown and Sharp 1995), neural architecture based on landmark recognition (Gaussier et al. 2002), neuronal plasticity (Arleo and Gerstner 2000; Arleo et al. 2004; Strösslin et al. 2005; Sheynikhovich et al. 2005; Krichmar et al. 2005), independent component analysis (Takács and Lőrincz 2006; Franzius et al. 2007), self organizing map (Chokshi et al. 2003; Ollington and Vamplew 2004) or Kalman filter (Bousquet et al. 1998; Balakrishnan et al. 1999). None of these, however, addresses the question of how multiple sensory inputs might affect PF formation. Experiments with rodents demonstrate that visual cues play an important role for the control of place cells (Muller and Kubie 1987; Knierim et al. 1995; Collett et al. 1986; O’Keefe and Speakman 1987; Maaswinkel and Whishaw 1999; Dudchenko 2001). On the other hand, in the absence of visual cues rats can rely on other cues such as olfactory, auditory or somatosensory stimuli (Hill and Best 1981; Carvell and Simons 1990; Maaswinkel and Whishaw 1999; Wallace et al. 2002a). Thus, it seems reasonable to consider the influence of such cues also on the formation of PFs. This view is supported by the observation that PFs become unstable when olfactory cues are removed, suggesting that olfactory cues are important in the formation and stability of PFs (Markus et al. 1994; Save et al. 2000).
Other types of cells related to hippocampal place cells and spatial navigation are head direction cells and grid cells. Head direction cells are found in found in many brain areas including postsubiculum, the thalamus, lateral mammillary nucleus, dorsal tegmental nucleus, and striatum (Taube et al. 1990a, b; Muller et al. 1996; Knierim et al. 1998). Head direction cells respond maximally when animal’s head is oriented in preferred direction in the horizontal plane. Like place cells, head direction cells are under control of distal stimuli, and have different preferred directions in different environments. Experimental data suggests that the head direction cell system may orient the place cell system (Jeffery and O’Keefe 1999; Calton et al. 2003; Yoganarasimha and Knierim 2005).
Grid cells are found in entorhinal cortex (Hafting et al. 2005; Sargolini et al. 2006; Barry et al. 2007). Grid cells, like place cells, also fire strongly when an animal is in specific locations in an environment, but differ from place cells in that they have multi-peak firing fields which are organized into a hexagonal grid. It has been suggested that grid cells may make associations between places and events which is needed for the formation of memories (Hafting et al. 2005).
1.2 Navigation guided by PFs and other influences
Many experimental studies have been performed on goal directed learning in rodents (Barnes et al. 1980; Morris 1984; Prados and Trobalon 1998; Lavenex and Schenk 1998; Maaswinkel and Whishaw 1999; Wallace et al. 2002a; Etienne and Jeffery 2004; Jeffery et al. 2003; Hines and Whishaw 2005). Navigation models based on place cells usually address goal learning by using reinforcement learning algorithms (Arleo and Gerstner 2000; Arleo et al. 2004; Strösslin et al. 2005; Sheynikhovich et al. 2005; Krichmar et al. 2005) where place cell representation is based on combination of visual information and information provided by head direction cells or path integration.
Path integration was considered by many researchers as evidence for an additional mechanism when navigating in the absence of visual cues (for a review see Etienne and Jeffery 2004). Experimental data suggests that grid cells may be related to the path integra tion system (Hafting et al. 2005; Sargolini et al. 2006; McNaughton et al. 2006). However, Save et al. (2000) have shown that path integration alone is not sufficient to maintain stable receptive fields of place cells when rats navigate in the dark. Without additional cues, path integration leads to an accumulation of errors in direction and distance, and it thus needs to be reset through position information from stable cues (Etienne et al. 1996, 2004). In the study of Strösslin et al. (2005) the authors claim that their model is able to work in the dark based on self-motion cues (visual cues together with path integration were used), yet it is unclear how the model can succeed if visual cues used for recalibration are not available while navigating for a longer time in the dark.
Thus, for navigation in natural environments it seems reasonable to consider other sensory inputs, and it is known from the literature that rodents can form spatial representations based on olfactory cues and use this information for spatial orientation and navigation (Tomlinson and Johnston 1991; Lavenex and Schenk 1995, 1996, 1998). Experiments show that rats can track odors or self-generated scent marks to find a food source (Wallace et al. 2002a, 2003). To accommodate these findings, we propose a novel navigation mechanism based on self-marking by odor patches combined with a Q-learning algorithm based on (multi-sensory formed) place cells in order to improve spatial navigation.
Studies show that rats use visual and/or olfactory cues when available, and that such allothetic cues dominate over path integration information (ideothetic components) (Maaswinkel and Whishaw 1999; Whishaw et al. 2001). Therefore, the focus of the current study is on place cell formation and spatial navigation in cue-rich, illuminated environment, where path integration would be extraneous.
Another interesting consideration concerns the question how navigation is affected by remapping. It is known that PFs change very quickly when the rat is confronted with a new environment and that many PFs will re-obtain their former properties as soon as the animal returns to the initial environment (Muller and Kubie 1987; Wilson and McNaughton 1993; Shapiro et al. 1997; Tanila et al. 1997; Knierim et al. 1995, 1998). It is, however, an unresolved question how remapping affects navigation and navigation (re-)learning (Jeffery et al. 2003).
1.3 Specific questions addressed
What is the contribution of olfactory cues to the formation of place cells and goal navigation?
Can goal navigation based on place cells be improved by additional navigation mechanisms?
How does the remapping of PFs influence goal navigation when switching between different environments?
The paper is organized as follows. First we describe the sensory inputs and the model system. Then we present different goal navigation strategies and thereafter we show the results of place cell analysis, and a comparison of the presented navigation algorithms. Finally, we discuss our results and relate them to other studies and biological data.
2.1 Sensory inputs
2.2 Place cell model
2.3 Navigation strategies
2.3.1 Closed loop context
2.3.2 Goal navigation task
The rat has to learn to navigate from its home location to the goal, i.e the food source. The rat can use allothetic visual and olfactory cues described above but it can not see or smell the food source (similar to the Morris water-maze task, Morris (1984)). The rat gets a reward only when it approaches the goal location. The setup for such a spatial task is shown in Fig. 2(b). We use the same discrete environment (square box) as described above, where we have different landmarks on all four walls (see Fig. 1(a)). The home location of the rat is in the bottom-left corner, 1000 points from both walls and is marked by a gray dot. The dimensions of the food source, marked by a square, are 2000×2000 points and it is located 3000 points from the left wall and 2000 points from the upper wall. At the beginning, the rat explores the environment randomly and finds the goal just by chance (dashed line), whereas after a few learning runs the rat finds a more or less direct path to the food source. Whenever the rat finds the food location we start a new run from the start position (home location). A maximum number of 200 steps is allowed for one run with a step size in the range of 400-600 points. In our model during the first run in most of the cases (80%) the rat finds the goal within less than 200 steps, so the rat has enough time to find the goal even when navigating randomly. Another reason for the 200 step limit is related to the frustration phenomenon observed in animals where creatures return to “home-base” if the goal is not found within an expected time (Eilam and Golani 1989; Whishaw et al. 2001; Wallace et al. 2002b; Hines and Whishaw 2005; Nemati and Whishaw 2007).
2.3.3 Q-learning with function approximation
2.3.4 Self-marking navigation
2.3.5 Combining Q-learning with self-marking navigation
2.4 Remapping and navigation
It is known from the literature that PFs can change in firing rate, position, shape, or turn on/off when the animal is exposed to different environments, a phenomenon which is called remapping (Muller and Kubie 1987; Wilson and McNaughton 1993; Shapiro et al. 1997; Tanila et al. 1997; Knierim et al. 1995; Knierim et al. 1998). Fundamental changes occur within 5-10 minutes of exploration in a new environment, whereas the firing rate can change even within the first second (Wilson and McNaughton 1993). In this study we also investigate how remapping of place cells affects goal navigation task when the rat switches between different environments. We compare different navigation strategies with respect to change of environmental cues, as well as to a change of the goal location.
To compare Q-learning based on PFs obtained from combined visual and olfactory stimuli with the combination of Q-learning with the navigation based on self-generated odor marks we perform different sets of experiments. In the first set of experiments, we switch between two environments “A” and “B”, changing only environmental cues and keeping the location of goal unchanged (see Fig. 3(a)). In the second set of experiments, we switch between the environment “A” and “C”, and in “C” the environmental cues as well as the location of the food source are changed.
3.1 Place cell analysis
Before looking at the comparison of goal navigation strategies we would like to investigate the contribution of the olfactory input to place cell formation. This influence can be assessed by measuring the directionality of place cells. For this investigation, we let the rat to explore the environment randomly as shown in Fig. 5(e) for 5000 time steps (development phase). For comparison we used a relatively low rate factor (μ = 0.01) to develop connection weights between an input and an output layer (see Fig. 1(c)), because weights oscillate and do not converge when a high rate factor (μ = 0.1) is used, and this does not lead to the final stabilization of place cells. For comparison of weight development for different rate factors see Fig. 5(g). After the development phase we let the rat move in the environment for another 5000 time steps to create test data. To evaluate the directionality of place cells we looked at the locations which had been passed by the rat in different directions. We say that a cell is omnidirectional, i.e. independent of the movement direction, if at a given location the cell fires with its highest firing rate regardless of crossing the location in different directions. Averaged results of 20 experiments are presented in Fig. 5(f) where we compare the directionality of place cells obtained from visual cues alone with that obtained from both visual and olfactory stimuli. The white bars show the control case, with place cell directionality before the development phase (i.e. before learning). We can see that we obtain more omnidirectional cells when we use combined stimuli compared to visual stimuli alone and more omnidirectional cells develop during the development phase compared to control case. The improvement in omni-directionality when using olfactory cues can be explained by the fact that perception of olfactory cues is direction independent whereas perception of visual cues depends on local views. Note that the view-field influences the directionality of PFs. The larger the view-field, the fewer directional cells are obtained. Since the rats do not have the omnidirectional view we still would get more directional cells obtained from visual information alone compared to combined stimuli (visual and olfactory cues) or olfactory cues alone. Our results on place cell directionality are qualitatively similar to experimental data of Battaglia et al. (2004). For further discussion on place cell directionality see Section 4.
3.2 Goal navigation
3.2.1 Comparison of different navigation strategies
3.2.2 Hierarchical input preference in spatial navigation
3.3.1 Remapping of PFs
3.3.2 Remapping and goal navigation
In the following subsection we present results on spatial navigation with respect to the remapping of PFs when switching between to different environments. For environmental setup see Fig. 3. The results of goal navigation while switching between environments “A” and “B” are shown in Fig. 11(d–g), where the average number of steps needed to find the food source is plotted versus number of runs for 200 experiments. Navigation results obtained by using Q-learning based on PCs obtained from visual and olfactory stimuli (VOQ) are presented in panel d, and results of the combined method (VOQS) are shown in panel e. Note that here we used a combined strategy without hierarchical input preference, i.e. the rate would still follow a scent trail after learning. We can see that by using both navigation strategies the rat can learn to find the goal in two environments “A” and “B”, whenever the location of the food source is the same in both environments, and it goes directly to the goal after returning to the previous environment. It is worthwhile to note that in our model we do not introduce unfamiliar cues to the rat in the new environment, but we just “fool” the rat by switching visual cues and changing the position and shape of olfactory cues. That is why we also observe that the rat uses some information (i.e. learned Q-values) from the previous environment, and it does not have to relearn from scratch when moved to the new environment. In panel d, for comparison, we show the control case where in environments “A” and “B” we initialize Q-values randomly from a uniform distribution within the interval [0;1]. The results for the goal navigation while switching between environments “A” and “C”(the location of the goal is also changed) for the cases VOQ and VOQS are presented in Fig. 11 (f, g) respectively. Here we found that the rat has to relearn the food location all the time, even if returned to the previously visited environment. However, by employing the combined strategy (see panel g), the rat can easily find the food source in both environments even if the location of the goal is changed, because the rat just follows the trail of scent marks. Note that if we used the combined strategy with hierarchical input preference we would have obtained results similarly to the case VOQ (see panel f), since after learning the rat would prefer environmental cues and navigate according Q-values. In general, we observed that the rat can learn both environments when location of the goal is unchanged but has to relearn the route in case of changes in both environmental cues and location of the goal. For further discussion on remapping results see the Section 4.
In the following we compare our place cell model and goal navigation strategies with other approaches. We also discuss our results in relation to biological data.
A starting point for this study was experimental data which show that olfactory cues play an important role for the stability of PFs (Markus et al. 1994; Save et al. 2000) and navigation of rodents (Tomlinson and Johnston 1991; Lavenex and Schenk 1995, 1996, 1998; Wallace et al. 2002a, 2003). We have for the first time, to our knowledge, implemented an odor supported place cell model and applied it for goal navigation learning. Based on self-marking behavior in rodents (Harley and Martin 1999), we proposed a novel navigation mechanism which allows better performance in goal directed navigation. We predict that use of environmental odor cues improve omni-directionality of place cells which as a consequence results in faster goal directed learning, whereas use of self-generated scent marks results in even faster learning, and could serve as an additional information for path finding when environmental cues are not available.
4.1 Place cell model
We modeled place cells from visual and olfactory cues using a feed-forward network based on radial basis functions (RBF). Here we used an abstract model excluding interactions between hippocampal layers. This is justified as we did not focus on the place model itself but rather on the contribution of sensory inputs to the formation of place cells and on the utilization of place cells in spatial navigation. Our approach is similar to the model of O’Keefe and Burgess (1996) or Hartley et al. (2000), but we use n-dimensional RBFs instead of calculating the thresholded sum of the Gaussian tuning-curves of the rat’s distance from each box wall (O’Keefe and Burgess 1996). Our model differs from the augmented model of Hartley et al. (2000), where the firing rate of a place cell is modeled as the thresholded sum of boundary vector cells (BVCs). The response of a BVC is the product of two Gaussian tuning curves, where one is a function of the distance from the rat to the wall and the second is a function of the rat’s head direction (Hartley et al. 2000). In these models, the amplitude and the width of the PF depend on the distance to the wall: the larger the distance, the lower the amplitude and the broader the field, and vice versa. In our model we keep the width of the PF σ f fixed and the obtained PFs that vary in shape and amplitude because of the combination of different sensory inputs (see Fig. 4(c)). We use a winner-takes-all mechanism for PF formation, which means that we do not change weights of neighbor neurons as in self-organizing map (SOM) approaches (Chokshi et al. 2003; Ollington and Vamplew 2004) as there are no obvious topographical relations between the positions of the PFs and the anatomical locations of the place cells relative to each other within the hippocampus (O’Keefe 1999).
In several studies (Arleo and Gerstner 2000; Arleo et al. 2004; Sheynikhovich et al. 2005; Strösslin et al. 2005) self-motion cues have been used as an additional input to hippocampus to create place cells. The disadvantage of self-motion cues is that path integration leads to an accumulation of errors in direction and distance, and needs to be re-calibrated according to position estimation from stable cues (Etienne et al. 1996, 2004). Save et al. (2000) have shown that path integration alone is insufficient to maintain the stability of PFs. If visual or olfactory sensory cues are available then these cues dominate over path integration information (Maaswinkel and Whishaw 1999; Whishaw et al. 2001). In contrast to other models we use odor cues as an additional input to form place cells. For the sake of simplicity we model static odors. Models of dynamic odors are quite complex and include many parameters (Boeker et al. 2000). By using static odors we ignore odor patch development, and effects that might be induced by changes of odors in time. Here we concentrate only on an odor function as a reference cue that is sensed unambiguously by the rat, as opposed to visual cues, which might be mismatched, misinterpreted or not seen at all. Obtained PFs capture similar properties to those that were found in the rats’ hippocampus (Muller and Kubie 1987; Muller et al. 1994; Wilson and McNaughton 1993; O’Keefe 1999).
Place cells tend to be less directional when navigating in an open environment as compared to navigation where the rat is forced to move along a specific direction (McNaughton et al. 1983; Muller et al. 1994; Markus et al. 1995). These properties has been also captured by the models of Sharp (1991) and Brunel and Trullier (1998). In this study, we have investigated the contribution of olfactory input to the directionality of place cells. From our analysis, we found that if olfactory cues are available for the formation of place cells, more omnidirectional fields develop. This agrees with observations of PFs by Battaglia et al. (2004) on cue-rich and cue-poor linear tracks. The proportion of omnidirectional cells over total spatially selective cells was ≈ 43% in a cue-rich environment vs. ≈ 30% in a cue-poor environment. We obtained more omnidirectional cells because cells tend to be more directional in eight-arm mazes or T-mazes compared to open environments (Muller et al. 1994; Markus et al. 1995). Our results support the notion that place cell directionality should influence goal directed behavior as we obtained better performance in a goal navigation task when using place cells formed from both visual and combined stimuli than when using place cells formed from visual cues alone.
4.2 Goal navigation learning
In the second part of our study we presented different navigation strategies and compared them in a goal navigation task and in a remapping situation. Goal navigation based on place cells has previously been addressed by implementing reinforcement learning algorithms (Arleo and Gerstner 2000; Arleo et al. 2004; Foster et al. 2000; Strösslin et al. 2005; Sheynikhovich et al. 2005; Krichmar et al. 2005). We presented a new navigation mechanism that combined Q-learning with navigation based on self-generated odor patches in order to achieve better performance in goal directed navigation. Our approach differs from that of Russell (1995), who developed a robotic system where the robot is able to lay an odor trail on the ground and to follow the trail afterward. In his approach the robot is not using odor marking to find a goal, whereas in our approach, the rat lays scent marks in order to find a goal and to create a trail, which leads to the food source. The proposed mechanism, based on self-marking, propagates scent marks backwards from the location of the reward as in reinforcement learning, but here we do not have predefined features, but rather create them “on the fly”, and we do not directly memorize action values associated to states. The mechanism of RBF1-like features created on-line in action learning was used in several other studies (Kretchmar and Anderson 1997; Atkeson et al. 1997). The method of updating odor marks resembles a TD(0) approach with function approximation (Sutton and Barto 1998), where the weights towards the value function are increased if the following states have high values. The update rule in our study is different from the one used in TD. Here, updates of odor marks are made by a fixed amount based on the binary decision whether some odor is sensed at the current location or not.
Experimental data show that rats perform better in cue-rich environments compared to the cue-poor environments. Barnes et al. (1980) showed that if all of the extra-maze cues surrounding a circular maze were removed, rats made many more errors finding a goal location. Morris (1984) demonstrated that rats performed worse when he obscured some of the cues around the water maze by pulling the curtains 1/4 of the way around. When he obscured all of the extra-maze cues by pulling the curtains fully around, the rats performed very badly. Prados and Trobalon (1998) showed that rats could learn the platform location in a water maze if 4 or 2 extra-maze cues were available, but they were much worse if only 1 cue was present. We addressed these findings by testing the performance of our model rat with and without olfactory input where we served that the model rat performed significantly better with both, visual and olfactory, cues compared to visual stimuli alone.
The experiments of Maaswinkel and Whishaw (1999) suggest that rats have a hierarchical preference in using sensory cues. In their experiments, rats ignored distortion in self-motion cues when they where moved to a new starting position or ignored distortion in odor cues (scent marks) when the apparatus was rotated suggesting that visual cues dominate over other cues whenever they are available. However, when blindfolded, the rats still performed well suggesting that they were using odor cues when available, and path integration when odor cues were disrupted. To address these findings we modified our combined navigation strategy by adding an input preference component where the rat uses both environmental and self generated cues for the learning. After learning the rat prefers environmental cues if they are available and uses self-generated olfactory cues when visual cues are not available. By using such an modified strategy, we have demonstrated that the model rat succeeds in faster goal directed learning showing unaffected performance when environmental cues are changed. This is supported by the finding that rat can find a goal when scent trail is distorted or removed, or can find the route to the goal using self-laid odor cues when environmental cues are unavailable.
4.3 Remapping and goal navigation
The results for goal navigation with respect to remapping of place cells show that the rat can learn to find a goal in two environments, “A” and “B”, by using Q-learning or combined navigation when the location of the goal is unchanged, but environmental cues are switched. Note that the rat can learn both environments only as long as different, partially overlapping subsets of place cells fire in the environments “A” and “B”, i.e. most of the cells, which do not fire in the environment “A”, fire in the environment “B”. In case of cue rotation the rat would need to relearn the task all the time if the location of goal is not rotated together with landmarks, because in both environments the same subset of place cells would be used. This is an equivalent of leaving the environment the same, but changing the location of the goal. Also in the Morris water-maze experiment (Morris 1981) the rat also has to relearn the location of the platform every time whenever it is moved to another location. When environments are substantially different and the cells remap, in our experiments the rat can easily find the food source in both environments even if the location of the goal is changed by employing the combined strategy, because the rat can use the trail of scent marks.
Our model predicts that the remapping of PFs would disrupt a previously learned route to a goal. The closest empirical data addressing this prediction is a study by Jeffery et al. (2003), who examined the relationship between remapping and performance of a spatial navigation task. In their experiment, rats were trained to search for a food source in a black box, and subsequently tested in a white box. Jeffery et al. (2003) found that place cells re-mapped between the two boxes, and although the rats were slightly worse in the second environment, they still performed well. This finding suggests that, although the place cells may encode spatial contexts, they dont directly guide behavior. One difference between the experimental situation of Jeffery et al. (2003) and that of the current model is that in the experimental situation there were no landmarks within the square apparatus. Instead, rats relied on spatial landmarks - posters on the curtains surrounding the apparatus - for orientation. So, in the Jeffery et al. (2003) experiment, unlike in our model, cues outside the immediate environment were the only way in which the animal could distinguish the correct corner. The results of Yoganarasimha and Knierim (2005) suggest that head direction cells are influenced by distal landmarks, whereas some place cells are influenced by local landmarks. Thus it may be that the Jeffery et al. (2003) task was one that could not be solved using place cells, because there was no way of distinguishing one corner of the apparatus from the other because there were no local cues available within the square. Rats may have used a non-place cell representation - such as the head direction cell system - to solve the task. Had there been local cues inside the square enclosure and no cues outside the enclosure, a stronger link between remapping and disrupted navigation may have been observed. An acknowledged difficulty with this account, however, is that Jeffery et al. (2003) also show that this task is impaired by lesions of the hippocampus.
4.4 Predictions and suggested experiments
Present experimental studies on spatial learning in cue-rich-cue-poor environments are still based on visual cues alone (Barnes et al. 1980; Morris 1984; Prados and Trobalon 1998). They also test the performance of the rat after learning. It would thus be interesting to test whether real animals would learn the task faster in environments with additional olfactory cues compared to visual stimuli alone as our model predicts.
Experiments on self-marking behavior in the process of learning would be useful to prove or disprove the proposed setup and hypothesis that self-marking behavior speeds-up learning.
In the Jeffery et al. (2003) experiment on place cell remapping and goal navigation, it may be that the task was one that could not be solved using place cells, be cause there was no way of distinguishing one corner of the apparatus from the other because there were no local cues available within the square. It would be interesting to make more experiments in order to test the hypothesis whether remapping of place cells influences goal directed learning or not as our model predicts.
By using a combined strategy with hierarchical input preference the model rat creates two representations of the route to the goal: one is based on environmental cues while the other is based on self-generated scent marks. Our model predicts that in case of remapping, when the goal in two environments is at different locations, the rat would fail when moved back to the previous environment since it would prefer environmental cues. We would hypothesize that the rat could use the scent trail in the next trial after it fails to find a goal when using environmental cues. Experiments to test this hypothesis would also be of great interest.
RBF – radial basis function.
We thank Alexander Wolf for helpful comments.
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
- Boeker, P., Wallenfang, O., Koster, F., Croce, R., Diekmann, B., Griebel, M., et al. (2000). The modelling of odour dispersion with time-resolved models. Agrartechnische Forschung, 4, E84–E89.Google Scholar
- Bousquet, O., Balakrishnan, K., & Honavar, V. (1998). Is the hippocampus a kalman filter? In Proceedings of the pacific symposium on biocomputing (pp. 655–666).Google Scholar
- Chokshi, K., Wermter, S., & Weber, C. (2003). Learning localisation based on landmarks using self-organisation. In ICANN (pp. 504–514).Google Scholar
- Kretchmar, R., & Anderson, C. (1997). Comparison of cmacs and radial basis functions for local function approximators in reinforcement learning. In Proceedings of the IEEE international conference on neural networksorks (pp. 834–837). Houston, TX.Google Scholar
- O’Keefe, J., & Nadel, L. (1978). The hippocampus as a cognitive map. Oxford: Oxford University Press.Google Scholar
- O’Keefe, J., & Speakman, A. (1987). Single unit activity in the rat hippocampus during a spatial memory task. Experimental Brain Research, 68(1), 1–27.Google Scholar
- Ollington, R., & Vamplew, P. (2004). Learning place cells from sonar data. In AISAT2004: International conference on artificial intelligence in science and technology (pp. 126–131).Google Scholar
- Prados, J., & Trobalon, J. (1998). The location of an invisible goal requires the presence of at least two landmarks. Psychobiology, 26, 42–48.Google Scholar
- Reynolds, S. I. (2002). The stability of general discounted reinforcement learning with linear function approximation. In UK workshop on computational intelligence (UKCI-02) (pp. 139–146). Birmingham, UK.Google Scholar
- Sharp, P. E. (1991). Computer simulation of hippocampal place cells. Psychobiology, 19(2), 103–115.Google Scholar
- Sheynikhovich, D., Chavarriaga, R., Strösslin, T., & Gerstner, W. (2005). Spatial representation and navigation in a bio-inspired robot. In Biomimetic neural learning for intelligent robots: Intelligent systems, cognitive robotics, and neuroscience (pp. 245–264).Google Scholar
- Sutton, R., & Barto, A. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT.Google Scholar
- Tomlinson, W. T., & Johnston, T. D. (1991). Hamsters remember spatial information derived from olfactory cues. Animal Learning and Behavior, 19, 185–190.Google Scholar
- Whishaw, I. Q., Hines, D. J., & Wallace, D. G. (2001). Dead reckoning (path integration) requires the hippocampal formation: Evidence from spontaneous exploration and spatial learning tasks in light (allothetic) and dark (idiothetic) tests. Behavioural Brain Research, 127(1–2), 49–69.CrossRefPubMedGoogle Scholar
Open AccessThis is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.