Abstract
Artificial intelligence is currently achieving impressive success in all fields. However, autonomous navigation remains a major challenge for AI. Reinforcement learning is used for target navigation to simulate the interaction between the brain and the environment at the behavioral level, but the Artificial Neural Network trained by reinforcement learning cannot match the autonomous mobility of humans and animals. The hippocampus–striatum circuits are considered as key circuits for target navigation planning and decision-making. This paper aims to construct a bionic navigation model of reinforcement learning corresponding to the nervous system to improve the autonomous navigation performance of the robot. The ventral striatum is considered to be the behavioral evaluation region, and the hippocampal–striatum circuit constitutes the position–reward association. In this paper, a set of episode cognition and reinforcement learning system simulating the mechanism of hippocampus and ventral striatum is constructed, which is used to provide target guidance for the robot to perform autonomous tasks. Compared with traditional methods, this system reflects the high efficiency of learning and better Environmental Adaptability. Our research is an exploration of the intersection and fusion of artificial intelligence and neuroscience, which is conducive to the development of artificial intelligence and the understanding of the nervous system.
Similar content being viewed by others
Data availability
The datasets generated during and analyzed during the current study are available from the corresponding author on reasonable request.
References
Yuan, J., Guo, W., Hou, Z., Zha, F., Li, M., Sun, L., & Wang, P. (2023). Robot navigation strategy in complex environment based on episode cognition. Journal of Bionic Engineering, 20(1), 1–15.
Zhao, Y., Peng, Y., Wen, Y., Han, L., Zhang, H., Zhao, Z., & Liu, X. (2022). A novel movement behavior control method for carp robot through the stimulation of medial longitudinal fasciculus nucleus of midbrain. Journal of Bionic Engineering, 19(5), 1302–1313.
Bermudez-Contreras, E., Clark, B. J., & Wilber, A. (2020). The neuroscience of spatial navigation and the relationship to artificial intelligence. Frontiers in Computational Neuroscience, 14, 63.
Chersi, F., & Burgess, N. (2015). The cognitive architecture of spatial navigation: Hippocampal and striatal contributions. Neuron, 88(1), 64–77.
Anggraini, D., Glasauer, S., & Wunderlich, K. (2018). Neural signatures of reinforcement learning correlate with strategy adoption during spatial navigation. Scientific Reports, 8(1), 10110.
Tolman, E. C. (1948). Cognitive maps in rats and men. Psychological Review, 55(4), 189.
Moser, E. I., Kropff, E., & Moser, M. B. (2008). Place cells, grid cells, and the brain’s spatial representation system. Annual Review of Neuroscience, 31(1), 69–89.
Hafting, T., Fyhn, M., Molden, S., Moser, M. B., & Moser, E. I. (2005). Microstructure of a spatial map in the entorhinal cortex. Nature, 436(7052), 801–806.
Solstad, T., Boccara, C. N., Kropff, E., Moser, M. B., & Moser, E. I. (2008). Representation of geometric borders in the entorhinal cortex. Science, 322(5909), 1865–1868.
Høydal, Ø. A., Skytøen, E. R., Andersson, S. O., Moser, M. B., & Moser, E. I. (2019). Object-vector coding in the medial entorhinal cortex. Nature, 568(7752), 400–404.
Taube, J. S., Muller, R. U., & Ranck, J. B. (1990). Head-direction cells recorded from the postsubiculum in freely moving rats. I. Description and quantitative analysis. Journal of Neuroscience, 10(2), 420–435.
Bellmund, J. L., Gärdenfors, P., Moser, E. I., & Doeller, C. F. (2018). Navigating cognition: Spatial codes for human thinking. Science, 362(6415), eaat6766.
Buzsáki, G., & Moser, E. I. (2013). Memory, navigation and theta rhythm in the hippocampal-entorhinal system. Nature Neuroscience, 16(2), 130–138.
Foster, D. J., Morris, R. G., & Dayan, P. (2000). A model of hippocampally dependent navigation, using the temporal difference learning rule. Hippocampus, 10(1), 1–16.
Stoianov, I. P., Pennartz, C. M., Lansink, C. S., & Pezzulo, G. (2018). Model-based spatial navigation in the hippocampus-ventral striatum circuit: A computational analysis. PLoS Computational Biology, 14(9), e1006316.
Johnson, A., & Redish, A. D. (2007). Neural ensembles in CA3 transiently encode paths forward of the animal at a decision point. Journal of Neuroscience, 27(45), 12176–12189.
Pezzulo, G., Kemere, C., & Van Der Meer, M. A. (2017). Internally generated hippocampal sequences as a vantage point to probe future-oriented cognition. Annals of the New York Academy of Sciences, 1396(1), 144–165.
Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT Press.
McDannald, M. A., Lucantonio, F., Burke, K. A., Niv, Y., & Schoenbaum, G. (2011). Ventral striatum and orbitofrontal cortex are both required for model-based, but not model-free, reinforcement learning. Journal of Neuroscience, 31(7), 2700–2705.
Pezzulo, G., Donnarumma, F., Iodice, P., Maisto, D., & Stoianov, I. (2017). Model-based approaches to active perception and control. Entropy, 19(6), 266.
Daw, N. D., & Dayan, P. (2014). The algorithmic anatomy of model-based evaluation. Philosophical Transactions of the Royal Society B: Biological Sciences, 369(1655), 20130478.
Redish, A. D. (2016). Vicarious trial and error. Nature Reviews Neuroscience, 17(3), 147–159.
Wang, W., Subagdja, B., Tan, A. H., & Starzyk, J. A. (2012). Neural modeling of episode memory: Encoding, retrieval, and forgetting. IEEE Transactions on Neural Networks and Learning Systems, 23(10), 1574–1586.
Yuan, J., Guo, W., Zha, F., Li, M., & Sun, L. (2021). Method of robot episode cognition based on hippocampus mechanism. IEEE Access, 10, 42386–42395.
Pennartz, C. M. A., Ito, R., Verschure, P. F. M. J., Battaglia, F. P., & Robbins, T. W. (2011). The hippocampal–striatal axis in learning, prediction and goal-directed behavior. Trends in Neurosciences, 34(10), 548–559.
Sutton, R. S. (1991). Dyna, an integrated architecture for learning, planning, and reacting. ACM Sigart Bulletin, 2(4), 160–163.
Konda, V. R., & Tsitsiklis, J. N. (2003). On actor-critic algorithms. Society for Industrial and Applied Mathematics, 42(4), 1143–1166.
Funding
This work was funded by National Key R&D Program of China to Fusheng Zha with Grant numbers 2020YFB13134 and Natural Science Foundation of China to Fusheng Zha with Grant numbers U2013602, 52075115, 51521003, 61911530250.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yuan, J., Guo, W., Hou, Z. et al. Reinforcement Learning Navigation for Robots Based on Hippocampus Episode Cognition. J Bionic Eng 21, 288–302 (2024). https://doi.org/10.1007/s42235-023-00454-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42235-023-00454-7