Reinforcement Learning Navigation for Robots Based on Hippocampus Episode Cognition

Yuan, Jinsheng; Guo, Wei; Hou, Zhiyuan; Zha, Fusheng; Li, Mantian; Wang, Pengfei; Sun, Lining

doi:10.1007/s42235-023-00454-7

Reinforcement Learning Navigation for Robots Based on Hippocampus Episode Cognition

Research Article
Published: 03 January 2024

Volume 21, pages 288–302, (2024)
Cite this article

Journal of Bionic Engineering Aims and scope Submit manuscript

Jinsheng Yuan¹,
Wei Guo¹,
Zhiyuan Hou³,
Fusheng Zha ORCID: orcid.org/0000-0001-9695-1940^1,2,
Mantian Li¹,
Pengfei Wang¹ &
…
Lining Sun¹

241 Accesses
Explore all metrics

Abstract

Artificial intelligence is currently achieving impressive success in all fields. However, autonomous navigation remains a major challenge for AI. Reinforcement learning is used for target navigation to simulate the interaction between the brain and the environment at the behavioral level, but the Artificial Neural Network trained by reinforcement learning cannot match the autonomous mobility of humans and animals. The hippocampus–striatum circuits are considered as key circuits for target navigation planning and decision-making. This paper aims to construct a bionic navigation model of reinforcement learning corresponding to the nervous system to improve the autonomous navigation performance of the robot. The ventral striatum is considered to be the behavioral evaluation region, and the hippocampal–striatum circuit constitutes the position–reward association. In this paper, a set of episode cognition and reinforcement learning system simulating the mechanism of hippocampus and ventral striatum is constructed, which is used to provide target guidance for the robot to perform autonomous tasks. Compared with traditional methods, this system reflects the high efficiency of learning and better Environmental Adaptability. Our research is an exploration of the intersection and fusion of artificial intelligence and neuroscience, which is conducive to the development of artificial intelligence and the understanding of the nervous system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robot Navigation Strategy in Complex Environment Based on Episode Cognition

Article 05 October 2022

A Possible Explanation for the Generation of Habit in Navigation: a Striatal Behavioral Learning Model

Article 03 November 2021

Target-Driven Autonomous Robot Exploration in Mappless Indoor Environments Through Deep Reinforcement Learning

Data availability

The datasets generated during and analyzed during the current study are available from the corresponding author on reasonable request.

References

Yuan, J., Guo, W., Hou, Z., Zha, F., Li, M., Sun, L., & Wang, P. (2023). Robot navigation strategy in complex environment based on episode cognition. Journal of Bionic Engineering, 20(1), 1–15.
Article Google Scholar
Zhao, Y., Peng, Y., Wen, Y., Han, L., Zhang, H., Zhao, Z., & Liu, X. (2022). A novel movement behavior control method for carp robot through the stimulation of medial longitudinal fasciculus nucleus of midbrain. Journal of Bionic Engineering, 19(5), 1302–1313.
Article Google Scholar
Bermudez-Contreras, E., Clark, B. J., & Wilber, A. (2020). The neuroscience of spatial navigation and the relationship to artificial intelligence. Frontiers in Computational Neuroscience, 14, 63.
Article PubMed PubMed Central Google Scholar
Chersi, F., & Burgess, N. (2015). The cognitive architecture of spatial navigation: Hippocampal and striatal contributions. Neuron, 88(1), 64–77.
Article CAS PubMed Google Scholar
Anggraini, D., Glasauer, S., & Wunderlich, K. (2018). Neural signatures of reinforcement learning correlate with strategy adoption during spatial navigation. Scientific Reports, 8(1), 10110.
Article ADS PubMed PubMed Central Google Scholar
Tolman, E. C. (1948). Cognitive maps in rats and men. Psychological Review, 55(4), 189.
Article CAS PubMed Google Scholar
Moser, E. I., Kropff, E., & Moser, M. B. (2008). Place cells, grid cells, and the brain’s spatial representation system. Annual Review of Neuroscience, 31(1), 69–89.
Article CAS PubMed Google Scholar
Hafting, T., Fyhn, M., Molden, S., Moser, M. B., & Moser, E. I. (2005). Microstructure of a spatial map in the entorhinal cortex. Nature, 436(7052), 801–806.
Article ADS CAS PubMed Google Scholar
Solstad, T., Boccara, C. N., Kropff, E., Moser, M. B., & Moser, E. I. (2008). Representation of geometric borders in the entorhinal cortex. Science, 322(5909), 1865–1868.
Article ADS CAS PubMed Google Scholar
Høydal, Ø. A., Skytøen, E. R., Andersson, S. O., Moser, M. B., & Moser, E. I. (2019). Object-vector coding in the medial entorhinal cortex. Nature, 568(7752), 400–404.
Article ADS PubMed Google Scholar
Taube, J. S., Muller, R. U., & Ranck, J. B. (1990). Head-direction cells recorded from the postsubiculum in freely moving rats. I. Description and quantitative analysis. Journal of Neuroscience, 10(2), 420–435.
Article CAS PubMed Google Scholar
Bellmund, J. L., Gärdenfors, P., Moser, E. I., & Doeller, C. F. (2018). Navigating cognition: Spatial codes for human thinking. Science, 362(6415), eaat6766.
Article ADS PubMed Google Scholar
Buzsáki, G., & Moser, E. I. (2013). Memory, navigation and theta rhythm in the hippocampal-entorhinal system. Nature Neuroscience, 16(2), 130–138.
Article PubMed PubMed Central Google Scholar
Foster, D. J., Morris, R. G., & Dayan, P. (2000). A model of hippocampally dependent navigation, using the temporal difference learning rule. Hippocampus, 10(1), 1–16.
Article CAS PubMed Google Scholar
Stoianov, I. P., Pennartz, C. M., Lansink, C. S., & Pezzulo, G. (2018). Model-based spatial navigation in the hippocampus-ventral striatum circuit: A computational analysis. PLoS Computational Biology, 14(9), e1006316.
Article ADS PubMed PubMed Central Google Scholar
Johnson, A., & Redish, A. D. (2007). Neural ensembles in CA3 transiently encode paths forward of the animal at a decision point. Journal of Neuroscience, 27(45), 12176–12189.
Article CAS PubMed Google Scholar
Pezzulo, G., Kemere, C., & Van Der Meer, M. A. (2017). Internally generated hippocampal sequences as a vantage point to probe future-oriented cognition. Annals of the New York Academy of Sciences, 1396(1), 144–165.
Article ADS PubMed Google Scholar
Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT Press.
Google Scholar
McDannald, M. A., Lucantonio, F., Burke, K. A., Niv, Y., & Schoenbaum, G. (2011). Ventral striatum and orbitofrontal cortex are both required for model-based, but not model-free, reinforcement learning. Journal of Neuroscience, 31(7), 2700–2705.
Article CAS PubMed Google Scholar
Pezzulo, G., Donnarumma, F., Iodice, P., Maisto, D., & Stoianov, I. (2017). Model-based approaches to active perception and control. Entropy, 19(6), 266.
Article ADS Google Scholar
Daw, N. D., & Dayan, P. (2014). The algorithmic anatomy of model-based evaluation. Philosophical Transactions of the Royal Society B: Biological Sciences, 369(1655), 20130478.
Article Google Scholar
Redish, A. D. (2016). Vicarious trial and error. Nature Reviews Neuroscience, 17(3), 147–159.
Article CAS PubMed PubMed Central Google Scholar
Wang, W., Subagdja, B., Tan, A. H., & Starzyk, J. A. (2012). Neural modeling of episode memory: Encoding, retrieval, and forgetting. IEEE Transactions on Neural Networks and Learning Systems, 23(10), 1574–1586.
Article PubMed Google Scholar
Yuan, J., Guo, W., Zha, F., Li, M., & Sun, L. (2021). Method of robot episode cognition based on hippocampus mechanism. IEEE Access, 10, 42386–42395.
Article Google Scholar
Pennartz, C. M. A., Ito, R., Verschure, P. F. M. J., Battaglia, F. P., & Robbins, T. W. (2011). The hippocampal–striatal axis in learning, prediction and goal-directed behavior. Trends in Neurosciences, 34(10), 548–559.
Article CAS PubMed Google Scholar
Sutton, R. S. (1991). Dyna, an integrated architecture for learning, planning, and reacting. ACM Sigart Bulletin, 2(4), 160–163.
Article Google Scholar
Konda, V. R., & Tsitsiklis, J. N. (2003). On actor-critic algorithms. Society for Industrial and Applied Mathematics, 42(4), 1143–1166.
MathSciNet Google Scholar

Download references

Funding

This work was funded by National Key R&D Program of China to Fusheng Zha with Grant numbers 2020YFB13134 and Natural Science Foundation of China to Fusheng Zha with Grant numbers U2013602, 52075115, 51521003, 61911530250.

Author information

Authors and Affiliations

State Key Laboratory of Robotics and System, Harbin Institute of Technology (HIT), Harbin, 150001, China
Jinsheng Yuan, Wei Guo, Fusheng Zha, Mantian Li, Pengfei Wang & Lining Sun
Shenzhen Academy of Aerospace Technology, Shenzhen, 518057, China
Fusheng Zha
School of Mechanical and Electrical Engineering, Lanzhou University of Technology, Lanzhou, 730050, China
Zhiyuan Hou

Authors

Jinsheng Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Wei Guo
View author publications
You can also search for this author in PubMed Google Scholar
Zhiyuan Hou
View author publications
You can also search for this author in PubMed Google Scholar
Fusheng Zha
View author publications
You can also search for this author in PubMed Google Scholar
Mantian Li
View author publications
You can also search for this author in PubMed Google Scholar
Pengfei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Lining Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fusheng Zha.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yuan, J., Guo, W., Hou, Z. et al. Reinforcement Learning Navigation for Robots Based on Hippocampus Episode Cognition. J Bionic Eng 21, 288–302 (2024). https://doi.org/10.1007/s42235-023-00454-7

Download citation

Received: 23 December 2022
Revised: 08 November 2023
Accepted: 09 November 2023
Published: 03 January 2024
Issue Date: January 2024
DOI: https://doi.org/10.1007/s42235-023-00454-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reinforcement Learning Navigation for Robots Based on Hippocampus Episode Cognition

Abstract

Access this article

Similar content being viewed by others

Robot Navigation Strategy in Complex Environment Based on Episode Cognition

A Possible Explanation for the Generation of Habit in Navigation: a Striatal Behavioral Learning Model

Target-Driven Autonomous Robot Exploration in Mappless Indoor Environments Through Deep Reinforcement Learning

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation