Abstract
How to realize flexible behavior decision making is an important prerequisite for mobile robots to perform various tasks. To solve the problems of poor real-time performance and adaptability of traditional methods, this paper proposes a method that simulates cerebellar function through developmental network, and simulates the function of “what” and “where” channels in the visual system as well as the neuromodulatory mechanisms of dopamine and serotonin, so as to improve the adaptability of cerebellar model to behavioral decision making under supervised learning strategies. At the same time, this paper pays special attention to the strategy of simulating cerebellar reinforcement learning. By simulating the sleep recall mechanism of hippocampus and the neuromodulatory mechanism of acetylcholine and norepinephrine, mobile robots can have continuous and stable learning ability in unfamiliar environment, and improve the real-time and adaptability of their behavioral decision making. Simulation results in both static and dynamic environments, as well as the results in the static physical environment, validate the potential of this model, indicating that the cerebellar model based on reinforcement learning plays an important role in the behavioral decision making of mobile robots.
Similar content being viewed by others
References
Albus JS (1971) A theory of cerebellar function. Math Biosci 10(1–2):25–61
Behnck LP, Doering D, Pereira CE, Rettberg A (2015) A modified simulated annealing algorithm for SUAVs path planning. IFAC-PapersOnLine 48(10):63–68
Bostan AC, Dum RP, Strick PL (2010) The basal ganglia communicate with the cerebellum. Proc Natl Acad Sci USA 107(18):8452–8456
Caligiore D, Pezzulo G, Baldassarre G, Bostan AC, Strick PL, Doya K, Helmich RC, Dirkx M, Houk J, Jörntell H et al (2017) Consensus paper: towards a systems-level view of cerebellar function: the interplay between cerebellum, basal ganglia, and cortex. The Cerebellum 16:203–229
Chambers AM (2017) The role of sleep in cognitive processing: focusing on memory consolidation. Wiley Interdiscip Rev Cognit Sci 8(3):e1433
Choi D, Kim SH, Lee W, Kang S, Kim K (2021) Development and preclinical trials of a surgical robot system for endoscopic endonasal transsphenoidal surgery. Int J Control Autom Syst 19(3):1352–1362
De Zeeuw CI (2021) Bidirectional learning in upbound and downbound microzones of the cerebellum. Nat Rev Neurosci 22(2):92–110
Do H, Le AV, Yi L, Hoong JCC, Tran M, Van Duc P, Vu MB, Weeger O, Mohan RE (2022) Heat conduction combined grid-based optimization method for reconfigurable pavement sweeping robot path planning. Robot Auton Syst 152:104,063
Fang W, Chao F, Yang L, Lin CM, Shang C, Zhou C, Shen Q (2019) A recurrent emotional CMAC neural network controller for vision-based mobile robots. Neurocomputing 334:227–238
Faulkner P, Deakin JW (2014) The role of serotonin in reward, punishment and behavioural inhibition in humans: insights from studies with acute tryptophan depletion. Neurosci Biobehav Rev 46:365–378
Fink CG, Murphy GG, Zochowski M, Booth V (2013) A dynamical role for acetylcholine in synaptic renormalization. PLoS Comput Biol 9(3):e1002,939
Gaffield MA, Bonnan A, Christie JM (2019) Conversion of graded presynaptic climbing fiber activity into graded postsynaptic \(\text{ Ca}^{2+}\) signals by Purkinje cell dendrites. Neuron 102(4):762–769
Gmira M, Gendreau M, Lodi A, Potvin JY (2021) Tabu search for the time-dependent vehicle routing problem with time windows on a road network. Eur J Oper Res 288(1):129–140
Gonzalez R, Fiacchini M, Iagnemma K (2018) Slippage prediction for off-road mobile robots via machine learning regression and proprioceptive sensing. Robot Auton Syst 105:85–93
Hady GG, Abigail CD, Sebastian H, Andrea A, Damian B et al (2018) ALCIDES: a novel lunar mission concept study for the demonstration of enabling technologies in deep-space exploration and human-robots interaction. Acta Astronaut 151:270–283
Hausknecht M, Li WK, Mauk M, Stone P (2016) Machine learning capabilities of a simulated cerebellum. IEEE Trans Neural Netw Learn Syst 28(3):510–522
Heiney SA, Wojaczynski GJ, Medina JF (2021) Action-based organization of a cerebellar module specialized for predictive control of multiple body parts. Neuron 109(18):2981-2994.e5
Huang J, Yang HY, Ruan XG, Yu NG, Zuo GY, Liu HM (2021) A spatial cognitive model that integrates the effects of endogenous and exogenous information on the hippocampus and striatum. Int J Autom Comput 18:632–644
Islam N, Haseeb K, Almogren A, Din IU, Guizani M, Altameem A (2020) A framework for topological based map building: a solution to autonomous robot navigation in smart cities. Future Gener Comput Syst 111:644–653
Kakade S, Dayan P (2002) Dopamine: generalization and bonuses. Neural Netw 15(4–6):549–559
Kostadinov D, Beau M, Blanco-Pozo M, Häusser M (2019) Predictive and reactive reward signals conveyed by climbing fiber inputs to cerebellar Purkinje cells. Nat Neurosci 22(6):950–962
Kostadinov D, Hausser M (2022) Reward signals in the cerebellum: origins, targets, and functional implications. Neuron 110:1290–1303
Krichmar JL (2012) A biologically inspired action selection algorithm based on principles of neuromodulation. In: The 2012 international joint conference on neural networks (IJCNN). IEEE, pp 1–8
Kumar PB, Sahu C, Parhi DR (2018) A hybridized regression-adaptive ant colony optimization approach for navigation of humanoids in a cluttered environment. Appl Soft Comput 68:565–585
Labbadi M, Cherkaoui M (2021) Robust adaptive global time-varying sliding-mode control for finite-time tracker design of quadrotor drone subjected to gaussian random parametric uncertainties and disturbances. Int J Control Autom Syst 19:2213–2223
Lambert ED, Romano R, Watling D (2021) Optimal smooth paths based on clothoids for car-like vehicles in the presence of obstacles. Int J Control Autom Syst 19:2163–2182
Low ES, Ong P, Low CY, Omar R (2022) Modified q-learning with distance metric and virtual target on path planning of mobile robot. Expert Syst Appl 199:117,191
Mar D (1969) A theory of cerebella, cortex. J Physiol 202:437–470
Medina JF, Lisberger SG (2009) Erratum: Corrigendum: Links from complex spikes to local plasticity and motor learning in the cerebellum of awake-behaving monkeys. Nat Neurosci 12(6):808–808
Moshayedi AJ, Abbasi A, Liao L, Li S (2019) Path planning and trajectroy tracking of a mobile robot using bio-inspired optimization algorithms and pid control. In: 2019 IEEE international conference on computational intelligence and virtual environments for measurement systems and applications (CIVEMSA). IEEE, pp 1–6
Moshayedi AJ, Li J, Liao L (2021) Simulation study and PID tune of automated guided vehicles (AGV). In: 2021 IEEE international conference on computational intelligence and virtual environments for measurement systems and applications (CIVEMSA), pp 1–7
Moshayedi AJ, Reza KS, Khan AS, Nawaz A (2023) Integrating virtual reality and robotic operation system (ROS) for AGV navigation. EAI Endorsed Trans AI Robot 2(1):e3–e3
Naveros F, Luque NR, Ros E, Arleo A (2019) VOR adaptation on a humanoid iCub robot using a spiking cerebellar model. IEEE Trans Cybern 50(11):4744–4757
Paez D, Romero JP, Noriega B, Cardona GA, Calderon JM (2021) Distributed particle swarm optimization for multi-robot system in search and rescue operations. IFAC-PapersOnLine 54(4):1–6
Patle B, Parhi D, Jagadeesh A, Kashyap SK (2018) Matrix-binary codes based genetic algorithm for path planning of mobile robot. Comput Electr Eng 67:708–728
Popa LS, Streng ML, Hewitt AL, Ebner TJ (2016) The errors of our ways: understanding error representations in cerebellar-dependent motor learning. The Cerebellum 15:93–103
Pradhan S, Mandava RK, Vundavilli PR (2021) Development of path planning algorithm for biped robot using combined multi-point RRT and visibility graph. Int J Inf Technol 13(4):1513–1519
Puig MV, Miller EK (2015) Neural substrates of dopamine D2 receptor modulated executive functions in the monkey prefrontal cortex. Cereb Cortex 25(9):2980–2987
Rahman MM, Ishii K, Noguchi N (2019) Optimum harvesting area of convex and concave polygon field for path planning of robot combine harvester. Intel Serv Robot 12:167–179
Ran T, Yuan L, Zhang J (2021) Scene perception based visual navigation of mobile robot in indoor environment. ISA Trans 109:389–400
Raymond JL, Lisberger SG (1998) Neural learning rules for the vestibulo-ocular reflex. J Neurosci 18(21):9112–9129
Rueckl JG, Cave KR, Kosslyn SM (1989) Why are “what’’ and “where’’ processed by separate cortical visual systems? a computational investigation. J Cogn Neurosci 1(2):171–186
Sakaki M, Yagi A, Murayama K (2018) Curiosity in old age: a possible key to achieving adaptive aging. Neurosci Biobehav Rev 88:106–116
Sanders KE, Osburn S, Paller KA, Beeman M (2019) Targeted memory reactivation during sleep improves next-day problem solving. Psychol Sci 30(11):1616–1624
Schultz W (1998) Predictive reward signal of dopamine neurons. J Neurophysiol
Shin Y, Kim E (2021) Hybrid path planning using positioning risk and artificial potential fields. Aerosp Sci Technol 112:106,640
Singh NH, Thongam K (2018) Mobile robot navigation using MLP-BP approaches in dynamic environments. Arab J Sci Eng 43(12):8013–8028
Tai L, Liu M (2016) Towards cognitive exploration through deep reinforcement learning for mobile robots. arXiv preprint arXiv:1610.01733
Teli TA, Wani MA (2021) A fuzzy based local minima avoidance path planning in autonomous robots. Int J Inf Technol 13:33–40
Wagner MJ, Kim TH, Savall J, Schnitzer MJ, Luo L (2017) Cerebellar granule cells encode the expectation of reward. Nature 544(7648):96–100
Wan S, Gu Z, Ni Q (2020) Cognitive computing and wireless communications on the edge for healthcare service robots. Comput Commun 149:99–106
Wang D, Duan Y, Weng J (2018) Motivated optimal developmental learning for sequential tasks without using rigid time-discounts. IEEE Trans Neural Netw Learn Syst 29(10):4917–4931
Wang D, Hu Y, Ma T (2020) Mobile robot navigation with the combination of supervised learning in cerebellum and reward-based learning in basal ganglia. Cogn Syst Res 59:1–14
Wang D, Si W, Luo Y (2019) A biologically inspired behavior control for the unexpected uncertainty with motivated developmental network. IEEE Trans Cogn Dev Syst 12(4):774–786
Wang D, Wang H, Liu L (2016) Unknown environment exploration of multi-robot system with the FORDPSO. Swarm Evol Comput 26:157–174
Wang D, Wang J, Liu L (2018) Developmental network: an internal emergent object feature learning. Neural Process Lett 48:1135–1159
Wang D, Yang K, Wang H, Liu L (2021) Behavioral decision-making of mobile robot in unknown environment with the cognitive transfer. J Intell Robot Syst 103:1–22
Watkins CJ, Dayan P (1992) Q-learning. Mach Learn 8:279–292
Weng J, Luciw M (2009) Dually optimal neuronal layers: lobe component analysis. IEEE Trans Auton Ment Dev 1(1):68–85
Wu L, Huang X, Cui J, Liu C, Xiao W (2023) Modified adaptive ant colony optimization algorithm and its application for solving path planning of mobile robot. Expert Syst Appl 215:119,410
Xu G, Khan AS, Moshayedi AJ, Zhang X, Shuxin Y (2022) The object detection, perspective and obstacles in robotic: a review. EAI Endorsed Trans AI Robot 1(1)
Zarei M, Moshayedi AJ, Zhong Y, Khan AS, Kolahdooz A, Andani ME (2023) Indoor UAV object detection algorithms on three processors: implementation test and comparison. In: 2023 3rd international conference on consumer electronics and computer engineering (ICCECE). IEEE, pp 812–819
Zhang H, Lin W, Chen A (2018) Path planning for the mobile robot: a review. Symmetry 10(10):450
Zheng YC, Wang J, Guo D, Zhang H, Li CC, Li DC, Li HM, Li K (2020) Study of multi-objective path planning method for vehicles. Environ Sci Pollut Res 27:3257–3270
Zhong X, Tian J, Hu H, Peng X (2020) Hybrid path planning based on safe a* algorithm and adaptive window approach for mobile robot in large-scale dynamic environment. J Intell Robot Syst 99:65–77
Acknowledgements
This work is financial supported by the National Natural Science Funds of China with Grant No. 62173309, and the Major Science and Technology Projects of Longmen Laboratory under Grant 231100220200.
Author information
Authors and Affiliations
Contributions
YZ puts forward the research idea, does the experiment and writes the manuscript; DW discusses with the first author and revises the manuscript; LL checks the paper and corrects the handwriting mistakes.
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no competing interest.
Ethics approval
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhou, Y., Wang, D. & Liu, L. Exploring unknown environments: motivated developmental learning for autonomous navigation of mobile robots. Intel Serv Robotics 17, 197–219 (2024). https://doi.org/10.1007/s11370-023-00504-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11370-023-00504-3