Particle Filter on Episode for Learning Decision Making Rule

Ueda, Ryuichi; Mizuta, Kotaro; Yamakawa, Hiroshi; Okada, Hiroyuki

doi:10.1007/978-3-319-48036-7_54

Particle Filter on Episode for Learning Decision Making Rule

Ryuichi Ueda¹⁹,
Kotaro Mizuta²⁰,
Hiroshi Yamakawa²¹ &
…
Hiroyuki Okada²²

Conference paper
First Online: 11 February 2017

2558 Accesses
1 Altmetric

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 531))

Abstract

We propose a novel method, a particle filter on episode, for decision makings of agents in the real world. This method is used for simulating behavioral experiments of rodents as a workable model, and for decision making of actual robots. Recent studies on neuroscience suggest that hippocampus and its surroundings in brains of mammals are related to solve navigation problems, which are also essential in robotics. The hippocampus also handle memories and some parts of a brain utilize them for decision. The particle filter gives a calculation model of decision making based on memories. In this paper, we have verified that this method learns two kinds of tasks that have been frequently examined in behavioral experiments of rodents. Though the tasks have been different in character from each other, the algorithm has been able to make an actual robot take appropriate behavior in the both tasks with an identical parameter set.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Barbieri, R., et al.: An analysis of hippocampal spatio-temporal representations using a Bayesian algorithm for neural spike train decoding. IEEE Trans. Neural Syst. Rehabil. Eng. 13(2), 131–136 (2005)
Article MathSciNet Google Scholar
Buzsáki, G., Moser, E.I.: Memory, navigation and theta rhythm in the hippocampal-entorhinal system. Nat. Neurosci. 16(2), 130–138 (2013)
Article Google Scholar
Dudchenko, P.A.: An overview of the tasks used to test working memory in rodents. Neurosci. Biobehav. Rev. 28(7), 699–709 (2004)
Article Google Scholar
Fox, D., et al.: Monte Carlo localization: efficient position estimation for mobile robots. In: Proceedings of AAAI. pp. 343–349 (1999)
Google Scholar
Franz, M.O., Mallot, H.A.: Biomimetic robot navigation. Robot. Auton. Syst. 30, 133–153 (2000)
Article Google Scholar
Hafting, T., et al.: Microstructure of a spatial map in the entorhinal cortex. Nature 436, 801–806 (2005). Aug
Article Google Scholar
Hargreaves, E.L., et al.: Major dissociation between medial and lateral entorhinal input to dorsal hippocampus. Science 308, 1792–1794 (2005). June
Article Google Scholar
Ito, H.T., et al.: A prefrontal-thalamo-hippocampal circuit for goal-directed spatial navigation. Nature 522, 50–55 (2015)
Article Google Scholar
Kitamura, T., et al.: Island cells control temporal association memory. Science 343(6173), 896–901 (2014)
Article Google Scholar
Lever, C., et al.: Boundary vector cells in the subiculum of the hippocampal formation. J. Neurosci. 29(31), 9771–9777 (2009)
Article Google Scholar
Milford, M., Schulz, R.: Principles of goal-directed spatial robot navigation in biomimetic models. Philos. Trans. R. Soc. B 369(1665), 2013484 (2014)
Google Scholar
Milford, M.J., Wyeth, G.F.: Mapping a suburb with a single camera using a biologically inspired SLAM system. IEEE Trans. Robot. Autom. 24(5), 1038–1053 (2008)
Article Google Scholar
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015)
Article Google Scholar
Montemerlo, M.: FastSLAM: a factored solution to the simultaneous localization and mapping problem with unknown data association. Doctor Thesis, Carnegie Mellon University (2003)
Google Scholar
Moser, E.I., Moser, M.B.: A metric for space. Hippocampus 18(12), 1142–1156 (2008)
Google Scholar
O’keefe, J., Dostrovsky, J.: The hippocampus as a spatial map. Preliminary evidence from unit activity in the freely-moving rat. Brain Res. 34(1), 171–175 (1971)
Google Scholar
Ormoneit, D., Sen, Ś.: Kernel-based reinforcement learning. Mach. Learn. 49(2–3), 161–178 (2002)
Article MATH Google Scholar
Pastalkova, E., et al.: Internally generated cell assembly sequences in the rat Hippocampus. Science 321(5894), 1322–1327 (2008)
Article Google Scholar
Pfeiffer, B.E., Foster, D.J.: Hippocampal place-cell sequences depict future paths to remembered goals. Nature 497, 74–79 (2013)
Article Google Scholar
Shaw, C.L., et al.: The role of the medial prefrontal cortex in the acquisition, retention, and reversal of a tactile visuospatial conditional discrimination task. Behav. Brain Res. 236, 94–101 (2013)
Article Google Scholar
Solstad, T., et al.: Representation of geometric borders in the entorhinal cortex. Science 322(5909), 1865–1868 (2008)
Article Google Scholar
Spellman, T., et al.: Hippocampal-prefrontal input supports spatial encoding in working memory. Nature 522, 309–314 (2015)
Article Google Scholar
Suh, J.: Entorhinal cortex layer III input to the Hippocampus is crucial for temporal association memory. Science 334(9), 1415–1420 (2011)
Article MathSciNet Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT Press, Cambridge (1998)
Google Scholar
Tesauro, G.: Temporal difference learning and TD-Gammon. Commun. ACM 38(3), 58–68 (1995)
Article Google Scholar
Thrun, S., et al.: Probabilistic Robotics. MIT Press (2005)
Google Scholar
Ueda, R.: Generation of compensation behavior of autonomous robot for uncertainty of information with probabilistic flow control. Adv. Robot. 29(11), 721–734 (2015)
Article Google Scholar
Unemi, T., Saitoh, H.: Episode-based reinforcement learning—an instance-based approach for perceptual aliasing. In: Proceedings of IEEE International Conference on Systems, Man, and Cybernetics, pp. 435–440 (1999)
Google Scholar
Yamakawa, H.: Hippocampal formation mechanism will inspire frame generation for building an artificial general intelligence. In: Artificial General Intelligence, pp. 362–371 (2012)
Google Scholar
Yamamoto, J., et al.: Successful execution of working memory linked to synchronized high-frequency gamma oscillations. Cell 157, 845–857 (2014)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Chiba Institute of Technology, 2-17-1 Tsudanuma, Narashino, Chiba, Japan
Ryuichi Ueda
Riken Brain Science Institute, 2-1 Hirosawa, Saitama, Wako, Japan
Kotaro Mizuta
Dowango Artificial Intelligence Laboratory, Kabukiza Tower, 4-12-15 Ginza, Chuo-ku, Tokyo, Japan
Hiroshi Yamakawa
Graduate School of Brain Sciences, Tamagawa University, 6-1-1 Tamagawa-gakuen, Machida, Tokyo, Japan
Hiroyuki Okada

Authors

Ryuichi Ueda
View author publications
You can also search for this author in PubMed Google Scholar
Kotaro Mizuta
View author publications
You can also search for this author in PubMed Google Scholar
Hiroshi Yamakawa
View author publications
You can also search for this author in PubMed Google Scholar
Hiroyuki Okada
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ryuichi Ueda .

Editor information

Editors and Affiliations

Shanghai Jiao Tong University , Shanghai, China
Weidong Chen
Osaka University , Osaka, Japan
Koh Hosoda
University of Padua , Padua, Italy
Emanuele Menegatti
Osaka University , Osaka, Japan
Masahiro Shimizu
Shanghai Jiao Tong University , Shanghai, China
Hesheng Wang

A Appendix

1.1 A.1 The Character of the Range Sensor

We have measured the relation between sensor readings and distances from a sensor to a wall in the environment. The result is shown in Fig. 9, Note that sensor readings are easily shifted by some differences of conditions.

Table 4 A part of an episode

Full size table

1.2 A.2 Actual Episode on the Experiment

Table 4 shows the first eight events of an experimental set, which is the first set of the experiment in Sect. 5.7.1. In the first trial, we set the reward one at the left arm and the robot obtained it. In the second trial, the robot did not obtain the reward placed at the right arm since it chose action “left” at \(t=5\).

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ueda, R., Mizuta, K., Yamakawa, H., Okada, H. (2017). Particle Filter on Episode for Learning Decision Making Rule. In: Chen, W., Hosoda, K., Menegatti, E., Shimizu, M., Wang, H. (eds) Intelligent Autonomous Systems 14. IAS 2016. Advances in Intelligent Systems and Computing, vol 531. Springer, Cham. https://doi.org/10.1007/978-3-319-48036-7_54

Download citation

DOI: https://doi.org/10.1007/978-3-319-48036-7_54
Published: 11 February 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-48035-0
Online ISBN: 978-3-319-48036-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Abstract

Buying options

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Appendix

A Appendix

1.1 A.1 The Character of the Range Sensor

1.2 A.2 Actual Episode on the Experiment

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation