Effect of an Ensemble Algorithm in Reinforcement Learning for Garbage-Collection Sailing

Tachibana, Kanta; Fukazawa, Ryuta

doi:10.1007/978-3-319-45453-5_7

Kanta Tachibana³ &
Ryuta Fukazawa³

Included in the following conference series:

World Robotic Sailing championship and International Robotic Sailing Conference

787 Accesses

Abstract

A robot sailor can obtain its behaviour autonomously with reinforcement learning. However, reinforcement learning suffers from the curse of dimensionality, with an increase in state variables and an exponential increase in the number of states to realize fine control. This paper introduces an ensemble algorithm in Q-learning to allow robot sailors to collect garbage while sailing, and discusses the effect of the ensemble algorithm. This paper especially investigated the enhancement of decision-making to sail faster to the target position, while keeping a small number of state variables and a small number of states. Numerical experiments show a statistically significant enhancement by the proposed ensemble decision-making algorithm with a diverse number of agents, state variables, and learning parameters.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Watkins, C.: Learning from Delayed Rewards Ph.D. thesis. Cambridge University, Cambridge (1989)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Google Scholar
Barto, A.G., Sutton, R.S., Anderson, W.C.: Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans. Syst., Man Cybern. SMC–13(5), 834–846 (1983)
Article Google Scholar
Asada, M., Noda, S., Tawaratsumida, S., Hosoda, K.: Purposive behavior acquisition for a real robot by vision-based reinforcement learning. Mach. Learn. 23, 279–303 (1996)
Google Scholar
Notsu, A., Ichihashi, H., Honda, K.: State and action space segmentation algorithm in Q-learning. Int. Joint Conf. Neural Netw. pp. 2384–2389 (2008)
Google Scholar
Kashimura, Y., Ueno, A., Tatsumi, S.: A continuous action space representation by particle filter for reinforcement learning. In: 22nd Annual Conference of the Japanese Society for Artificial Intelligence (2008) (in Japanese)
Google Scholar
Morimoto, J., Doya, K.: Reinforcement learning of dynamic motor sequence: learning to stand up. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, Vol. 3, 1721–1726 (1998)
Google Scholar
Ishihara, S., Igarashi, H.: Applying the policy gradient method to behavior learning in multiagent systems: the pursuit problem. Syst. Comput. Jpn. 37(10), 101–109 (2006)
Article Google Scholar
Glorennec, P., Dept. d’Inf, I.d.R.F., Jouffe, L.: Fuzzy Q-learning. In: Proceeding of Sixth IEEE Internationall Conference on Fuzzy Systems, Vol. 2 (1997)
Google Scholar
Hosoya, Y., Umano, M.: Dynamic fuzzy Q-learning with facility of tuning and removing fuzzy rules. IEEE World Congress on Computational Intelligence (2012)
Google Scholar
Yairi, T., Hori, K., Nakasuka, S.: Autonomous reconstruction of state space for learning of robot behavior. In: Proceedings of of International Conference on Intelligent Robots and Systems, pp. 891–896 (2000)
Google Scholar
Nagayoshi, M., Murao, H., Tamaki, H.: A reinforcement learning with switching controllers for a continuous action space. Artif. Life Robot. 15, 97–100 (2010)
Article Google Scholar
Sterne, P.J.: Reinforcement Sailing. Artificial Intelligence, School of Informatics, University of Edinburgh, Master of Science (2004)
Google Scholar
Manabe, H., Tachibana, K.: Consideration of state representation for semi-autonomous reinforcement learning of sailing within a navigable area. Robot. Sail. 1, 89–102 (2015)
Google Scholar
Ueda, N., Nakano, R.: Generalization error of ensemble estimators. IEEE Conf. Neural Netw. pp. 90–95 (1996)
Google Scholar
Sugiyama, T., Obata, T., Hoki, K., Ito, T.: Optimistic selection rule better than majority voting system. Comput. Games LNCS 6515, 166–175 (2011)
Article MATH Google Scholar
Freund, Y., Schapire, E.R.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 15(1), 119–139 (1995)
MathSciNet MATH Google Scholar
Wiering, M.A., Hasselt, H.V.: Ensemble algorithm in reinforcement learning. IEEE Trans. Syst., Man Cybern. B 38(4), 930–936 (2008)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Informatics, Kogakuin University, Tokyo, Shinjuku, 1-24-2, Nishishinjuku, Japan
Kanta Tachibana & Ryuta Fukazawa

Authors

Kanta Tachibana
View author publications
You can also search for this author in PubMed Google Scholar
Ryuta Fukazawa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kanta Tachibana .

Editor information

Editors and Affiliations

Dept. of Electrical and Computer Engg, University of Porto Dept. of Electrical and Computer Engg, Porto, Portugal
José C. Alves
Dept. of Electrical and Computer Engg, University of Porto Dept. of Electrical and Computer Engg, Porto, Portugal
Nuno A. Cruz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tachibana, K., Fukazawa, R. (2017). Effect of an Ensemble Algorithm in Reinforcement Learning for Garbage-Collection Sailing. In: Alves, J., Cruz, N. (eds) Robotic Sailing 2016. WRSC/IRSC 2016. Springer, Cham. https://doi.org/10.1007/978-3-319-45453-5_7

Download citation

DOI: https://doi.org/10.1007/978-3-319-45453-5_7
Published: 25 August 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45452-8
Online ISBN: 978-3-319-45453-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics