Abstract
A robot sailor can obtain its behaviour autonomously with reinforcement learning. However, reinforcement learning suffers from the curse of dimensionality, with an increase in state variables and an exponential increase in the number of states to realize fine control. This paper introduces an ensemble algorithm in Q-learning to allow robot sailors to collect garbage while sailing, and discusses the effect of the ensemble algorithm. This paper especially investigated the enhancement of decision-making to sail faster to the target position, while keeping a small number of state variables and a small number of states. Numerical experiments show a statistically significant enhancement by the proposed ensemble decision-making algorithm with a diverse number of agents, state variables, and learning parameters.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Watkins, C.: Learning from Delayed Rewards Ph.D. thesis. Cambridge University, Cambridge (1989)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Barto, A.G., Sutton, R.S., Anderson, W.C.: Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans. Syst., Man Cybern. SMC–13(5), 834–846 (1983)
Asada, M., Noda, S., Tawaratsumida, S., Hosoda, K.: Purposive behavior acquisition for a real robot by vision-based reinforcement learning. Mach. Learn. 23, 279–303 (1996)
Notsu, A., Ichihashi, H., Honda, K.: State and action space segmentation algorithm in Q-learning. Int. Joint Conf. Neural Netw. pp. 2384–2389 (2008)
Kashimura, Y., Ueno, A., Tatsumi, S.: A continuous action space representation by particle filter for reinforcement learning. In: 22nd Annual Conference of the Japanese Society for Artificial Intelligence (2008) (in Japanese)
Morimoto, J., Doya, K.: Reinforcement learning of dynamic motor sequence: learning to stand up. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, Vol. 3, 1721–1726 (1998)
Ishihara, S., Igarashi, H.: Applying the policy gradient method to behavior learning in multiagent systems: the pursuit problem. Syst. Comput. Jpn. 37(10), 101–109 (2006)
Glorennec, P., Dept. d’Inf, I.d.R.F., Jouffe, L.: Fuzzy Q-learning. In: Proceeding of Sixth IEEE Internationall Conference on Fuzzy Systems, Vol. 2 (1997)
Hosoya, Y., Umano, M.: Dynamic fuzzy Q-learning with facility of tuning and removing fuzzy rules. IEEE World Congress on Computational Intelligence (2012)
Yairi, T., Hori, K., Nakasuka, S.: Autonomous reconstruction of state space for learning of robot behavior. In: Proceedings of of International Conference on Intelligent Robots and Systems, pp. 891–896 (2000)
Nagayoshi, M., Murao, H., Tamaki, H.: A reinforcement learning with switching controllers for a continuous action space. Artif. Life Robot. 15, 97–100 (2010)
Sterne, P.J.: Reinforcement Sailing. Artificial Intelligence, School of Informatics, University of Edinburgh, Master of Science (2004)
Manabe, H., Tachibana, K.: Consideration of state representation for semi-autonomous reinforcement learning of sailing within a navigable area. Robot. Sail. 1, 89–102 (2015)
Ueda, N., Nakano, R.: Generalization error of ensemble estimators. IEEE Conf. Neural Netw. pp. 90–95 (1996)
Sugiyama, T., Obata, T., Hoki, K., Ito, T.: Optimistic selection rule better than majority voting system. Comput. Games LNCS 6515, 166–175 (2011)
Freund, Y., Schapire, E.R.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 15(1), 119–139 (1995)
Wiering, M.A., Hasselt, H.V.: Ensemble algorithm in reinforcement learning. IEEE Trans. Syst., Man Cybern. B 38(4), 930–936 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Tachibana, K., Fukazawa, R. (2017). Effect of an Ensemble Algorithm in Reinforcement Learning for Garbage-Collection Sailing. In: Alves, J., Cruz, N. (eds) Robotic Sailing 2016. WRSC/IRSC 2016. Springer, Cham. https://doi.org/10.1007/978-3-319-45453-5_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-45453-5_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45452-8
Online ISBN: 978-3-319-45453-5
eBook Packages: EngineeringEngineering (R0)