Abstract
Path planning is the primary task for robotic fish, especially when the environment under water of robotic fish is unknown. The conventional reinforcement learning algorithms usually exhibit a poor convergence property in unknown environment. In order to find the optimal path and increase the convergence speed in the unknown environment, an improved reinforcement learning method utilizing a simulated annealing approach is proposed in robotic fish navigation. The simulated annealing policy with a novel cooling method rather than a general ɛ-greedy policy is taken for action choice. The algorithm convergence speed is improved by a novel reward function with goal-oriented strategy. Then the stopping condition of the proposed reinforcement learning algorithm is rectified as well. In this work, the robotic fish is designed and the prototype is produced by 3D printing technology. Then the proposed algorithm is examined in the 2D unpredictable environment to obtain greedy actions. Experimental results show that the proposed algorithms can generate an optimal path in unknown environment for robotic fish and increase the convergence speed as well as balance the exploration and exploitation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Lamini, C., Fathi, Y., Benhlima, S.: Collaborative Q-learning path planning for autonomous robots based on holonic multi-agent system. In: 2015 10th International Conference on Intelligent Systems: Theories and Applications (SITA), pp. 1–6 (2015)
Kaluđer, H., Brezak, M., Petrović, I.: A visibility graph based method for path planning in dynamic environments. In: 2011 Proceedings of the 34th International Convention MIPRO, pp. 717–721 (2011)
Li, C., Lu, H., Cui, G.: The improved potential grid method in robot path planning. In: International Technology and Innovation Conference (ITIC 2009), pp. 1–5 (2009)
Chin, W.H., Saputra, A.A., Kubota, N.: A neuro-based network for on-line topological map building and dynamic path planning. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 2805–2810 (2017)
Bounini, F., Gingras, D., Pollart, H., Gruyer, D.: Modified artificial potential field method for online path planning applications. In: 2017 IEEE Intelligent Vehicles Symposium (IV), pp. 180–185 (2017)
Lee, J., Park, W.: A probability-based path planning method using fuzzy logic. In: 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 2978–2984 (2014)
Cobano, J.A., Conde, R., Alejo, D., Ollero, A.: Path planning based on genetic algorithms and the Monte-Carlo method to avoid aerial vehicle collisions under uncertainties. In: 2011 IEEE International Conference on Robotics and Automation, pp. 4429–4434 (2011)
Lamini, C., Fathi, Y., Benhlima, S.: H-MAS architecture and reinforcement learning method for autonomous robot path planning. In: 2017 Intelligent Systems and Computer Vision (ISCV), pp. 1–7 (2017)
Konar, A., Chakraborty, I.G., Singh, S.J., Jain, L.C., Nagar, A.K.: A deterministic improved q-learning for path planning of a mobile robot. IEEE Trans. Syst. Man Cybern.: Syst. 43(5), 1141–1153 (2013)
Liu, Y., Liu, H., Wang, B.: Autonomous exploration for mobile robot using Q-learning. In: 2017 2nd International Conference on Advanced Robotics and Mechatronics (ICARM), pp. 614–619 (2017)
Kim, B., Pineau, J.: Socially Adaptive path planning in human environments using inverse reinforcement learning. Int. J. Soc. Robot. 8(1), 51–66 (2016)
Das, P.K., Behera, H.S., Panigrahi, B.K.: Intelligent-based multi-robot path planning inspired by improved classical Q-learning and improved particle swarm optimization with perturbed velocity. Eng. Sci. Technol. Int. J. 19(1), 651–669 (2016)
Li, L., Lv, Y., Wang, F.-Y.: Traffic signal timing via deep reinforcement learning. IEEE/CAA J. Autom. Sinica 3(3), 247–254 (2016)
Liu, T., Tian, B., Ai, Y., Li, L., Cao, D., Wang, F.-Y.: Parallel reinforcement learning: a framework and case study. IEEE/CAA J. Autom. Sinica 5(4), 827–835 (2018)
Busoniu, L., Babuska, R., Schutter, B.D., Ernst, D.: Reinforcement Learning and Dynamic Programming Using Function Approximators, p. 280. CRC Press, Inc., Boca Raton (2010)
Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4(1), 237–285 (1996)
Watkins, C., Dayan, P.: Technical note Q-learning. Mach. Learn. 8, 279–292 (1992)
Sutton, R.S., Barto, A.G.: Reinforcement learning: an introduction. Trans. Neur. Netw. 9(5), 1054 (1998)
Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E.: Equation of state calculations by fast computing machines. J. Chem. Phys. 21(6), 1087–1092 (1953)
Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220(4598), 671–680 (1983)
Acknowledgement
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Natural Science Foundation of Hubei Province (20181j001: Interfacial Defects Initiation Mechanism of Flexible Laminated Thin Film Energy Harvester and its Fabrication Process).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Hu, J., Mei, J., Chen, D., Li, L., Cheng, Z. (2018). Path Planning of Robotic Fish in Unknown Environment with Improved Reinforcement Learning Algorithm. In: Xiang, Y., Sun, J., Fortino, G., Guerrieri, A., Jung, J. (eds) Internet and Distributed Computing Systems. IDCS 2018. Lecture Notes in Computer Science(), vol 11226. Springer, Cham. https://doi.org/10.1007/978-3-030-02738-4_21
Download citation
DOI: https://doi.org/10.1007/978-3-030-02738-4_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-02737-7
Online ISBN: 978-3-030-02738-4
eBook Packages: Computer ScienceComputer Science (R0)