Optimal path planning method based on epsilon-greedy Q-learning algorithm

Bulut, Vahide

doi:10.1007/s40430-022-03399-w

Optimal path planning method based on epsilon-greedy Q-learning algorithm

Technical Paper
Published: 02 March 2022

Volume 44, article number 106, (2022)
Cite this article

Journal of the Brazilian Society of Mechanical Sciences and Engineering Aims and scope Submit manuscript

Vahide Bulut ORCID: orcid.org/0000-0002-0786-8860¹

907 Accesses
9 Citations
Explore all metrics

Abstract

Path planning in an environment with obstacles is an ongoing problem for mobile robots. Q-learning algorithm increases its importance due to its utility in interacting with the environment. However, the size of state space and computational cost are the main parts to be improved. Hence, this paper proposes an improved epsilon-greedy Q-learning (IEGQL) algorithm to enhance efficiency and productivity regarding path length and computational cost. It is important to determine an effective reward function and adjust the agent’s next action to ensure exploitation and exploration. We present a new reward function to ensure the environment’s knowledge in advance for a mobile robot. Additionally, novel mathematical modeling is proposed to provide the optimal selection besides ensuring a rapid convergence. Since a mobile robot has difficulty moving through the path with sharp corners, the smooth path is formed after obtaining the optimal skeleton path. Furthermore, a real-world experiment is given based on the multi-objective function. The benchmark of the proposed IEGQL algorithm with the classical EGQL and A-star algorithms is presented. The experimental results and performance analysis indicate that the IEGQL algorithm generates the optimal path based on path length, computation time, low jerk, and staying closer to the optimal skeleton path.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimal path planning of multi-robot in dynamic environment using hybridization of meta-heuristic algorithm

Article 11 September 2022

Research on Path Planning Algorithm for Mobile Robot Based on Improved Reinforcement Learning

An Algorithm for Path Planning Based on Improved Q-Learning

References

Sakthitharan S, Jayashri S (2019) Establishing an emergency communication network and optimal path using multiple autonomous rover robots. Concurr Comput Pract Exp 31(14):e4636
Article Google Scholar
Brooks RA (1983) Solving the find-path problem by good representation of free space. IEEE Trans Syst Man Cybern 13(2):190–197
Article MathSciNet Google Scholar
Lu S, Zhao J, Jiang L, Liu H (2017) Time-jerk optimal trajectory planning of a 7-dof redundant robot. Turk J Electr Eng Comput Sci 25:4211–4222
Article Google Scholar
Gasparetto A, Zanotto V (2008) A technique for time-jerk optimal planning of robot trajectories. Robot Comput Integr Manuf 24(3):415–426
Article Google Scholar
Zanotto V, Gasparetto A, Lanzutti A et al (2011) Experimental validation of minimum time-jerk algorithms for industrial robots. J Intell Robot Syst 64:197–219
Article Google Scholar
Elsner J.(2018) Optimizing passenger comfort in cost functions for trajectory planning. arXiv preprint arXiv:1811.06895
Chen Y, Li B (2011) A piecewise acceleration-optimal and smooth-jerk trajectory planning method for robot manipulator along a predefined path. Int J Adv Robot Syst 8(4):50
Article Google Scholar
Lu S, Ding B, Li Y (2020) Minimum-jerk trajectory planning pertaining to a translational 3-degree-of-freedom parallel manipulator through piecewise quintic polynomials interpolation. Adv Mech Eng 12(3):1687814020913667
Article Google Scholar
Ali JM, Miscro MY (2017) Quintic trigonometric Bézier curve with two shape parameters. Sains Malays 46:825–831
Article Google Scholar
Hart P, Nilsson N, Raphael B (1968) A formal basis for the heuristic determination of minimum cost paths. IEEE Trans Syst Sci Cybern 4(2):100–107
Article Google Scholar
Hentout A, Maoudj A, Guir D, Saighi S, Harkat MA, Hammouche MZ, Bakdiy A (2019) Collision-free path planning for indoor mobile robots based on rapidly-exploring random trees and piecewise cubic hermite interpolating polynomial. Int J Imaging Robot 19(3):74–97
Google Scholar
Bulut V (2020) Differential geometry of autonomous wheel-legged robots. Eng Comput 37(2):615–637
Article Google Scholar
Bulut V (2021) Path planning for autonomous ground vehicles based on quintic trigonometric Bézier curve. J Braz Soc Mech Sci Eng. 43(2):1–4
Article Google Scholar
Bulut V (2021) SP-search-based path planning algorithm for mobile robots using quintic trigonometric Bézier curves. Concurr Comput Pract Exp. https://doi.org/10.1002/cpe.6493.e6493
Qin YQ, Sun DB, Li N, Cen YG (2004) Path planning for mobile robot using the particle swarm optimization with mutation operator. In: Proceedings of 2004 international conference on machine learning and cybernetics (IEEE Cat. No. 04EX826), vol 4, pp 2473–2478. IEEE
Dorigo M, Stützle T (2019) Ant colony optimization: overview and recent advances. In: Handbook of metaheuristics, pp 311–351
Brand M, Masuda M, Wehner N, Yu XH (2010) Ant colony optimization algorithm for robot path planning. In: 2010 international conference on computer design and applications, vol 3, pp V3–436. IEEE
Bakdi A, Hentout A, Boutami H, Maoudj A, Hachour O, Bouzouia B (2017) Optimal path planning and execution for mobile robots using genetic algorithm and adaptive fuzzy-logic control. Robot Autonom Syst 89:95–109
Article Google Scholar
Karami AH, Hasanzadeh M (2015) An adaptive genetic algorithm for robot motion planning in 2D complex environments. Comput Electr Eng 43:317–29
Article Google Scholar
Orozco-Rosas U, Montiel O, Sepúlveda R (2019) Mobile robot path planning using membrane evolutionary artificial potential field. Appl Soft Comput 77:236–51
Article Google Scholar
Cheng YH, Chao PJ, Kuo CN (2019) Mobile robot path planning using a teaching–learning-interactive learning-based optimization. IAENG Int J Comput Sci 46(2):199–207
Google Scholar
Kober J, Bagnell JA, Peters J (2013) Reinforcement learning in robotics: a survey. Int J Robot Res 32(11):1238–74
Article Google Scholar
Low ES, Ong P, Cheah KC (2019) Solving the optimal path planning of a mobile robot using improved Q-learning. Robot Autonom Syst 115:143–61
Article Google Scholar
Luviano D, Yu W (2017) Continuous-time path planning for multi-agents with fuzzy reinforcement learning. J Intell Fuzzy Syst 33(1):491–501
Article Google Scholar
Qu C, Gai W, Zhong M, Zhang J (2020) A novel reinforcement learning based grey wolf optimizer algorithm for unmanned aerial vehicles (UAVs) path planning. Appl Soft Comput 89:106099
Article Google Scholar
Watkins CJ, Dayan P (1992) Q-learning. Mach Learn 8(3):279–292
MATH Google Scholar
Ganapathy V, Yun SC, Joe HK (2009 ) Neural q-learning controller for mobile robot. In: 2009 IEEE/ASME international conference on advanced intelligent mechatronics, pp 863–868. IEEE
Wang YH, Li THS, Lin CJ (2013) Backward Q-learning: the combination of Sarsa algorithm and Q-learning. Eng Appl Artif Intell 26(9):2184–2193
Article Google Scholar
Das PK, Mandhata SC, Behera HS, Patro SN (2012) An improved Q-learning algorithm for path-planning of a mobile robot. Int J Comput Appl 975:8887
Google Scholar
Oh CH, Nakashima T, Ishibuchi H (1998) Initialization of Q-values by fuzzy rules for accelerating Q-learning. In: 1998 IEEE international joint conference on neural networks proceedings. IEEE world congress on computational intelligence (Cat. No. 98CH36227), vol 3, pp 2051–2056. IEEE
Jiang J, Xin J (2019) Path planning of a mobile robot in a free-space environment using Q-learning. Progr Artif Intell 8(1):133–142
Article MathSciNet Google Scholar
Maoudj A, Hentout A (2020) Optimal path planning approach based on q-learning algorithm for mobile robots. Appl Soft Comput 97:106796. https://doi.org/10.1016/j.asoc.2020.106796
Article Google Scholar
https://tr.pinterest.com/pin/423197696216608579/

Download references

Author information

Authors and Affiliations

Department of Engineering Sciences, Izmir Katip Celebi University, 35620, Cigli, Izmir, Turkey
Vahide Bulut

Authors

Vahide Bulut
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

This article is completed all by Vahide Bulut.

Corresponding author

Correspondence to Vahide Bulut.

Ethics declarations

Conflict of interest

The author declares that she has no conflict of interest.

Ethics approval

The submitted work is original and has not been published elsewhere in any form or language.

Additional information

Technical Editor: Rogério Sales Gonçalves.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bulut, V. Optimal path planning method based on epsilon-greedy Q-learning algorithm. J Braz. Soc. Mech. Sci. Eng. 44, 106 (2022). https://doi.org/10.1007/s40430-022-03399-w

Download citation

Received: 13 September 2021
Accepted: 31 January 2022
Published: 02 March 2022
DOI: https://doi.org/10.1007/s40430-022-03399-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimal path planning method based on epsilon-greedy Q-learning algorithm

Abstract

Access this article

Similar content being viewed by others

Optimal path planning of multi-robot in dynamic environment using hybridization of meta-heuristic algorithm

Research on Path Planning Algorithm for Mobile Robot Based on Improved Reinforcement Learning

An Algorithm for Path Planning Based on Improved Q-Learning

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethics approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Optimal path planning method based on epsilon-greedy Q-learning algorithm

Abstract

Access this article

Similar content being viewed by others

Optimal path planning of multi-robot in dynamic environment using hybridization of meta-heuristic algorithm

Research on Path Planning Algorithm for Mobile Robot Based on Improved Reinforcement Learning

An Algorithm for Path Planning Based on Improved Q-Learning

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethics approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation