Progress in Artificial Intelligence

, Volume 8, Issue 1, pp 133–142 | Cite as

Path planning of a mobile robot in a free-space environment using Q-learning

  • Jianxun Jiang
  • Jianbin XinEmail author
Regular Paper


This paper proposes an improved Q-learning algorithm for the path planning of a mobile robot in a free-space environment. Existing Q-learning methods for path planning focus on the mesh routing environment; therefore, new methods must be developed for free-space environments in which robots move continuously. For the free-space environment, we construct fuzzified state variables for dividing the continuous space to avoid the curse of dimensionality. The state variables include the distances to the target point and obstacles and the heading of the robot. Based on the defined state variables, we propose an integrated learning strategy on the basis of the space allocation to accelerate the convergence during the learning process. Simulation experiments show that the path planning of mobile robots can be realized quickly, and the probability of obstacle collisions can be reduced. The results of the experiments also demonstrate the considerable advantages of the proposed learning algorithm compared to two commonly used methods.


Q-learning Free state space Mobile robot Path planning 



This research is supported by the China Postdoctoral Science Foundation under Grant 2016M592311, the National Natural Science Foundation of China under Grant 61703372 and 61603345, the Key Scientific Research Project of Henan Higher Education under Grant 18A413012 and 17A413003 and the Science & Technology Innovation Team Project of Henan Province under Grant 17IRTSTHN013.


  1. 1.
    Raja, P., Pugazhenthi, S.: Optimal path planning of mobile robots: a review. Int. J. Phys. Sci. 7(9), 1314–1320 (2012)CrossRefGoogle Scholar
  2. 2.
    Rezaee, H., Abdollahi, F.: A decentralized cooperative control scheme with obstacle avoidance for a team of mobile robots. IEEE Trans. Ind. Electron. 61(1), 347–354 (2013)CrossRefGoogle Scholar
  3. 3.
    Parasuraman, S., Ganapathy, V., Shirinzadeh, B.: Multiple sensors data integration using MFAM for mobile robot navigation. In: IEEE Congress on Evolutionary Computation, pp. 2421–2427 (2007)Google Scholar
  4. 4.
    Cai, C., Ferrari, S.: Information-driven sensor path planning by approximate cell decomposition. IEEE Trans. Syst. Man Cybern. Part B Cybern. 39(3), 672–689 (2009)CrossRefGoogle Scholar
  5. 5.
    Bhattacharya, S., Likhachev, M., Kumar, V.: Topological constraints in search-based robot path planning. Auton. Robots 33(3), 273–290 (2012)CrossRefGoogle Scholar
  6. 6.
    Zhang, G., Ferrari, S., Qian, M.: An information roadmap method for robotic sensor path planning. J. Intell. Robot. Syst. 56(1–2), 69–98 (2009)CrossRefzbMATHGoogle Scholar
  7. 7.
    Contreras-Cruz, M.A., Ayala-Ramirez, V., Hernandez-Belmonte, U.H.: Mobile robot path planning using artificial bee colony and evolutionary programming. Appl. Soft Comput. 30, 319–328 (2015)CrossRefGoogle Scholar
  8. 8.
    Tsai, C.C., Huang, H.C., Chan, C.K.: Parallel elite genetic algorithm and its application to global path planning for autonomous robot navigation. IEEE Trans. Ind. Electron. 58(10), 4813–4821 (2011)CrossRefGoogle Scholar
  9. 9.
    Miao, H., Tian, Y.C.: Dynamic robot path planning using an enhanced simulated annealing approach. Appl. Math. Comput. 222(5), 420–437 (2013)zbMATHGoogle Scholar
  10. 10.
    Remolina, E., Kuipers, B.: Towards a general theory of topological maps. Artif. Intell. 152(1), 47–104 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Sun, B., et al.: A novel fuzzy control algorithm for three-dimensional AUV path planning based on sonar model. J. Intell. Fuzzy Syst. 26(6), 2913–2926 (2014)zbMATHGoogle Scholar
  12. 12.
    Qu, H., et al.: Real-time robot path planning based on a modified pulse-coupled neural network model. IEEE Trans. Neural Netw. 20(11), 1724–39 (2009)CrossRefGoogle Scholar
  13. 13.
    Zhu, Y., et al.: Target-driven visual navigation in indoor scenes using deep reinforcement learning. IEEE Int. Conf. Robot. Autom. 2017, 3357–3364 (2017)Google Scholar
  14. 14.
    Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992)zbMATHGoogle Scholar
  15. 15.
    Gomes, E.R., Kowalczyk, R.: Modelling the dynamics of multiagent Q-learning with \(\epsilon \)-greedy exploration. In: International Conference on Autonomous Agents and Multiagent Systems, pp. 1181–1182 (2009)Google Scholar
  16. 16.
    Kim, I., et al.: Obstacle avoidance path planning for UAV using reinforcement learning under simulated environment. In: IASER 3rd International Conference on Electronics, Electrical Engineering, Computer Science, Okinawa, pp. 34–36 (2017)Google Scholar
  17. 17.
    Konar, A., Indrani, G., Sapam, J.S., Lakhmi, C.J., Atulya, K.N.: A deterministic improved Q-learning for path planning of a mobile robot. IEEE Trans. Syst. Man Cybern. Syst. 43(5), 1141–1153 (2013)CrossRefGoogle Scholar
  18. 18.
    Nirmalya, R., et al.: Implementation of image processing and reinforcement learning in path planning of mobile robots. Int. J. Eng. Sci. 7, 15211 (2017)Google Scholar
  19. 19.
    Wei, Q., Frank, L.L., Sun, Q., Yan, P., Song, R.: Discrete-time deterministic Q-learning: a novel convergence analysis. IEEE Trans. Cybern. 47(5), 1224–1237 (2017)CrossRefGoogle Scholar
  20. 20.
    Liu, J., Wei, Q., Xu, L.: Multi-step reinforcement learning algorithm of mobile robot path planning based on virtual potential field. In: International Conference of Pioneering Computer Scientists, Engineers and Educators, Singapore, pp. 528–538 (2017)Google Scholar
  21. 21.
    Li, S., Xin, X., Lei, Z.: Dynamic path planning of a mobile robot with improved Q-learning algorithm. In: IEEE International Conference on Information and Automation, pp. 409–411 (2015)Google Scholar
  22. 22.
    Lee, K., et al.: Deep reinforcement learning in continuous action spaces: a case study in the game of simulated curling. In: International Conference on Machine Learning, pp. 2943–2952 (2018)Google Scholar
  23. 23.
    Luuk, A., Heskes, T., de Vries, A. P.: Comparing discretization methods for applying Q-learning in continuous state-action space (2017)Google Scholar
  24. 24.
    Keogh, E., Mueen, A.: Curse of dimensionality. Ind. Eng. Chem 29(1), 48–53 (2017)Google Scholar
  25. 25.
    Cai, J., Yu, R., Cheng, L.: Autonomous navigation research for mobile robot. In: The World Congress on Intelligent Control and Automation, pp. 331–335 (2012)Google Scholar
  26. 26.
    Seising, R.: The Fuzzification of Systems: The Genesis of Fuzzy Set Theory and Its Initial Applications. Springer, Berlin (2007)zbMATHGoogle Scholar
  27. 27.
    Sanyal, A.K., et al.: Global optimal attitude estimation using uncertainty ellipsoids. Syst. Control Lett. 57(3), 236–245 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  28. 28.
    Dearden, R., Friedman, N., Russell, S.: Bayesian Q-learning. In: AAAI/IAAI, pp. 761–768 (1998)Google Scholar
  29. 29.
    Kianercy, A., Galstyan, A.: Dynamics of Boltzmann Q-learning in two-player two-action games. Phys. Rev. E 85(4), 041145 (2012)CrossRefGoogle Scholar
  30. 30.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)zbMATHGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.School of Electrical EngineeringZhengzhou UniversityZhengzhouChina

Personalised recommendations