Fast-Maneuvering Target Seeking Based on Double-Action Q-Learning

  • Daniel C. K. Ngai
  • Nelson H. C. Yung
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4571)


In this paper, a reinforcement learning method called DAQL is proposed to solve the problem of seeking and homing onto a fast maneuvering target, within the context of mobile robots. This Q-learning based method considers both target and obstacle actions when determining its own action decisions, which enables the agent to learn more effectively in a dynamically changing environment. It particularly suits fast-maneuvering target cases, in which maneuvers of the target are unknown a priori. Simulation result depicts that the proposed method is able to choose a less convoluted path to reach the target when compared to the ideal proportional navigation (IPN) method in handling fast maneuvering and randomly moving target. Furthermore, it can learn to adapt to the physical limitation of the system and do not require specific initial conditions to be satisfied for successful navigation towards the moving target.


Moving object navigation reinforcement learning Q-learning 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ge, S.S., Cui, Y.J.: New potential functions for mobile robot path planning. IEEE Transactions on Robotics and Automation 16, 615–620 (2000)CrossRefGoogle Scholar
  2. 2.
    Conte, G., Zulli, R.: Hierarchical Path Planning in a Multirobot Environment with a Simple Navigation Function. IEEE Transactions on Systems Man and Cybernetics 25, 651–654 (1995)CrossRefGoogle Scholar
  3. 3.
    Li, T.H.S., Chang, S.J., Tong, W.: Fuzzy target tracking control of autonomous mobile robots by using infrared sensors. IEEE Transactions on Fuzzy Systems 12, 491–501 (2004)CrossRefGoogle Scholar
  4. 4.
    Luo, R.C., Chen, T.M., Su, K.L.: Target tracking using hierarchical grey-fuzzy motion decision-making method. IEEE Transactions on Systems Man and Cybernetics Part a-Systems and Humans 31, 179–186 (2001)CrossRefGoogle Scholar
  5. 5.
    Dias, J., Paredes, C., Fonseca, I., Araujo, H., Batista, J., Almeida, A.T.: Simulating pursuit with machine experiments with robots and artificial vision. IEEE Transactions on Robotics and Automation 14, 1–18 (1998)CrossRefGoogle Scholar
  6. 6.
    Adams, M.D.: High speed target pursuit and asymptotic stability in mobile robotics. IEEE Transactions on Robotics and Automation 15, 230–237 (1999)CrossRefGoogle Scholar
  7. 7.
    Ge, S.S., Cui, Y.J.: Dynamic motion planning for mobile robots using potential field method. Autonomous Robots 13, 207–222 (2002)zbMATHCrossRefGoogle Scholar
  8. 8.
    Belkhouche, F., Belkhouche, B., Rastgoufard, P.: Line of sight robot navigation toward a moving goal. IEEE Transactions on System Man and Cybernetic Part b-Cybernetic 36, 255–267 (2006)CrossRefGoogle Scholar
  9. 9.
    Yang, C.D., Yang, C.C.: A unified approach to proportional navigation. IEEE Transactions on Aerospace and Electronic Systems 33, 557–567 (1997)CrossRefGoogle Scholar
  10. 10.
    Shukla, U.S., Mahapatra, P.R.: The Proportional Navigation Dilemma - Pure or True. IEEE Transactions on Aerospace and Electronic Systems 26, 382–392 (1990)CrossRefGoogle Scholar
  11. 11.
    Yuan, P.J., Chern, J.S.: Ideal Proportional Navigation. Journal of Guidance Control and Dynamics 15, 1161–1165 (1992)CrossRefGoogle Scholar
  12. 12.
    Borg, J.M., Mehrandezh, M., Fenton, R.G., Benhabib, B.: An Ideal Proportional Navigation Guidance system for moving object interception-robotic experiments. In: Systems, Man, and Cybernetics, 2000 IEEE International Conference, vol. 5, pp. 3247–3252 (2000)Google Scholar
  13. 13.
    Mehrandezh, M., Sela, N.M., Fenton, R.G., Benhabib, B.: Robotic interception of moving objects using an augmented ideal proportional navigation guidance technique. IEEE Transactions on Systems Man and Cybernetics Part a-Systems and Humans 30, 238–250 (2000)CrossRefGoogle Scholar
  14. 14.
    Sutton, R.S., Barto, A.G.: Reinforcement learning: an introduction. MIT Press, Cambridge (1998)Google Scholar
  15. 15.
    Kaelbling, L.P.: Learning in embedded systems. MIT Press, Cambridge (1993)Google Scholar
  16. 16.
    Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: A survey. Journal of Artificial Intelligence Research 4, 237–285 (1996)Google Scholar
  17. 17.
    Sutton, R.S.: Reinforcement Learning. The International Series in Engineering and Computer Science, vol. 173. Kluwer Academic Publishers, Dordrecht (1992)Google Scholar
  18. 18.
    Ngai, D.C.K., Yung, N.H.C.: Double action Q-learning for obstacle avoidance in a dynamically changing environment. In: Proceedings of the 2005 IEEE Intelligent Vehicles Symposium, Las Vegas, pp. 211–216 (2005)Google Scholar
  19. 19.
    Ngai, D.C.K., Yung, N.H.C.: Performance Evaluation of Double Action Q-Learning in Moving Obstacle Avoidance Problem. In: Proceedings of the 2005 IEEE International Conference on Systems, Man, and Cybernetics, Hawaii, October 2005, pp. 865–870 (2005)Google Scholar
  20. 20.
    Watkins, C.J.C.H., Dayan, P.: Technical Note: Q-learning. Machine Learning 8, 279–292 (1992)zbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Daniel C. K. Ngai
    • 1
  • Nelson H. C. Yung
    • 1
  1. 1.Department of Electrical & Electronic Engineering, The University of Hong Kong, Pokfulam RoadHong Kong

Personalised recommendations