A Method to Effectively Detect Vulnerabilities on Path Planning of VIN

  • Jingjing Liu
  • Wenjia Niu
  • Jiqiang Liu
  • Jia Zhao
  • Tong Chen
  • Yinqi Yang
  • Yingxiao Xiang
  • Lei Han
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10631)


Reinforcement Learning has been used on path planning for a long time, which is thought to be very effective, especially the Value Iteration Networks (VIN) with strong generalization ability. In this paper, we analyze the path planning of VIN and propose a method that can effectively find vulnerable points in VIN. We build a 2D navigation task to test our method. The experiment for interfering VIN is conducted for the first time. The experimental results show that our method has good performance on finding vulnerabilities and could automatically adding obstacles to obstruct VIN path planning.


Path planning Reinforcement learning VIN Vulnerable points 



This material is based upon work supported by the National Natural Science Foundation of China (Grant No. 61672092, No. 61502030), Science and Technology on Information Assurance Laboratory (No. 614200103011711), BM-IIE Project (No. BMK2017B02-2), and the Fundamental Research Funds for the Central Universities (Grant No. 2016JBM020, No. 2017RC016).


  1. 1.
    Watkins, C.J.C.H., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992)zbMATHGoogle Scholar
  2. 2.
    Giusti, A., Guzzi, J., Dan, C.C., et al.: A machine learning approach to visual perception of forest trails for mobile robots. IEEE Robot. Autom. Lett. 1(2), 661–667 (2016)CrossRefGoogle Scholar
  3. 3.
    Levine, S., Finn, C., Darrell, T., et al.: End-to-end training of deep visuomotor policies. J. Mach. Learn. Res. 17(39), 1–40 (2016)MathSciNetzbMATHGoogle Scholar
  4. 4.
    Mnih, V., Kavukcuoglu, K., Silver, D., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)CrossRefGoogle Scholar
  5. 5.
    Tamar, A., Wu, Y., Thomas, G., et al.: Value iteration networks. In: Advances in Neural Information Processing Systems, pp. 2154–2162 (2016)Google Scholar
  6. 6.
    Hyndman, R.J., Koehler, A.B., Snyder, R.D., et al.: A state space framework for automatic forecasting using exponential smoothing methods. Int. J. Forecast. 18(3), 439–454 (2002)CrossRefGoogle Scholar
  7. 7.
    Brooks, R.A.: Solving the find-path problem by good representation of free space. IEEE Trans. Syst. Man Cybern. 2, 190–197 (1983)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Zhu, Q.B., Zhang, Y.: An ant colony algorithm based on grid method for mobile robot path planning. Robot 27(2), 132–136 (2005)Google Scholar
  9. 9.
    Terwilliger, T.C.: Automated main-chain model building by template matching and iterative fragment extension. Acta Crystallogr. Sect. D: Biol. Crystallogr. 59(1), 38–44 (2003)CrossRefGoogle Scholar
  10. 10.
    Warren, C.W.: Global path planning using artificial potential fields. In: Proceedings of 1989 IEEE International Conference on Robotics and Automation, pp. 316–321. IEEE (1989)Google Scholar
  11. 11.
    Lagoudakis, M.G., Parr, R.: Least-squares policy iteration. J. Mach. Learn. Res. 4(Dec), 1107–1149 (2003)MathSciNetzbMATHGoogle Scholar
  12. 12.
    Pineau, J., Gordon, G., Thrun, S.: Point-based value iteration: an anytime algorithm for POMDPs. In: IJCAI, vol. 3, pp. 1025–1032 (2003)Google Scholar
  13. 13.
    Dellaert, F., Fox, D., Burgard, W., et al.: Monte Carlo localization for mobile robots. In: Proceedings of 1999 IEEE International Conference on Robotics and Automation, vol. 2, pp. 1322–1328. IEEE (1999)Google Scholar
  14. 14.
    Tesauro, G.: Temporal difference learning and TD-Gammon. Commun. ACM 38(3), 58–68 (1995)CrossRefGoogle Scholar
  15. 15.
    Mnih, V., Kavukcuoglu, K., Silver, D., et al.: Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)
  16. 16.
    Wang, F.Y., Zhang, J.J., Zheng, X., et al.: Where does AlphaGo go: from church-turing thesis to AlphaGo thesis and beyond. IEEE/CAA J. Automatica Sin. 3(2), 113–120 (2016)CrossRefGoogle Scholar
  17. 17.
    Goodfellow, I., Pouget-Abadie, J., Mirza, M., et al.: Generative adversarial nets. In: Advances in neural information processing systems, pp. 2672–2680 (2014)Google Scholar
  18. 18.

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Jingjing Liu
    • 1
  • Wenjia Niu
    • 1
  • Jiqiang Liu
    • 1
  • Jia Zhao
    • 1
  • Tong Chen
    • 1
  • Yinqi Yang
    • 1
  • Yingxiao Xiang
    • 1
  • Lei Han
    • 2
  1. 1.Beijing Key Laboratory of Security and Privacy in Intelligent TransportationBeijing Jiaotong UniversityBeijingChina
  2. 2.Science and Technology on Information Assurance LaboratoryBeijingChina

Personalised recommendations