Skip to main content

Learning to Plan with Uncertain Topological Maps

  • Conference paper
  • First Online:
Computer Vision – ECCV 2020 (ECCV 2020)

Abstract

We train an agent to navigate in 3D environments using a hierarchical strategy including a high-level graph based planner and a local policy. Our main contribution is a data driven learning based approach for planning under uncertainty in topological maps, requiring an estimate of shortest paths in valued graphs with a probabilistic structure. Whereas classical symbolic algorithms achieve optimal results on noise-less topologies, or optimal results in a probabilistic sense on graphs with probabilistic structure, we aim to show that machine learning can overcome missing information in the graph by taking into account rich high-dimensional node features, for instance visual information available at each location of the map. Compared to purely learned neural white box algorithms, we structure our neural model with an inductive bias for dynamic programming based shortest path algorithms, and we show that a particular parameterization of our neural model corresponds to the Bellman-Ford algorithm. By performing an empirical analysis of our method in simulated photo-realistic 3D environments, we demonstrate that the inclusion of visual features in the learned neural planner outperforms classical symbolic solutions for graph based planning.

Project page https://edbeeching.github.io/papers/learning_to_plan.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Anderson, P., et al.: On evaluation of embodied navigation agents (2018)

    Google Scholar 

  2. Åström, K.J.: Optimal control of Markov processes with incomplete state information. J. Math. Anal. Appl. 10(1), 174–205 (1965)

    Article  MathSciNet  Google Scholar 

  3. Battaglia, P., Pascanu, R., Lai, M., Rezende, D.J., et al.: Interaction networks for learning about objects, relations and physics. In: Advances in Neural Information Processing Systems, pp. 4502–4510 (2016)

    Google Scholar 

  4. Battaglia, P.W., et al.: Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261 (2018)

  5. Battaglia, P., et al.: Relational inductive biases, deep learning, and graph networks. arXiv preprint 1807.09244 (2018)

    Google Scholar 

  6. Beeching, E., Wolf, C., Dibangoye, J., Simonin, O.: EgoMap: projective mapping and structured egocentric memory for deep RL (2020)

    Google Scholar 

  7. Bellman, R.: On a routing problem. Q. Appl. Math. 16(1), 87–90 (1958)

    Article  MathSciNet  Google Scholar 

  8. Bhatti, S., Desmaison, A., Miksik, O., Nardelli, N., Siddharth, N., Torr, P.H.S.: Playing doom with slam-augmented deep reinforcement learning. arxiv preprint 1612.00380 (2016)

    Google Scholar 

  9. Bronstein, M.M., Bruna, J., LeCun, Y., Szlam, A., Vandergheynst, P.: Geometric deep learning: going beyond Euclidean data. IEEE Sig. Process. Mag. 34(4), 18–42 (2017)

    Article  Google Scholar 

  10. Chaplot, D.S., Gandhi, D., Gupta, S., Gupta, A., Salakhutdinov, R.: Learning to explore using active neural slam. In: International Conference on Learning Representations (2020). https://openreview.net/forum?id=HklXn1BKDH

  11. Chen, T., Gupta, S., Gupta, A.: Learning exploration policies for navigation. In: International Conference on Learning Representations (2019). https://openreview.net/forum?id=SyMWn05F7

  12. Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014)

  13. Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Gated Feedback Recurrent Neural Networks. In: ICML (2015)

    Google Scholar 

  14. Dauphin, Y.N., Fan, A., Auli, M., Grangier, D.: Language modeling with gated convolutional networks. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 933–941. JMLR.org (2017)

    Google Scholar 

  15. Dijkstra, E.W.: A note on two problems in connexion with graphs. Numerische mathematik 1(1), 269–271 (1959)

    Article  MathSciNet  Google Scholar 

  16. Eysenbach, B., Salakhutdinov, R.R., Levine, S.: Search on the replay buffer: bridging planning and reinforcement learning. In: Advances in Neural Information Processing Systems, vol. 32, pp. 15220–15231. Curran Associates, Inc. (2019)

    Google Scholar 

  17. Fout, A., Byrd, J., Shariat, B., Ben-Hur, A.: Protein interface prediction using graph convolutional networks. In: Advances in Neural Information Processing Systems, pp. 6530–6539 (2017)

    Google Scholar 

  18. Gilmer, J., Schoenholz, S.S., Riley, P.F., Vinyals, O., Dahl, G.E.: Neural message passing for quantum chemistry. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 1263–1272. JMLR.org (2017)

    Google Scholar 

  19. Graves, A., Wayne, G., Danihelka, I.: Neural turing machines. arXiv preprint arXiv:1410.5401 (2014)

  20. Graves, A., et al.: Hybrid computing using a neural network with dynamic external memory. Nature 538(7626), 471 (2016)

    Article  Google Scholar 

  21. Gupta, S., Davidson, J., Levine, S., Sukthankar, R., Malik, J.: Cognitive mapping and planning for visual navigation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7272–7281, July 2017. https://doi.org/10.1109/CVPR.2017.769

  22. Gupta, S., Fouhey, D., Levine, S., Malik, J.: Unifying map and landmark based representations for visual navigation. arXiv preprint arXiv:1712.08125 (2017)

  23. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  24. Jaderberg, M., et al.: Reinforcement learning with unsupervised auxiliary tasks. In: ICLR (2017)

    Google Scholar 

  25. Joshi, C.K., Laurent, T., Bresson, X.: An efficient graph convolutional network technique for the travelling salesman problem. arXiv preprint arXiv:1906.01227 (2019)

  26. Kaelbling, L.P., Littman, M.L., Cassandra, A.R.: Planning and acting in partially observable stochastic domains. Artif. Intell. 101(1–2), 99–134 (1998)

    Article  MathSciNet  Google Scholar 

  27. Karkus, P., Hsu, D., Lee, W.S.: QMDP-net: deep learning for planning under partial observability (2017)

    Google Scholar 

  28. Kempka, M., Wydmuch, M., Runc, G., Toczek, J., Jaskowski, W.: ViZDoom: a doom-based AI research platform for visual reinforcement learning. In: IEEE Conference on Computatonal Intelligence and Games, CIG (2017). https://doi.org/10.1109/CIG.2016.7860433, https://arxiv.org/pdf/1605.02097.pdf

  29. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: International Conference on Learning Representations (2017)

    Google Scholar 

  30. Kurniawati, H.: SARSOP: efficient point-based POMDP planning by approximating optimally reachable belief spaces. In: Proceedings of the Robotics: Science and Systems (2008)

    Google Scholar 

  31. LaValle, S.M.: Planning Algorithms. Cambridge University Press, New York (2006)

    Book  Google Scholar 

  32. Lecun, Y., Eon Bottou, L., Bengio, Y., Haaner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  33. Li, Z., Chen, Q., Koltun, V.: Combinatorial optimization with graph convolutional networks and guided tree search. In: Advances in Neural Information Processing Systems, pp. 539–548 (2018)

    Google Scholar 

  34. Savva, M., et al.: Habitat: a platform for embodied AI research. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)

    Google Scholar 

  35. Mirowski, P., et al.: Learning to navigate in cities without a map. arxiv pre-print 1804.00168v2 (2018)

    Google Scholar 

  36. Mirowski, P., et al.: Learning to navigate in complex environments. In: ICLR (2017)

    Google Scholar 

  37. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)

    Article  Google Scholar 

  38. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518, (2015). https://doi.org/10.1038/nature14236, https://storage.googleapis.com/deepmind-media/dqn/DQNNaturePaper.pdf

  39. Neverova, N., Wolf, C., Taylor, G., Nebout, F.: ModDrop: adaptive multi-modal gesture recognition. IEEE Trans. Pattern Anal. Mach. Intell. 38(8), 1692–1706 (2015)

    Article  Google Scholar 

  40. Parisotto, E., Salakhutdinov, R.: Neural map: structured memory for deep reinforcement learning. In: ICLR (2018)

    Google Scholar 

  41. Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 32, pp. 8024–8035. Curran Associates, Inc. (2019)

    Google Scholar 

  42. Remolina, E., Kuipers, B.: Towards a general theory of topological maps. Artif. Intell. 152, 47–104 (2004)

    Article  MathSciNet  Google Scholar 

  43. Savinov, N., Dosovitskiy, A., Koltun, V.: Semi-parametric topological memory for navigation. In: International Conference on Learning Representations (2018). https://openreview.net/forum?id=SygwwGbRW

  44. Schlichtkrull, M., Kipf, T.N., Bloem, P., van den Berg, R., Titov, I., Welling, M.: Modeling relational data with graph convolutional networks. In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 10843, pp. 593–607. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93417-4_38

    Chapter  Google Scholar 

  45. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arxiv pre-print 1707.06347 (2017)

    Google Scholar 

  46. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)

  47. Shani, G., Pineau, J., Kaplow, R.: A survey of point-based POMDP solvers. Auton. Agents Multi-Agent Syst. 27(1), 1–51 (2013). https://doi.org/10.1007/s10458-012-9200-210.1007/s10458-012-9200-2

    Article  Google Scholar 

  48. Shatkay, H., Kaelbling, L.P.: Learning topological maps with weak local odometric information. In: IJCAI, vol. 2, pp. 920–929 (1997)

    Google Scholar 

  49. Silver, D., et al.: A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science 362(6419), 1140–1144 (2018)

    Article  MathSciNet  Google Scholar 

  50. Smallwood, R.D., Sondik, E.J.: The optimal control of partially observable Markov processes over a finite horizon. Oper. Res. 21(5), 1071–1088 (1973)

    Article  Google Scholar 

  51. Smith, T., Simmons, R.: Heuristic search value iteration for POMDPs. In: Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, pp. 520–527 (2004)

    Google Scholar 

  52. Srinivas, A., Jabri, A., Abbeel, P., Levine, S., Finn, C.: Universal planning networks (2018)

    Google Scholar 

  53. Tamar, A., Wu, Y., Thomas, G., Levine, S., Abbeel, P.: Value iteration networks (2016)

    Google Scholar 

  54. Thrun, S.: Learning metric-topological maps for indoor mobile robot navigation. Artif. Intell. 99(1), 21–71 (1998)

    Article  Google Scholar 

  55. Wang, R.F., Spelke, E.S.: Human spatial representation: insights from animals. Trends Cogn. Sci. 6(9), 376–382 (2002). https://doi.org/10.1016/s1364-6613(02)01961-7

    Article  Google Scholar 

  56. Wayne, G., et al.: Unsupervised predictive memory in a goal-directed agent. arxiv preprint 1803.10760 (2018)

    Google Scholar 

  57. Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., Yu, P.S.: A comprehensive survey on graph neural networks. arXiv preprint arXiv:1901.00596 (2019)

  58. Xia, F., R. Zamir, A., He, Z.Y., Sax, A., Malik, J., Savarese, S.: Gibson Env: real-world perception for embodied agents. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE (2018)

    Google Scholar 

  59. Xu, K., Li, J., Zhang, M., Du, S., Kawarabayashi, K., Jegelka, S.: What can neural networks reason about? arxiv preprint 1905.13211 (2019)

    Google Scholar 

  60. Xu, K., Li, J., Zhang, M., Du, S.S., Kawarabayashi, K.i., Jegelka, S.: What can neural networks reason about? arXiv preprint arXiv:1905.13211 (2019)

  61. Zhang, J., Tai, L., Boedecker, J., Burgard, W., Liu, M.: Neural SLAM. arxiv preprint 1706.09520 (2017)

    Google Scholar 

  62. Zhou, J., et al.: Graph neural networks: a review of methods and applications. arXiv preprint arXiv:1812.08434 (2018)

  63. Zhu, Y., et al.: Target-driven visual navigation in indoor scenes using deep reinforcement learning. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 3357–3364. IEEE (2017)

    Google Scholar 

Download references

Acknowledgements

This work was funded by grant Deepvision (ANR-15-CE23-0029, STPGP479356-15), a joint French/Canadian call by ANR & NSERC; Compute was provided by the CNRS/IN2P3 Computing Center (Lyon, France), and by GENCI-IDRIS (Grant 2019-100964).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Edward Beeching .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 1167 KB)

Supplementary material 2 (mp4 34819 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Beeching, E., Dibangoye, J., Simonin, O., Wolf, C. (2020). Learning to Plan with Uncertain Topological Maps. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12348. Springer, Cham. https://doi.org/10.1007/978-3-030-58580-8_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58580-8_28

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58579-2

  • Online ISBN: 978-3-030-58580-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics