Skip to main content

Out of the Box: Embodied Navigation in the Real World

  • Conference paper
  • First Online:
Computer Analysis of Images and Patterns (CAIP 2021)

Abstract

The research field of Embodied AI has witnessed substantial progress in visual navigation and exploration thanks to powerful simulating platforms and the availability of 3D data of indoor and photorealistic environments. These two factors have opened the doors to a new generation of intelligent agents capable of achieving nearly perfect PointGoal Navigation. However, such architectures are commonly trained with millions, if not billions, of frames and tested in simulation. Together with great enthusiasm, these results yield a question: how many researchers will effectively benefit from these advances? In this work, we detail how to transfer the knowledge acquired in simulation into the real world. To that end, we describe the architectural discrepancies that damage the Sim2Real adaptation ability of models trained on the Habitat simulator and propose a novel solution tailored towards the deployment in real-world scenarios. We then deploy our models on a LoCoBot, a Low-Cost Robot equipped with a single Intel RealSense camera. Different from previous work, our testing scene is unavailable to the agent in simulation. The environment is also inaccessible to the agent beforehand, so it cannot count on scene-specific semantic priors. In this way, we reproduce a setting in which a research group (potentially from other fields) needs to employ the agent visual navigation capabilities as-a-Service. Our experiments indicate that it is possible to achieve satisfying results when deploying the obtained model in the real world. Our code and models are available at https://github.com/aimagelab/LoCoNav.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bigazzi, R., Landi, F., Cornia, M., Cascianelli, S., Baraldi, L., Cucchiara, R.: Explore and explain: self-supervised navigation and recounting. In: ICPR (2020)

    Google Scholar 

  2. Cascianelli, S., Costante, G., Bellocchio, E., Valigi, P., Fravolini, M.L., Ciarfuglia, T.A.: A robust semi-semantic approach for visual localization in urban environment. In: ISC2 (2016)

    Google Scholar 

  3. Chaplot, D.S., Gandhi, D., Gupta, S., Gupta, A., Salakhutdinov, R.: Learning to explore using active neural SLAM. In: ICLR (2019)

    Google Scholar 

  4. Chaplot, D.S., Gandhi, D.P., Gupta, A., Salakhutdinov, R.R.: Object goal navigation using goal-oriented semantic exploration. In: NeurIPS (2020)

    Google Scholar 

  5. Chen, T., Gupta, S., Gupta, A.: Learning exploration policies for navigation. In: ICLR (2019)

    Google Scholar 

  6. Choi, S., Zhou, Q.Y., Koltun, V.: Robust reconstruction of indoor scenes. In: CVPR (2015)

    Google Scholar 

  7. Deitke, M., et al.: RoboTHOR: an open simulation-to-real embodied AI platform. In: CVPR (2020)

    Google Scholar 

  8. Kadian, A., et al.: Sim2Real predictivity: does evaluation in simulation predict real-world performance? IEEE Robot. Autom. Lett. 5(4), 6670–6677 (2020)

    Article  Google Scholar 

  9. Landi, F., Baraldi, L., Cornia, M., Corsini, M., Cucchiara, R.: Multimodal attention networks for low-level vision-and-language navigation. CVIU (2021)

    Google Scholar 

  10. LoCoBot: An Open Source Low Cost Robot. https://locobot-website.netlify.com

  11. Murali, A., et al.: PyRobot: an open-source robotics framework for research and benchmarking. arXiv preprint arXiv:1906.08236 (2019)

  12. Ramakrishnan, S.K., Al-Halah, Z., Grauman, K.: Occupancy anticipation for efficient exploration and navigation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12350, pp. 400–418. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58558-7_24

    Chapter  Google Scholar 

  13. Ramakrishnan, S.K., Jayaraman, D., Grauman, K.: An exploration of embodied visual exploration. Int. J. Comput. Vis. 129(5), 1616–1649 (2021). https://doi.org/10.1007/s11263-021-01437-z

    Article  Google Scholar 

  14. Rosano, M., Furnari, A., Gulino, L., Farinella, G.M.: On embodied visual navigation in real environments through habitat. In: ICPR (2020)

    Google Scholar 

  15. Savva, M., et al.: Habitat: a platform for embodied AI research. In: ICCV (2019)

    Google Scholar 

  16. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)

  17. Telea, A.: An image inpainting technique based on the fast marching method. J. Graph. Tools 9(1), 23–34 (2004)

    Article  Google Scholar 

  18. Wijmans, E., et al.: DD-PPO: learning near-perfect PointGoal navigators from 2.5 billion frames. In: ICLR (2019)

    Google Scholar 

  19. Xia, F., Zamir, A.R., He, Z., Sax, A., Malik, J., Savarese, S.: Gibson env: real-world perception for embodied agents. In: CVPR (2018)

    Google Scholar 

Download references

Acknowledgment

This work has been supported by “Fondazione di Modena” under the project “AI for Digital Humanities” and by the national project “IDEHA” (PON ARS01_00421), cofunded by the Italian Ministry of University and Research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Roberto Bigazzi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bigazzi, R., Landi, F., Cornia, M., Cascianelli, S., Baraldi, L., Cucchiara, R. (2021). Out of the Box: Embodied Navigation in the Real World. In: Tsapatsoulis, N., Panayides, A., Theocharides, T., Lanitis, A., Pattichis, C., Vento, M. (eds) Computer Analysis of Images and Patterns. CAIP 2021. Lecture Notes in Computer Science(), vol 13052. Springer, Cham. https://doi.org/10.1007/978-3-030-89128-2_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-89128-2_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-89127-5

  • Online ISBN: 978-3-030-89128-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics