Learning to explore by reinforcement over high-level options

Liu, Juncheng; McCane, Brendan; Mills, Steven

doi:10.1007/s00138-023-01492-1

Learning to explore by reinforcement over high-level options

Original Paper
Published: 30 November 2023

Volume 35, article number 6, (2024)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

136 Accesses
Explore all metrics

Abstract

Autonomous 3D environment exploration is a fundamental task for various applications such as navigation and object searching. The goal of exploration is to investigate a new environment and build a map efficiently. In this paper, we propose a new method which grants an agent two intertwined options of behaviors: “look-around” and “frontier navigation.” This is implemented by an option-critic architecture and trained by reinforcement learning algorithms. In each time step, an agent produces an option and a corresponding action according to the policy. We also take advantage of macro-actions by incorporating classic path-planning techniques to increase training efficiency. We demonstrate the effectiveness of the proposed method on two publicly available 3D environment datasets, and the results show our method achieves higher coverage than competing techniques with better efficiency. We also show that our method can be transferred and applied on a rover robot in real-world environments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-agent deep reinforcement learning: a survey

Article Open access 15 April 2021

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

A review of motion planning algorithms for intelligent robots

Article Open access 25 November 2021

References

Bacon, P.L., Harb, J., Precup, D.: The option-critic architecture. In: Proceedings of the AAAI Conference on Artificial Intelligence (2017)
Chang, A., Dai, A., Funkhouser, T., et al.: Matterport3d: learning from RGB-D data in indoor environments. arXiv preprint arXiv:1709.06158 (2017)
Chaplot, D.S., Gandhi, D., Gupta, S., et al.: Learning to explore using active neural slam. arXiv preprint arXiv:2004.05155 (2020)
Chen, T., Gupta, S., Gupta, A.: Learning exploration policies for navigation. arXiv preprint arXiv:1903.01959 (2019)
He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Henriques, J.F., Vedaldi, A.: Mapnet: an allocentric spatial memory for mapping environments. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8476–8484 (2018)
Klein, G., Murray, D.: Parallel tracking and mapping for small ar workspaces. In: 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality. IEEE, pp. 225–234 (2007)
Mnih, V., Badia, A.P., Mirza, M., et al.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, PMLR, pp. 1928–1937 (2016)
Mur-Artal, R., Montiel, J.M.M., Tardos, J.D.: Orb-slam: a versatile and accurate monocular slam system. IEEE Trans. Rob. 31(5), 1147–1163 (2015)
Article Google Scholar
Paszke, A., Gross, S., Massa, F., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Wallach, H., Larochelle, H., Beygelzimer, A., et al (eds) Advances in Neural Information Processing Systems 32. Curran Associates, Inc., pp. 8024–8035, http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf (2019)
Pathak, D., Agrawal, P., Efros, A.A., et al.: Curiosity-driven exploration by self-supervised prediction. In: International Conference on Machine Learning, PMLR, pp. 2778–2787 (2017)
Ramakrishnan, S.K., Al-Halah, Z., Grauman, K.: Occupancy anticipation for efficient exploration and navigation. In: European Conference on Computer Vision. Springer, pp. 400–418 (2020)
Ramakrishnan, S.K., Jayaraman, D., Grauman, K.: An exploration of embodied visual exploration. Int. J. Comput. Vis. 129(5), 1616–1649 (2021)
Article Google Scholar
Savva, M., Kadian, A., Maksymets, O., et al.: Habitat: A platform for embodied AI research. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9339–9347 (2019)
Schulman, J., Wolski, F., Dhariwal, P., et al.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
Sethian, J.A.: A fast marching level set method for monotonically advancing fronts. Proc. Natl. Acad. Sci. 93(4), 1591–1595 (1996)
Article MathSciNet Google Scholar
Sutton, R.S., Precup, D., Singh, S.: Between mdps and semi-mdps: a framework for temporal abstraction in reinforcement learning. Artif. Intell. 112(1–2), 181–211 (1999)
Article MathSciNet Google Scholar
Tang, H., Houthooft, R., Foote, D., et al.: # exploration: a study of count-based exploration for deep reinforcement learning. In: 31st Conference on Neural Information Processing Systems (NIPS), pp. 1–18 (2017)
White, C.C.: A survey of solution techniques for the partially observed Markov decision process. Ann. Oper. Res. 32(1), 215–230 (1991)
Article MathSciNet Google Scholar
Xia, F., Zamir, A.R., He, Z., et al.: Gibson env: real-world perception for embodied agents. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9068–9079 (2018)
Yamauchi, B.: A frontier-based approach for autonomous exploration. In: Proceedings 1997 IEEE International Symposium on Computational Intelligence in Robotics and Automation CIRA’97.’Towards New Computational Principles for Robotics and Automation’. IEEE, pp. 146–151 (1997)

Download references

Acknowledgements

This project was funded by Science for Technological Innovation (https://www.sftichallenge.govt.nz/.) under the spearhead project: Adaptive learning robots to complement the human workforce.

Author information

Authors and Affiliations

Department of Computer Science, University of Otago, 133 Union St East, Dunedin, Otago, 9016, New Zealand
Juncheng Liu, Brendan McCane & Steven Mills

Authors

Juncheng Liu
View author publications
You can also search for this author in PubMed Google Scholar
Brendan McCane
View author publications
You can also search for this author in PubMed Google Scholar
Steven Mills
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Juncheng Liu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (mp4 52391 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Liu, J., McCane, B. & Mills, S. Learning to explore by reinforcement over high-level options. Machine Vision and Applications 35, 6 (2024). https://doi.org/10.1007/s00138-023-01492-1

Download citation

Received: 26 August 2022
Revised: 28 July 2023
Accepted: 01 November 2023
Published: 30 November 2023
DOI: https://doi.org/10.1007/s00138-023-01492-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning to explore by reinforcement over high-level options

Abstract

Access this article

Similar content being viewed by others

Multi-agent deep reinforcement learning: a survey

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

A review of motion planning algorithms for intelligent robots

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning to explore by reinforcement over high-level options

Abstract

Access this article

Similar content being viewed by others

Multi-agent deep reinforcement learning: a survey

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

A review of motion planning algorithms for intelligent robots

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation