Skip to main content

Leaving the NavMesh: An Ablative Analysis of Deep Reinforcement Learning for Complex Navigation in 3D Virtual Environments

  • Conference paper
  • First Online:
AI 2023: Advances in Artificial Intelligence (AI 2023)

Abstract

Expanding non-player character (NPC) navigation behavior in video games has the potential to induce novel player experiences. Current industry standards represent traversable world geometry by utilizing a Navigation Mesh (NavMesh); however NavMesh complexity scales poorly with additional navigation abilities (e.g. jumping, wall-running, jet-packs, etc.) and increasing world scale. Deep Reinforcement Learning (DRL) allows for an NPC agent to learn how to navigate environmental obstacles with any navigation ability without NavMesh dependence. Despite the promise of DRL navigation, adoption in industry remains low due to the required expert knowledge in agent design and the poor training efficiency of DRL algorithms. In this work, we utilize the off-policy Soft-Actor Critic (SAC) DRL algorithm to investigate the importance of different local observation types and agent scalar information to agent performance across three topologically distinct environments. We implement a truncated n-step returns method for minibatch sampling which improves early training efficiency by up to 75% by reducing inaccurate off-policy bias. We empirically evaluate environment partial observability with observation stacking where we find that 4–8 observation stacks renders the environments sufficiently Markovian.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Alonso, E., Peter, M., Goumard, D., Romoff, J.: Deep Reinforcement Learning for Navigation in AAA Video Games, November 2020. arXiv:2011.04764 [cs]

  2. Beeching, E., et al.: Graph augmented Deep Reinforcement Learning in the GameRLand3D environment, December 2021. arXiv:2112.11731 [cs]

  3. Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents. J. Artif. Intell. Res. 47, 253–279 (2013). https://doi.org/10.1613/jair.3912

    Article  Google Scholar 

  4. Bergdahl, J., Gordillo, C., Tollmar, K., Gisslén, L.: Augmenting Automated Game Testing with Deep Reinforcement Learning, March 2021. arXiv:2103.15819 [cs]

  5. Budde, S.: Automatic Generation of Jump Links in Arbitrary 3d Environments. Master’s thesis, Humboldt-Universität zu Berlin (2013)

    Google Scholar 

  6. Cook, R.L.: Stochastic sampling in computer graphics. ACM Trans. Graph. 5(1), 51–72 (1986). https://doi.org/10.1145/7529.8927

    Article  MathSciNet  Google Scholar 

  7. Daley, B., Amato, C.: Reconciling \(\lambda \)-returns with experience replay. In: Advances in Neural Information Processing Systems, vol. 32. Curran Associates, Inc. (2019)

    Google Scholar 

  8. Devlin, S., et al.: Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation, July 2021. arXiv:2105.09637 [cs]

  9. Fedus, W., et al.: Revisiting Fundamentals of Experience Replay, July 2020. arXiv:2007.06700 [cs]

  10. Gisslén, L., Eakins, A., Gordillo, C., Bergdahl, J., Tollmar, K.: Adversarial Reinforcement Learning for Procedural Content Generation, June 2021. arXiv:2103.04847 [cs]

  11. Gordillo, C., Bergdahl, J., Tollmar, K., Gisslén, L.: Improving Playtesting Coverage via Curiosity Driven Reinforcement Learning Agents, June 2021. arXiv:2103.13798 [cs]

  12. Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor, August 2018. arXiv:1801.01290 [cs, stat]

  13. Haarnoja, T., et al.: Soft Actor-Critic Algorithms and Applications, January 2019. arXiv:1812.05905 [cs, stat]

  14. Hart, P.E., Nilsson, N.J., Raphael, B.: A formal basis for the heuristic determination of minimum cost paths. IEEE Trans. Syst. Sci. Cybern. 4(2), 100–107 (1968). https://doi.org/10.1109/TSSC.1968.300136

    Article  Google Scholar 

  15. Ioffe, S., Szegedy, C.: Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, March 2015. arXiv:1502.03167 [cs]

  16. Juliani, A., et al.: Unity: A General Platform for Intelligent Agents, May 2020. arXiv:1809.02627 [cs, stat]

  17. Kempka, M., Wydmuch, M., Runc, G., Toczek, J., Jaśkowski, W.: ViZDoom: A Doom-based AI Research Platform for Visual Reinforcement Learning, September 2016. arXiv:1605.02097 [cs]

  18. Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998). https://doi.org/10.1109/5.726791

    Article  Google Scholar 

  19. Mirowski, P., et al.: Learning to Navigate in Complex Environments, January 2017. arXiv:1611.03673 [cs]

  20. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015). https://doi.org/10.1038/nature14236

    Article  Google Scholar 

  21. Roy, J., Girgis, R., Romoff, J., Bacon, P.L., Pal, C.: Direct Behavior Specification via Constrained Reinforcement Learning, January 2022. arXiv:2112.12228 [cs]

  22. Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized Experience Replay, February 2016. arXiv:1511.05952

  23. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal Policy Optimization Algorithms, August 2017. arXiv:1707.06347 [cs]

  24. Sestini, A., Gisslén, L., Bergdahl, J., Tollmar, K., Bagdanov, A.D.: CCPT: Automatic Gameplay Testing and Validation with Curiosity-Conditioned Proximal Trajectories, February 2022. arXiv:2202.10057 [cs]. https://doi.org/10.48550/arXiv.2202.10057

  25. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)

    Google Scholar 

  26. van Hasselt, H., Doron, Y., Strub, F., Hessel, M., Sonnerat, N., Modayil, J.: Deep Reinforcement Learning and the Deadly Triad, December 2018. arXiv:1812.02648 [cs]. https://doi.org/10.48550/arXiv.1812.02648

  27. Zhang, S., Sutton, R.S.: A Deeper Look at Experience Replay, April 2018. arXiv:1712.01275 [cs]. https://doi.org/10.48550/arXiv.1712.01275

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dale Grant .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Grant, D., Garcia, J., Raffe, W. (2024). Leaving the NavMesh: An Ablative Analysis of Deep Reinforcement Learning for Complex Navigation in 3D Virtual Environments. In: Liu, T., Webb, G., Yue, L., Wang, D. (eds) AI 2023: Advances in Artificial Intelligence. AI 2023. Lecture Notes in Computer Science(), vol 14472. Springer, Singapore. https://doi.org/10.1007/978-981-99-8391-9_23

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8391-9_23

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8390-2

  • Online ISBN: 978-981-99-8391-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics