Skip to main content

Understanding Structure of Concurrent Actions

  • Conference paper
  • First Online:
Artificial Intelligence XXXVI (SGAI 2019)

Abstract

Whereas most work in reinforcement learning (RL) ignores the structure or relationships between actions, in this paper we show that exploiting structure in the action space can improve sample efficiency during exploration. To show this we focus on concurrent action spaces where the RL agent selects multiple actions per timestep. Concurrent action spaces are challenging to learn in especially if the number of actions is large as this can lead to a combinatorial explosion of the action space.

This paper proposes two methods: a first approach uses implicit structure to perform high-level action elimination using task-invariant actions; a second approach looks for more explicit structure in the form of action clusters. Both methods are context-free, focusing only on an analysis of the action space and show a significant improvement in policy convergence times.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents. In: Proceedings of the 24th International Conference on Artificial Intelligence, pp. 4148–4152. IJCAI 2015, AAAI Press (2015). http://dl.acm.org/citation.cfm?id=2832747.2832830

  2. Chandak, Y., Theocharous, G., Kostas, J., Jordan, S., Thomas, P.: Learning action representations for reinforcement learning. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 97, pp. 941–950. PMLR, Long Beach, California, USA (09–15 June 2019). http://proceedings.mlr.press/v97/chandak19a.html

  3. Claus, C., Boutilier, C.: The dynamics of reinforcement learning in cooperative multiagent systems. AAAI/IAAI (1998)

    Google Scholar 

  4. Dimakopoulou, M., Van Roy, B.: Coordinated exploration in concurrent reinforcement learning. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80, pp. 1271–1279. PMLR, Stockholmsmässan, Stockholm Sweden (2018)

    Google Scholar 

  5. Harmer, J., et al.: Imitation learning with concurrent actions in 3D games. In: 2018 IEEE Conference on Computational Intelligence and Games, CIG 2018, Maastricht, The Netherlands, August 14–17, 2018, pp. 1–8 (2018). https://doi.org/10.1109/CIG.2018.8490398

  6. Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007). https://doi.org/10.1007/s11222-007-9033-z

    Article  MathSciNet  Google Scholar 

  7. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013). http://dblp.uni-trier.de/db/journals/corr/corr1301.html/abs-1301-3781

  8. Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. In: Proceedings of the 33rd International Conference on International Conference on Machine Learning, vol. 48, pp. 1928–1937. ICML 2016, JMLR.org (2016). http://dl.acm.org/citation.cfm?id=3045390.3045594

  9. Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Dietterich, T.G., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Information Processing Systems, vol. 14, pp. 849–856. MIT Press, Cambridge (2002). http://papers.nips.cc/paper/2092-on-spectral-clustering-analysis-and-an-algorithm.pdf

  10. Rohanimanesh, K.: Concurrent decision making in Markov decision processes. Citeseer (2006)

    Google Scholar 

  11. Rosman, B., Ramamoorthy, S.: Action priors for learning domain invariances. IEEE Trans. Auton. Ment. Dev. 7(2), 107–118 (2015)

    Article  Google Scholar 

  12. Sharma, S., Suresh, A., Ramesh, R., Ravindran, B.: Learning to factor policies and action-value functions: factored action space representations for deep reinforcement learning. CoRR abs/1705.07269 (2017). http://arxiv.org/abs/1705.07269

  13. Silver, D., et al.: Mastering the game of go without human knowledge. Nature 550, 354 (2017). https://doi.org/10.1038/nature24270

    Article  Google Scholar 

  14. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998). http://www.cs.ualberta.ca/~sutton/book/the-book.html

    MATH  Google Scholar 

  15. Tennenholtz, G., Mannor, S.: The natural language of actions. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 97, pp. 6196–6205. PMLR, Long Beach, California, USA (09–15 Jun 2019). http://proceedings.mlr.press/v97/tennenholtz19a.html

  16. Vinyals, O., et al.: Starcraft II: a new challenge for reinforcement learning. CoRR abs/1708.04782 (2017). http://arxiv.org/abs/1708.04782

  17. Wang, H., Yu, Y.: Exploring multi-action relationship in reinforcement learning. In: Booth, R., Zhang, M.-L. (eds.) PRICAI 2016. LNCS (LNAI), vol. 9810, pp. 574–587. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-42911-3_48

    Chapter  Google Scholar 

  18. Zahavy, T., Haroush, M., Merlis, N., Mankowitz, D.J., Mannor, S.: Learn what not to learn: action elimination with deep reinforcement learning. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems 31, pp. 3562–3573. Curran Associates, Inc. (2018). http://papers.nips.cc/paper/7615-learn-what-not-to-learn-action-elimination-with-deep-reinforcement-learning.pdf

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Perusha Moodley .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Moodley, P., Rosman, B., Hong, X. (2019). Understanding Structure of Concurrent Actions. In: Bramer, M., Petridis, M. (eds) Artificial Intelligence XXXVI. SGAI 2019. Lecture Notes in Computer Science(), vol 11927. Springer, Cham. https://doi.org/10.1007/978-3-030-34885-4_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-34885-4_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-34884-7

  • Online ISBN: 978-3-030-34885-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics