A Proposal to Integrate Deep Q-Learning with Automated Planning to Improve the Performance of a Planning-Based Agent

Núñez-Molina, Carlos; Vellido, Ignacio; Nikolov-Vasilev, Vladislav; Pérez, Raúl; Fdez-Olivares, Juan

doi:10.1007/978-3-030-85713-4_3

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12882))

Included in the following conference series:

Conference of the Spanish Association for Artificial Intelligence

1258 Accesses

Abstract

In this work we propose an architecture which learns to select subgoals with Deep Q-Learning in order to decrease the load of a planner when faced with scenarios with tight time restrictions, such as online execution systems. We have trained this architecture on a video game environment used as a standard testbed for intelligent systems applications. We experiment with different values of the discount rate \(\gamma \) and show the importance of long-term thinking when selecting subgoals. We also compare our approach against a classical planner and show how it is able to greatly reduce time requirements, although obtaining plans with 25% more actions on average. We conclude our approach is competitive with a classical planner and presents better generalization properties than most Reinforcement Learning algorithms when applied to new levels of the same game.

This work has been partially supported by Spanish Government Project MINECO RTI2018-098460-B-I00 and UE FEDER.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Deep Q-Learning with Prioritized Sampling

Solving Sokoban Game with a Heuristic for Avoiding Dead-End States

Human-level control through deep reinforcement learning

Article 25 February 2015

References

Bonanno, D., Roberts, M., Smith, L., Aha, D.W.: Selecting subgoals using deep learning in minecraft: a preliminary report. In: IJCAI Workshop on Deep Learning for Artificial Intelligence (2016)
Google Scholar
Cox, M.T.: Perpetual self-aware cognitive agents. AI Mag. 28(1), 32–45 (2007)
Google Scholar
Fox, M., Long, D.: PDDL2: 1: an extension to PDDL for expressing temporal planning domains. J. Artif. Intell. Res. 20, 61–124 (2003)
Google Scholar
Ghallab, M., Nau, D., Traverso, P.: Automated Planning and Acting. Cambridge University Press, New York (2016)
Google Scholar
Hoffmann, J.: FF: the fast-forward planning system. AI Mag. 22(3), 57 (2001)
Google Scholar
Jaidee, U., Muñoz-Avila, H., Aha, D.W.: Learning and reusing goal-specific policies for goal-driven autonomy. In: Agudo, B.D., Watson, I. (eds.) ICCBR 2012. LNCS (LNAI), vol. 7466, pp. 182–195. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-32986-9_15
Chapter Google Scholar
Klenk, M., Molineaux, M., Aha, D.W.: Goal-driven autonomy for responding to unexpected events in strategy simulations. Comput. Intell. 29(2), 187–206 (2013)
Article MathSciNet Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Mnih, V., et al.: Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)
Mukadam, M., Cosgun, A., Nakhaei, A., Fujimura, K.: Tactical decision making for lane changing with deep reinforcement learning (2017)
Google Scholar
Niemueller, T., Hofmann, T., Lakemeyer, G.: Goal reasoning in the clips executive for integrated planning and execution. In: Proceedings of the International Conference on Automated Planning and Scheduling, vol. 29, pp. 754–763 (2019)
Google Scholar
Patra, S., Ghallab, M., Nau, D., Traverso, P.: Acting and planning using operational models. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7691–7698 (2019)
Google Scholar
Perez-Liebana, D., et al.: The 2014 general video game playing competition. IEEE Trans. Comput. Intell. AI Games 8(3), 229–243 (2015)
Article Google Scholar
Shen, Y., Zhao, N., Xia, M., Du, X.: A deep q-learning network for ship stowage planning problem. Pol. Marit. Res. 24(s3), 102–109 (2017)
Article Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)
Google Scholar
Tamar, A., Wu, Y., Thomas, G., Levine, S., Abbeel, P.: Value iteration networks. In: Advances in Neural Information Processing Systems, pp. 2154–2162 (2016)
Google Scholar
Torrado, R.R., Bontrager, P., Togelius, J., Liu, J., Perez-Liebana, D.: Deep reinforcement learning for general video game AI. In: 2018 IEEE Conference on Computational Intelligence and Games (CIG), pp. 1–8. IEEE (2018)
Google Scholar
Toyer, S., Trevizan, F., Thiébaux, S., Xie, L.: Action schema networks: generalised policies with deep learning. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Watkins, C.J.C.H.: Learning from delayed rewards (1989)
Google Scholar
Zhang, C., Vinyals, O., Munos, R., Bengio, S.: A study on overfitting in deep reinforcement learning (2018). arXiv preprint arXiv:1804.06893

Download references

Author information

Authors and Affiliations

Department of Computer Science and AI, University of Granada, Granada, Spain
Carlos Núñez-Molina, Ignacio Vellido, Vladislav Nikolov-Vasilev, Raúl Pérez & Juan Fdez-Olivares

Authors

Carlos Núñez-Molina
View author publications
You can also search for this author in PubMed Google Scholar
Ignacio Vellido
View author publications
You can also search for this author in PubMed Google Scholar
Vladislav Nikolov-Vasilev
View author publications
You can also search for this author in PubMed Google Scholar
Raúl Pérez
View author publications
You can also search for this author in PubMed Google Scholar
Juan Fdez-Olivares
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Juan Fdez-Olivares .

Editor information

Editors and Affiliations

University of Malaga, Málaga, Spain
Enrique Alba
University of Malaga, Málaga, Spain
Gabriel Luque
University of Malaga, Málaga, Spain
Francisco Chicano
University of Malaga, Málaga, Spain
Carlos Cotta
Technical University of Madrid, Madrid, Spain
David Camacho
University of Malaga, Málaga, Spain
Manuel Ojeda-Aciego
University of Oviedo, Oviedo, Spain
Susana Montes
Pablo de Olavide University, Seville, Spain
Alicia Troncoso
University of Seville, Seville, Spain
José Riquelme
University of Malaga, Málaga, Spain
Rodrigo Gil-Merino

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Núñez-Molina, C., Vellido, I., Nikolov-Vasilev, V., Pérez, R., Fdez-Olivares, J. (2021). A Proposal to Integrate Deep Q-Learning with Automated Planning to Improve the Performance of a Planning-Based Agent. In: Alba, E., et al. Advances in Artificial Intelligence. CAEPIA 2021. Lecture Notes in Computer Science(), vol 12882. Springer, Cham. https://doi.org/10.1007/978-3-030-85713-4_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-85713-4_3
Published: 13 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-85712-7
Online ISBN: 978-3-030-85713-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Proposal to Integrate Deep Q-Learning with Automated Planning to Improve the Performance of a Planning-Based Agent

Abstract

Access this chapter

Similar content being viewed by others

Deep Q-Learning with Prioritized Sampling

Solving Sokoban Game with a Heuristic for Avoiding Dead-End States

Human-level control through deep reinforcement learning

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

A Proposal to Integrate Deep Q-Learning with Automated Planning to Improve the Performance of a Planning-Based Agent

Abstract

Access this chapter

Similar content being viewed by others

Deep Q-Learning with Prioritized Sampling

Solving Sokoban Game with a Heuristic for Avoiding Dead-End States

Human-level control through deep reinforcement learning

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation