Wide and Deep Reinforcement Learning Extended for Grid-Based Action Games

  • Juan M. MontoyaEmail author
  • Christoph Doell
  • Christian Borgelt
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11978)


For the last decade, Deep Reinforcement Learning (DRL) has undergone very rapid development. However, less has been done to integrate linear methods into it. Our research aims at a simple and practical Wide and Deep Reinforcement Learning framework to extend DRL algorithms by combining linear (wide) and non-linear (deep) methods. This framework can help to integrate expert knowledge or to fuse sensor information while at the same time improving the performance of existing DRL algorithms. To test this framework we have developed an extension of the popular Deep Q-Networks Algorithm, which we call Wide Deep Q-Networks. We analyze its performance compared to Deep Q-Networks and Linear Agents, as well as human agents by applying our new algorithm to Berkeley’s Pac-Man environment. Our algorithm considerably outperforms Deep Q-Networks both in terms of learning speed and ultimate performance, showing its potential for boosting existing algorithms. Furthermore, it is robust to the failure of one of its components.


Wide and deep reinforcement learning Wide deep Q-networks Value function approximation Reinforcement learning agents Model fusion reinforcement learning 


  1. 1.
    Bohez, S., Verbelen, T., De Coninck, E., Vankeirsbilck, B., Simoens, P., Dhoedt, B.: Sensor fusion for robot control through deep reinforcement learning. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2365–2370. IEEE, September 2017Google Scholar
  2. 2.
    Cheng, H.T., et al.: Wide & deep learning for recommender systems. In: Proceedings of the 1st Workshop on Deep Learning for Recommender Systems, DLRS 2016, pp. 7–10. ACM, New York (2016)Google Scholar
  3. 3.
    DeNero, J., Klein, D.: Teaching introductory artificial intelligence with pac-man. In: Proceedings of the Symposium on Educational Advances in Artificial Intelligence, pp. 1885–1889 (2010)Google Scholar
  4. 4.
    Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)zbMATHGoogle Scholar
  5. 5.
    van Hasselt, H.P., Guez, A., Hessel, M., Mnih, V., Silver, D.: Learning values across many orders of magnitude. In: Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, Barcelona, Spain, 5–10 December 2016, pp. 4287–4295 (2016)Google Scholar
  6. 6.
    Henderson, P., Islam, R., Bachman, P., Pineau, J., Precup, D., Meger, D.: Deep reinforcement learning that matters. In: Proceedings of the Thirtieth-Second AAAI Conference on Artificial Intelligence, AAAI 2018. AAAI Press (2018)Google Scholar
  7. 7.
    Kalashnikov, D., et al.: QT-Opt: scalable deep reinforcement learning for vision-based robotic. CoRR abs/1806.10293 (2018)Google Scholar
  8. 8.
    Kim, H.J., Jordan, M.I., Sastry, S., Ng, A.Y.: Autonomous helicopter flight via reinforcement learning. In: Thrun, S., Saul, L.K., Schölkopf, B. (eds.) Advances in Neural Information Processing Systems 16, pp. 799–806. MIT Press (2004)Google Scholar
  9. 9.
    Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  10. 10.
    Lin, L.J.: Self-improving reactive agents based on reinforcement learning, Plann. Teach. Machine Learning 8(3), 293–321 (1992). Scholar
  11. 11.
    Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)CrossRefGoogle Scholar
  12. 12.
    Montoya., J.M., Borgelt., C.: Wide and deep reinforcement learning for grid-based action games. In: Proceedings of the 11th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART, pp. 50–59. INSTICC, SciTePress (2019).
  13. 13.
    van der Ouderaa, T.: Deep Reinforcement Learning in Pac-Man (2016). Bachelor thesis, University of AmsterdamGoogle Scholar
  14. 14.
    Russell, S.J., Norvig, P.: Artificial Intelligence: A Modern Approach. Pearson Education, 3 edn. (2003)Google Scholar
  15. 15.
    Sutton, R.S., Barto, A.G.: Introduction to Reinforcement Learning (71 2018), working Second EditionGoogle Scholar
  16. 16.
    Watkins, C.J.C.H.: Learning from Delayed Rewards. Ph.D. thesis, King’s College, Cambridge, UK (1989).

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Juan M. Montoya
    • 1
    Email author
  • Christoph Doell
    • 1
  • Christian Borgelt
    • 2
  1. 1.University of KonstanzKonstanzGermany
  2. 2.University of SalzburgSalzburgAustria

Personalised recommendations