Skip to main content

Abstracting Reinforcement Learning Agents with Prior Knowledge

  • Conference paper
  • First Online:
PRIMA 2018: Principles and Practice of Multi-Agent Systems (PRIMA 2018)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11224))

Abstract

Recent breakthroughs in reinforcement learning have enabled the creation of learning agents for solving a wide variety of sequential decision problems. However, these methods require a large number of iterations in complex environments. A standard paradigm to tackle this challenge is to extend reinforcement learning to handle function approximation with deep learning. Lack of interpretability and impossibility to introduce background knowledge limits their usability in many safety-critical real-world scenarios. In this paper, we propose a new agent architecture to combine reinforcement learning and external knowledge. We derive a rule-based variant version of the Sarsa(\(\lambda \)) algorithm, which we call Sarsa-rb(\(\lambda \)), that augments data with complex knowledge and exploits similarities among states. We apply our method to a trading task from the stock market environment. We show that the resulting agent leads to much better performance but also improves training speed compared to the Deep Q-learning (DQN) algorithm and the Deep Deterministic Policy Gradients (DDPG) algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Andre, D., Russell, S.J.: State abstraction for programmable reinforcement learning agents (2002)

    Google Scholar 

  2. Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents (2013)

    Article  Google Scholar 

  3. Boots, B., Siddiqi, S.M., Gordon, G.J.: Closing the learning-planning loop with predictive state representations. Int. J. Robot. Res. 30(7), 954–966 (2011)

    Article  Google Scholar 

  4. Bougie, N., Ichise, R.: Deep reinforcement learning boosted by external knowledge. In: Proceedings of the 33rd Annual ACM Symposium on Applied Computing, pp. 331–338. ACM (2018)

    Google Scholar 

  5. d’Avila Garcez, A., Resende Riquetti Dutra, A., Alonso, E.: Towards Symbolic Reinforcement Learning with Common Sense. ArXiv e-prints, April 2018

    Google Scholar 

  6. Džeroski, S., De Raedt, L., Driessens, K.: Relational reinforcement learning. Mach. Learn. 43(1–2), 7–52 (2001)

    Article  Google Scholar 

  7. Gambardella, L.M., Dorigo, M.: Ant-Q: a reinforcement learning approach to the traveling salesman problem. In: Machine Learning Proceedings 1995, pp. 252–260. Elsevier (1995)

    Google Scholar 

  8. Garnelo, M., Arulkumaran, K., Shanahan, M.: Towards deep symbolic reinforcement learning. In: Abbeel, P., Chen, P., Silver, D., Singh, S. (eds.) NIPS. Neural Information Processing Systems Foundation, La Jolla, California (2016)

    Google Scholar 

  9. Hausknecht, M., Stone, P.: Deep recurrent Q-learning for partially observable MDPs (2015)

    Google Scholar 

  10. Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)

  11. Mashayekhi, M., Gras, R.: Rule extraction from random forest: the RF+HC methods. In: Barbosa, D., Milios, E. (eds.) CANADIAN AI 2015. LNCS (LNAI), vol. 9091, pp. 223–237. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18356-5_20

    Chapter  Google Scholar 

  12. Mnih, V., et al.: Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)

  13. Nison, S.: Japanese Candlestick Charting Techniques: A Contemporary Guide to the Ancient Investment Techniques of the Far East. Penguin, London (2001)

    Google Scholar 

  14. Papudesi, V., Huber, M.: Learning behaviorally grounded state representations for reinforcement learning agents (2006)

    Google Scholar 

  15. Randløv, J., Alstrøm, P.: Learning to drive a bicycle using reinforcement learning and shaping. Proc. ICML 98, 463–471 (1998)

    Google Scholar 

  16. Rosencrantz, M., Gordon, G., Thrun, S.: Learning low dimensional predictive representations. In: Proceedings of the Twenty-First ICML. ACM (2004)

    Google Scholar 

  17. Singh, S.P., Jaakkola, T., Jordan, M.I.: Reinforcement learning with soft state aggregation. In: Proceedings of NIPS, pp. 361–368 (1995)

    Google Scholar 

  18. Singh, S.P., Sutton, R.S.: Reinforcement learning with replacing eligibility traces. Mach. Learn. 22(1–3), 123–158 (1996)

    Article  Google Scholar 

  19. Sutton, R.S.: Generalization in reinforcement learning: successful examples using sparse coarse coding. In: Proceedings of NIPS, pp. 1038–1044 (1996)

    Google Scholar 

  20. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)

    Google Scholar 

  21. Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nicolas Bougie .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bougie, N., Ichise, R. (2018). Abstracting Reinforcement Learning Agents with Prior Knowledge. In: Miller, T., Oren, N., Sakurai, Y., Noda, I., Savarimuthu, B.T.R., Cao Son, T. (eds) PRIMA 2018: Principles and Practice of Multi-Agent Systems. PRIMA 2018. Lecture Notes in Computer Science(), vol 11224. Springer, Cham. https://doi.org/10.1007/978-3-030-03098-8_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-03098-8_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-03097-1

  • Online ISBN: 978-3-030-03098-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics