Using Localization and Factorization to Reduce the Complexity of Reinforcement Learning
General reinforcement learning is a powerful framework for artificial intelligence that has seen much theoretical progress since introduced fifteen years ago. We have previously provided guarantees for cases with finitely many possible environments. Though the results are the best possible in general, a linear dependence on the size of the hypothesis class renders them impractical. However, we dramatically improved on these by introducing the concept of environments generated by combining laws. The bounds are then linear in the number of laws needed to generate the environment class. This number is identified as a natural complexity measure for classes of environments. The individual law might only predict some feature (factorization) and only in some contexts (localization). We here extend previous deterministic results to the important stochastic setting.
KeywordsReinforcement learning Laws Optimism Bounds
Unable to display preview. Download preview PDF.
- 1.Diuk, C., Li, L., Leffer, B.R.: The adaptive k-meteorologists problem and its application to structure learning and feature selection in reinforcement learning. In: Danyluk, A.P., Bottou, L., Littman, M.L. (eds.) ICML. ACM International Conference Proceeding Series, vol. 382 (2009)Google Scholar
- 2.Hutter, M.: Universal Articial Intelligence: Sequential Decisions based on Algorithmic Probability. Springer, Berlin (2005)Google Scholar
- 3.Lattimore, T.: Theory of General Reinforcement Learning. Ph.D. thesis, Australian National University (2014)Google Scholar
- 5.Lattimore, T., Hutter, M., Sunehag, P.: The sample-complexity of general reinforcement learning. Journal of Machine Learning Research, W&CP: ICML 28(3), 28–36 (2013)Google Scholar
- 6.Russell, S.J., Norvig, P.: Artificial Intelligence: A Modern Approach, 3rd edn. Prentice Hall, Englewood Clifs (2010)Google Scholar
- 11.Sunehag, P., Hutter, M.: A dual process theory of optimistic cognition. In: Annual Conference of the Cognitive Science Society, CogSci 2014 (2014)Google Scholar
- 12.Sunehag, P., Hutter, M.: Rationality, Optimism and Guarantees in General Reinforcement Learning. Journal of Machine Learning Reserch (to appear, 2015)Google Scholar