We consider extending the AIXI agent by using multiple (or even a compact class of) priors. This has the benefit of weakening the conditions on the true environment that we need to prove asymptotic optimality. Furthermore, it decreases the arbitrariness of picking the prior or reference machine. We connect this to removing symmetry between accepting and rejecting bets in the rationality axiomatization of AIXI and replacing it with optimism. Optimism is often used to encourage exploration in the more restrictive Markov Decision Process setting and it alleviates the problem that AIXI (with geometric discounting) stops exploring prematurely.
KeywordsAIXI Reinforcement Learning Optimism Optimality
Unable to display preview. Download preview PDF.
- [ALL09]Asmuth, J., Li, L., Littman, M.L., Nouri, A., Wingate, D.: Pac-mdp reinforcement learning with bayesian priors (2009)Google Scholar
- [deF37]deFinetti, B.: La prévision: Ses lois logiques, ses sources subjectives. In: Annales de l’Institut Henri Poincaré 7, Paris, pp. 1–68 (1937)Google Scholar
- [Hut05]Hutter, M.: Universal Articial Intelligence: Sequential Decisions based on Algorithmic Probability. Springer, Berlin (2005)Google Scholar
- [LV93]Li, M., Vitany, P.: An Introduction to Kolmogov Complexity and Its Applications. Springer (1993)Google Scholar
- [NM44]Neumann, J., Morgenstern, O.: Theory of Games and Economic Behavior. Princeton University Press (1944)Google Scholar
- [Ram31]Ramsey, F.: Truth and probability. In: Braithwaite, R.B. (ed.) The Foundations of Mathematics and other Logical Essays, ch. 7, pp. 156–198. Brace & Co. (1931)Google Scholar
- [RN10]Russell, S.J., Norvig, P.: Artificial Intelligence: A Modern Approach, 3rd edn. Prentice Hall, Englewood Cliffs (2010)Google Scholar
- [SH11b]Sunehag, P., Hutter, M.: Principles of Solomonoff induction and AIXI. In: Solomonoff Memorial Conference, Melbourne, Australia (2011)Google Scholar
- [Wal00]Walley, P.: Towards a unified theory of imprecise probability. Int. J. Approx. Reasoning, 125–148 (2000)Google Scholar