On Ensemble Techniques for AIXI Approximation

  • Joel Veness
  • Peter Sunehag
  • Marcus Hutter
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7716)

Abstract

One of the key challenges in AIXI approximation is model class approximation - i.e. how to meaningfully approximate Solomonoff Induction without requiring an infeasible amount of computation? This paper advocates a bottom-up approach to this problem, by describing a number of principled ensemble techniques for approximate AIXI agents. Each technique works by efficiently combining a set of existing environment models into a single, more powerful model. These techniques have the potential to play an important role in future AIXI approximations.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press (2004)Google Scholar
  2. Chen, Y., Ye, X.: Projection Onto A Simplex. ArXiv e-prints 1101.6081 (January 2011)Google Scholar
  3. Van Erven, T., Grünwald, P., De Rooij, S.: Catching Up Faster in Bayesian Model Selection and Model Averaging. In: Platt, J.C., Koller, D., Singer, Y., Roweis, S. (eds.) Advances in Neural Information Processing Systems 20, pp. 417–424. MIT Press, Cambridge (2008)Google Scholar
  4. Hazan, E.: Efficient algorithms for online convex optimization and their applications. PhD thesis, Princeton, NJ, USA (2006)Google Scholar
  5. Hazan, E., Kalai, A., Kale, S., Agarwal, A.: Logarithmic Regret Algorithms for Online Convex Optimization. In: Lugosi, G., Simon, H.U. (eds.) COLT 2006. LNCS (LNAI), vol. 4005, pp. 499–513. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  6. Herbster, M., Warmuth, M.K.: Tracking the best expert. Machine Learning 32, 151–178 (1998)MATHCrossRefGoogle Scholar
  7. Hutter, M.: Universal Artificial Intelligence: Sequential Decisions Based on Algorithmic Probability. Springer (2005)Google Scholar
  8. Mahoney, M.: Adaptive weighing of context models for lossless data compression. Technical report, Florida Institute of Technology (2005)Google Scholar
  9. Mattern, C.: Mixing strategies in data compression. In: Data Compression Conference (DCC), pp. 337–346 (2012)Google Scholar
  10. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press (1998)Google Scholar
  11. Veness, J., Ng, K.S., Hutter, M., Silver, D.: Reinforcement learning via AIXI approximation. In: Proc. 24th AAAI Conference on Artificial Intelligence, Atlanta, pp. 605–611. AAAI Press (2010)Google Scholar
  12. Veness, J., Ng, K.S., Hutter, M., Uther, W., Silver, D.: A Monte Carlo AIXI approximation. Journal of Artificial Intelligence Research 40, 95–142 (2011)MathSciNetMATHGoogle Scholar
  13. Veness, J., Ng, K.S., Hutter, M., Bowling, M.H.: Context Tree Switching. In: Data Compression Conference (DCC), pp. 327–336 (2012)Google Scholar
  14. Volf, P.A.J., Willems, F.M.J.: Switching between two universal source coding algorithms. In: Data Compression Conference, pp. 491–500 (1998)Google Scholar
  15. Willems, F.M.J., Shtarkov, Y.M., Tjalkens, T.J.: The Context Tree Weighting Method: Basic Properties. IEEE Transactions on Information Theory 41, 653–664 (1995)MATHCrossRefGoogle Scholar
  16. Zinkevich, M.: Online convex programming and generalized infinitesimal gradient ascent. In: ICML, pp. 928–936 (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Joel Veness
    • 1
  • Peter Sunehag
    • 2
  • Marcus Hutter
    • 2
  1. 1.University of AlbertaCanada
  2. 2.Australian National UniversityAustralia

Personalised recommendations