Universal Knowledge-Seeking Agents for Stochastic Environments

  • Laurent Orseau
  • Tor Lattimore
  • Marcus Hutter
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8139)

Abstract

We define an optimal Bayesian knowledge-seeking agent, KL-KSA, designed for countable hypothesis classes of stochastic environments and whose goal is to gather as much information about the unknown world as possible. Although this agent works for arbitrary countable classes and priors, we focus on the especially interesting case where all stochastic computable environments are considered and the prior is based on Solomonoff’s universal prior. Among other properties, we show that KL-KSA learns the true environment in the sense that it learns to predict the consequences of actions it does not take. We show that it does not consider noise to be information and avoids taking actions leading to inescapable traps. We also present a variety of toy experiments demonstrating that KL-KSA behaves according to expectation.

Keywords

Universal artificial intelligence exploration reinforcement learning algorithmic information theory Solomonoff induction 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [BO13]
    Baranes, A., Oudeyer, P.-Y.: Active Learning of Inverse Models with Intrinsically Motivated Goal Exploration in Robots. Robotics and Autonomous Systems 61(1), 69–73 (2013)CrossRefGoogle Scholar
  2. [Hut05]
    Hutter, M.: Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability. Springer (2005)Google Scholar
  3. [LH11a]
    Lattimore, T., Hutter, M.: Asymptotically optimal agents. In: Kivinen, J., Szepesvári, C., Ukkonen, E., Zeugmann, T. (eds.) ALT 2011. LNCS, vol. 6925, pp. 368–382. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  4. [LH11b]
    Lattimore, T., Hutter, M.: Time Consistent Discounting. In: Kivinen, J., Szepesvári, C., Ukkonen, E., Zeugmann, T. (eds.) ALT 2011. LNCS, vol. 6925, pp. 383–397. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  5. [LV08]
    Li, M., Vitányi, P.M.B.: An Introduction to Kolmogorov Complexity and Its Applications, 3rd edn. Springer, New York (2008)CrossRefMATHGoogle Scholar
  6. [Ors11]
    Orseau, L.: Universal Knowledge-Seeking Agents. In: Kivinen, J., Szepesvári, C., Ukkonen, E., Zeugmann, T. (eds.) ALT 2011. LNCS, vol. 6925, pp. 353–367. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  7. [Ors13]
    Orseau, L.: Asymptotic non-learnability of universal agents with computable horizon functions. Theoretical Computer Science 473, 149–156 (2013)MathSciNetCrossRefMATHGoogle Scholar
  8. [RH11]
    Rathmanner, S., Hutter, M.: A philosophical treatise of universal induction. Entropy 13(6), 1076–1136 (2011)MathSciNetCrossRefGoogle Scholar
  9. [SB98]
    Sutton, R., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)Google Scholar
  10. [Sch06]
    Schmidhuber, J.: Developmental robotics, optimal artificial curiosity, creativity, music, and the fine arts. Connection Science 18(2), 173–188 (2006)CrossRefGoogle Scholar
  11. [SGS11]
    Sun, Y., Gomez, F., Schmidhuber, J.: Planning to Be Surprised: Optimal Bayesian Exploration in Dynamic Environments. In: Schmidhuber, J., Thórisson, K.R., Looks, M. (eds.) AGI 2011. LNCS, vol. 6830, pp. 41–51. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  12. [SHS95]
    Storck, J., Hochreiter, S., Schmidhuber, J.: Reinforcement driven information acquisition in non-deterministic environments. In: Proceedings of the International Conference on Artificial Neural Networks, Paris, vol. 2, pp. 159–164. EC2 & Cie (1995)Google Scholar
  13. [Sol78]
    Solomonoff, R.: Complexity-based induction systems: comparisons and convergence theorems. IEEE Transactions on Information Theory 24(4), 422–432 (1978)MathSciNetCrossRefMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Laurent Orseau
    • 1
    • 2
  • Tor Lattimore
    • 3
  • Marcus Hutter
    • 3
  1. 1.UMR 518 MIAAgroParisTechParisFrance
  2. 2.UMR 518 MIAINRAParisFrance
  3. 3.RSCSAustralian National UniversityCanberraAustralia

Personalised recommendations