Evaluating a Reinforcement Learning Algorithm with a General Intelligence Test

  • Javier Insa-Cabrera
  • David L. Dowe
  • José Hernández-Orallo
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7023)

Abstract

In this paper we apply the recent notion of anytime universal intelligence tests to the evaluation of a popular reinforcement learning algorithm, Q-learning. We show that a general approach to intelligence evaluation of AI algorithms is feasible. This top-down (theory-derived) approach is based on a generation of environments under a Solomonoff universal distribution instead of using a pre-defined set of specific tasks, such as mazes, problem repositories, etc. This first application of a general intelligence test to a reinforcement learning algorithm brings us to the issue of task-specific vs. general AI agents. This, in turn, suggests new avenues for AI agent evaluation and AI competitions, and also conveys some further insights about the performance of specific algorithms.

Keywords

Reinforcement Learning Intelligence Test General Intelligence Kolmogorov Complexity Average Reward 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Dowe, D.L., Hajek, A.R.: A non-behavioural, computational extension to the Turing Test. In: Intl. Conf. on Computational Intelligence & multimedia applications (ICCIMA 1998), Gippsland, Australia, pp. 101–106 (1998)Google Scholar
  2. 2.
    Genesereth, M., Love, N., Pell, B.: General game playing: Overview of the AAAI competition. AI Magazine 26(2), 62 (2005)Google Scholar
  3. 3.
    Hernández-Orallo, J.: Beyond the Turing Test. J. Logic, Language & Information 9(4), 447–466 (2000)MathSciNetCrossRefMATHGoogle Scholar
  4. 4.
    Hernández-Orallo, J.: A (hopefully) non-biased universal environment class for measuring intelligence of biological and artificial systems. In: Hutter, M., et al. (eds.) 3rd Intl. Conf. on Artificial General Intelligence, Atlantis, pp. 182–183 (2010)Google Scholar
  5. 5.
    Hernández-Orallo, J.: On evaluating agent performance in a fixed period of time. In: Hutter, M., et al. (eds.) 3rd Intl. Conf. on Artificial General Intelligence, pp. 25–30. Atlantis Press (2010)Google Scholar
  6. 6.
    Hernández-Orallo, J., Dowe, D.L.: Measuring universal intelligence: Towards an anytime intelligence test. Artificial Intelligence 174(18), 1508–1539 (2010)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Legg, S., Hutter, M.: A universal measure of intelligence for artificial agents. Intl. Joint Conf. on Artificial Intelligence, IJCAI 19, 1509 (2005)Google Scholar
  8. 8.
    Legg, S., Hutter, M.: Universal intelligence: A definition of machine intelligence. Minds and Machines 17(4), 391–444 (2007)CrossRefGoogle Scholar
  9. 9.
    Levin, L.A.: Universal sequential search problems. Problems of Information Transmission 9(3), 265–266 (1973)Google Scholar
  10. 10.
    Li, M., Vitányi, P.: An introduction to Kolmogorov complexity and its applications, 3rd edn. Springer-Verlag New York, Inc. (2008)Google Scholar
  11. 11.
    Sanghi, P., Dowe, D.L.: A computer program capable of passing IQ tests. In: Proc. 4th ICCS International Conference on Cognitive Science (ICCS 2003), Sydney, Australia, pp. 570–575 (2003)Google Scholar
  12. 12.
    Solomonoff, R.J.: A formal theory of inductive inference. Part I. Information and Control 7(1), 1–22 (1964)MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    Strehl, A.L., Li, L., Wiewiora, E., Langford, J., Littman, M.L.: PAC model-free reinforcement learning. In: Proc. of the 23rd Intl. Conf. on Machine Learning, ICML 2006, New York, pp. 881–888 (2006)Google Scholar
  14. 14.
    Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction. The MIT press (1998)Google Scholar
  15. 15.
    Turing, A.M.: Computing machinery and intelligence. Mind 59, 433–460 (1950)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Veness, J., Ng, K.S., Hutter, M., Silver, D.: Reinforcement learning via AIXI approximation. In: Proc. 24th Conf. on Artificial Intelligence (AAAI 2010), pp. 605–611 (2010)Google Scholar
  17. 17.
    Watkins, C.J.C.H., Dayan, P.: Q-learning. Machine learning 8(3), 279–292 (1992)MATHGoogle Scholar
  18. 18.
    Weyns, D., Parunak, H.V.D., Michel, F., Holvoet, T., Ferber, J.: Environments for multiagent systems state-of-the-art and research challenges. In: Weyns, D., Van Dyke Parunak, H., Michel, F. (eds.) E4MAS 2004. LNCS (LNAI), vol. 3374, pp. 1–47. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  19. 19.
    Whiteson, S., Tanner, B., White, A.: The Reinforcement Learning Competitions. The AI magazine 31(2), 81–94 (2010)Google Scholar
  20. 20.
    Woergoetter, F., Porr, B.: Reinforcement learning. Scholarpedia 3(3), 1448 (2008)CrossRefGoogle Scholar
  21. 21.
    Zatuchna, Z., Bagnall, A.: Learning mazes with aliasing states: An LCS algorithm with associative perception. Adaptive Behavior 17(1), 28–57 (2009)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Javier Insa-Cabrera
    • 1
  • David L. Dowe
    • 2
  • José Hernández-Orallo
    • 1
  1. 1.DSICUniversitat Politècnica de ValènciaSpain
  2. 2.Clayton School of Information TechnologyMonash UniversityAustralia

Personalised recommendations