Towards Flexible Task Environments for Comprehensive Evaluation of Artificial Intelligent Systems and Automatic Learners

Thórisson, Kristinn R.; Bieger, Jordi; Schiffel, Stephan; Garrett, Deon

doi:10.1007/978-3-319-21365-1_20

Kristinn R. Thórisson^7,8,
Jordi Bieger⁷,
Stephan Schiffel⁷ &
…
Deon Garrett^7,8

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9205))

Included in the following conference series:

International Conference on Artificial General Intelligence

1557 Accesses
6 Citations

Abstract

Evaluation of artificial intelligence (AI) systems is a prerequisite for comparing them on the many dimensions they are intended to perform on. Design of task-environments for this purpose is often ad-hoc, focusing on some limited aspects of the systems under evaluation. Testing on a wide range of tasks and environments would better facilitate comparisons and understanding of a system’s performance, but this requires that manipulation of relevant dimensions cause predictable changes in the structure, behavior, and nature of the task-environments. What is needed is a framework that enables easy composition, decomposition, scaling, and configuration of task-environments. Such a framework would not only facilitate evaluation of the performance of current and future AI systems, but go beyond it by allowing evaluation of knowledge acquisition, cognitive growth, lifelong learning, and transfer learning. In this paper we list requirements that we think such a framework should meet to facilitate the evaluation of intelligence, and present preliminary ideas on how this could be realized.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Archibald, T.W., McKinnon, K.I.M., Thomas, L.C.: On the generation of Markov decision processes. J. Oper. Res. Soc. 46, 354–361 (1995)
Article Google Scholar
Asta, S., Özcan, E., Parkes, A.J.: Batched mode hyper-heuristics. In: Nicosia, G., Pardalos, P. (eds.) LION 7. LNCS, vol. 7997, pp. 404–409. Springer, Heidelberg (2013)
Chapter Google Scholar
Bhatnagar, S., Sutton, R.S., Ghavamzadeh, M., Lee, M.: Natural actor-critic algorithms. Automatica 45(11), 2471–2482 (2009)
Article MathSciNet MATH Google Scholar
Bieger, J., Thórisson, K.R., Garrett, D.: Raising AI: tutoring matters. In: Goertzel, B., Orseau, L., Snaider, J. (eds.) AGI 2014. LNCS, vol. 8598, pp. 1–10. Springer, Heidelberg (2014)
Google Scholar
Bischl, B., Mersmann, O., Trautmann, H., Preuß, M.: Algorithm selection based on exploratory landscape analysis and cost-sensitive learning. In: Proceedings of the 14th Annual Conference on Genetic and Evolutionary Computation, GECCO 2012, pp. 313–320. ACM, New York (2012)
Google Scholar
Burke, E.K., Gendreau, M., Hyde, M., Kendall, G., Ochoa, G., Özcan, E., Qu, R.: Hyper-heuristics: A survey of the state of the art. J. Oper. Res. Soc. 64(12), 1695–1724 (2013)
Article Google Scholar
Decker, K.: TAEMS: A framework for environment centered analysis & design of coordination mechanisms. In: O’Hare, G.M.P., Jennings, N.R. (eds.) Foundations of Distributed Artificial Intelligence, pp. 429–448. Wiley Inter-Science (1996)
Google Scholar
Ebner, M., Levine, J., Lucas, S.M., Schaul, T., Thompson, T., Togelius, J.: Towards a video game description language. In: Lucas, S.M., Mateas, M., Preuss, M., Spronck, P., Togelius, J. (eds.) Artificial and Computational Intelligence in Games. Dagstuhl Follow-Ups, vol. 6, pp. 85–100. Schloss Dagstuhl (2013)
Google Scholar
Garrett, D., Bieger, J., Thórisson, K.R.: Tunable and generic problem instance generation for multi-objective reinforcement learning. In: ADPRL 2014. IEEE (2014)
Google Scholar
Hernández-Orallo, J.: A (hopefully) non-biased universal environment class for measuring intelligence of biological and artificial systems. In: Baum, E., Hutter, M., Kitzelmann, E. (eds.) AGI 2010, pp. 182–183. Atlantis Press (2010)
Google Scholar
Hernández-Orallo, J.: AI Evaluation: past, present and future (2014). arXiv:1408.6908
Hernández-Orallo, J., Dowe, D.L.: Measuring universal intelligence: Towards an anytime intelligence test. Artif. Intell. 174(18), 1508–1539 (2010)
Article MATH Google Scholar
Legg, S., Hutter, M.: Tests of Machine Intelligence [cs] (December 2007). arXiv:0712.3825
Legg, S., Veness, J.: An approximation of the universal intelligence measure. In: Dowe, D.L. (ed.) Solomonoff Festschrift. LNCS(LNAI), vol. 7070, pp. 236–249. Springer, Heidelberg (2013)
Google Scholar
Lim, C.U., Harrell, D.F.: An approach to general videogame evaluation and automatic generation using a description language. In: CIG 2014. IEEE (2014)
Google Scholar
Love, N., Hinrichs, T., Haley, D., Schkufza, E., Genesereth, M.: General game playing: Game description language specification. Tech. Rep. LG-2006-01, Stanford Logic Group (2008)
Google Scholar
McDermott, D., Ghallab, M., Howe, A., Knoblock, C., Ram, A., Veloso, M., Weld, D., Wilkins, D.: PDDL-The Planning Domain Definition Language. Tech. Rep. TR-98-003, Yale Center for Computational Vision and Control (1998). http://www.cs.yale.edu/homes/dvm/
Rohrer, B.: Accelerating progress in Artificial General Intelligence: Choosing a benchmark for natural world interaction. J. Art. Gen. Int. 2(1), 1–28 (2010)
Article Google Scholar
Schaul, T.: A video game description language for model-based or interactive learning. In: CIG 2013, pp. 1–8. IEEE (2013)
Google Scholar
Schaul, T., Togelius, J., Schmidhuber, J.: Measuring intelligence through games (2011). arXiv preprint arXiv:1109.1314
Togelius, J., Champandard, A.J., Lanzi, P.L., Mateas, M., Paiva, A., Preuss, M., Stanley, K.O.: Procedural content generation: Goals, challenges and actionable steps. In: Lucas, S.M., Mateas, M., Preuss, M., Spronck, P., Togelius, J. (eds.) Artificial and Computational Intelligence in Games. Dagstuhl Follow-Ups, vol. 6, pp. 61–75. Schloss Dagstuhl (2013)
Google Scholar
Turing, A.M.: Computing machinery and intelligence. Mind 59(236), 433–460 (1950)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Center for Analysis and Design of Intelligent Agents / School of Computer Science, Reykjavik University, Menntavegur 1, 101, Reykjavik, Iceland
Kristinn R. Thórisson, Jordi Bieger, Stephan Schiffel & Deon Garrett
Icelandic Institute for Intelligent Machines, Uranus, Menntavegur 1, 101, Reykjavik, Iceland
Kristinn R. Thórisson & Deon Garrett

Authors

Kristinn R. Thórisson
View author publications
You can also search for this author in PubMed Google Scholar
Jordi Bieger
View author publications
You can also search for this author in PubMed Google Scholar
Stephan Schiffel
View author publications
You can also search for this author in PubMed Google Scholar
Deon Garrett
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jordi Bieger .

Editor information

Editors and Affiliations

Reykjavik University, Reykjavik, Iceland
Jordi Bieger
Hong Kong Polytechnic University, Hong Kong, Hong Kong SAR
Ben Goertzel
Mechanics and Optics, Saint Petersburg State University of Information Technologies, St. Petersburg, Russia
Alexey Potapov

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Thórisson, K.R., Bieger, J., Schiffel, S., Garrett, D. (2015). Towards Flexible Task Environments for Comprehensive Evaluation of Artificial Intelligent Systems and Automatic Learners. In: Bieger, J., Goertzel, B., Potapov, A. (eds) Artificial General Intelligence. AGI 2015. Lecture Notes in Computer Science(), vol 9205. Springer, Cham. https://doi.org/10.1007/978-3-319-21365-1_20

Download citation

DOI: https://doi.org/10.1007/978-3-319-21365-1_20
Published: 15 July 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-21364-4
Online ISBN: 978-3-319-21365-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics