Exploration in Relational Worlds

Lang, Tobias; Toussaint, Marc; Kersting, Kristian

doi:10.1007/978-3-642-15883-4_12

Exploration in Relational Worlds

Tobias Lang²³,
Marc Toussaint²³ &
Kristian Kersting²⁴

Conference paper

2139 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6322))

Abstract

One of the key problems in model-based reinforcement learning is balancing exploration and exploitation. Another is learning and acting in large relational domains, in which there is a varying number of objects and relations between them. We provide one of the first solutions to exploring large relational Markov decision processes by developing relational extensions of the concepts of the Explicit Explore or Exploit (E ³) algorithm. A key insight is that the inherent generalization of learnt knowledge in the relational representation has profound implications also on the exploration strategy: what in a propositional setting would be considered a novel situation and worth exploration may in the relational setting be an instance of a well-known context in which exploitation is promising. Our experimental evaluation shows the effectiveness and benefit of relational exploration over several propositional benchmark approaches on noisy 3D simulated robot manipulation problems.

Download to read the full chapter text

Chapter PDF

References

Bengio, Y., Louradour, J., Collobert, R., Weston, J.: Curriculum learning. In: Proc. of the Int. Conf. on Machine Learning (ICML), pp. 41–48 (2009)
Google Scholar
Boutilier, C., Reiter, R., Price, B.: Symbolic dynamic programming for first-order MDPs. In: Proc. of the Int. Conf. on Artificial Intelligence (IJCAI), pp. 690–700 (2001)
Google Scholar
Brafman, R.I., Tennenholtz, M.: R-max - a general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research 3, 213–231 (2002)
Article MathSciNet Google Scholar
Cohn, D.A., Ghahramani, Z., Jordan, M.I.: Active learning with statistical models. Journal of Artificial Intelligence Research 4(1), 129–145 (1996)
MATH Google Scholar
Croonenborghs, T., Ramon, J., Blockeel, H., Bruynooghe, M.: Online learning and exploiting relational models in reinforcement learning. In: Proc. of the Int. Conf. on Artificial Intelligence (IJCAI), pp. 726–731 (2007)
Google Scholar
Driessens, K., Džeroski, S.: Integrating guidance into relational reinforcement learning. Machine Learning 57(3), 271–304 (2004)
Article MATH Google Scholar
Driessens, K., Ramon, J., Gärtner, T.: Graph kernels and Gaussian processes for relational reinforcement learning. In: Machine Learning (2006)
Google Scholar
Džeroski, S., de Raedt, L., Driessens, K.: Relational reinforcement learning. Machine Learning 43, 7–52 (2001)
Article MATH Google Scholar
Getoor, L., Taskar, B. (eds.): A Introduction to Statistical Relational Learning. MIT Press, Cambridge (2007)
Google Scholar
Guestrin, C., Patrascu, R., Schuurmans, D.: Algorithm-directed exploration for model-based reinforcement learning in factored MDPs. In: Proc. of the Int. Conf. on Machine Learning (ICML), pp. 235–242 (2002)
Google Scholar
Halbritter, F., Geibel, P.: Learning models of relational MDPs using graph kernels. In: Proc. of the Mexican Conf. on A.I (MICAI), pp. 409–419 (2007)
Google Scholar
Joshi, S., Kersting, K., Khardon, R.: Self-taught decision theoretic planning with first order decision diagrams. In: Proceedings of ICAPS 2010 (2010)
Google Scholar
Kearns, M., Koller, D.: Efficient reinforcement learning in factored MDPs. In: Proc. of the Int. Conf. on Artificial Intelligence (IJCAI), pp. 740–747 (1999)
Google Scholar
Kearns, M., Singh, S.: Near-optimal reinforcement learning in polynomial time. Machine Learning 49(2-3), 209–232 (2002)
Article MATH Google Scholar
Kersting, K., Driessens, K.: Non–parametric policy gradients: A unified treatment of propositional and relational domains. In: Proceedings of the 25th International Conference on Machine Learning (ICML 2008), July 5-9 (2008)
Google Scholar
Lang, T., Toussaint, M.: Approximate inference for planning in stochastic relational worlds. In: Proc. of the Int. Conf. on Machine Learning, ICML (2009)
Google Scholar
Lang, T., Toussaint, M.: Relevance grounding for planning in relational domains. In: Proc. of the European Conf. on Machine Learning (ECML) (September 2009)
Google Scholar
Pasula, H.M., Zettlemoyer, L.S., Kaelbling, L.P.: Learning symbolic models of stochastic domains. Artificial Intelligence Research 29, 309–352 (2007)
MATH Google Scholar
Poupart, P., Vlassis, N., Hoey, J., Regan, K.: An analytic solution to discrete bayesian reinforcement learning. In: Proc. of the Int. Conf. on Machine Learning (ICML), pp. 697–704 (2006)
Google Scholar
Ramon, J., Driessens, K., Croonenborghs, T.: Transfer learning in reinforcement learning problems through partial policy recycling. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 699–707. Springer, Heidelberg (2007)
Chapter Google Scholar
Sanner, S., Boutilier, C.: Practical solution techniques for first order MDPs. Artificial Intelligence Journal 173, 748–788 (2009)
Article MATH MathSciNet Google Scholar
Thrun, S.: The role of exploration in learning control. In: White, D., Sofge, D. (eds.) Handbook for Intelligent Control: Neural, Fuzzy and Adaptive Approaches, Van Nostrand Reinhold, Florence (1992)
Google Scholar
Walsh, T.J.: Efficient learning of relational models for sequential decision making. PhD thesis, Rutgers, The State University of New Jersey, New Brunswick, NJ (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Machine Learning and Robotics Group, Technische Universität Berlin, Germany
Tobias Lang & Marc Toussaint
Fraunhofer Institute IAIS, Sankt Augustin, Germany
Kristian Kersting

Authors

Tobias Lang
View author publications
You can also search for this author in PubMed Google Scholar
Marc Toussaint
View author publications
You can also search for this author in PubMed Google Scholar
Kristian Kersting
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Departamento de Matemáticas, Estadística y Computación, Universidad de Cantabria, Avenida de los Castros, s/n, 39071, Santander, Spain
José Luis Balcázar
Yahoo! Research Barcelona, Avinguda Diagonal 177, 08018, Barcelona, Spain
Francesco Bonchi
Yahoo! Research Barcelona, Avinguda Diagnonal 177, 08018, Barcelona, Spain
Aristides Gionis
TAO, CNRS-INRIA-LRI, Université Paris-Sud, 91405, Orsay, France
Michèle Sebag

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lang, T., Toussaint, M., Kersting, K. (2010). Exploration in Relational Worlds. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2010. Lecture Notes in Computer Science(), vol 6322. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15883-4_12

Download citation

DOI: https://doi.org/10.1007/978-3-642-15883-4_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15882-7
Online ISBN: 978-3-642-15883-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics