Skip to main content

Invited Talk: UCRL and Autonomous Exploration

  • Conference paper
  • 2139 Accesses

Part of the Lecture Notes in Computer Science book series (LNAI,volume 7188)

Abstract

After reviewing the main ingredients of the UCRL algorithm and its analysis for online reinforcement learning — exploration vs. exploitation, optimism in the face of uncertainty, consistency with observations and upper confidence bounds, regret analysis — I show how these techniques can also be used to derive PAC-MDP bounds which match the best currently available bounds for the discounted and the undiscounted setting. As typical for reinforcement learning, the analysis for the undiscounted setting is significantly more involved.

In the second part of my talk I consider a model for autonomous exploration, where an agent learns about its environment and how to navigate in it. Whereas evaluating autonomous exploration is typically difficult, in the presented setting rigorous performance bounds can be derived. For that we present an algorithm that optimistically explores, by repeatedly choosing the apparently closest unknown state — as indicated by an optimistic policy — for further exploration.

Acknowledgements. This is joint work with Shiau Hong Lim. The research leading to these results has received funding from the European Community’s Seventh Framework Programme (FP7/2007-2013) under grant agreement 231495 (CompLACS).

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (Canada)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Auer, P. (2012). Invited Talk: UCRL and Autonomous Exploration. In: Sanner, S., Hutter, M. (eds) Recent Advances in Reinforcement Learning. EWRL 2011. Lecture Notes in Computer Science(), vol 7188. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29946-9_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-29946-9_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-29945-2

  • Online ISBN: 978-3-642-29946-9

  • eBook Packages: Computer ScienceComputer Science (R0)