Options with Exceptions
An option is a policy fragment that represents a solution to a frequent subproblem encountered in a domain. Options may be treated as temporally extended actions thus allowing us to reuse that solution in solving larger problems. Often, it is hard to find subproblems that are exactly the same. These differences, however small, need to be accounted for in the reused policy. In this paper, the notion of options with exceptions is introduced to address such scenarios. This is inspired by the Ripple Down Rules approach used in data mining and knowledge representation communities. The goal is to develop an option representation so that small changes in the subproblem solutions can be accommodated without losing the original solution. We empirically validate the proposed framework on a simulated game domain.
KeywordsOptions framework Transfer Learning Maintenance of skills
Unable to display preview. Download preview PDF.
- 1.Barto, A.G., Mahadevan, S.: Recent Advances in Hierarchical Reinforcement Learning. Discrete Event Dynamic Systems 13(1-2) (2003)Google Scholar
- 4.McCallum, A.K.: Reinforcement Learning with Selective Perception and Hidden State, Ph.D. Thesis, Department of Computer Science, The College of Arts and Science, University of Rocheater, USA (1995)Google Scholar
- 5.Asadi, M., Huber, M.: Autonomous Subgoal Discovery and Hierarchical Abstraction Learned Policies. In: FLAIRS Conference, pp. 346–350 (2003)Google Scholar
- 6.Gaines, B.R., Compton, P.: Induction of Ripple-Down Rules Applied to Modeling Large Database. Knowledge Acquisition 2(3), 241–258 (1995)Google Scholar
- 7.McGovern, A.: Autonomous Discovery of Temporal Abstraction from Interaction with An Environment, Ph.D. Thesis, Department of Computer Science, University of Massachusetts, Amherst, USA (2002)Google Scholar
- 8.Precup, D.: Temporal Abstraction in Reinforcement Learning, Ph.D. Thesis, Department of Computer Science, University of Massachusetts, Amherst, USA (2000)Google Scholar
- 9.McGovern, A., Barto, A.G.: Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density. In: Proc. 18th International Conf. on Machine Learning, pp. 361–368. Morgan Kaufmann, San Francisco (2001)Google Scholar
- 10.Bradtke, S.J., Duff, M.O.: Reinforcement Learning Methods for Continuous-Time Markov Decision Problems. In: Tesauro, G., Touretzky, D., Leen, T. (eds.) Advances in Neural Information Processing Systems, vol. 7, pp. 393–400. The MIT Press (1995)Google Scholar
- 11.Sutton, R.S., Precup, D.: Intra-option learning about temporally abstract actions. In: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 556–564. Morgan Kaufman (1998)Google Scholar
- 12.Kaelbling, L.P.: Hierarchical learning in stochastic domains: Preliminary results. In: Proceedings of the Tenth International Conference on Machine Learning, pp. 167–173 (1993)Google Scholar
- 13.Ravindran, B., Barto, A.G.: Relativized Options: Choosing the Right Transformation. In: Proceedings of the Twentieth International Conference on Machine Learning, pp. 608–615 (2003)Google Scholar