Automatic Discovery of Subgoals in Reinforcement Learning Using Strongly Connected Components

  • Seyed Jalal Kazemitabar
  • Hamid Beigy
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5506)

Abstract

The hierarchical structure of real-world problems has resulted in a focus on hierarchical frameworks in the reinforcement learning paradigm. Preparing mechanisms for automatic discovery of macro-actions has mainly concentrated on subgoal discovery methods. Among the proposed algorithms, those based on graph partitioning have achieved precise results. However, few methods have been shown to be successful both in performance and also efficiency in terms of time complexity of the algorithm. In this paper, we present a SCC-based subgoal discovery algorithm; a graph theoretic approach for automatic detection of subgoals in linear time. Meanwhile a parameter tuning method is proposed to find the only parameter of the method.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Asadi, M., Huber, M.: Reinforcement learning acceleration through autonomous subgoal discovery. In: Proceedings of The 2005 International Conference on Machine Learning, Models, Technologies and Applications, pp. 69–74 (2005)Google Scholar
  2. 2.
    Chen, F., Gao, Y., Chen, S., Ma, Z.: Connect-based subgoal discovery for options in hierarchical reinforcement learning. In: Proceedings of the 3rd International Conference on Natural Computation, pp. 698–702 (2007)Google Scholar
  3. 3.
    Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms. MIT Press, Cambridge (2001)MATHGoogle Scholar
  4. 4.
    Mannor, S., Menache, I., Hoze, A., Klein, U.: Dynamic abstraction in reinforcement learning via clustering. In: Proceedings of the 21st International Conference on Machine Learning, pp. 560–567 (2004)Google Scholar
  5. 5.
    McGovern, A., Barto, A.G.: Automatic discovery of subgoals in reinforcement learning using diverse density. In: Proceedings of the 18th International Conference on Machine Learning, pp. 361–368 (2001)Google Scholar
  6. 6.
    Menache, I., Mannor, S., Shimkin, N.: Q-cut - dynamic discovery of sub-goals in reinforcement learning. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) ECML 2002. LNCS, vol. 2430, pp. 295–306. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  7. 7.
    Şimşek, Ö., Barto, A.G.: Using relative novelty to identify useful temporal abstractions in reinforcement learning. In: Proceedings of the 21st International Conference on Machine Learning, pp. 751–758 (2004)Google Scholar
  8. 8.
    Şimşek, Ö., Wolfe, A.P., Barto, A.G.: Identifying useful subgoals in reinforcement learning by local graph partitioning. In: Proceedings of the 22nd International Conference on Machine Learning, pp. 816–823 (2005)Google Scholar
  9. 9.
    Sutton, R.S., Precup, D., Singh, S.P.: Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Journal of Artificial Intelligence 112, 181–211 (1999)MathSciNetCrossRefMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Seyed Jalal Kazemitabar
    • 1
  • Hamid Beigy
    • 1
  1. 1.Computer Engineering DepartmentSharif Univeristy of TechnologyTehranIran

Personalised recommendations