Stochastic Abstract Policies for Knowledge Transfer in Robotic Navigation Tasks

  • Tiago Matos
  • Yannick Plaino Bergamo
  • Valdinei Freire da Silva
  • Anna Helena Reali Costa
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7094)


Most work in navigation approaches for mobile robots does not take into account existing solutions to similar problems when learning a policy to solve a new problem, and consequently solves the current navigation problem from scratch. In this article we investigate a knowledge transfer technique that enables the use of a previously know policy from one or more related source tasks in a new task. Here we represent the knowledge learned as a stochastic abstract policy, which can be induced from a training set given by a set of navigation examples of state-action sequences executed successfully by a robot to achieve a specific goal in a given environment. We propose both a probabilistic and a nondeterministic abstract policy, in order to preserve the occurrence of all actions identified in the inductive process. Experiments carried out attest to the effectiveness and efficiency of our proposal.


Knowledge Transfer Goal State Abstract Action Ground Action Atomic Sentence 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Kersting, K., Plagemann, C., Cocora, A., Burgard, W., Raedt, L.D.: Learning to transfer optimal navigation policies. Advanced Robotics: Special Issue on Imitative Robots 21, 1565–1582 (2007)Google Scholar
  2. 2.
    Taylor, M.E., Stone, P.: Transfer learning for reinforcement learning domains: A survey. Journal of Machine Learning Research 10, 1633–1685 (2009)MathSciNetzbMATHGoogle Scholar
  3. 3.
    Madden, M.G., Howley, T.: Transfer of experience between reinforcement learning environments with progressive difficulty. Artif. Intell. Rev. 21, 375–398 (2004)CrossRefzbMATHGoogle Scholar
  4. 4.
    Sherstov, A.A., Stone, P.: Improving action selection in MDP’s via knowledge transfer. In: Proc. of the 20th National Conference on Artificial Intelligence (2005)Google Scholar
  5. 5.
    Bianchi, R., Ribeiro, C., Costa, A.: Accelerating autonomous learning by using heuristic selection of actions. Journal of Heuristics 14, 135–168 (2008), doi:10.1007/s10732-007-9031-5CrossRefzbMATHGoogle Scholar
  6. 6.
    Lane, T., Wilson, A.: Toward a topological theory of relational reinforcement learning for navigation tasks. In: Proc. of the 18th Int. Florida Artificial Intelligence Research Society Conference (2005)Google Scholar
  7. 7.
    Kersting, K., Otterlo, M.V., Raedt, L.D.: Bellman goes relational. In: Brodley, C.E. (ed.) Proc. of the 21st Int. Conference on Machine Learning, Banff, Alberta, Canada, pp. 465–472 (2004)Google Scholar
  8. 8.
    Otterlo, M.V.: The logic of adaptive behavior: knowledge representation and algorithms for the Markov decision process framework in first-order domains. PhD thesis, University of Twente, Enschede (2008)Google Scholar
  9. 9.
    Hoey, J., St-Aubin, R., Hu, A.J., Boutilier, C.: Spudd: Stochastic planning using decision diagrams. In: Proc. of Uncertainty in Artificial Intelligence, Stockholm, Sweden (1999)Google Scholar
  10. 10.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)Google Scholar
  11. 11.
    Blockeel, H., Raedt, L.D.: Top-down induction of first-order logical decision trees. Artificial Intelligence 101, 285–297 (1998)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Quinlan, J.R.: C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)Google Scholar
  13. 13.
    Matos, T., Bergamo, Y.P., da Silva, V.F., Cozman, F.G., Costa, A.H.R.: Simultaneous abstract and concrete reinforcement learning. In: Proc. of the 9th Symposium on Abstraction, Reformulation and Approximation (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Tiago Matos
    • 1
  • Yannick Plaino Bergamo
    • 1
  • Valdinei Freire da Silva
    • 1
  • Anna Helena Reali Costa
    • 1
  1. 1.Laboratório de Técnicas Inteligentes (LTI/EPUSP)Escola Politécnica, Universidade de São PauloSão PauloBrazil

Personalised recommendations