Helping an Agent Reach a Different Goal by Action Transfer in Reinforcement Learning

  • Yuchen WangEmail author
  • Fenghui Ren
  • Minjie Zhang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11919)


Reinforcement learning agents can be helped by the knowledge transferred from experienced agents. This paper studies the problem of how an experienced agent helps another agent learn when they have different learning goals by action transfer. This problem is motivated by the widely existing situations where agents have different learning goals and only action transfer is available to agents. To tackle the problem, we propose an approach to facilitate the transfer of actions that are right to a learning agent’s goal. Experimental results show the effectiveness of the proposed approach in transferring right actions to an agent and helping the agent learn to reach a different goal.


Different goals Action transfer Reinforcement learning 



This research is supported by a DECRA Project (DP140100007) from Australia Research Council (ARC), a UPA and an IPTA scholarships from University of Wollongong, Australia.


  1. 1.
    Amir, O., Kamar, E., Kolobov, A., Grosz, B.J.: Interactive teaching strategies for agent training. In: Proceedings of the 25th International Joint Conferences on Artificial Intelligence. pp. 804–811 (2016)Google Scholar
  2. 2.
    Chernova, S., Veloso, M.: Confidence-based policy learning from demonstration using gaussian mixture models. In: Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems. pp. 1315–1322 (2007)Google Scholar
  3. 3.
    Da Silva, F.L., Glatt, R., Costa, A.H.R.: Simultaneously learning and advising in multiagent reinforcement learning. In: Proceedings of the 16th Conference on Autonomous Agents and Multiagent Systems. pp. 1100–1108 (2017)Google Scholar
  4. 4.
    Fernández, F., Veloso, M.: Probabilistic policy reuse in a reinforcement learning agent. In: Proceedings of the fifth International Ioint Conference on Autonomous Agents and Multiagent Systems. pp. 720–727. ACM (2006)Google Scholar
  5. 5.
    Puterman, M.L.: Markov decision processes: discrete stochastic dynamic programming. John Wiley & Sons (2014)Google Scholar
  6. 6.
    Sherstov, A.A., Stone, P.: Improving action selection in mdp’s via knowledge transfer. In: Proceedings of the 20th National Conference on Artificial Intelligence. vol. 5, pp. 1024–1029 (2005)Google Scholar
  7. 7.
    Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction. MIT Press (1998)Google Scholar
  8. 8.
    Taylor, M.E., Carboni, N., Fachantidis, A., Vlahavas, I., Torrey, L.: Reinforcement learning agents providing advice in complex video games. Connection Science 26(1), 45–63 (2014)CrossRefGoogle Scholar
  9. 9.
    Taylor, M.E., Stone, P.: Transfer learning for reinforcement learning domains: A survey. Journal of Machine Learning Research 10, 1633–1685 (2009)MathSciNetzbMATHGoogle Scholar
  10. 10.
    Torrey, L., Taylor, M.: Teaching on a budget: Agents advising agents in reinforcement learning. In: Proceedings of the 12th International Conference on Autonomous Agents and Multiagent Systems. pp. 1053–1060 (2013)Google Scholar
  11. 11.
    Watkins, C.J., Dayan, P.: Q-learning. Machine Learning 8(3–4), 279–292 (1992)zbMATHGoogle Scholar
  12. 12.
    Wilson, A., Fern, A., Ray, S., Tadepalli, P.: Multi-task reinforcement learning: a hierarchical bayesian approach. In: Proceedings of the 24th International Conference on Machine Learning. pp. 1015–1022. ACM (2007)Google Scholar
  13. 13.
    Ye, D., Zhu, T., Zhou, W., Philip, S.Y.: Differentially private malicious agent avoidance in multiagent advising learning. IEEE Transactions on Cybernetics (2019)Google Scholar
  14. 14.
    Yu, C., Zhang, M., Ren, F., Tan, G.: Multiagent learning of coordination in loosely coupled multiagent systems. IEEE Transactions on Cybernetics 45(12), 2853–2867 (2015)CrossRefGoogle Scholar
  15. 15.
    Zhan, Y., Ammar, H.B., Taylor, M.E.: Theoretically-grounded policy advice from multiple teachers in reinforcement learning settings with applications to negative transfer. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence. pp. 2315–2321 (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.School of Computing and Information TechnologyUniversity of WollongongWollongongAustralia

Personalised recommendations