Skip to main content

Transfer in Reinforcement Learning: A Framework and a Survey

  • Chapter
Book cover Reinforcement Learning

Part of the book series: Adaptation, Learning, and Optimization ((ALO,volume 12))

Abstract

Transfer in reinforcement learning is a novel research area that focuses on the development of methods to transfer knowledge from a set of source tasks to a target task. Whenever the tasks are similar, the transferred knowledge can be used by a learning algorithm to solve the target task and significantly improve its performance (e.g., by reducing the number of samples needed to achieve a nearly optimal performance). In this chapter we provide a formalization of the general transfer problem, we identify the main settings which have been investigated so far, and we review the most important approaches to transfer in reinforcement learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 299.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 379.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 379.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Antos, A., Szepesvári, C., Munos, R.: Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path. Machine Learning Journal 71, 89–129 (2008)

    Article  Google Scholar 

  • Argyriou, A., Evgeniou, T., Pontil, M.: Convex multi-task feature learning. Machine Learning Journal 73(3), 243–272 (2008)

    Article  Google Scholar 

  • Asadi, M., Huber, M.: Effective control knowledge transfer through learning skill and representation hierarchies. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI-2007), pp. 2054–2059 (2007)

    Google Scholar 

  • Banerjee, B., Stone, P.: General game learning using knowledge transfer. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI-2007), pp. 672–677 (2007)

    Google Scholar 

  • Bartlett, P.L., Tewari, A.: Regal: a regularization based algorithm for reinforcement learning in weakly communicating mdps. In: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (UAI-2009), pp. 35–42. AUAI Press, Arlington (2009)

    Google Scholar 

  • Baxter, J.: A model of inductive bias learning. Journal of Artificial Intelligence Research 12, 149–198 (2000)

    MathSciNet  MATH  Google Scholar 

  • Ben-David, S., Schuller-Borbely, R.: A notion of task relatedness yiealding provable multiple-task learning guarantees. Machine Learning Journal 73(3), 273–287 (2008)

    Article  Google Scholar 

  • Bernstein, D.S.: Reusing old policies to accelerate learning on new mdps. Tech. rep., University of Massachusetts, Amherst, MA, USA (1999)

    Google Scholar 

  • Bonarini, A., Lazaric, A., Restelli, M.: Incremental Skill Acquisition for Self-motivated Learning Animats. In: Nolfi, S., Baldassarre, G., Calabretta, R., Hallam, J.C.T., Marocco, D., Meyer, J.-A., Miglino, O., Parisi, D. (eds.) SAB 2006. LNCS (LNAI), vol. 4095, pp. 357–368. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  • Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press (2006)

    Google Scholar 

  • Crammer, K., Kearns, M., Wortman, J.: Learning from multiple sources. Journal of Machine Learning Research 9, 1757–1774 (2008)

    MathSciNet  MATH  Google Scholar 

  • Drummond, C.: Accelerating reinforcement learning by composing solutions of automatically identified subtasks. Journal of Artificial Intelligence Research 16, 59–104 (2002)

    MATH  Google Scholar 

  • Engel, Y., Mannor, S., Meir, R.: Reinforcement learning with Gaussian processes. In: Proceedings of the 22nd International Conference on Machine Learning (ICML-2005), pp. 201–208 (2005)

    Google Scholar 

  • Ernst, D., Geurts, P., Wehenkel, L.: Tree-based batch mode reinforcement learning. Journal of Machine Learning Research 6, 503–556 (2005)

    MathSciNet  MATH  Google Scholar 

  • Farahmand, A.M., Ghavamzadeh, M., Szepesvári, C., Mannor, S.: Regularized policy iteration. In: Proceedings of the Twenty-Second Annual Conference on Advances in Neural Information Processing Systems (NIPS-2008), pp. 441–448 (2008)

    Google Scholar 

  • Fawcett, T., Callan, J., Matheus, C., Michalski, R., Pazzani, M., Rendell, L., Sutton, R. (eds.): Constructive Induction Workshop at the Eleventh International Conference on Machine Learning (1994)

    Google Scholar 

  • Ferguson, K., Mahadevan, S.: Proto-transfer learning in markov decision processes using spectral methods. In: Workshop on Structural Knowledge Transfer for Machine Learning at the Twenty-Third International Conference on Machine Learning (2006)

    Google Scholar 

  • Ferns, N., Panangaden, P., Precup, D.: Metrics for finite markov decision processes. In: Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence (UAI-2004), pp. 162–169 (2004)

    Google Scholar 

  • Ferrante, E., Lazaric, A., Restelli, M.: Transfer of task representation in reinforcement learning using policy-based proto-value functions. In: Proceedings of the Seventh International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-2008), pp. 1329–1332 (2008)

    Google Scholar 

  • Foster, D.J., Dayan, P.: Structure in the space of value functions. Machine Learning Journal 49(2-3), 325–346 (2002)

    Article  MATH  Google Scholar 

  • Gentner, D., Loewenstein, J., Thompson, L.: Learning and transfer: A general role for analogical encoding. Journal of Educational Psychology 95(2), 393–408 (2003)

    Article  Google Scholar 

  • Gick, M.L., Holyoak, K.J.: Schema induction and analogical transfer. Cognitive Psychology 15, 1–38 (1983)

    Article  Google Scholar 

  • Hauskrecht, M.: Planning with macro-actions: Effect of initial value function estimate on convergence rate of value iteration. Tech. rep., Department of Computer Science, University of Pittsburgh (1998)

    Google Scholar 

  • Hengst, B.: Discovering hierarchy in reinforcement learning. PhD thesis, University of New South Wales (2003)

    Google Scholar 

  • Jaksch, T., Ortner, R., Auer, P.: Near-optimal regret bounds for reinforcement learning. Journal of Machine Learning Research 11, 1563–1600 (2010)

    MathSciNet  MATH  Google Scholar 

  • Kalmar, Z., Szepesvari, C.: An evaluation criterion for macro-learning and some results. Tech. Rep. TR-99-01, Mindmaker Ltd. (1999)

    Google Scholar 

  • Konidaris, G., Barto, A.: Autonomous shaping: knowledge transfer in reinforcement learning. In: Proceedings of the Twenty-Third International Conference on Machine Learning (ICML-2006), pp. 489–496 (2006)

    Google Scholar 

  • Konidaris, G., Barto, A.G.: Building portable options: Skill transfer in reinforcement learning. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI-2007), pp. 895–900 (2007)

    Google Scholar 

  • Langley, P.: Transfer of knowledge in cognitive systems. In: Talk, Workshop on Structural Knowledge Transfer for Machine Learning at the Twenty-Third International Conference on Machine Learning (2006)

    Google Scholar 

  • Lazaric, A.: Knowledge transfer in reinforcement learning. PhD thesis, Poltecnico di Milano (2008)

    Google Scholar 

  • Lazaric, A., Ghavamzadeh, M.: Bayesian multi-task reinforcement learning. In: Proceedings of the Twenty-Seventh International Conference on Machine Learning, ICML-2010 (2010) (submitted)

    Google Scholar 

  • Lazaric, A., Restelli, M., Bonarini, A.: Transfer of samples in batch reinforcement learning. In: Proceedings of the Twenty-Fifth Annual International Conference on Machine Learning (ICML-2008), pp. 544–551 (2008)

    Google Scholar 

  • Lazaric, A., Ghavamzadeh, M., Munos, R.: Finite-sample analysis of lstd. In: Proceedings of the Twenty-Seventh International Conference on Machine Learning, ICML-2010 (2010)

    Google Scholar 

  • Li, H., Liao, X., Carin, L.: Multi-task reinforcement learning in partially observable stochastic environments. Journal of Machine Learning Research 10, 1131–1186 (2009)

    MathSciNet  MATH  Google Scholar 

  • Madden, M.G., Howley, T.: Transfer of experience between reinforcement learning environments with progressive difficulty. Artificial Intelligence Review 21(3-4), 375–398 (2004)

    Article  MATH  Google Scholar 

  • Mahadevan, S., Maggioni, M.: Proto-value functions: A laplacian framework for learning representation and control in markov decision processes. Journal of Machine Learning Research 38, 2169–2231 (2007)

    MathSciNet  Google Scholar 

  • Maillard, O.A., Lazaric, A., Ghavamzadeh, M., Munos, R.: Finite-sample analysis of bellman residual minimization. In: Proceedings of the Second Asian Conference on Machine Learning, ACML-2010 (2010)

    Google Scholar 

  • McGovern, A., Barto, A.G.: Automatic discovery of subgoals in reinforcement learning using diverse density. In: Proceedings of the Eighteenth International Conference on Machine Learning, ICML 2001 (2001)

    Google Scholar 

  • Mehta, N., Natarajan, S., Tadepalli, P., Fern, A.: Transfer in variable-reward hierarchical reinforcement learning. Machine Learning Journal 73(3), 289–312 (2008)

    Article  Google Scholar 

  • Menache, I., Mannor, S., Shimkin, N.: Q-cut - dynamic discovery of sub-goals in reinforcement learning. In: Proceedings of the Thirteen European Conference on Machine Learning, pp. 295–306 (2002)

    Google Scholar 

  • Munos, R., Szepesvári, C.: Finite time bounds for fitted value iteration. Journal of Machine Learning Research 9, 815–857 (2008)

    MATH  Google Scholar 

  • Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 22(22), 1345–1359 (2010)

    Article  Google Scholar 

  • Perkins, D.N., Salomon, G., Press, P.: Transfer of learning. In: International Encyclopedia of Education. Pergamon Press (1992)

    Google Scholar 

  • Perkins, T.J., Precup, D.: Using options for knowledge transfer in reinforcement learning. Tech. rep., University of Massachusetts, Amherst, MA, USA (1999)

    Google Scholar 

  • Phillips, C.: Knowledge transfer in markov decision processes. McGill School of Computer Science (2006), http://www.cs.mcgill.ca/~martin/usrs/phillips.pdf

  • Ravindran, B., Barto, A.G.: Relativized options: Choosing the right transformation. In: Proceedings of the Twentieth International Conference on Machine Learning (ICML 2003), pp. 608–615 (2003)

    Google Scholar 

  • Sherstov, A.A., Stone, P.: Improving action selection in MDP’s via knowledge transfer. In: Proceedings of the Twentieth National Conference on Artificial Intelligence, AAAI-2005 (2005)

    Google Scholar 

  • Silver, D.: Selective transfer of neural network task knowledge. PhD thesis, University of Western Ontario (2000)

    Google Scholar 

  • Silver, D.L., Poirier, R.: Requirements for Machine Lifelong Learning. In: Mira, J., Álvarez, J.R. (eds.) IWINAC 2007, Part I. LNCS, vol. 4527, pp. 313–319. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  • Simsek, O., Wolfe, A.P., Barto, A.G.: Identifying useful subgoals in reinforcement learning by local graph partitioning. In: Proceedings of the Twenty-Second International Conference of Machine Learning, ICML 2005 (2005)

    Google Scholar 

  • Singh, S., Barto, A., Chentanez, N.: Intrinsically motivated reinforcement learning. In: Proceedings of the Eighteenth Annual Conference on Neural Information Processing Systems, NIPS-2004 (2004)

    Google Scholar 

  • Soni, V., Singh, S.P.: Using homomorphisms to transfer options across continuous reinforcement learning domains. In: Proceedings of the Twenty-first National Conference on Artificial Intelligence, AAAI-2006 (2006)

    Google Scholar 

  • Stone, P., Sutton, R.S., Kuhlmann, G.: Reinforcement learning for RoboCup-soccer keepaway. Adaptive Behavior 13(3), 165–188 (2005)

    Article  Google Scholar 

  • Sunmola, F.T., Wyatt, J.L.: Model transfer for markov decision tasks via parameter matching. In: Proceedings of the 25th Workshop of the UK Planning and Scheduling Special Interest Group, PlanSIG 2006 (2006)

    Google Scholar 

  • Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)

    Google Scholar 

  • Sutton, R.S., Precup, D., Singh, S.: Between mdps and semi-mdps: a framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112, 181–211 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  • Talvitie, E., Singh, S.: An experts algorithm for transfer learning. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI-2007), pp. 1065–1070 (2007)

    Google Scholar 

  • Tanaka, F., Yamamura, M.: Multitask reinforcement learning on the distribution of mdps. In: IEEE International Symposium on Computational Intelligence in Robotics and Automation, vol. 3, pp. 1108–1113 (2003)

    Google Scholar 

  • Taylor, M.E., Stone, P.: Behavior transfer for value-function-based reinforcement learning. In: Proceedings of the Fourth International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-2005), pp. 53–59 (2005)

    Google Scholar 

  • Taylor, M.E., Stone, P.: Representation transfer for reinforcement learning. In: AAAI 2007 Fall Symposium on Computational Approaches to Representation Change during Learning and Development (2007)

    Google Scholar 

  • Taylor, M.E., Stone, P.: Transfer learning for reinforcement learning domains: A survey. Journal of Machine Learning Research 10(1), 1633–1685 (2009)

    MathSciNet  MATH  Google Scholar 

  • Taylor, M.E., Stone, P., Liu, Y.: Value functions for RL-based behavior transfer: A comparative study. In: Proceedings of the Twentieth National Conference on Artificial Intelligence, AAAI-2005 (2005)

    Google Scholar 

  • Taylor, M.E., Stone, P., Liu, Y.: Transfer learning via inter-task mappings for temporal difference learning. Journal of Machine Learning Research 8, 2125–2167 (2007a)

    MathSciNet  MATH  Google Scholar 

  • Taylor, M.E., Whiteson, S., Stone, P.: Transfer via inter-task mappings in policy search reinforcement learning. In: Proceedings of the Sixth International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS-2007 (2007b)

    Google Scholar 

  • Taylor, M.E., Jong, N.K., Stone, P.: Transferring instances for model-based reinforcement learning. In: Proceedings of the European Conference on Machine Learning (ECML-2008), pp. 488–505 (2008a)

    Google Scholar 

  • Taylor, M.E., Kuhlmann, G., Stone, P.: Autonomous transfer for reinforcement learning. In: Proceedings of the Seventh International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-2008), pp. 283–290 (2008b)

    Google Scholar 

  • Thorndike, E.L., Woodworth, R.S.: The influence of improvement in one mental function upon the efficiency of other functions. Psychological Review 8 (1901)

    Google Scholar 

  • Torrey, L., Walker, T., Shavlik, J., Maclin, R.: Using Advice to Transfer Knowledge Acquired in one Reinforcement Learning Task to Another. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 412–424. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  • Torrey, L., Shavlik, J., Walker, T., Maclin, R.: Skill Acquisition Via Transfer Learning and Advice Taking. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 425–436. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  • Utgoff, P.: Shift of bias for inductive concept learning. Machine Learning 2, 163–190 (1986)

    Google Scholar 

  • Walsh, T.J., Li, L., Littman, M.L.: Transferring state abstractions between mdps. In: ICML Workshop on Structural Knowledge Transfer for Machine Learning (2006)

    Google Scholar 

  • Watkins, C., Dayan, P.: Q-learning. Machine Learning 8, 279–292 (1992)

    MATH  Google Scholar 

  • Wilson, A., Fern, A., Ray, S., Tadepalli, P.: Multi-task reinforcement learning: a hierarchical bayesian approach. In: Proceedings of the Twenty-Forth International Conference on Machine learning (ICML-2007), pp. 1015–1022 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alessandro Lazaric .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Lazaric, A. (2012). Transfer in Reinforcement Learning: A Framework and a Survey. In: Wiering, M., van Otterlo, M. (eds) Reinforcement Learning. Adaptation, Learning, and Optimization, vol 12. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27645-3_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-27645-3_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-27644-6

  • Online ISBN: 978-3-642-27645-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics