Skip to main content

Advertisement

Log in

Accounting for Task-Difficulty in Active Multi-Task Robot Control Learning

  • Technical Contribution
  • Published:
KI - Künstliche Intelligenz Aims and scope Submit manuscript

Abstract

Contextual policy search is a reinforcement learning approach for multi-task learning in the context of robot control learning. It can be used to learn versatilely applicable skills that generalize over a range of tasks specified by a context vector. In this work, we combine contextual policy search with ideas from active learning for selecting the task in which the next trial will be performed. Moreover, we use active training set selection for reducing detrimental effects of exploration in the sampling policy. A core challenge in this approach is that the distribution of the obtained rewards may not be directly comparable between different tasks. We propose the novel approach PUBSVE for estimating a reward baseline and investigate empirically on benchmark problems and simulated robotic tasks to which extent this method can remedy the issue of non-comparable reward.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Deisenroth MP, Neumann G, Peters J (2013) A survey on policy search for robotics. Found Trends Robot 2(1–2):328–373

    Google Scholar 

  2. Fabisch A, Metzen JH (2014) Active contextual policy search. J Mach Learn Res 15:3371–3399

    MathSciNet  MATH  Google Scholar 

  3. Hansen N, Auger A, Ros R, Finck S, Posik P (2010) Comparing results of 31 algorithms from the black-box optimization benchmarking bbob-2009. In: Proceedings of the 12th annual conference companion on genetic and evolutionary computation

  4. Ijspeert AJ, Nakanishi J, Hoffmann H, Pastor P, Schaal S (2013) Dynamical movement primitives: learning attractor models for motor behaviors. Neural Comput 25(2):328–373

    Article  MathSciNet  MATH  Google Scholar 

  5. Kober J, Peters J (2011) Policy search for motor primitives in robotics. Mach Learn 84(1–2):171–203

    Article  MathSciNet  MATH  Google Scholar 

  6. Kober J, Wilhelm A, Oztop E, Peters J (2012) Reinforcement learning to adjust parametrized motor primitives to new situations. Auton Robot 33(4):361–379

    Article  Google Scholar 

  7. Krell MM (2015) Generalizing, decoding, and optimizing support vector machine classification. Ph.D. thesis, University of Bremen, Bremen

  8. Kupcsik AG, Deisenroth MP, Peters J, Neumann G (2013) Data-efficient generalization of robot skills with contextual policy search. In: Proceedings of the national conference on artificial intelligence (AAAI)

  9. Mangasarian OL, Musicant DR (1998) Successive overrelaxation for support vector machines. IEEE Trans Neural Netw 10:1032–1037

    Article  Google Scholar 

  10. Manz M, Sonsalla R, Hilljegerdes J, Oekermann C, Schwendner J, Bartsch S, Ptacek S (2014) Design of a rover for mobile manipulation in uneven terrain in the context of the spacebot cup. In: Proceedings of the international symposium on artificial intelligence, robotics and automation in space (i-SAIRAS 2014). http://spacebot.dfki-bremen.de/

  11. Neumann G (2011) Variational inference for policy search in changing situations. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 817–824

  12. Peters J, Mülling K, Altun Y (2010) Relative entropy policy search. In: Long DP, Fox M (ed) Proceedings of the twenty-fourth AAAI conference on artificial intelligence. AAAI Press, Atlanta, pp 1607–1612

  13. Ring MB (1997) CHILD: a first step towards continual learning. Mach Learn 28(1):77–104

    Article  MATH  Google Scholar 

  14. Ruvolo P, Eaton E (2013) Active task selection for lifelong machine learning. In: Twenty-seventh AAAI conference on artificial intelligence

  15. da Silva BC, Konidaris G, Barto A (2014) Active learning of parameterized skills. In: Proceedings of the 31st international conference on machine learning (ICML 2014). Beijing, China

  16. da Silva BC, Konidaris G, Barto AG (2012) Learning parameterized skills. In: Proceedings of the 29th international conference on machine learning (ICML 2012). Edinburgh, Scotland

  17. Silver DL, Yang Q, Li L (2013) Lifelong machine learning systems: beyond learning algorithms. In: 2013 AAAI spring symposium series

  18. Steinwart I, Hush D, Scovel C (2009) Training SVMs without offset. J Mach Learn Res 12:141–202

    MathSciNet  Google Scholar 

  19. Sutton RS, Koop A, Silver D (2007) On the role of tracking in stationary environments. In: Proceedings of the 24th international conference on machine learning. ACM, pp 871–878

  20. Syed NA, Liu H, Sung KK (1999) Handling concept drifts in incremental learning with support vector machines. In: Proceedings of the 5th international conference on knowledge discovery and data mining—KDD ’99. ACM Press, New York, pp 317–321

  21. Thrun S (1996) Is learning the n-th thing any easier than learning the first? In: Advances in neural information processing systems. The MIT Press, pp 640–646

  22. Thrun S, Mitchell TM (1995) Lifelong robot learning. In: Steels L (ed) The biology and technology of intelligent autonomous agents, vol 144. Springer, Berlin, pp 165–196

  23. Williams C, Seeger M (2001) Using the Nyström method to speed up kernel machines. In: Advances in neural information processing systems. vol 13. MIT Press, pp 682–688

Download references

Acknowledgments

This work was supported through two Grants of the German Federal Ministry of Economics and Technology (BMWi, FKZ 50 RA 1216 and FKZ 50 RA 1217).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexander Fabisch.

Additional information

A. Fabisch and J. H. Metzen contributed equally.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fabisch, A., Metzen, J.H., Krell, M.M. et al. Accounting for Task-Difficulty in Active Multi-Task Robot Control Learning. Künstl Intell 29, 369–377 (2015). https://doi.org/10.1007/s13218-015-0363-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13218-015-0363-2

Keywords

Navigation