A novel optimization perspective to the problem of designing sequences of tasks in a reinforcement learning framework

Seccia, Ruggiero; Foglino, Francesco; Leonetti, Matteo; Sagratella, Simone

doi:10.1007/s11081-021-09708-x

A novel optimization perspective to the problem of designing sequences of tasks in a reinforcement learning framework

Research Article
Published: 13 January 2022

Volume 24, pages 831–846, (2023)
Cite this article

Optimization and Engineering Aims and scope Submit manuscript

Ruggiero Seccia¹,
Francesco Foglino³,
Matteo Leonetti² &
…
Simone Sagratella¹

Abstract

Training agents over sequences of tasks is often employed in deep reinforcement learning to let the agents progress more quickly towards better behaviours. This problem, known as curriculum learning, has been mainly tackled in the literature by numerical methods based on enumeration strategies, which, however, can handle only small size problems. In this work, we define a new optimization perspective to the curriculum learning problem with the aim of developing efficient solution methods for solving complex reinforcement learning tasks. Specifically, we show how the curriculum learning problem can be viewed as an optimization problem with a nonsmooth and nonconvex objective function and with an integer feasible region. We reformulate it by defining a grey-box function that includes a suitable scheduling problem. Numerical results on a benchmark environment in the reinforcement learning community show the effectiveness of the proposed approaches in reaching better performance also on large problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Gray-Box Approach for Curriculum Learning

Reinforcement Learning with Success Induced Task Prioritization

Automatic Curriculum Generation by Hierarchical Reinforcement Learning

References

Belotti P, Kirches C, Leyffer S, Linderoth J, Luedtke J, Mahajan A (2013) Mixed-integer nonlinear optimization. Acta Numer 22:1–131
Article MathSciNet MATH Google Scholar
Bergstra J, Yamins D, Cox DD (2013) Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures. Proceedings of the 30th International Conference on Machine Learning(ICML 2013), June 2013, pp. I-115–I-23. https://github.com/hyperopt/hyperopt. Accessed on 4 Nov 2019
Bergstra JS, Bardenet R, Bengio Y, Kégl B (2011) Algorithms for hyper-parameter optimization. In: Advances in neural information processing systems, pp 2546–2554
Custódio AL, Scheinberg K, Nunes Vicente L (2017) Methodologies and software for derivative-free optimization. In: Advances and trends in optimization with engineering applications, pp. 495–506
Di Pillo G, Liuzzi G, Lucidi S, Piccialli V, Rinaldi F (2016) A DIRECT-type approach for derivative-free constrained global optimization. Comput Optim Appl 65(2):361–397
Article MathSciNet MATH Google Scholar
Foglino F, Christakou CC, Leonetti M (2019) An optimization framework for task sequencing in curriculum learning. In: 2019 Joint IEEE 9th international conference on development and learning and epigenetic robotics (ICDL-EpiRob), IEEE, pp 207–214
Foglino F, Leonetti M, Sagratella S, Seccia R (2019) A gray-box approach for curriculum learning. World congress on global optimization. Springer, Cham, pp 720–729
Google Scholar
Frazier PI (2018) A tutorial on bayesian optimization. arXiv preprint arXiv:1807.02811
Gpyopt (2016) A bayesian optimization framework in python. http://github.com/SheffieldML/GPyOpt. Accessed on 4 Nov 2019
IBM: IBM Decision Optimization (2019). http://ibmdecisionoptimization.github.io/docplex-doc/mp/refman.html
Leonetti M, Kormushev P, Sagratella S (2012) Combining local and global direct derivative-free optimization for reinforcement learning. Cybern Inf Technol 12(3):53–65
Google Scholar
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529
Article Google Scholar
Narvekar S, Sinapov J, Leonetti M, Stone P (2016) Source task creation for curriculum learning. In: Proceedings of the 2016 international conference on autonomous agents & multiagent systems, International Foundation for Autonomous Agents and Multiagent Systems, pp 566–574
Narvekar S, Sinapov J, Stone P (2017) Autonomous task sequencing for customized curriculum design in reinforcement learning. In: IJCAI, pp. 2536–2542
Narvekar S, Peng B, Leonetti M, Sinapov J, Taylor ME, Stone P (2020) Curriculum learning for reinforcement learning domains: a framework and survey. J Mach Learn Res 21(181):1–50
MathSciNet MATH Google Scholar
Peng B, MacGlashan J, Loftin R, Littman ML, Roberts DL. Taylor ME (2016) An empirical study of non-expert curriculum design for machine learners. In: In Proceedings of the IJCAI interactive machine learning workshop
Rasmussen CE (2004) Gaussian processes in machine learning. Advanced lectures on machine learning. Springer, Berlin, pp 63–71
Chapter Google Scholar
Shahriari B, Swersky K, Wang Z, Adams RP, De Freitas N (2016) Taking the human out of the loop: a review of Bayesian optimization. Proc IEEE 104(1):148–175
Article Google Scholar
Snoek J, Larochelle H, Adams RP (2012) Practical bayesian optimization of machine learning algorithms. In: Advances in neural information processing systems, pp 2951–2959
Sutton Richard S, Barto AG (2018) Reinforcement Learning: an introduction. MIT Press, Cambridge
MATH Google Scholar
Svetlik M, Leonetti M, Sinapov J, Shah R, Walker N, Stone P (2017) Automatic curriculum graph generation for reinforcement learning agents. In: AAAI, pp 2590–2596

Download references

Author information

Authors and Affiliations

Department of Computer, Control and Management Engineering Antonio Ruberti, Sapienza University of Rome, Rome, Italy
Ruggiero Seccia & Simone Sagratella
Department of Informatics, King’s College London, London, UK
Matteo Leonetti
School of Computing, University of Leeds, Leeds, UK
Francesco Foglino

Authors

Ruggiero Seccia
View author publications
You can also search for this author in PubMed Google Scholar
Francesco Foglino
View author publications
You can also search for this author in PubMed Google Scholar
Matteo Leonetti
View author publications
You can also search for this author in PubMed Google Scholar
Simone Sagratella
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Simone Sagratella.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Seccia, R., Foglino, F., Leonetti, M. et al. A novel optimization perspective to the problem of designing sequences of tasks in a reinforcement learning framework. Optim Eng 24, 831–846 (2023). https://doi.org/10.1007/s11081-021-09708-x

Download citation

Received: 28 January 2021
Revised: 21 December 2021
Accepted: 21 December 2021
Published: 13 January 2022
Issue Date: June 2023
DOI: https://doi.org/10.1007/s11081-021-09708-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A novel optimization perspective to the problem of designing sequences of tasks in a reinforcement learning framework

Abstract

Access this article

Similar content being viewed by others

A Gray-Box Approach for Curriculum Learning

Reinforcement Learning with Success Induced Task Prioritization

Automatic Curriculum Generation by Hierarchical Reinforcement Learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A novel optimization perspective to the problem of designing sequences of tasks in a reinforcement learning framework

Abstract

Access this article

Similar content being viewed by others

A Gray-Box Approach for Curriculum Learning

Reinforcement Learning with Success Induced Task Prioritization

Automatic Curriculum Generation by Hierarchical Reinforcement Learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation