A Meta-Reinforcement Learning Approach to Optimize Parameters and Hyper-parameters Simultaneously

Ali, Abbas Raza; Budka, Marcin; Gabrys, Bogdan

doi:10.1007/978-3-030-29911-8_8

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11671))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

2711 Accesses
2 Citations

Abstract

In the last few years, we have witnessed a resurgence of interest in neural networks. The state-of-the-art deep neural network architectures are however challenging to design from scratch and requiring computationally costly empirical evaluations. Hence, there has been a lot of research effort dedicated to effective utilisation and adaptation of previously proposed architectures either by using transfer learning or by modifying the original architecture. The ultimate goal of designing a network architecture is to achieve the best possible accuracy for a given task or group of related tasks. Although there have been some efforts to automate network architecture design process, most of the existing solutions are still very computationally intensive. This work presents a framework to automatically find a good set of hyper-parameters resulting in reasonably good accuracy, which at the same time is less computationally expensive than the existing approaches. The idea presented here is to frame the hyper-parameter selection and tuning within the reinforcement learning regime. Thus, the parameters of a meta-learner, RNN, and hyper-parameters of the target network are tuned simultaneously. Our meta-learner is being updated using policy network and simultaneously generates a tuple of hyper-parameters which are utilized by another network. The network is trained on a given task for a number of steps and produces validation accuracy whose delta is used as reward. The reward along with the state of the network, comprising statistics of network’s final layer outcome and training loss, are fed back to the meta-learner which in turn generates a tuned tuple of hyper-parameters for the next time-step. Therefore, the effectiveness of a recommended tuple can be tested very quickly rather than waiting for the network to converge. This approach produces accuracy close to the state-of-the-art approach and is found to be comparatively less computationally intensive.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Learning Global Optimization by Deep Reinforcement Learning

Efficient hyperparameters optimization through model-based reinforcement learning with experience exploiting and meta-learning

Article 09 April 2023

Latent-Conditioned Policy Gradient for Multi-Objective Deep Reinforcement Learning

References

Ali, A., Budka, M., Gabrys, B.: Towards meta-level learning of deep neural networks for fast adaptation. In: Proceedings of the 16th Pacific RIM International Conference on Artificial Intelligence (PRICAI) (2019)
Google Scholar
DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. Computing Research Repository (CoRR) arXiv:1708.04552 (2017)
Duan, Y., Schulman, J., Chen, X., Bartlett, P.L., Sutskever, I., Abbeel, P.: RL2: fast reinforcement learning via slow reinforcement learning. Computing Research Repository (CoRR) arXiv:1611.02779 (2016)
Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 1126–1135. PMLR, International Convention Centre, Sydney, August 2017
Google Scholar
Finn, C., Levine, S.: Meta-learning and universality: deep representations and gradient descent can approximate any learning algorithm. Computing Research Repository (CoRR) arXiv:1710.11622 (2018)
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feed-forward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics. PMLR (2010)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)
Article Google Scholar
Huang, G., Sun, Y., Liu, Z., Sedra, D., Weinberger, K.: Deep networks with stochastic depth. Computing Research Repository (CoRR arXiv:1603.09382 (2016)
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference of Machine Learning (ICML) (2015)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (ICLR) (2015)
Google Scholar
Krizhevsky, A., Nair, V., Hinton, G.: CIFAR-10 and CIFAR-100. Canadian Institute for Advanced Research
Google Scholar
Le, Y., Yang, X.: Tiny ImageNet visual recognition challenge. Stanford CS 231N (2015)
Google Scholar
LeCun, Y., Cortes, C., Burges, C.J.C.: The MNIST dataset of handwritten digits (1999)
Google Scholar
Liu, C., Zoph, B., Neumann, M., et al.: Progressive neural architecture search. Computing Research Repository (CoRR) arXiv:1712.00559 (2018)
Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: International Conference of Machine Learning (ICML) (2010)
Google Scholar
Pham, H., Guan, M.Y., Zoph, B., Le, Q.V., Dean, J.: Efficient neural architecture search via parameter sharing. Computing Research Repository (CoRR) arXiv:1802.03268 (2018)
Ravi, S., Larochelle, H.: Optimization as a model for few-shot learning. In: International Conference on Learning Representations (ICLR) (2017)
Google Scholar
Sutskever, I., Martens, J., Dahl, G., Hinton, G.E.: Practical network blocks design with Q-learning. In: International Conference of Machine Learning (ICML) (2013)
Google Scholar
Sutton, R.S., McAllester, D., Singh, S., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: NIPS (1999)
Google Scholar
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn., 41–49 (2019)
Google Scholar
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. Computing Research Repository (CoRR) arXiv:1708.07747 (2017)
Xu, T., Liu, Q., Zhao, L., Peng, J.: Learning to explore with meta-policy gradient. Computing Research Repository (CoRR) arXiv:1803.05044 (2018)
Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. In: International Conference on Learning Representations (ICLR) (2017)
Google Scholar
Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for salable image recognition. In: Computer Vision and Pattern Recognition (CVPR) (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Bournemouth University, Poole, BH12 5BB, UK
Abbas Raza Ali & Marcin Budka
University Technology Sydney, Ultimo, NSW, 2007, Australia
Bogdan Gabrys

Authors

Abbas Raza Ali
View author publications
You can also search for this author in PubMed Google Scholar
Marcin Budka
View author publications
You can also search for this author in PubMed Google Scholar
Bogdan Gabrys
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Abbas Raza Ali .

Editor information

Editors and Affiliations

Department of Computing, Macquarie University, Sydney, NSW, Australia
Abhaya C. Nayak
RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
Alok Sharma

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ali, A.R., Budka, M., Gabrys, B. (2019). A Meta-Reinforcement Learning Approach to Optimize Parameters and Hyper-parameters Simultaneously. In: Nayak, A., Sharma, A. (eds) PRICAI 2019: Trends in Artificial Intelligence. PRICAI 2019. Lecture Notes in Computer Science(), vol 11671. Springer, Cham. https://doi.org/10.1007/978-3-030-29911-8_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-29911-8_8
Published: 23 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29910-1
Online ISBN: 978-3-030-29911-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Meta-Reinforcement Learning Approach to Optimize Parameters and Hyper-parameters Simultaneously

Abstract

Access this chapter

Similar content being viewed by others

Learning Global Optimization by Deep Reinforcement Learning

Efficient hyperparameters optimization through model-based reinforcement learning with experience exploiting and meta-learning

Latent-Conditioned Policy Gradient for Multi-Objective Deep Reinforcement Learning

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

A Meta-Reinforcement Learning Approach to Optimize Parameters and Hyper-parameters Simultaneously

Abstract

Access this chapter

Similar content being viewed by others

Learning Global Optimization by Deep Reinforcement Learning

Efficient hyperparameters optimization through model-based reinforcement learning with experience exploiting and meta-learning

Latent-Conditioned Policy Gradient for Multi-Objective Deep Reinforcement Learning

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation