Abstract
Deep Neural Networks (DNNs) have gained enormous research attention since they consistently outperform other state-of-the-art methods in a plethora of machine learning tasks. However, their performance strongly depends on the DNN hyper-parameters which are commonly tuned by experienced practitioners. Recently, we introduced Particle Swarm Optimization (PSO) and parallel PSO techniques to automate this process. In this work, we theoretically and experimentally investigate the convergence capabilities of these algorithms. The experiments were performed for several DNN architectures (both gradually augmented and hand-crafted by a human) using two challenging multi-class benchmark datasets—MNIST and CIFAR-10.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
See: https://github.com/fchollet/keras; last access date: July 29, 2017.
References
Liu, L., Luo, J., Deng, X., Li, S.: FPGA-based acceleration of deep neural networks using high level method. In: Proceedings 3PGCIC, pp. 824–827, November 2015
Lorenzo, P.R., Nalepa, J., Kawulok, M., Ramos, L.S., Pastor, J.R.: Particle swarm optimization for hyper-parameter selection in deep neural networks. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2017, New York, NY, USA. ACM (2017) 481–488
Lorenzo, P.R., Nalepa, J., Ramos, L.S., Pastor, J.R.: Hyper-parameter selection in deep neural networks using parallel particle swarm optimization. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, GECCO 2017, New York, NY, USA, pp. 1864–1871. ACM (2017)
Nalepa, J.: Genetic and memetic algorithms for selection of training sets for support vector machines. Ph.D. thesis, Silesian University of Technology, Poland (2016)
Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of machine learning algorithms. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Proceedings NIPS, pp. 2951–2959. Curran Associates (2012)
Bergstra, J., Bardenet, R., Bengio, Y., Kégl, B.: Algorithms for hyper-parameter optimization. In: Shawe-Taylor, J., Zemel, R.S., Bartlett, P.L., Pereira, F., Weinberger, K.Q. (eds.) Proceedings NIPS, pp. 2546–2554. Curran Associates (2011)
Loshchilov, I., Hutter, F.: CMA-ES for hyperparameter optimization of deep neural networks. CoRR abs/1604.07269, pp. 1–8 (2016)
Ilievski, I., Akhtar, T., Feng, J., Shoemaker, C.A.: Hyperparameter optimization of deep neural networks using non-probabilistic RBF surrogate model. CoRR abs/1607.08316, pp. 1–8 (2016)
David, O.E., Greental, I.: Genetic algorithms for evolving deep neural networks. In: Proceedings GECCO, USA, pp. 1451–1452. ACM (2014)
Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)
Larochelle, H., Erhan, D., Courville, A., Bergstra, J., Bengio, Y.: An empirical evaluation of deep architectures on problems with many factors of variation. In: Proceedings ICML, USA, pp. 473–480. ACM (2007)
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. CoRR abs/1412.6980, pp. 1–15 (2014)
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. In: Proceedings IEEE, pp. 2278–2324 (1998)
Schmidhuber, J.: Multi-column deep neural networks for image classification. In: Proceedings CVPR, USA, pp. 3642–3649. IEEE Computer Society (2012)
Graham, B.: Fractional max-pooling. CoRR abs/1412.6071, pp. 1–10 (2014)
Acknowledgements
This work has been supported by the Polish National Centre for Research and Development under the Innomed grant POIR.01.02.00-00-0030/15, and the Silesian University of Technology grant for young researchers (BKM-507/RAU2/2016).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Nalepa, J., Lorenzo, P.R. (2018). Convergence Analysis of PSO for Hyper-Parameter Selection in Deep Neural Networks. In: Xhafa, F., Caballé, S., Barolli, L. (eds) Advances on P2P, Parallel, Grid, Cloud and Internet Computing. 3PGCIC 2017. Lecture Notes on Data Engineering and Communications Technologies, vol 13. Springer, Cham. https://doi.org/10.1007/978-3-319-69835-9_27
Download citation
DOI: https://doi.org/10.1007/978-3-319-69835-9_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-69834-2
Online ISBN: 978-3-319-69835-9
eBook Packages: EngineeringEngineering (R0)