Abstract
Algorithms for deep learning (DL) have been widely employed in a variety of applications and fields. The hyperparameters of a deep learning model must be optimized to match different challenges. For deep learning models, choosing the optimum hyperparameter configuration has a direct influence on the model’s performance. It typically involves a thorough understanding of deep learning algorithms and their hyperparameter optimization (HPO) techniques. Although there are various automatic optimization approaches available, each has its own set of advantages and disadvantages when applied to different datasets and architectures. In this paper, we analyzed which algorithm takes the longest optimization time to optimize an architecture and whether the performance of HPO algorithms is consistent across different datasets and architectures. We selected VGG16 and ResNet50 architectures, CIFAR10 and Intel Image Classification Dataset, as well as Grid search (GS), Genetic algorithm (GA), Bayesian optimization (BO), Random search (RS), Hyperband (HB) and Particle swarm optimization (PSO) HPO algorithms for comparison. Due to the lack of pattern, it is challenging to determine which approach obtains the best performance on different datasets and architecture. The results show that all of the algorithms have similar results in terms of optimization time. This research is expected to aid DL users, developers, data analysts, and researchers in their attempts to use and adapt DL models utilizing appropriate HPO methodologies and frameworks. It will also help to better understand the challenges that currently exist in the HPO field, allowing future research into HPO and DL applications to move forward.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Grigorescu S, Trasnea B, Cocias T, Macesanu G (2020) A survey of deep learning techniques for autonomous driving. J Field Robot 37(3):362–86. https://doi.org/10.1002/rob.21918
Avanzini G, de Angelis EL, Giulietti F (2021) Performance analysis and sizing guidelines of electrically-powered extraterrestrial rovers. Acta Astronautica 178:349–59. https://www.sciencedirect.com/science/article/pii/S0094576520305749
Yu X, Wang P, Zhang Z (2021) Learning-based end-to-end path planning for lunar rovers with safety constraints. Sensors 21(3). https://www.mdpi.com/1424-8220/21/3/796
Budiharto W, Gunawan AAS, Suroso JS, Chowanda A, Patrik A, Utama G (2018) Fast object detection for quadcopter drone using deep learning. In: 2018 3rd international conference on computer and communication systems (ICCCS), pp 192–195
Lu H, Uemura T, Wang D, Zhu J, Huang Z, Kim H (2020) Deep-sea organisms tracking using dehazing and deep learning. Mobile Netw Appl 25(6):2536
Shuvo AAC, Chowdhury SK, Hanif M, Nosheen SN, Zishan MSR (2021) Design and development of citizen surveillance and social-credit information system for Bangladesh. AIUB J Sci Eng (AJSE) 20(2):33–39
Chowdhury AA, Chowdhury SK, Hanif M, Nosheen SN, Zishan MSR (2020) YOLO-based enhancement of public safety on roads and transportation in Bangladesh. AIUB J Sci Eng (AJSE) 19(2):71–78
Nampoothiri MGH, Vinayakumar B, Sunny Y, Antony R (2021) Recent developments in terrain identification, classification, parameter estimation for the navigation of autonomous robots. SN Appl Sci 3(4):1–14. https://doi.org/10.1007/s42452-021-04453-3
Hasan KT, Rahman MM, Ahmmed MM, Chowdhury AA, Islam MK (2021) 4P model for dynamic prediction of COVID-19: a statistical and machine learning approach. Cogn Comput Special Issue:97–110
Chowdhury AA, Hasan KT, Hoque KKS (2021) Analysis and prediction of COVID-19 pandemic in Bangladesh by using ANFIS and LSTM network. Cogn Comput 13(3):761–770. https://doi.org/10.1007/s12559-021-09859-0
Yang L, Shami A (2020) On hyperparameter optimization of machine learning algorithms: theory and practice. Neurocomputing 415:295–316. https://www.sciencedirect.com/science/article/pii/S0925231220311693
O’Malley T et al (2019) Keras Tuner. https://github.com/keras-team/keras-tuner
sklearn.model_selection.GridSearchCV. https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html?highlight=gridsearch#sklearn.model_selection.GridSearchCV
Abreu S (2019) Automated architecture design for deep neural networks
Liashchynskyi P, Liashchynskyi P (2019) Grid search, random search, genetic algorithm: a big comparison for NAS. ArXiv:1912.06059
Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13:281–305
Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13(10):281–305. http://jmlr.org/papers/v13/bergstra12a.html
Li L, Jamieson K, DeSalvo G, Rostamizadeh A, Talwalkar A (2017) Hyperband: a novel bandit-based approach to hyperparameter optimization. J Mach Learn Res 18(1):6765–6816
Lambora A, Gupta K, Chopra K (2019) Genetic algorithm-a literature review. In: 2019 international conference on machine learning, big data, cloud and parallel computing (COMITCon), pp 380–384
Zhang Y, Agarwal P, Bhatnagar V, Balochian S, Yan J (2013) Swarm intelligence and its applications. Sci World J 2013:1–3
Byla E, Pang W (2020) DeepSwarm: optimising convolutional neural networks using swarm intelligence. In: Advances in intelligent systems and computing advances in computational intelligence systems, pp 119–130
Bacanin N, Bezdan T, Tuba E, Strumberger I, Tuba M (2020) Optimizing convolutional neural network hyperparameters by enhanced swarm intelligence metaheuristics. Algorithms 13(3). https://www.mdpi.com/1999-4893/13/3/67
Zhang X, Zhao K, Niu Y (2020) Improved Harris Hawks optimization based on adaptive cooperative foraging and dispersed foraging strategies. IEEE Access 8:160297–160314
Milosevic S, Bezdan T, Zivkovic M, Bacanin N, Strumberger I, Tuba M (2021) Feed-forward neural network training by hybrid bat algorithm. In: Modelling and development of intelligent systems communications in computer and information, pp 52–66
Bacanin N, Bezdan T, Zivkovic M, Chhabra A (2021) Weight optimization in artificial neural network training by improved monarch butterfly algorithm. In: Mobile computing and sustainable informatics lecture notes on data engineering and communications technologies, pp 397–409
Spanaki K, Karafili E, Sivarajah U, Despoudi S, Irani Z (2021) Artificial intelligence and food security: swarm intelligence of AgriTech drones for smart AgriFood operations. Prod Plann Control 0(0):1–19. https://doi.org/10.1080/09537287.2021.1882688
Aufa BZ, Suyanto S, Arifianto A (2020) Hyperparameter setting of LSTM-based language model using grey wolf optimizer. In: 2020 international conference on data science and its applications (ICoDSA), pp 1–5
Dorigo M, Birattari M, Stutzle T (2006) Ant colony optimization. IEEE Comput Intell Mag 1(4):28–39
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of ICNN’95 - international conference on neural networks, vol 4, pp 1942–1948
Teodorović D (2009) Bee Colony Optimization (BCO). In: Innovations in swarm intelligence studies in computational intelligence, pp 39–60
Hassan EA, Hafez AI, Hassanien AE, Fahmy AA (2015) Community detection algorithm based on artificial fish swarm optimization. In: Advances in intelligent systems and computing intelligent systems 2014, pp 509–521
Neshat M, Sepidnam G, Sargolzaei M (2012) Swallow swarm optimization algorithm: a new method to optimization. Neural Comput Appl 23(2):429–454
Yang XS (2009) Firefly algorithms for multimodal optimization. In: Foundations and applications lecture notes in computer science, stochastic algorithms, pp 169–178
Yang X, Gandomi AH (2012) Bat algorithm: a novel approach for global engineering optimization. Eng Comput 29(5):464–483
Mirjalili S, Mirjalili SM, Lewis A (2014) Grey wolf optimizer. Adv Eng Softw 69:46–61. https://www.sciencedirect.com/science/article/pii/S0965997813001853
Krishnanand KN, Ghose D (2008) Glowworm swarm optimization for simultaneous capture of multiple local optima of multimodal functions. Swarm Intell 3(2):87–124
Mirjalili S, Lewis A (2016) The whale optimization algorithm. Adv Eng Softw 95:51–67. https://www.sciencedirect.com/science/article/pii/S0965997816300163
Chandrasekaran K, Simon SP (2012) Multi-objective scheduling problem: hybrid approach using fuzzy assisted cuckoo search algorithm. Swarm Evol Comput 5:1–16. https://www.sciencedirect.com/science/article/pii/S2210650212000107
Jamon M (1987) Effectiveness and limitation of random search in homing behaviour. Springer, Dordrecht. https://doi.org/10.1007/978-94-009-3531-0_23
CIFAR10 (2017). https://www.cs.toronto.edu/~kriz/cifar.html
Bansal P (2019) Intel image classification. https://www.kaggle.com/puneet6060/intel-image-classification
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Chowdhury, A.A., Das, A., Hoque, K.K.S., Karmaker, D. (2022). A Comparative Study of Hyperparameter Optimization Techniques for Deep Learning. In: Uddin, M.S., Jamwal, P.K., Bansal, J.C. (eds) Proceedings of International Joint Conference on Advances in Computational Intelligence. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-19-0332-8_38
Download citation
DOI: https://doi.org/10.1007/978-981-19-0332-8_38
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-0331-1
Online ISBN: 978-981-19-0332-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)