A Comparative Study of Hyperparameter Optimization Techniques for Deep Learning

Chowdhury, Anjir Ahmed; Das, Argho; Hoque, Khadija Kubra Shahjalal; Karmaker, Debajyoti

doi:10.1007/978-981-19-0332-8_38

Anjir Ahmed Chowdhury⁷,
Argho Das⁷,
Khadija Kubra Shahjalal Hoque⁷ &
…
Debajyoti Karmaker⁷

Part of the book series: Algorithms for Intelligent Systems ((AIS))

594 Accesses
2 Altmetric

Abstract

Algorithms for deep learning (DL) have been widely employed in a variety of applications and fields. The hyperparameters of a deep learning model must be optimized to match different challenges. For deep learning models, choosing the optimum hyperparameter configuration has a direct influence on the model’s performance. It typically involves a thorough understanding of deep learning algorithms and their hyperparameter optimization (HPO) techniques. Although there are various automatic optimization approaches available, each has its own set of advantages and disadvantages when applied to different datasets and architectures. In this paper, we analyzed which algorithm takes the longest optimization time to optimize an architecture and whether the performance of HPO algorithms is consistent across different datasets and architectures. We selected VGG16 and ResNet50 architectures, CIFAR10 and Intel Image Classification Dataset, as well as Grid search (GS), Genetic algorithm (GA), Bayesian optimization (BO), Random search (RS), Hyperband (HB) and Particle swarm optimization (PSO) HPO algorithms for comparison. Due to the lack of pattern, it is challenging to determine which approach obtains the best performance on different datasets and architecture. The results show that all of the algorithms have similar results in terms of optimization time. This research is expected to aid DL users, developers, data analysts, and researchers in their attempts to use and adapt DL models utilizing appropriate HPO methodologies and frameworks. It will also help to better understand the challenges that currently exist in the HPO field, allowing future research into HPO and DL applications to move forward.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 229.00; Price excludes VAT (USA)

Softcover Book: USD 299.99; Price excludes VAT (USA)

Hardcover Book: USD 299.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Grigorescu S, Trasnea B, Cocias T, Macesanu G (2020) A survey of deep learning techniques for autonomous driving. J Field Robot 37(3):362–86. https://doi.org/10.1002/rob.21918
Article Google Scholar
Avanzini G, de Angelis EL, Giulietti F (2021) Performance analysis and sizing guidelines of electrically-powered extraterrestrial rovers. Acta Astronautica 178:349–59. https://www.sciencedirect.com/science/article/pii/S0094576520305749
Yu X, Wang P, Zhang Z (2021) Learning-based end-to-end path planning for lunar rovers with safety constraints. Sensors 21(3). https://www.mdpi.com/1424-8220/21/3/796
Budiharto W, Gunawan AAS, Suroso JS, Chowanda A, Patrik A, Utama G (2018) Fast object detection for quadcopter drone using deep learning. In: 2018 3rd international conference on computer and communication systems (ICCCS), pp 192–195
Google Scholar
Lu H, Uemura T, Wang D, Zhu J, Huang Z, Kim H (2020) Deep-sea organisms tracking using dehazing and deep learning. Mobile Netw Appl 25(6):2536
Google Scholar
Shuvo AAC, Chowdhury SK, Hanif M, Nosheen SN, Zishan MSR (2021) Design and development of citizen surveillance and social-credit information system for Bangladesh. AIUB J Sci Eng (AJSE) 20(2):33–39
Article Google Scholar
Chowdhury AA, Chowdhury SK, Hanif M, Nosheen SN, Zishan MSR (2020) YOLO-based enhancement of public safety on roads and transportation in Bangladesh. AIUB J Sci Eng (AJSE) 19(2):71–78
Article Google Scholar
Nampoothiri MGH, Vinayakumar B, Sunny Y, Antony R (2021) Recent developments in terrain identification, classification, parameter estimation for the navigation of autonomous robots. SN Appl Sci 3(4):1–14. https://doi.org/10.1007/s42452-021-04453-3
Article Google Scholar
Hasan KT, Rahman MM, Ahmmed MM, Chowdhury AA, Islam MK (2021) 4P model for dynamic prediction of COVID-19: a statistical and machine learning approach. Cogn Comput Special Issue:97–110
Google Scholar
Chowdhury AA, Hasan KT, Hoque KKS (2021) Analysis and prediction of COVID-19 pandemic in Bangladesh by using ANFIS and LSTM network. Cogn Comput 13(3):761–770. https://doi.org/10.1007/s12559-021-09859-0
Article Google Scholar
Yang L, Shami A (2020) On hyperparameter optimization of machine learning algorithms: theory and practice. Neurocomputing 415:295–316. https://www.sciencedirect.com/science/article/pii/S0925231220311693
O’Malley T et al (2019) Keras Tuner. https://github.com/keras-team/keras-tuner
sklearn.model_selection.GridSearchCV. https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html?highlight=gridsearch#sklearn.model_selection.GridSearchCV
Abreu S (2019) Automated architecture design for deep neural networks
Google Scholar
Liashchynskyi P, Liashchynskyi P (2019) Grid search, random search, genetic algorithm: a big comparison for NAS. ArXiv:1912.06059
Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13:281–305
MathSciNet MATH Google Scholar
Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13(10):281–305. http://jmlr.org/papers/v13/bergstra12a.html
Li L, Jamieson K, DeSalvo G, Rostamizadeh A, Talwalkar A (2017) Hyperband: a novel bandit-based approach to hyperparameter optimization. J Mach Learn Res 18(1):6765–6816
MathSciNet MATH Google Scholar
Lambora A, Gupta K, Chopra K (2019) Genetic algorithm-a literature review. In: 2019 international conference on machine learning, big data, cloud and parallel computing (COMITCon), pp 380–384
Google Scholar
Zhang Y, Agarwal P, Bhatnagar V, Balochian S, Yan J (2013) Swarm intelligence and its applications. Sci World J 2013:1–3
Google Scholar
Byla E, Pang W (2020) DeepSwarm: optimising convolutional neural networks using swarm intelligence. In: Advances in intelligent systems and computing advances in computational intelligence systems, pp 119–130
Google Scholar
Bacanin N, Bezdan T, Tuba E, Strumberger I, Tuba M (2020) Optimizing convolutional neural network hyperparameters by enhanced swarm intelligence metaheuristics. Algorithms 13(3). https://www.mdpi.com/1999-4893/13/3/67
Zhang X, Zhao K, Niu Y (2020) Improved Harris Hawks optimization based on adaptive cooperative foraging and dispersed foraging strategies. IEEE Access 8:160297–160314
Article Google Scholar
Milosevic S, Bezdan T, Zivkovic M, Bacanin N, Strumberger I, Tuba M (2021) Feed-forward neural network training by hybrid bat algorithm. In: Modelling and development of intelligent systems communications in computer and information, pp 52–66
Google Scholar
Bacanin N, Bezdan T, Zivkovic M, Chhabra A (2021) Weight optimization in artificial neural network training by improved monarch butterfly algorithm. In: Mobile computing and sustainable informatics lecture notes on data engineering and communications technologies, pp 397–409
Google Scholar
Spanaki K, Karafili E, Sivarajah U, Despoudi S, Irani Z (2021) Artificial intelligence and food security: swarm intelligence of AgriTech drones for smart AgriFood operations. Prod Plann Control 0(0):1–19. https://doi.org/10.1080/09537287.2021.1882688
Aufa BZ, Suyanto S, Arifianto A (2020) Hyperparameter setting of LSTM-based language model using grey wolf optimizer. In: 2020 international conference on data science and its applications (ICoDSA), pp 1–5
Google Scholar
Dorigo M, Birattari M, Stutzle T (2006) Ant colony optimization. IEEE Comput Intell Mag 1(4):28–39
Article Google Scholar
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of ICNN’95 - international conference on neural networks, vol 4, pp 1942–1948
Google Scholar
Teodorović D (2009) Bee Colony Optimization (BCO). In: Innovations in swarm intelligence studies in computational intelligence, pp 39–60
Google Scholar
Hassan EA, Hafez AI, Hassanien AE, Fahmy AA (2015) Community detection algorithm based on artificial fish swarm optimization. In: Advances in intelligent systems and computing intelligent systems 2014, pp 509–521
Google Scholar
Neshat M, Sepidnam G, Sargolzaei M (2012) Swallow swarm optimization algorithm: a new method to optimization. Neural Comput Appl 23(2):429–454
Article Google Scholar
Yang XS (2009) Firefly algorithms for multimodal optimization. In: Foundations and applications lecture notes in computer science, stochastic algorithms, pp 169–178
Google Scholar
Yang X, Gandomi AH (2012) Bat algorithm: a novel approach for global engineering optimization. Eng Comput 29(5):464–483
Article Google Scholar
Mirjalili S, Mirjalili SM, Lewis A (2014) Grey wolf optimizer. Adv Eng Softw 69:46–61. https://www.sciencedirect.com/science/article/pii/S0965997813001853
Krishnanand KN, Ghose D (2008) Glowworm swarm optimization for simultaneous capture of multiple local optima of multimodal functions. Swarm Intell 3(2):87–124
Article Google Scholar
Mirjalili S, Lewis A (2016) The whale optimization algorithm. Adv Eng Softw 95:51–67. https://www.sciencedirect.com/science/article/pii/S0965997816300163
Chandrasekaran K, Simon SP (2012) Multi-objective scheduling problem: hybrid approach using fuzzy assisted cuckoo search algorithm. Swarm Evol Comput 5:1–16. https://www.sciencedirect.com/science/article/pii/S2210650212000107
Jamon M (1987) Effectiveness and limitation of random search in homing behaviour. Springer, Dordrecht. https://doi.org/10.1007/978-94-009-3531-0_23
CIFAR10 (2017). https://www.cs.toronto.edu/~kriz/cifar.html
Bansal P (2019) Intel image classification. https://www.kaggle.com/puneet6060/intel-image-classification

Download references

Author information

Authors and Affiliations

Department of Computer Science, American International University-Bangladesh, Dhaka, Bangladesh
Anjir Ahmed Chowdhury, Argho Das, Khadija Kubra Shahjalal Hoque & Debajyoti Karmaker

Authors

Anjir Ahmed Chowdhury
View author publications
You can also search for this author in PubMed Google Scholar
Argho Das
View author publications
You can also search for this author in PubMed Google Scholar
Khadija Kubra Shahjalal Hoque
View author publications
You can also search for this author in PubMed Google Scholar
Debajyoti Karmaker
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anjir Ahmed Chowdhury .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Jahangirnagar University, Dhaka, Bangladesh
Mohammad Shorif Uddin
Nazarbayev University, Nur-Sultan, Kazakhstan
Prashant Kumar Jamwal
Department of Mathematics, South Asian University, New Delhi, India
Jagdish Chand Bansal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chowdhury, A.A., Das, A., Hoque, K.K.S., Karmaker, D. (2022). A Comparative Study of Hyperparameter Optimization Techniques for Deep Learning. In: Uddin, M.S., Jamwal, P.K., Bansal, J.C. (eds) Proceedings of International Joint Conference on Advances in Computational Intelligence. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-19-0332-8_38

Download citation

DOI: https://doi.org/10.1007/978-981-19-0332-8_38
Published: 19 May 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-0331-1
Online ISBN: 978-981-19-0332-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

A Comparative Study of Hyperparameter Optimization Techniques for Deep Learning