Skip to main content

A Comparative Study of Hyperparameter Optimization Techniques for Deep Learning

  • Conference paper
  • First Online:
Proceedings of International Joint Conference on Advances in Computational Intelligence

Abstract

Algorithms for deep learning (DL) have been widely employed in a variety of applications and fields. The hyperparameters of a deep learning model must be optimized to match different challenges. For deep learning models, choosing the optimum hyperparameter configuration has a direct influence on the model’s performance. It typically involves a thorough understanding of deep learning algorithms and their hyperparameter optimization (HPO) techniques. Although there are various automatic optimization approaches available, each has its own set of advantages and disadvantages when applied to different datasets and architectures. In this paper, we analyzed which algorithm takes the longest optimization time to optimize an architecture and whether the performance of HPO algorithms is consistent across different datasets and architectures. We selected VGG16 and ResNet50 architectures, CIFAR10 and Intel Image Classification Dataset, as well as Grid search (GS), Genetic algorithm (GA), Bayesian optimization (BO), Random search (RS), Hyperband (HB) and Particle swarm optimization (PSO) HPO algorithms for comparison. Due to the lack of pattern, it is challenging to determine which approach obtains the best performance on different datasets and architecture. The results show that all of the algorithms have similar results in terms of optimization time. This research is expected to aid DL users, developers, data analysts, and researchers in their attempts to use and adapt DL models utilizing appropriate HPO methodologies and frameworks. It will also help to better understand the challenges that currently exist in the HPO field, allowing future research into HPO and DL applications to move forward.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 299.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 299.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Grigorescu S, Trasnea B, Cocias T, Macesanu G (2020) A survey of deep learning techniques for autonomous driving. J Field Robot 37(3):362–86. https://doi.org/10.1002/rob.21918

    Article  Google Scholar 

  2. Avanzini G, de Angelis EL, Giulietti F (2021) Performance analysis and sizing guidelines of electrically-powered extraterrestrial rovers. Acta Astronautica 178:349–59. https://www.sciencedirect.com/science/article/pii/S0094576520305749

  3. Yu X, Wang P, Zhang Z (2021) Learning-based end-to-end path planning for lunar rovers with safety constraints. Sensors 21(3). https://www.mdpi.com/1424-8220/21/3/796

  4. Budiharto W, Gunawan AAS, Suroso JS, Chowanda A, Patrik A, Utama G (2018) Fast object detection for quadcopter drone using deep learning. In: 2018 3rd international conference on computer and communication systems (ICCCS), pp 192–195

    Google Scholar 

  5. Lu H, Uemura T, Wang D, Zhu J, Huang Z, Kim H (2020) Deep-sea organisms tracking using dehazing and deep learning. Mobile Netw Appl 25(6):2536

    Google Scholar 

  6. Shuvo AAC, Chowdhury SK, Hanif M, Nosheen SN, Zishan MSR (2021) Design and development of citizen surveillance and social-credit information system for Bangladesh. AIUB J Sci Eng (AJSE) 20(2):33–39

    Article  Google Scholar 

  7. Chowdhury AA, Chowdhury SK, Hanif M, Nosheen SN, Zishan MSR (2020) YOLO-based enhancement of public safety on roads and transportation in Bangladesh. AIUB J Sci Eng (AJSE) 19(2):71–78

    Article  Google Scholar 

  8. Nampoothiri MGH, Vinayakumar B, Sunny Y, Antony R (2021) Recent developments in terrain identification, classification, parameter estimation for the navigation of autonomous robots. SN Appl Sci 3(4):1–14. https://doi.org/10.1007/s42452-021-04453-3

    Article  Google Scholar 

  9. Hasan KT, Rahman MM, Ahmmed MM, Chowdhury AA, Islam MK (2021) 4P model for dynamic prediction of COVID-19: a statistical and machine learning approach. Cogn Comput Special Issue:97–110

    Google Scholar 

  10. Chowdhury AA, Hasan KT, Hoque KKS (2021) Analysis and prediction of COVID-19 pandemic in Bangladesh by using ANFIS and LSTM network. Cogn Comput 13(3):761–770. https://doi.org/10.1007/s12559-021-09859-0

    Article  Google Scholar 

  11. Yang L, Shami A (2020) On hyperparameter optimization of machine learning algorithms: theory and practice. Neurocomputing 415:295–316. https://www.sciencedirect.com/science/article/pii/S0925231220311693

  12. O’Malley T et al (2019) Keras Tuner. https://github.com/keras-team/keras-tuner

  13. sklearn.model_selection.GridSearchCV. https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html?highlight=gridsearch#sklearn.model_selection.GridSearchCV

  14. Abreu S (2019) Automated architecture design for deep neural networks

    Google Scholar 

  15. Liashchynskyi P, Liashchynskyi P (2019) Grid search, random search, genetic algorithm: a big comparison for NAS. ArXiv:1912.06059

  16. Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13:281–305

    MathSciNet  MATH  Google Scholar 

  17. Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13(10):281–305. http://jmlr.org/papers/v13/bergstra12a.html

  18. Li L, Jamieson K, DeSalvo G, Rostamizadeh A, Talwalkar A (2017) Hyperband: a novel bandit-based approach to hyperparameter optimization. J Mach Learn Res 18(1):6765–6816

    MathSciNet  MATH  Google Scholar 

  19. Lambora A, Gupta K, Chopra K (2019) Genetic algorithm-a literature review. In: 2019 international conference on machine learning, big data, cloud and parallel computing (COMITCon), pp 380–384

    Google Scholar 

  20. Zhang Y, Agarwal P, Bhatnagar V, Balochian S, Yan J (2013) Swarm intelligence and its applications. Sci World J 2013:1–3

    Google Scholar 

  21. Byla E, Pang W (2020) DeepSwarm: optimising convolutional neural networks using swarm intelligence. In: Advances in intelligent systems and computing advances in computational intelligence systems, pp 119–130

    Google Scholar 

  22. Bacanin N, Bezdan T, Tuba E, Strumberger I, Tuba M (2020) Optimizing convolutional neural network hyperparameters by enhanced swarm intelligence metaheuristics. Algorithms 13(3). https://www.mdpi.com/1999-4893/13/3/67

  23. Zhang X, Zhao K, Niu Y (2020) Improved Harris Hawks optimization based on adaptive cooperative foraging and dispersed foraging strategies. IEEE Access 8:160297–160314

    Article  Google Scholar 

  24. Milosevic S, Bezdan T, Zivkovic M, Bacanin N, Strumberger I, Tuba M (2021) Feed-forward neural network training by hybrid bat algorithm. In: Modelling and development of intelligent systems communications in computer and information, pp 52–66

    Google Scholar 

  25. Bacanin N, Bezdan T, Zivkovic M, Chhabra A (2021) Weight optimization in artificial neural network training by improved monarch butterfly algorithm. In: Mobile computing and sustainable informatics lecture notes on data engineering and communications technologies, pp 397–409

    Google Scholar 

  26. Spanaki K, Karafili E, Sivarajah U, Despoudi S, Irani Z (2021) Artificial intelligence and food security: swarm intelligence of AgriTech drones for smart AgriFood operations. Prod Plann Control 0(0):1–19. https://doi.org/10.1080/09537287.2021.1882688

  27. Aufa BZ, Suyanto S, Arifianto A (2020) Hyperparameter setting of LSTM-based language model using grey wolf optimizer. In: 2020 international conference on data science and its applications (ICoDSA), pp 1–5

    Google Scholar 

  28. Dorigo M, Birattari M, Stutzle T (2006) Ant colony optimization. IEEE Comput Intell Mag 1(4):28–39

    Article  Google Scholar 

  29. Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of ICNN’95 - international conference on neural networks, vol 4, pp 1942–1948

    Google Scholar 

  30. Teodorović D (2009) Bee Colony Optimization (BCO). In: Innovations in swarm intelligence studies in computational intelligence, pp 39–60

    Google Scholar 

  31. Hassan EA, Hafez AI, Hassanien AE, Fahmy AA (2015) Community detection algorithm based on artificial fish swarm optimization. In: Advances in intelligent systems and computing intelligent systems 2014, pp 509–521

    Google Scholar 

  32. Neshat M, Sepidnam G, Sargolzaei M (2012) Swallow swarm optimization algorithm: a new method to optimization. Neural Comput Appl 23(2):429–454

    Article  Google Scholar 

  33. Yang XS (2009) Firefly algorithms for multimodal optimization. In: Foundations and applications lecture notes in computer science, stochastic algorithms, pp 169–178

    Google Scholar 

  34. Yang X, Gandomi AH (2012) Bat algorithm: a novel approach for global engineering optimization. Eng Comput 29(5):464–483

    Article  Google Scholar 

  35. Mirjalili S, Mirjalili SM, Lewis A (2014) Grey wolf optimizer. Adv Eng Softw 69:46–61. https://www.sciencedirect.com/science/article/pii/S0965997813001853

  36. Krishnanand KN, Ghose D (2008) Glowworm swarm optimization for simultaneous capture of multiple local optima of multimodal functions. Swarm Intell 3(2):87–124

    Article  Google Scholar 

  37. Mirjalili S, Lewis A (2016) The whale optimization algorithm. Adv Eng Softw 95:51–67. https://www.sciencedirect.com/science/article/pii/S0965997816300163

  38. Chandrasekaran K, Simon SP (2012) Multi-objective scheduling problem: hybrid approach using fuzzy assisted cuckoo search algorithm. Swarm Evol Comput 5:1–16. https://www.sciencedirect.com/science/article/pii/S2210650212000107

  39. Jamon M (1987) Effectiveness and limitation of random search in homing behaviour. Springer, Dordrecht. https://doi.org/10.1007/978-94-009-3531-0_23

  40. CIFAR10 (2017). https://www.cs.toronto.edu/~kriz/cifar.html

  41. Bansal P (2019) Intel image classification. https://www.kaggle.com/puneet6060/intel-image-classification

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anjir Ahmed Chowdhury .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chowdhury, A.A., Das, A., Hoque, K.K.S., Karmaker, D. (2022). A Comparative Study of Hyperparameter Optimization Techniques for Deep Learning. In: Uddin, M.S., Jamwal, P.K., Bansal, J.C. (eds) Proceedings of International Joint Conference on Advances in Computational Intelligence. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-19-0332-8_38

Download citation

Publish with us

Policies and ethics