A Comparative Analysis of Gradient-Based Optimization Methods for Machine Learning Problems

Maurya, Manju; Yadav, Neha

doi:10.1007/978-981-99-3432-4_7

Manju Maurya⁶ &
Neha Yadav⁷

Part of the book series: Lecture Notes on Data Engineering and Communications Technologies ((LNDECT,volume 175))

Included in the following conference series:

International Conference on Data Analytics and Computing

100 Accesses
1 Citations

Abstract

In this study, we compare and contrast the seven most widely used gradient-based optimization algorithms of the first order for machine learning problems. These methods are Stochastic Gradient Descent with momentum (SGD), Adaptive Gradient (AdaGrad), Adaptive Delta (AdaDelta), Root Mean Square Propagation (RMSProp), Adaptive Moment Estimation (Adam), Nadam (Nestrove accelerated adaptive moment estimation) and Adamax (maximum adaptive moment estimation). For model creation and comparison, three test problems based on regression, binary classification and multi-classification are addressed. Using three randomly selected datasets, we trained the model and evaluated the optimization strategy in terms of accuracy and loss function. The total experimental results demonstrate that Nadam outperformed the other optimization approach across these datasets, but only in terms of correctness, not in terms of time. Adam optimizer has the best performance in terms of time and accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Soydaner D (2020) A comparison of optimization algorithms for deep learning. Int J Pattern Recognit Artif Intell 34(13):2052013
Article Google Scholar
Reddi SJ, Kale S, Kumar S (2019) On the convergence of adam and beyond. arXiv:1904.09237
Loshchilov I, HutterF (2019) Decoupled weight decay regularization. arXiv:1711.05101
Ma J, Yarats D (2018) Quasi-hyperbolic momentum and adam for deep learning. arXiv:1810.06801
Lucas J, Sun S, Zemel R, Grosse R (2018) Aggregated momentum: Stability through passive damping. arXiv:1804.00325
Qian N (1999) On the momentum term in gradient descent learning algorithms. Neural Netw 12(1):145–151
Article Google Scholar
Duchi J, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res 12(7)
Google Scholar
Zeiler MD (2012) Adadelta: an adaptive learning rate method. arXiv:1212.5701
G. Hinton (2012) Neural networks for machine learning, coursera, video lectures
Google Scholar
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980
Dozat T (2016) Incorporating nesterov momentum into adam
Google Scholar
Acharya MS, Armaan A, Antony AS (2019) A comparison of regression models for prediction of graduate admissions. In: International conference on computational intelligence in data science (ICCIDS). IEEE, pp 1–5
Google Scholar
https://www.kaggle.com/datasets/mathchi/churn-for-bank-customers
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics and Scientific computing, National Institute of Technology, Hamirpur, H.P., 177005, India
Manju Maurya
Department of Mathematics, Dr BR Ambedkar National Institute of Technology, Jalandhar, India
Neha Yadav

Authors

Manju Maurya
View author publications
You can also search for this author in PubMed Google Scholar
Neha Yadav
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Neha Yadav .

Editor information

Editors and Affiliations

Department of Mathematics, DR BR Ambedkar NIT Jalandhar, Jalandhar, India
Anupam Yadav
School of Mathematical Sciences, College of Science and Technology, Wenzhou Kean University, Wenzhou, China
Gaurav Gupta
School of Mathematical Sciences, College of Science and Technology, Wenzhou Kean University, Wenzhou, China
Puneet Rana
School of Civil, Environmental and Architectural Engineering, Korea University, Seoul, Korea (Republic of)
Joong Hoon Kim

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Maurya, M., Yadav, N. (2023). A Comparative Analysis of Gradient-Based Optimization Methods for Machine Learning Problems. In: Yadav, A., Gupta, G., Rana, P., Kim, J.H. (eds) Proceedings on International Conference on Data Analytics and Computing. ICDAC 2022. Lecture Notes on Data Engineering and Communications Technologies, vol 175. Springer, Singapore. https://doi.org/10.1007/978-981-99-3432-4_7

Download citation

DOI: https://doi.org/10.1007/978-981-99-3432-4_7
Published: 09 August 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-3431-7
Online ISBN: 978-981-99-3432-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics