Skip to main content
Log in

Combination of Optimization Methods in a Multistage Approach for a Deep Neural Network Model

  • Original Research
  • Published:
International Journal of Information Technology Aims and scope Submit manuscript

Abstract

This paper on gradient descent (GD) lies at the heart and soul of neural networks. The development of GD optimization algorithms significantly sped up the advancement of deep learning. Gradient Descent methods focus on deep learning research; some research projects have attempted to mix multiple training approaches to improve network performance; moreover, these methods seem primarily practical and need more theoretical guidance. This paper develops an architecture to demonstrate the combination of various GD optimization methodologies by analyzing other learning rates and numerous adaptive methods. This research aims to show how to apply GD to different optimization methods in Multistage into the GD optimization approach for exploring a deep learning model using a GD optimization method mixing technique. This research was motivated by the principles of SGDR (stochastic gradient descent with warm restarts), warm-up, and CLR (cyclical learning rates). The results of the training tests with the huge deep learning network validate the efficiency of the technique. This experiment is done by google colab python.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Availability of Data and Materials

The data used in this paper can be requested from the corresponding author upon request.

References

  1. Shafi Patel, Parag Parandkar, et al., "Exploring Alternative Topologies for Network-on-Chip Architectures," BIJIT - BVICAM's International Journal of Information Technology, Vol.3 No.2, July – December 2011.

  2. Singha, A.K., Pathak, N, Sharma. ,N, Tiwari ,P.K., J. P. C. Joel., ”COVID-19 Disease Classification Model Using Deep Dense Convolutional Neural Networks”, In Emerging Technologies in Data Mining and Information Security, pp. 671–682. Springer, Singapore, 2023.

  3. Kingma, D. P., Ba, J.,” Adam: A method for stochastic optimization”, arXiv preprint arXiv:1412.6980((2014).

  4. N. Qian.,” On the momentum term in gradient descent learning algorithms”, Neural Networks, vol. 12, no. 1,pp. 145–151, 1999.

  5. Nesterov, Y. ,” Gradient methods for minimizing composite functions," Mathematical Programming, vol. 140, no. 1,pp. 125–161, 2013.

  6. Singha, A.K., Nitish Pathak., Sharma, N., Tiwari, P.K., J. P. C. Joel.,” Forecasting COVID-19 Confirmed Cases in China Using an Optimization Method”,.In Emerging Technologies in Data Mining and Information Security, pp. 683–695. Springer, Singapore, 2023.

  7. Ruder, S. ,” An overview of gradient descent optimization algorithms”, arXiv preprint arXiv:1609.04747(2016).

  8. Bengio, Y.,”Practical recommendations for gradient-based training of deep architectures”, In Neural networks: Tricks of the trade (pp. 437–478) (2012).

  9. Robbins, H., Monro, S.,” A stochastic approximation method,”.The annals of mathematical statistics, 400–407(1951).

  10. Darken, C., Chang, J., Moody, J.,” Learning rate schedules for faster stochastic gradient search”, In Neural networks for signal processing (Vol. 2)(1992).

  11. Upadhyay, S.K., Kumar, A.,” A novel approach for rice plant diseases classification with deep convolutional neural network”, Int. J. Inf. Technol. 14, 185 – 199 (2022).

  12. Kalaiselvi, T., Padmapriya, S.T., sriramakrishnan, P., Somasundaram, K.,”Deriving tumor detection models using convolutional neural networks from MRI of human brain scans," Int. J. Inf. Technol. 12, 403 – 408 (2020).

  13. Goyal, P., Dollár, P., Girshick, R., Noordhuis, P., Wesolowski, L., Kyrola, A., He, K. ,” Accurate, large minibatch sgd: Training imagenet in 1 hour”, arXiv preprint arXiv:1706.02677(2017).

  14. Smith, L. N. ,” Cyclical learning rates for training neural networks. In IEEE winter conference on applications of computer vision”, (WACV), (pp. 464–472)(2017).

  15. Loshchilov, I., Hutter, F.,” SGDR: stochastic gradient descent with warm restarts”, in Proceedings of ICLR : International Conference on Learning Representations ( 2016).

  16. Zeng, X., Ouyang, W., Wang, X.,” Multistage contextual deep learning for pedestrian detection”, in Proceeding of the IEEE International Conference on Computer Vision, pp. 121– 128, Sydney, Australia, ( 2013).

  17. Rana, "Innovative Use of Cloud Computing in Smart Phone Technology", BIJIT - BVICAM's International Journal of Information Technology, Vol.5 No.2, July- December, 2013.

  18. Yan, Z., Zhan, Y., Peng, Z., Liao, S., Shinagawa, Y., Metaxas, D. N., & Zhou, X. S. : Body part recognition using multistage deep learning. Lecture Notes in Computer Science, in International Conference on information processing in medical imaging, pp. 449–461, Isle of Skye, UK ( 2015).

  19. K. Bhatia, A. K. Pal, Anu Chaudhary, "Performance Analysis of High Speed Data Networks Using Priority Discipline," BIJIT - BVICAM's International Journal of Information Technology, Vol.1 No.2, July – December 2009.

  20. R. B. Patel, Anu, "A Mobile Transaction System for Open Networks", BIJIT - BVICAM's International Journal of Information Technology, Vol.1 No.1, January – June,2009.

  21. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proceeding of the IEEE 86(11):2278–2324

    Article  Google Scholar 

  22. Zubair, S., Singha, A. K., Pathak, N., Sharma, N., Urooj, S., Larguech, S. R..,” Performance Enhancement of Adaptive Neural Networks Based on Learning Rate”, CMC-COMPUTERS MATERIALS & CONTINUA, 74(1), 2005–2019(2023).

  23. Yuan, X. ,” Phd forum: deep learning-based real-time malware detection with multistage analysis”’ in Proceeding of the IEEE Conference on Smart Computing (SMARTCOMP), pp. 1–2,Hong Kong, China( 2017).

  24. Tanwar, P., Gohil, R., & Tanwar, M.,” Quick Survey of Benefits from Control Plane and Data Plane Separation in Software-Defined Networkin{\mathrm{g}}\prime \prime, BVICA M's International Journal of Information Technology, Vol.8, (2016

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Swaleha Zubair.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Singha, A.K., Zubair, S. Combination of Optimization Methods in a Multistage Approach for a Deep Neural Network Model. Int. j. inf. tecnol. 16, 1855–1861 (2024). https://doi.org/10.1007/s41870-023-01568-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41870-023-01568-1

Keywords

Navigation