Optimization of GPU Memory Usage for Training Deep Neural Networks

  • Che-Lun HungEmail author
  • Chine-fu Hsin
  • Hsiao-Hsi Wang
  • Chuan Yi Tang
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 1080)


Recently, Deep Neural Networks have been successfully utilized in many domains; especially in computer vision. Many famous convolutional neural networks, such as VGG, ResNet, Inception, and so forth, are used for image classification, object detection, and so forth. The architecture of these state-of-the-art neural networks has become deeper and complicated than ever. In this paper, we propose a method to solve the problem of large memory requirement in the process of training a model. The experimental result shows that the proposed algorithm is able to reduce the GPU memory significantly.


Deep Neural Network Convolutional Neural Networks GPU 



This research was partially supported by the Ministry of Science and Technology under the grants MOST 106-2221-E-126-001-MY2, MOST 108-2221-E-182-031-MY3 and MOST 108-2218-E-126-003.


  1. 1.
    Krizhevsky, A., Ilya, S., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  2. 2.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. arXiv:1512.03385 (2015)
  3. 3.
    Ba, J., Rich, C.: Do deep nets really need to be deep? In: Advances in Neural Information Processing Systems (2014)Google Scholar
  4. 4.
    Urban, G., et al.: Do Deep Convolutional Nets Really Need to be Deep and Convolutional? arXiv:1603.05691 (2016)
  5. 5.
    Hinton, G.E., Osindero, S., Teh, Y.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: Platt, J.C., Koller, D., Singer, Y., Roweis, S.T. (eds.) Advances in Neural Information Processing Systems, vol. 19, pp. 2814–2822 (2006)Google Scholar
  8. 8.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105 (2012)Google Scholar
  9. 9.
    Deng, L., Yu, D.: Deep learning: methods and applications. Found. Trends Signal Process. 7(3–4), 197–387 (2014)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)
  11. 11.
    Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part I. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). Scholar
  12. 12.
    Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Szegedy, C., et al.: Going Deeper with Convolutions. arXiv:1409.4842 (2014)
  14. 14.
    Hu, J., Shen, L., Albanie, S., Sun, G., Wu, E.: Squeeze-and-Excitation Networks. arXiv:1709.01507 (2017)
  15. 15.
    Chen, T., Xu, B., Zhang, C., Guestrin, C.: Training Deep Nets with Sublinear Memory Cost. arXiv:1604.06174 (2016)

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Che-Lun Hung
    • 1
    • 2
    Email author
  • Chine-fu Hsin
    • 1
    • 2
  • Hsiao-Hsi Wang
    • 1
    • 2
  • Chuan Yi Tang
    • 2
  1. 1.Chang Gung UniversityTaoyuanTaiwan
  2. 2.Providence UniversityTaichungTaiwan

Personalised recommendations