Optimization of GPU Memory Usage for Training Deep Neural Networks
Recently, Deep Neural Networks have been successfully utilized in many domains; especially in computer vision. Many famous convolutional neural networks, such as VGG, ResNet, Inception, and so forth, are used for image classification, object detection, and so forth. The architecture of these state-of-the-art neural networks has become deeper and complicated than ever. In this paper, we propose a method to solve the problem of large memory requirement in the process of training a model. The experimental result shows that the proposed algorithm is able to reduce the GPU memory significantly.
KeywordsDeep Neural Network Convolutional Neural Networks GPU
This research was partially supported by the Ministry of Science and Technology under the grants MOST 106-2221-E-126-001-MY2, MOST 108-2221-E-182-031-MY3 and MOST 108-2218-E-126-003.
- 1.Krizhevsky, A., Ilya, S., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
- 2.He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. arXiv:1512.03385 (2015)
- 3.Ba, J., Rich, C.: Do deep nets really need to be deep? In: Advances in Neural Information Processing Systems (2014)Google Scholar
- 4.Urban, G., et al.: Do Deep Convolutional Nets Really Need to be Deep and Convolutional? arXiv:1603.05691 (2016)
- 7.Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: Platt, J.C., Koller, D., Singer, Y., Roweis, S.T. (eds.) Advances in Neural Information Processing Systems, vol. 19, pp. 2814–2822 (2006)Google Scholar
- 8.Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105 (2012)Google Scholar
- 10.Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)
- 13.Szegedy, C., et al.: Going Deeper with Convolutions. arXiv:1409.4842 (2014)
- 14.Hu, J., Shen, L., Albanie, S., Sun, G., Wu, E.: Squeeze-and-Excitation Networks. arXiv:1709.01507 (2017)
- 15.Chen, T., Xu, B., Zhang, C., Guestrin, C.: Training Deep Nets with Sublinear Memory Cost. arXiv:1604.06174 (2016)