Abstract
Thanks to their state-of-the-art performance, deep neural networks are increasingly used for object recognition. To achieve the best results, they use millions of parameters to be trained. However, when targetting embedded applications the size of these models becomes problematic. As a consequence, their usage on smartphones or other resource limited devices is prohibited. In this paper we introduce a novel compression method for deep neural networks that is performed during the learning phase. It consists in adding an extra regularization term to the cost function of fully-connected layers. We combine this method with Product Quantization (PQ) of the trained weights for higher savings in storage consumption. We evaluate our method on two data sets (MNIST and CIFAR10), on which we achieve significantly larger compression rates than state-of-the-art methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
LeCun, B.B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., Jackel, L.D.: Handwritten digit recognition with a back-propagation network. In: Advances in Neural Information Processing Systems. Citeseer (1990)
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions, arXiv preprint arXiv:1409.4842 (2014)
Simonyan, K., Zisserman, A.: Very deep convolutional networks forlarge-scale image recognition, arXiv preprint arXiv:1409.1556 (2014)
Jia, Y.: Caffe: an open source convolutional architecture for fast feature embedding, arXiv preprint arXiv:1310.1531 (2013)
Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., Darrell, T.: Decaf: a deep convolutional activation feature for generic visualrecognition, arXiv preprint arXiv:1310.1531 (2013)
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: integrated recognition, localization and detection usingconvolutional networks, arXiv preprint arXiv:1312.6229 (2013)
Zeiler, M.D., Fergus, R.: Stochastic pooling for regularization of deepconvolutional neural networks, arXiv preprint arXiv:1301.3557 (2013)
Gong, Y., Wang, L., Guo, R., Lazebnik, S.: Multi-scale orderless pooling of deep convolutional activation features. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part VII. LNCS, vol. 8695, pp. 392–407. Springer, Heidelberg (2014)
Gokhale, V., Jin, J., Dundar, A., Martini, B., Culurciello, E.: A 240 G-ops/s mobile coprocessor for deep neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 696–701. IEEE (2014)
Denil, M., Shakibi, B., Dinh, L., de Freitas, N., et al.: Predicting parameters in deep learning. In: Advances in Neural Information Processing Systems, pp. 2148–2156 (2013)
Denton, E.L., Zaremba, W., Bruna, J., LeCun, Y., Fergus, R.: Exploiting linear structure within convolutional networks for efficient evaluation. In: Advances in Neural Information Processing Systems, pp. 1269–1277 (2014)
Jaderberg, M., Vedaldi, A., Zisserman, A.: Speeding up convolutional neuralnetworks with low rank expansions, arXiv preprint arXiv:1405.3866 (2014)
Gong, Y., Liu, L., Yang, M., Bourdev, L.: Compressing deep convolutional networks using vector quantization, arXiv preprint arXiv:1412.6115 (2014)
Jegou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), 117–128 (2011)
Chen, Y., Guan, T., Wang, C.: Approximate nearest neighbor search by residual vector quantization. Sensors 10(12), 11259–11273 (2010)
Chen, W., Wilson, J.T., Tyree, S., Weinberger, K.Q., Chen, Y.: Compressingneural networks with the hashing trick, arXiv preprint arXiv:1504.04788 (2015)
Chen, W., Wilson, J.T., Tyree, S., Weinberger, K.Q., Chen, Y.: Compressing convolutional neural networks, arXiv preprint arXiv:1506.04449 (2015)
Murray, W., Ng, K.-M.: An algorithm for nonlinear optimization problems with binary variables. Comput. Optim. Appl. 47(2), 257–288 (2010)
Vanhoucke, V., Senior, A., Mao, M.Z.: Improving the speed of neural networks on CPUs. In: Proceedings of Deep Learning and Unsupervised Feature Learning NIPS Workshop, vol. 1 (2011)
Mathieu, M., Henaff, M., LeCun, Y.: Fast training of convolutional networks through FFTS, arXiv preprint arXiv:1312.5851 (2013)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Chollet, F.: Keras: Theano-based deep learning library (2015). http://keras.io/
Acknowledgments
This work was founded in part by the European Research Council under the European Union’s Seventh Framework Program (FP7 / 2007 - 2013) / ERC grant agreement number 290901.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Soulié, G., Gripon, V., Robert, M. (2016). Compression of Deep Neural Networks on the Fly. In: Villa, A., Masulli, P., Pons Rivero, A. (eds) Artificial Neural Networks and Machine Learning – ICANN 2016. ICANN 2016. Lecture Notes in Computer Science(), vol 9887. Springer, Cham. https://doi.org/10.1007/978-3-319-44781-0_19
Download citation
DOI: https://doi.org/10.1007/978-3-319-44781-0_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-44780-3
Online ISBN: 978-3-319-44781-0
eBook Packages: Computer ScienceComputer Science (R0)