Merging Similar Neurons for Deep Networks Compression
- 9 Downloads
Deep neural networks have achieved outstanding progress in many fields, such as computer vision, speech recognition and natural language processing. However, large deep neural networks often need huge storage space and long training time, making them difficult to apply to resource restricted devices. In this paper, we propose a method for compressing the structure of deep neural networks. Specifically, we apply clustering analysis to find similar neurons in each layer of the original network, and merge them and the corresponding connections. After the compression of the network, the number of parameters in the deep neural network is significantly reduced, and the required storage space and computational time is greatly reduced as well. We test our method on deep belief network (DBN) and two convolutional neural networks. The experimental results demonstrate that our proposed method can greatly reduce the number of parameters of the deep networks, while keeping their classification accuracy. Especially, on the CIFAR-10 dataset, we have compressed VGGNet with compression ratio 92.96%, and the final model after fine-tuning obtains even higher accuracy than the original model.
KeywordsMachine learning Deep neural networks Structure compression Neurons Clustering
This work was supported by the National Key R&D Program of China under Grant No. 2016YFC1401004, the National Natural Science Foundation of China (NSFC) under Grant No. 41706010, the Science and Technology Program of Qingdao under Grant No. 17-3-3-20-nsh, the CERNET Innovation Project under Grant No. NGII20170416, and the Fundamental Research Funds for the Central Universities of China. The Titan X GPU used for this research was donated by the NVIDIA Corporation.
Compliance with Ethical Standards
Conflict of interest
The authors declare that they have no conflict of interest.
This article does not contain any studies with human participants or animals performed by any of the authors.
Informed consent was obtained from all individual participants included in the study.
- 2.Bucila C, Caruana R, Niculescu-Mizil A. Model compression. ACM SIGKDD; 2006. p. 535–541.Google Scholar
- 3.Chen W, Wilson JT, Tyree S, Weinberger KQ, Chen Y. Compressing neural networks with the hashing trick. ICML; 2015. p. 2285–2294.Google Scholar
- 4.Cheng Y, Wang D, Zhou P, Zhang T. 2017. A survey of model compression and acceleration for deep neural networks. arXiv:1710.09282.
- 5.Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa PP. Natural language processing (almost) from scratch. J Mach Learn Res 2011;12:2493–2537.Google Scholar
- 6.Courbariaux M, Bengio Y, David J. Binaryconnect: training deep neural networks with binary weights during propagations. NIPS; 2015. p. 3123–3131.Google Scholar
- 7.Deng L, Li J, Huang J, Yao K, Yu D, Seide F, Seltzer ML, Zweig G, He X, Williams JD, Gong Y, Acero A. Recent advances in deep learning for speech research at Microsoft. ICASSP; 2013. p. 8604–8608.Google Scholar
- 8.Denil M, Shakibi B, Dinh L, Ranzato M, de Freitas N. Predicting parameters in deep learning. NIPS; 2013. p. 2148–2156.Google Scholar
- 9.Denton EL, Zaremba W, Bruna J, LeCun Y, Fergus R. Exploiting linear structure within convolutional networks for efficient evaluation. NIPS; 2014. p. 1269–1277.Google Scholar
- 10.Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T. DeCAF: A deep convolutional activation feature for generic visual recognition. ICML; 2014. p. 647–655.Google Scholar
- 12.Gong Y, Liu L, Yang M, Bourdev LD. 2014. Compressing deep convolutional networks using vector quantization. arXiv:1412.6115.
- 13.Han S, Mao H, Dally WJ. 2015. Deep compression: compressing deep neural network with pruning, trained quantization and Huffman coding. arXiv:1510.00149.
- 14.He Y, Zhang X, Sun J. Channel pruning for accelerating very deep neural networks. IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017; 2017. p. 1398–1406.Google Scholar
- 15.Hinton GE, Vinyals O, Dean J. Distilling the knowledge in a neural network. arXiv:1503.02531; 2015.
- 16.Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H. 2017. Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861.
- 17.Iandola FN, Moskewicz MW, Ashraf K, Han S, Dally WJ, Keutzer K. 2016. SqueezeNet: AlexNet-level accuracy with 50X fewer parameters and < 0.5 Mb model size. arXiv:1602.07360.
- 19.Kim Y, Park E, Yoo S, Choi T, Yang L, Shin D. 2015. Compression of deep convolutional neural networks for fast and low power mobile applications. arXiv:1511.06530.
- 20.Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. NIPS; 2012 . p. 1106–1114.Google Scholar
- 21.Lebedev V, Ganin Y, Rakhuba M, Oseledets IV, Lempitsky VS. 2014. Speeding-up convolutional neural networks using fine-tuned CP-decomposition. arXiv:1412.6553.
- 22.Lebedev V, Lempitsky VS. Fast ConvNets using group-wise brain damage. CVPR; 2016. p. 2554–2564.Google Scholar
- 24.Li H, Kadav A, Durdanovic I, Samet H, Graf HP. 2016. Pruning filters for efficient ConvNets. arXiv:1608.08710.
- 25.Liu Z, Li J, Shen Z, Huang G, Yan S, Zhang C. Learning efficient convolutional networks through network slimming. ICCV; 2017. p. 2755–2763.Google Scholar
- 26.Ren S, He K, Girshick RB, Sun J. Faster r-CNN: towards real-time object detection with region proposal networks. NIPS; 2015. p. 91–99.Google Scholar
- 27.Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y. 2014. Fitnets: hints for thin deep nets. arXiv:1412.6550.
- 29.Simonyan K, Zisserman A. 2014. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556.
- 30.Zeiler MD, Fergus R. Visualizing and understanding convolutional networks. ECCV; 2014. p. 818–833.Google Scholar
- 33.Zhong G, Yao H, Zhou H. Merging neurons for structure compression of deep networks. ICPR; 2018. p. 1462–1467.Google Scholar