Advertisement

Coreset-Based Neural Network Compression

  • Abhimanyu DubeyEmail author
  • Moitreya Chatterjee
  • Narendra Ahuja
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11211)

Abstract

We propose a novel Convolutional Neural Network (CNN) compression algorithm based on coreset representations of filters. We exploit the redundancies extant in the space of CNN weights and neuronal activations (across samples) in order to obtain compression. Our method requires no retraining, is easy to implement, and obtains state-of-the-art compression performance across a wide variety of CNN architectures. Coupled with quantization and Huffman coding, we create networks that provide AlexNet-like accuracy, with a memory footprint that is 832\(\times \) smaller than the original AlexNet, while also introducing significant reductions in inference time as well. Additionally these compressed networks when fine-tuned, successfully generalize to other domains as well.

Notes

Acknowledgments

We are grateful to Prof. Ramesh Raskar for his insightful comments. MC additionally acknowledges Po-han Huang for helpful discussions and NVIDIA for providing the GPUs used for this research.

Supplementary material

474212_1_En_28_MOESM1_ESM.pdf (9.5 mb)
Supplementary material 1 (pdf 9715 KB)

References

  1. 1.
    Agarwal, P.K., Har-Peled, S., Varadarajan, K.R.: Geometric approximation via coresets. Comb. Comput. Geom. 52, 1–30 (2005)MathSciNetzbMATHGoogle Scholar
  2. 2.
    Alvarez, J.M., Salzmann, M.: Compression-aware training of deep networks. In: Advances in Neural Information Processing Systems, pp. 856–867 (2017)Google Scholar
  3. 3.
    Antol, S., et al.: VQA: visual question answering. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2425–2433 (2015)Google Scholar
  4. 4.
    Ashok, A., Rhinehart, N., Beainy, F., Kitani, K.M.: N2N learning: network to network compression via policy gradient reinforcement learning. arXiv preprint arXiv:1709.06030 (2017)
  5. 5.
    Bādoiu, M., Har-Peled, S., Indyk, P.: Approximate clustering via core-sets. In: Proceedings of the Thiry-Fourth Annual ACM Symposium on Theory of Computing, pp. 250–257. ACM (2002)Google Scholar
  6. 6.
    Collins, M.D., Kohli, P.: Memory bounded deep convolutional networks. arXiv preprint arXiv:1412.1442 (2014)
  7. 7.
    Dai, J., et al.: Deformable convolutional networks. CoRR, abs/1703.06211 1(2), 3 (2017)Google Scholar
  8. 8.
    Delchambre, L.: Weighted principal component analysis: a weighted covariance eigendecomposition approach. Mon. Not. R. Astron. Soc. 446(4), 3545–3555 (2014)CrossRefGoogle Scholar
  9. 9.
    Denton, E.L., et al.: Exploiting linear structure within convolutional networks for efficient evaluation. In: Advances in NIPS 2014, pp. 1269–1277 (2014)Google Scholar
  10. 10.
    Dieleman, S., De Fauw, J., Kavukcuoglu, K.: Exploiting cyclic symmetry in convolutional neural networks. arXiv preprint arXiv:1602.02660 (2016)
  11. 11.
    Dosovitskiy, A., Springenberg, J.T., Riedmiller, M., Brox, T.: Discriminative unsupervised feature learning with convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 766–774 (2014)Google Scholar
  12. 12.
    Dubey, A., Naik, N., Raviv, D., Sukthankar, R., Raskar, R.: Coreset-based adaptive tracking. arXiv preprint arXiv:1511.06147 (2015)
  13. 13.
    Feldman, D., Monemizadeh, M., Sohler, C.: A PTAS for k-means clustering based on weak coresets. In: Proceedings of the Twenty-Third Annual Symposium on Computational Geometry, pp. 11–18. ACM (2007)Google Scholar
  14. 14.
    Feldman, D., Schmidt, M., Sohler, C.: Turning big data into tiny data: Constant-size coresets for k-means, PCA and projective clustering. In: Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1434–1453. SIAM (2013)CrossRefGoogle Scholar
  15. 15.
    Feldman, D., Volkov, M., Rus, D.: Dimensionality reduction of massive sparse datasets using coresets. arXiv preprint arXiv:1503.01663 (2015)
  16. 16.
    Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)Google Scholar
  17. 17.
    Gokhale, V., Jin, J., Dundar, A., Martini, B., Culurciello, E.: A 240 G-ops/s mobile coprocessor for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 682–687 (2014)Google Scholar
  18. 18.
    Guo, Y., Yao, A., Chen, Y.: Dynamic network surgery for efficient DNNs. In: Advances In Neural Information Processing Systems, pp. 1379–1387 (2016)Google Scholar
  19. 19.
    Han, S., et al.: EIE: efficient inference engine on compressed deep neural network. In: Proceedings of the 43rd International Symposium on Computer Architecture, pp. 243–254. IEEE Press (2016)Google Scholar
  20. 20.
    Han, S., Mao, H., Dally, W.J.: Deep compression: Compressing deep neural network with pruning, trained quantization and Huffman coding (2015)Google Scholar
  21. 21.
    Han, S., Pool, J., Tran, J., Dally, W.: Learning both weights and connections for efficient neural network. In: Advances in Neural Information Processing Systems, pp. 1135–1143 (2015)Google Scholar
  22. 22.
    Har-Peled, S., Mazumdar, S.: On coresets for k-means and k-median clustering. In: ACM Symposium on Theory of Computing (2004)Google Scholar
  23. 23.
    He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988. IEEE (2017)Google Scholar
  24. 24.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385 (2015)
  25. 25.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)Google Scholar
  26. 26.
    He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. arXiv preprint arXiv:1603.05027 (2016)
  27. 27.
    He, Y., et al.: Channel pruning for accelerating very deep NNS. In: ICCV 2017 (2017)Google Scholar
  28. 28.
    Huang, G., Liu, Z., Weinberger, K.Q., van der Maaten, L.: Densely connected convolutional networks. arXiv preprint arXiv:1608.06993 (2016)
  29. 29.
    Iandola, F.N., Moskewicz, M.W., Ashraf, K., Han, S., Dally, W.J., Keutzer, K.: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and \(<\)1mb model size. arXiv preprint arXiv:1602.07360 (2016)
  30. 30.
    Jenatton, R., Obozinski, G., Bach, F.: Structured sparse principal component analysis. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 366–373 (2010)Google Scholar
  31. 31.
    Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 675–678. ACM (2014)Google Scholar
  32. 32.
    Jolliffe, I.T., Trendafilov, N.T., Uddin, M.: A modified principal component technique based on the LASSO. J. Comput. Graph. Stat. 12(3), 531–547 (2003)MathSciNetCrossRefGoogle Scholar
  33. 33.
    Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3128–3137 (2015)Google Scholar
  34. 34.
    Khosla, A., Jayadevaprakash, N., Yao, B., Li, F.F.: Novel dataset for fine-grained image categorization: Stanford dogsGoogle Scholar
  35. 35.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  36. 36.
    Lebedev, V., Ganin, Y., Rakhuba, M., Oseledets, I., Lempitsky, V.: Speeding-up convolutional neural networks using fine-tuned cp-decomposition. arXiv preprint arXiv:1412.6553 (2014)
  37. 37.
    LeCun, Y.: The MNIST database of handwritten digits (1998). http://yann.lecun.com/exdb/mnist/
  38. 38.
    LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRefGoogle Scholar
  39. 39.
    Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)CrossRefGoogle Scholar
  40. 40.
    Li, H., et al.: Pruning filters for efficient ConvNets. In: ICLR 2017 (2017)Google Scholar
  41. 41.
    Luo, J.H., Wu, J., Lin, W.: ThiNet: a filter level pruning method for deep neural network compression. arXiv preprint arXiv:1707.06342 (2017)
  42. 42.
    Molchanov, P., et al.: Pruning convolutional neural networks for resource efficient transfer learning. In: ICLR 2017 (2017)Google Scholar
  43. 43.
    Nam, H., Han, B.: Learning multi-domain convolutional neural networks for visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4293–4302 (2016)Google Scholar
  44. 44.
    Paszke, A., Gross, S., Chintala, S.: PyTorch (2017)Google Scholar
  45. 45.
    Polyak, A., Wolf, L.: Channel-level acceleration of deep face representations. IEEE Access 3, 2163–2175 (2015)CrossRefGoogle Scholar
  46. 46.
    Real, E., et al.: Large-scale evolution of image classifiers. arXiv preprint arXiv:1703.01041 (2017)
  47. 47.
    Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)Google Scholar
  48. 48.
    Rosenfeld, A., Tsotsos, J.K.: Incremental learning through deep adaptation. arXiv preprint arXiv:1705.04228 (2017)
  49. 49.
    Shin, H.C., et al.: Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans. Med. Imaging 35(5), 1285–1298 (2016)CrossRefGoogle Scholar
  50. 50.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
  51. 51.
    Solomonik, E.: Provably efficient algorithms for numerical tensor algebra. University of California, Berkeley (2014)Google Scholar
  52. 52.
    Srebro, N., Jaakkola, T.: Weighted low-rank approximations. In: Proceedings of the 20th International Conference on Machine Learning (ICML 2003), pp. 720–727 (2003)Google Scholar
  53. 53.
    Srinivas, S., Babu, R.V.: Data-free parameter pruning for deep neural networks. arXiv preprint arXiv:1507.06149 (2015)
  54. 54.
    Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)Google Scholar
  55. 55.
    Ullrich, K., Meeds, E., Welling, M.: Soft weight-sharing for neural network compression. arXiv preprint arXiv:1702.04008 (2017)
  56. 56.
    Vasudevan, A., Anderson, A., Gregg, D.: Parallel multi channel convolution using general matrix multiplication. arXiv preprint arXiv:1704.04428 (2017)
  57. 57.
    Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The Caltech-UCSD birds-200-2011 dataset (2011)Google Scholar
  58. 58.
    Wang, S., Cai, H., Bilmes, J., Noble, W.: Training compressed fully-connected networks with a density-diversity penalty (2016)Google Scholar
  59. 59.
    Yang, Z., et al.: Deep fried ConvNets. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1476–1483 (2015)Google Scholar
  60. 60.
    Yu, R., et al.: NISP: pruning networks using neuron importance score propagation. arXiv preprint arXiv:1711.05908 (2017)
  61. 61.
    Zhai, S., Cheng, Y., Zhang, Z.M., Lu, W.: Doubly convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1082–1090 (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Abhimanyu Dubey
    • 1
    Email author
  • Moitreya Chatterjee
    • 2
  • Narendra Ahuja
    • 2
  1. 1.Massachusetts Institute of TechnologyCambridgeUSA
  2. 2.University of Illinois at Urbana-ChampaignChampaignUSA

Personalised recommendations