Advertisement

Neural Network Compression via Learnable Wavelet Transforms

Conference paper
  • 807 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12397)

Abstract

Wavelets are well known for data compression, yet have rarely been applied to the compression of neural networks. This paper shows how the fast wavelet transform can be used to compress linear layers in neural networks. Linear layers still occupy a significant portion of the parameters in recurrent neural networks (RNNs). Through our method, we can learn both the wavelet bases and corresponding coefficients to efficiently represent the linear layers of RNNs. Our wavelet compressed RNNs have significantly fewer parameters yet still perform competitively with the state-of-the-art on synthetic and real-world RNN benchmarks (Source code is available at https://github.com/v0lta/Wavelet-network-compression). Wavelet optimization adds basis flexibility, without large numbers of extra weights.

Keywords

Wavelets Network compression 

Notes

Acknowledgements

Research was supported by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) project YA 447/2-1 (FOR 2535 Anticipating Human Behavior) and by the National Research Foundation Singapore under its NRF Fellowship Programme [NRF-NRFFAI1-2019-0001].

References

  1. 1.
    Courbariaux, M., Bengio, Y., David, J.-P.: BinaryConnect: training deep neural networks with binary weights during propagations. In: NIPS (2015)Google Scholar
  2. 2.
    Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net: imagenet classification using binary convolutional neural networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 525–542. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46493-0_32CrossRefGoogle Scholar
  3. 3.
    Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural network with pruning, trained quantization and Huffman coding. In 4th: International Conference on Learning Representations, ICLR 2016 (2016)Google Scholar
  4. 4.
    Han, S., Pool, J., Tran, J., Dally, W.: Learning both weights and connections for efficient neural networks. In: NIPS (2015)Google Scholar
  5. 5.
    Lin, S., Ji, R., Li, Y., Wu, Y., Huang, F., Zhang, B.: Accelerating convolutional networks via global & dynamic filter pruning. In: IJCAI (2018)Google Scholar
  6. 6.
    Denton, E.L., Zaremba, W., Bruna, J., LeCun, Y., Fergus, R.: Exploiting linear structure within convolutional networks for efficient evaluation. In: NIPS (2014)Google Scholar
  7. 7.
    Lin, S., Ji, R., Guo, X., Li, X.: Towards convolutional neural networks compression via global error reconstruction. In: IJCAI (2016)Google Scholar
  8. 8.
    Novikov, A., Podoprikhin, D., Osokin, A., Vetrov, D.P.: Tensorizing neural networks. In: NIPS (2015)Google Scholar
  9. 9.
    Yang, Z., et al.: Deep fried convnets. In: ICCV (2015)Google Scholar
  10. 10.
    Ailon, N., Chazelle, B.: The fast Johnson-Lindenstrauss transform and approximate nearest neighbors. SIAM J. Comput. 39(1), 302–322 (2009)Google Scholar
  11. 11.
    Le, Q., Sarlós, T., Smola, A.: Fastfood-approximating kernel expansions in loglinear time. In: ICML, vol. 85 (2013)Google Scholar
  12. 12.
    Strang, G., Nguyen, T.: Wavelets and Filter Banks. SIAM, Philadelphia (1996)Google Scholar
  13. 13.
    Cheng, Y., Yu, F.X., Feris, R.S., Kumar, S., Choudhary, A., Chang, S.F.: An exploration of parameter redundancy in deep networks with circulant projections. In: ICCV (2015)Google Scholar
  14. 14.
    Araujo, A., Negrevergne, B., Chevaleyre, Y., Atif, J.: Training compact deep learning models for video classification using circulant matrices. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11132, pp. 271–286. Springer, Cham (2019).  https://doi.org/10.1007/978-3-030-11018-5_25CrossRefGoogle Scholar
  15. 15.
    Tjandra, A., Sakti, S., Nakamura, S.: Compressing recurrent neural network with tensor train. In: IJCNN. IEEE (2017)Google Scholar
  16. 16.
    Yang, Y., Krompass, D., Tresp, V.: Tensor-train recurrent neural networks for video classification. In: ICML. JMLR. org (2017)Google Scholar
  17. 17.
    Denil, M., Shakibi, B., Dinh, L., Ranzato, M.A., De Freitas, N.: Predicting parameters in deep learning. In: NIPS, pp. 2148–2156 (2013)Google Scholar
  18. 18.
    Jaderberg, M., Vedaldi, A., Zisserman, A.: Speeding up convolutional neural networks with low rank expansions. In: Proceedings of the British Machine Vision Conference. BMVA Press (2014)Google Scholar
  19. 19.
    Wen, W., et al.: Learning intrinsic sparse structures within long short-term memory. In: International Conference on Learning Representations (2018)Google Scholar
  20. 20.
    Narang, S., Elsen, E., Diamos, G., Sengupta, S.: Exploring sparsity in recurrent neural networks. In: ICLR (2017)Google Scholar
  21. 21.
    Wang, T., Fan, L., Wang, H.: Simultaneously learning architectures and features of deep neural networks. In: Tetko, I.V., Kůrková, V., Karpov, P., Theis, F. (eds.) ICANN 2019. LNCS, vol. 11728, pp. 275–287. Springer, Cham (2019).  https://doi.org/10.1007/978-3-030-30484-3_23CrossRefGoogle Scholar
  22. 22.
    Wang, Z., Lin, J., Wang, Z.: Accelerating recurrent neural networks: a memory-efficient approach. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 25(10), 2763–2775 (2017)Google Scholar
  23. 23.
    Pan, Y., et al.: Compressing recurrent neural networks with tensor ring for action recognition. In: AAAI (2019)Google Scholar
  24. 24.
    Ye, J., et al.: Learning compact recurrent neural networks with block-term tensor decomposition. In: CVPR (2018)Google Scholar
  25. 25.
    Rustamov, R., Guibas, L.J.: Wavelets on graphs via deep learning. In: Advances in Neural Information Processing Systems (2013)Google Scholar
  26. 26.
    Bruna, J., Zaremba, W., Szlam, A., Lecun, Y.: Spectral networks and locally connected networks on graphs. In: International Conference on Learning Representations (ICLR 2014), CBLS, April 2014 (2014)Google Scholar
  27. 27.
    Lan-lan Chen, Yu., Zhao, J.Z., Zou, J.: Automatic detection of alertness/drowsiness from physiological signals using wavelet-based nonlinear features and machine learning. Expert Syst. Appl. 42(21), 7344–7355 (2015)CrossRefGoogle Scholar
  28. 28.
    Cotter, F., Kingsbury, N.: Deep learning in the wavelet domain. arXiv preprint arXiv:1811.06115 (2018)
  29. 29.
    Arjovsky, M., Shah, A., Bengio, Y.: Unitary evolution recurrent neural networks. In: ICML (2016)Google Scholar
  30. 30.
    LeCun, Y., et al.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)Google Scholar
  31. 31.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)Google Scholar
  32. 32.
    Marcus, M., Santorini, B., Marcinkiewicz, M.A.: Building a large annotated corpus of English: The Penn Treebank (1993)Google Scholar
  33. 33.
    Bai, S., Kolter, J.Z., Koltun, V.: An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv:1803.01271 (2018)

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Institute for Computer ScienceUniversity of BonnBonnGermany
  2. 2.Fraunhofer Center for Machine Learning and SCAISankt AugustinGermany
  3. 3.School of ComputingNational University of SingaporeSingaporeSingapore

Personalised recommendations