Weighted Channel-Wise Decomposed Convolutional Neural Networks

  • Yao Lu
  • Guangming LuEmail author
  • Yuanrong Xu


Currently, block term decomposition is widely utilized to factorize regular convolutional kernels with several groups to decrease parameters. However, networks designed based on this method lack adequate information interactions from every group. Therefore, the Weighted Channel-wise Decomposed Convolutions (WCDC) are proposed in this paper, and the relevant networks can be called WCDC-Nets. The WCDC convolutional kernel employ the channel-wise decomposition to reduce the parameters and computational complexity to the bone. Furthermore, a tiny learnable weighted module is also utilized to dig up connections of the outputs from channel-wise convolutions in the WCDC kernel. The WCDC filter can be easily applied in many popular networks and can be trained end to end, resulting in a significant improvement of model’s flexibility. Experimental results on the benchmark datasets showed that WCDC-Nets can achieve better performances with much fewer parameters and flop pointing computations.


Block term decomposition Group convolutions Channel-wise convolutions Weighted channel-wise decomposed convolutions 



This work is supported by NSFC fund (61332011), Shenzhen Fundamental Research fund (JCYJ20170811155442454, JCYJ20180306172023949), and Medical Biometrics Perception and Analysis Engineering Laboratory, Shenzhen, China.


  1. 1.
    Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M et al (2016) Tensorflow: large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467
  2. 2.
    Collobert R, Bengio S, Marithoz J (2002) Torch: a modular machine learning software library. Technical report IDIAP-RR 02-46, IDIAPGoogle Scholar
  3. 3.
    De Lathauwer L (2008) Decompositions of a higher-order tensor in block terms—part II: definitions and uniqueness. SIAM J Matrix Anal Appl 30(3):1033–1066MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: IEEE conference on computer vision and pattern recognition, 2009. CVPR 2009, pp 248–255. IEEEGoogle Scholar
  5. 5.
    He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778Google Scholar
  6. 6.
    He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. In: European conference on computer vision. Springer, Berlin, pp 630–645Google Scholar
  7. 7.
    Hu J, Shen L, Sun G (2017) Squeeze-and-excitation networks. arXiv preprint arXiv:1709.01507
  8. 8.
    Huang G, Liu Z, Weinberger KQ, van der Maaten L (2017) Densely connected convolutional networks. arXiv preprint arXiv:1608.06993v4
  9. 9.
    Huang G, Sun Y, Liu Z, Sedra D, Weinberger K (2016) Deep networks with stochastic depth. In: European conference on computer vision. Springer, Berlin, pp 646–661Google Scholar
  10. 10.
    Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, pp 448–456Google Scholar
  11. 11.
    Yangqing J, Evan S, Jeff D, Sergey K, Jonathan L (2014) Caffe: convolutional architecture for fast feature embedding. Eprint Arxiv, pp 675–678Google Scholar
  12. 12.
    Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny imagesGoogle Scholar
  13. 13.
    Larsson G, Maire M, Shakhnarovich G (2016) Fractalnet: ultra-deep neural networks without residuals. arXiv preprint arXiv:1605.07648
  14. 14.
    Lin M, Chen Q, Yan S (2013) Network in network. arXiv preprint arXiv:1312.4400
  15. 15.
    Sutskever I, Martens J, Dahl G, Hinton G (2013) On the importance of initialization and momentum in deep learning. In: International conference on machine learning, pp 1139–1147Google Scholar
  16. 16.
    Szegedy C, Liu W, Jia Y, Sermanet P (2014) Going deeper with convolutions. In: Computer vision and pattern recognition, pp 1–9Google Scholar
  17. 17.
    Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826Google Scholar
  18. 18.
    Xie S, Girshick R, Dollár P, Tu Z, He K (2016) Aggregated residual transformations for deep neural networks. arXiv preprint arXiv:1611.05431
  19. 19.
    Yunpeng C, Xiaojie J, Bingyi K, Jiashi F, Shuicheng Y (2017) Sharing residual units through collective tensor factorization in deep neural networks. arXiv preprint arXiv:1703.02180
  20. 20.
    Zagoruyko S, Komodakis N (2016) Wide residual networks. arXiv preprint arXiv:1605.07146
  21. 21.
    Zhang T, Qi GJ, Xiao B, Wang J (2017) Interleaved group convolutions for deep neural networks. ArXiv e-printsGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Harbin Institute of Technology (Shenzhen)ShenzhenChina

Personalised recommendations