HetConv: Beyond Homogeneous Convolution Kernels for Deep CNNs

  • Pravendra SinghEmail author
  • Vinay Kumar Verma
  • Piyush Rai
  • Vinay P. Namboodiri
Part of the following topical collections:
  1. Special issue on Efficient Visual Recognition


While usage of convolutional neural networks (CNN) is widely prevalent, methods proposed so far always have considered homogeneous kernels for this task. In this paper, we propose a new type of convolution operation using heterogeneous kernels. The proposed Heterogeneous Kernel-Based Convolution (HetConv) reduces the computation (FLOPs) and the number of parameters as compared to standard convolution operation while it maintains representational efficiency. To show the effectiveness of our proposed convolution, we present extensive experimental results on the standard CNN architectures such as VGG, ResNet, Faster-RCNN, MobileNet, and SSD. We observe that after replacing the standard convolutional filters in these architectures with our proposed HetConv filters, we achieve 1.5 \(\times \) to 8 \(\times \) FLOPs based improvement in speed while it maintains (sometimes improves) the accuracy. We also compare our proposed convolution with group/depth wise convolution and show that it achieves more FLOPs reduction with significantly higher accuracy. Moreover, we demonstrate the efficacy of HetConv based CNN by showing that it also generalizes on object detection and is not constrained to image classification tasks. We also empirically show that the proposed HetConv convolution is more robust towards the over-fitting problem as compared to standard convolution.


Efficient convolutional neural networks Heterogeneous convolution FLOPs compression Model compression Efficient visual recognition 



  1. Abbasi-Asl, R., & Yu, B. (2017). Structural compression of convolutional neural networks based on greedy filter pruning. arXiv preprint arXiv:1705.07356.
  2. Alvarez, J. M., & Salzmann, M. (2016). Learning the number of neurons in deep networks. In NIPS (pp. 2270–2278).Google Scholar
  3. Brock, A., Lim, T., Ritchie, J. M., & Weston, N. (2018). Smash: One-shot model architecture search through hypernetworks. In ICLR.Google Scholar
  4. Cai, H., Chen, T., Zhang, W., Yu, Y., & Wang, J. (2018a). Efficient architecture search by network transformation. In Thirty-second AAAI conference on artificial intelligence.Google Scholar
  5. Cai, H., Yang, J., Zhang, W., Han, S., & Yu, Y. (2018b). Path-level network transformation for efficient architecture search. In ICML.Google Scholar
  6. Chen, W., Wilson, J., Tyree, S., Weinberger, K., & Chen, Y. (2015). Compressing neural networks with the hashing trick. In ICML (pp. 2285–2294).Google Scholar
  7. Chen, Y., Fang, H., Xu, B., Yan, Z., Kalantidis, Y., Rohrbach, M., et al. (2019). Drop an octave: Reducing spatial redundancy in convolutional neural networks with octave convolution. arXiv preprint arXiv:1904.05049
  8. Chollet, F. (2017). Xception: Deep learning with depthwise separable convolutions. In CVPR.Google Scholar
  9. Denton, E. L., Zaremba, W., Bruna, J., LeCun, Y., & Fergus, R. (2014). Exploiting linear structure within convolutional networks for efficient evaluation. In NIPS.Google Scholar
  10. Ding, X., Ding, G., Han, J., & Tang, S. (2018). Auto-balanced filter pruning for efficient convolutional neural networks. In AAAI.Google Scholar
  11. Everingham, M., Eslami, S. A., Van Gool, L., Williams, C. K., Winn, J., & Zisserman, A. (2015). The pascal visual object classes challenge: A retrospective. International Journal of Computer Vision, 111(1), 98–136.CrossRefGoogle Scholar
  12. Finn, C., Abbeel, P., & Levine, S. (2017). Model-agnostic meta-learning for fast adaptation of deep networks. In ICML.Google Scholar
  13. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y. (2014). Generative adversarial nets. In Advances in neural information processing systems (pp. 2672–2680).Google Scholar
  14. Han, S., Mao, H., & Dally, W. J. (2016). Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding. In ICLR.Google Scholar
  15. Hassibi, B., & Stork, D. G. (1993). Second order derivatives for network pruning: Optimal brain surgeon. In NIPS.Google Scholar
  16. He, K., & Sun, J. (2015). Convolutional neural networks at constrained time cost. In 2015 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 5353–5360). IEEE.Google Scholar
  17. He, K., Zhang, X., Ren, S., & Sun, J. (2016a). Deep residual learning for image recognition. In CVPR (pp. 770–778).Google Scholar
  18. He, K., Zhang, X., Ren, S., & Sun, J. (2016b). Identity mappings in deep residual networks. In European conference on computer vision (pp. 630–645). Springer.Google Scholar
  19. He, Y., Kang, G., Dong, X., Fu, Y., & Yang, Y. (2018). Soft filter pruning for accelerating deep convolutional neural networks. In IJCAI.Google Scholar
  20. He, Y., Zhang, X., Sun, J. (2017). Channel pruning for accelerating very deep neural networks. In ICCV (p. 6).Google Scholar
  21. Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., et al. (2017). Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861.
  22. Hu, H., Peng, R., Tai, Y. W., & Tang, C. K. (2016). Network trimming: A data-driven neuron pruning approach towards efficient deep architectures. arXiv preprint arXiv:1607.03250.
  23. Hu, J., Shen, L., & Sun, G. (2018). Squeeze-and-excitation networks. In CVPR.Google Scholar
  24. Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected convolutional networks. In CVPR.Google Scholar
  25. Iandola, F. N., Han, S., Moskewicz, M. W., Ashraf, K., Dally, W. J., & Keutzer, K. (2016). Squeezenet: Alexnet-level accuracy with 50\(\,\times \) fewer parameters and \(<\) 0.5 mb model size. arXiv preprint arXiv:1602.07360.
  26. Ioannou, Y., Robertson, D., Cipolla, R., & Criminisi, A. (2017). Deep roots: Improving cnn efficiency with hierarchical filter groups. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1231–1240).Google Scholar
  27. Ioannou, Y., Robertson, D., Shotton, J., Cipolla, R., & Criminisi, A. (2015). Training CNNs with low-rank filters for efficient image classification. arXiv preprint arXiv:1511.06744.
  28. Jaderberg, M., Vedaldi, A., & Zisserman, A. (2014). Speeding up convolutional neural networks with low rank expansions. arXiv preprint arXiv:1405.3866.
  29. Kamath, P., Singh, A., & Dutta, D. (2018). Neural architecture construction using envelopenets. arXiv preprint arXiv:1803.06744.
  30. Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images.Google Scholar
  31. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In NIPS (pp. 1097–1105).Google Scholar
  32. Lebedev, V., Lempitsky, V. (2016). Fast convnets using group-wise brain damage. In CVPR (pp. 2554–2564).Google Scholar
  33. LeCun, Y., Denker, J. S., & Solla, S. A. (1990). Optimal brain damage. In NIPS (pp. 598–605).Google Scholar
  34. Li, H., Kadav, A., Durdanovic, I., Samet, H., & Graf, H. P. (2016). Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710.
  35. Li, H., Kadav, A., Durdanovic, I., Samet, H., & Graf, H. P. (2017). Pruning filters for efficient convnets. . In ICLR.Google Scholar
  36. Li, Y., Kuang, Z., Chen, Y., & Zhang, W. (2019). Data-driven neuron allocation for scale aggregation networks. In Proceedings of the ieee conference on computer vision and pattern recognition (pp. 11,526–11,534).Google Scholar
  37. Lin, J., Rao, Y., Lu, J., & Zhou, J. (2017a). Runtime neural pruning. In Advances in neural information processing systems (pp. 2181–2191).Google Scholar
  38. Lin, T. Y., Dollár, P., Girshick, R. B., He, K., Hariharan, B., & Belongie, S. J. (2017b). Feature pyramid networks for object detection. In CVPR (p. 4).Google Scholar
  39. Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., et al. (2014). Microsoft coco: Common objects in context. In ECCV (pp. 740–755). Springer.Google Scholar
  40. Liu, C., Zoph, B., Neumann, M., Shlens, J., Hua, W., Li, L. J., et al. (2018). Progressive neural architecture search. In Proceedings of the European conference on computer vision (ECCV) (pp. 19–34).CrossRefGoogle Scholar
  41. Liu, H., Simonyan, K., Vinyals, O., Fernando, C., & Kavukcuoglu, K. (2017a). Hierarchical representations for efficient architecture search. arXiv preprint arXiv:1711.00436.
  42. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., et al. (2016). Ssd: Single shot multibox detector. In ECCV (pp. 21–37). Springer.Google Scholar
  43. Liu, Z., Li, J., Shen, Z., Huang, G., Yan, S., & Zhang, C. (2017b). Learning efficient convolutional networks through network slimming. In ICCV (pp. 2755–2763). IEEE.Google Scholar
  44. Louizos, C., Ullrich, K., & Welling, M. (2017). Bayesian compression for deep learning. In NIPS (pp. 3288–3298).Google Scholar
  45. Luo, J. H., Wu, J., Lin, W. (2017). Thinet: A filter level pruning method for deep neural network compression. In CVPR (pp. 5058–5066).Google Scholar
  46. Miao, H., Li, A., Davis, L. S., & Deshpande, A. (2017). Towards unified data and lifecycle management for deep learning. In ICDE (pp. 571–582). IEEE.Google Scholar
  47. Molchanov, P., Tyree, S., Karras, T., Aila, T., & Kautz, J. (2017). Pruning convolutional neural networks for resource efficient inference. In ICLR.Google Scholar
  48. Neklyudov, K., Molchanov, D., Ashukha, A., & Vetrov, D. P. (2017). Structured bayesian pruning via log-normal multiplicative noise. In NIPS (pp. 6775–6784).Google Scholar
  49. Noroozi, M., & Favaro, P. (2016). Unsupervised learning of visual representations by solving jigsaw puzzles. In European conference on computer vision (pp. 69–84). Springer.Google Scholar
  50. Pham, H., Guan, M. Y., Zoph, B., Le, Q. V., & Dean, J. (2018). Efficient neural architecture search via parameter sharing. In ICML.Google Scholar
  51. Rastegari, M., Ordonez, V., Redmon, J., & Farhadi, A. (2016). Xnor-net: Imagenet classification using binary convolutional neural networks. In ECCV (pp. 525–542). Springer.Google Scholar
  52. Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. In NIPS (pp. 91–99).Google Scholar
  53. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., et al. (2015). Imagenet large scale visual recognition challenge. IJCV, 115(3), 211–252.MathSciNetCrossRefGoogle Scholar
  54. Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In ICLR.Google Scholar
  55. Singh, P., Verma, V. K., Rai, P., & Namboodiri, V. P. (2019). Hetconv: Heterogeneous kernel-based convolutions for deep CNNs. In The IEEE conference on computer vision and pattern recognition (CVPR).Google Scholar
  56. Snell, J., Swersky, K., & Zemel, R. (2017). Prototypical networks for few-shot learning. In Advances in neural information processing systems (pp. 4077–4087).Google Scholar
  57. Stamoulis, D., Ding, R., Wang, D., Lymberopoulos, D., Priyantha, B., Liu, J., et al. (2019). Single-path nas: Designing hardware-efficient convnets in less than 4 hours. arXiv preprint arXiv:1904.02877.
  58. Sun, K., Li, M., Liu, D., & Wang, J. (2018). Igcv3: Interleaved low-rank group convolutions for efficient deep neural networks. arXiv preprint arXiv:1806.00178.
  59. Szegedy, C., Ioffe, S., Vanhoucke, V., & Alemi, A. A. (2017). Inception-v4, inception-resnet and the impact of residual connections on learning. In AAAI (vol. 4, p. 12).Google Scholar
  60. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., et al. (2015). Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1–9).Google Scholar
  61. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z. (2016). Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2818–2826).Google Scholar
  62. Vanhoucke, V. (2014) Learning visual representations at scale. In ICLR invited talk.Google Scholar
  63. Verma, V. K., Arora, G., Mishra, A., Rai, P. (2018). Generalized zero-shot learning via synthesized examples. In The IEEE conference on computer vision and pattern recognition (CVPR).Google Scholar
  64. Wen, W., Wu, C., Wang, Y., Chen, Y., Li, H. (2016). Learning structured sparsity in deep neural networks. In NIPS (pp. 2074–2082).Google Scholar
  65. Wu, Z., Nagarajan, T., Kumar, A., Rennie, S., Davis, L. S., Grauman, K., et al. (2018). Blockdrop: Dynamic inference paths in residual networks. In CVPR (pp. 8817–8826).Google Scholar
  66. Xie, G., Wang, J., Zhang, T., Lai, J., Hong, R., & Qi, G. J. (2018). Interleaved structured sparse convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 8847–8856).Google Scholar
  67. Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K. (2017). Aggregated residual transformations for deep neural networks. In 2017 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 5987–5995). IEEE.Google Scholar
  68. Yu, R., Li, A., Chen, C. F., Lai, J. H., Morariu, V. I., Han, X., et al. (2018). Nisp: Pruning networks using neuron importance score propagation. In CVPR.Google Scholar
  69. Zhang, T., Qi, G. J., Xiao, B., & Wang, J. (2017). Interleaved group convolutions. In Proceedings of the IEEE international conference on computer vision (pp. 4373–4382).Google Scholar
  70. Zhang, X., Zhou, X., Lin, M., Sun, J. (2018). Shufflenet: An extremely efficient convolutional neural network for mobile devices.Google Scholar
  71. Zhang, X., Zou, J., Ming, X., He, K., & Sun, J. (2015). Efficient and accurate approximations of nonlinear convolutional networks. In NIPS (pp. 1984–1992).Google Scholar
  72. Zhou, H., Alvarez, J. M., & Porikli, F. (2016). Less is more: Towards compact cnns. In ECCV (pp. 662–677). Springer.Google Scholar
  73. Zoph, B., & Le, Q.V. (2017). Neural architecture search with reinforcement learning. In ICLR.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringIndian Institute of Technology KanpurKanpurIndia

Personalised recommendations