Depthwise grouped convolution for object detection

Liao, Yongwei; Lu, Siwei; Yang, Zhenguo; Liu, Wenyin

doi:10.1007/s00138-021-01243-0

Depthwise grouped convolution for object detection

Original Paper
Published: 13 September 2021

Volume 32, article number 115, (2021)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Yongwei Liao¹,
Siwei Lu¹,
Zhenguo Yang¹ &
…
Wenyin Liu^1,2

495 Accesses
5 Citations
1 Altmetric
Explore all metrics

Abstract

Object detection usually adopts two-stage end-to-end networks, which use backbone network (such as VGG and ResNet) for feature extraction and are combined with the region proposal network (RPN) for object localization and classification. In this paper, we explore a novel depthwise grouped convolution (DGC) in the backbone network by integrating channels grouping and depthwise separable convolution, which is able to share the convolution parameters in different channels to reduce the amounts of parameters for speeding up training. In particular, split and shuffle strategies of channels are introduced to enhance information exchange between different groups of channels in DGC block, which can prevent the decrease of performance caused by insufficient object samples. Furthermore, non-local block is adopted in RPN to focus on small objects that are hard to identify. Consequently, we introduce margin-based loss to guide the model training together with the loss of classification and regression. Experiments conducted on the VOC2007, VOC2012 and COCO2017 datasets demonstrate the efficiency and effectiveness of our method for object detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Rich Features and Precise Localization with Region Proposal Network for Object Detection

Object Detection from Images Based on MFF-RPN and Multi-scale CNN

Scalable Object Detection Using Deep but Lightweight CNN with Features Fusion

References

Simonyan, Karen, Zisserman, A.: Very Deep Convolutional Networks for Large-Scale Image Recognition. Computer ence (2014)
Redmon, Joseph, et al.: You Only Look Once: Unified, Real-Time Object Detection. CVPR (2015)
Liu, Wei, et al.: SSD: Single Shot MultiBox Detector. ECCV (2016)
Zhang, Shifeng, et al.: Single-Shot Refinement Neural Network for Object Detection. Presented at the (2017)
Tan, Mingxing, Pang, R., Le, Q.V.: EfficientDet: Scalable and Efficient Object Detection. , CVPR (2019)
Ren, Shaoqing, et al.: “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.” Adv. Neural. Inf. Process. Syst. (2017)
Dai, Jifeng, et al.: “R-FCN: Object detection via region-based fully convolutional networks.” Adv. Neural. Inf. Process. Syst. (2016)
Cai, Zhaowei, and N. Vasconcelos.: “Cascade R-CNN: Delving into High Quality Object Detection.” (2017)
Fan, Qi. et al.: “Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector.” CVPR (2020)
He, Kaiming, et al.: Deep Residual Learning for Image Recognition. CVPR (2016)
Chollet, François: Xception: Deep learning with depthwise separable convolutions. CVPR (2017)
Wang, Xiaolong, et al.: Non-local Neural Networks. CVPR (2018)
Kong, Tao, et al.: HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection. CVPR (2016)
Kim, Kye-Hyeon, et al.: “Pvanet: Deep but lightweight neural networks for real-time object detection.” arXiv:1608.08021 (2016)
Shrivastava, Abhinav, Gupta, A., Girshick, R.: Training Region-based Object Detectors with Online Hard Example Mining. CVPR (2016)
Li, Minne, et al.: S-OHEM: Stratified Online Hard Example Mining for Object Detection. Computer Visio (2017)
Li, Buyu, Liu, Yu, Wang, Xiaogang.: “Gradient harmonized single-stage detector.”. AAAI (2019)
Huang, Gao. et al.: “Densely Connected Convolutional Networks.” CVPR (2017)
Xie, Saining. et al.: “Aggregated Residual Transformations for Deep Neural Networks.” CVPR (2017)
Szegedy, Christian, et al.: Rethinking the inception architecture for computer vision. CVPR (2016)
Ma, Ningning, et al.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. ECCV (2018)
Hu, Jie, et al.: “Squeeze-and-Excitation Networks.” CVPR (2018)
Li, Xiang. et al.: “Selective Kernel Networks.” CVPR (2019)
Girshick, Ross, et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. CVPR (2014)
Kaiming, He., et al.: Mask R-CNN. IEEE Transactions on Pattern Analysis and Machine Intelligence (2018)
Wu, Chao-Yuan. et al.: “Sampling Matters in Deep Embedding Learning.” ICCV (2017)
Bottou, Léon.: Large-scale machine learning with stochastic gradient descent. Physica-Verlag HD (2010)
Duchi, John, Hazan, E., Singer, Y.: Adaptive Subgradient Methods for Online Learning and Stochastic Optimization. J. Mach. Learn. Res. 12, 7 (2011)
MathSciNet MATH Google Scholar
Kingma, Diederik, Ba, J.: Adam: A Method for Stochastic Optimization. Computer ence (2014)
Ghadimi, Euhanna, Feyzmahdavian, H.R., Johansson, M.: Global convergence of the Heavy-ball method for convex optimization. ECCV (2015)
Sutskever, Ilya, et al.: “On the importance of initialization and momentum in deep learning.” International conference on machine learning (2013)
Zhang, Michael, et al.: “Lookahead optimizer: k steps forward, 1 step back.” Adv. Neural Inf. Process. Syst. (2019)
Yousong Zhu, et al.: “CoupleNet: Coupling Global Structure with Local Parts for Object Detection”. ICCV (2017)
Cartucho, Joao, Ventura, Rodrigo, Veloso, Manuela: Robust object recognition through symbiotic deep learning in mobile robots. IROS (2018)
Krizhevsky, Alex, Sutskever, I., Hinton, G.: ImageNet Classification with Deep Convolutional Neural Networks. Adv. Neural. Inf. Process. Syst. 25, 2 (2012)
Google Scholar
Sermanet, Pierr, et al.: OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks. Eprint Arxiv (2013)
Najibi, Mahyar, Rastegari, M., Davis, L.S.: G-CNN: An Iterative Grid Based Object Detector. CVPR (2016)
Kong, Tao, et al.: Ron: Reverse connection with objectness prior networks for object detection. CVPR (2017)
He, Kaiming, et al.: “Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition.” IEEE Trans. Pattern Analy. Machine Intell. 37.9(2014)
Lin, Tsung-Yi., et al.: Feature pyramid networks for object detection. CVPR (2017)
Howard, Andrew G., et al.: Mobilenets: Efficient convolutional neural networks for mobile vision applications. Presented at the arXiv preprint (2017)
Wang, Guangrun, Wang, Keze, Lin, Liang: Adaptively connected neural networks. CVPR (2019)
Vaswani, Ashish, et al.: “Attention is all you need.” Advances in neural information processing systems (2017)
Neubeck, Alexander, Gool, L.J.V..: “Efficient Non-Maximum Suppression.” International Conference on Pattern Recognition IEEE Computer Society (2006)
Zhang, Xiangyu, et al.: Shufflenet: An extremely efficient convolutional neural network for mobile devices. CVPR (2018)
Lin, Tsung-Yi., et al.: Microsoft coco: Common objects in context. EECV (2014)
Li, Wei, et al.: Object detection based on semi-supervised domain adaptation for imbalanced domain resources. Mach. Vis. Appl. 31, 3 (2020)
Article Google Scholar
Srivastava, Gargi, Srivastava, Rajeev: User-interactive salient object detection using YOLOv2, lazy snapping, and gabor filters. Mach. Vis. Appl. 31, 3 (2020)
Article Google Scholar
Park, Jinhee, et al.: Small object segmentation with fully convolutional network based on overlapping domain decomposition. Mach. Vis. Appl. 30, 4 (2019)
Article Google Scholar
Li, Cuiping, et al.: Saliency object detection: integrating reconstruction and prior. Mach. Vis. Appl. 30, 3 (2019)
Google Scholar
Shahdoosti, Hamid Reza, Rahemi, Zahra: A maximum likelihood filter using non-local information for despeckling of ultrasound images. Mach. Vis. Appl. 29, 4 (2018)
Article Google Scholar
Najibi, Mahyar, Singh, Bharat, Davis, Larry S.: FA-RPN: Floating Region Proposals for Face Detection. CVPR (2019)

Download references

Acknowledgements

This work is supported by the Guangdong Basic and Applied Basic Research Foundation (No.2020A1515010616), Science and Technology Program of Guangzhou (No.202102020524), the Guangdong Innovative Research Team Program (No.2014ZT05G157), the Key-Area Research and Development Program of Guangdong Province (2019B010136001), and the Science and Technology Planning Project of Guangdong Province (LZC0023).

Author information

Authors and Affiliations

School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, China
Yongwei Liao, Siwei Lu, Zhenguo Yang & Wenyin Liu
Cyberspace Security Research Center, Peng Cheng Laboratory, Shenzhen, China
Wenyin Liu

Authors

Yongwei Liao
View author publications
You can also search for this author in PubMed Google Scholar
Siwei Lu
View author publications
You can also search for this author in PubMed Google Scholar
Zhenguo Yang
View author publications
You can also search for this author in PubMed Google Scholar
Wenyin Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Zhenguo Yang or Wenyin Liu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liao, Y., Lu, S., Yang, Z. et al. Depthwise grouped convolution for object detection. Machine Vision and Applications 32, 115 (2021). https://doi.org/10.1007/s00138-021-01243-0

Download citation

Received: 04 January 2021
Revised: 21 July 2021
Accepted: 23 August 2021
Published: 13 September 2021
DOI: https://doi.org/10.1007/s00138-021-01243-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Depthwise grouped convolution for object detection

Abstract

Access this article

Similar content being viewed by others

Rich Features and Precise Localization with Region Proposal Network for Object Detection

Object Detection from Images Based on MFF-RPN and Multi-scale CNN

Scalable Object Detection Using Deep but Lightweight CNN with Features Fusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Depthwise grouped convolution for object detection

Abstract

Access this article

Similar content being viewed by others

Rich Features and Precise Localization with Region Proposal Network for Object Detection

Object Detection from Images Based on MFF-RPN and Multi-scale CNN

Scalable Object Detection Using Deep but Lightweight CNN with Features Fusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation