Advertisement

A Dual-Branch CNN Structure for Deformable Object Detection

  • Jianjun LiEmail author
  • Kai Zheng
  • Xin Zhang
  • Zhenxing Luo
  • Zhuo Tang
  • Ching-Chun Chang
  • Yuqi Lin
  • Peiqi Tang
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 895)

Abstract

Object detectors based on CNN are now able to achieve satisfactory accuracy, but their ability to deal with some targets with geometric deformation or occlusion is often poor. This is largely due to the fixed geometric structure of the convolution kernel and the single inflexible network structure. In our work, we use dual branch parallel processing to extract the different features of the target area to coordinate the prediction. To further enhance the performance of the network, this study rebuilds the feature extraction module. Finally, our detector learns to adapt to a variety of different shapes and sizes. The proposed method achieves up to 81.76% mAP on the Pascal VOC2007 dataset and 79.6% mAP on the Pascal VOC2012 dataset.

Keywords

Dual-branch structure Convolution neural network Deformable object detection 

Notes

Acknowledgments

This work was supported by the National Natural Science Fund of China (No. 61871170) and the National Equipment Development Pre-research Fund: 6140137050202.

References

  1. 1.
    Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks, pp. 379–387 (2016)Google Scholar
  2. 2.
    Dai, J., Qi, H., Xiong, Y., Li, Y., Zhang, G., Hu, H., Wei, Y.: Deformable convolutional networks. In: IEEE International Conference on Computer Vision, ICCV 2017, pp. 764–773 (2017)Google Scholar
  3. 3.
    Everingham, M., Gool, L.J.V., Williams, C.K.I., Winn, J.M., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)CrossRefGoogle Scholar
  4. 4.
    Felzenszwalb, P.F., Girshick, R.B., McAllester, D.A., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)CrossRefGoogle Scholar
  5. 5.
    Girshick, R.: Fast R-CNN. In: IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)Google Scholar
  6. 6.
    Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, 23–28 June 2014, pp. 580–587 (2014)Google Scholar
  7. 7.
    Girshick, R.B., Iandola, F.N., Darrell, T., Malik, J.: Deformable part models are convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 437–446 (2015)Google Scholar
  8. 8.
    He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 22–29 October 2017, pp. 2980–2988 (2017)Google Scholar
  9. 9.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, pp. 770–778 (2016)Google Scholar
  10. 10.
    He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016, Proceedings, Part IV, pp. 630–645 (2016)Google Scholar
  11. 11.
    Jaderberg, M., Simonyan, K., Zisserman, A., Kavukcuoglu, K.: Spatial transformer networks. In: Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, Montreal, Quebec, Canada, 7–12 December 2015, pp. 2017–2025 (2015)Google Scholar
  12. 12.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  13. 13.
    Li, Z., Peng, C., Yu, G., Zhang, X., Deng, Y., Sun, J.: Light-head R-CNN: in defense of two-stage object detector. CoRR abs/1711.07264 (2017)Google Scholar
  14. 14.
    Lin, C., Lucey, S.: Inverse compositional spatial transformer networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 2252–2260 (2017)Google Scholar
  15. 15.
    Lin, T., Dollár, P., Girshick, R.B., He, K., Hariharan, B., Belongie, S.J.: Feature pyramid networks for object detection, pp. 936–944 (2017)Google Scholar
  16. 16.
    Lin, T., Goyal, P., Girshick, R.B., He, K., Dollár, P.: Focal loss for dense object detection. In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 22–29 October 2017, pp. 2999–3007 (2017)Google Scholar
  17. 17.
    Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S.E., Fu, C., Berg, A.C.: SSD: single shot multibox detector. In: Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016, Proceedings, Part I, pp. 21–37 (2016)Google Scholar
  18. 18.
    Lu, Q., Liu, C., Jiang, Z., Men, A., Yang, B.: G-CNN: object detection via grid convolutional neural network. IEEE Access 5, 24023–24031 (2017)CrossRefGoogle Scholar
  19. 19.
    Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks, pp. 91–99 (2015)Google Scholar
  20. 20.
    Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M.S., Berg, A.C., Li, F.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Shelhamer, E., Long, J., Darrell, T.: Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2017)CrossRefGoogle Scholar
  22. 22.
    Shrivastava, A., Gupta, A., Girshick, R.B.: Training region-based object detectors with online hard example mining. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, 27–30 June 2016, pp. 761–769 (2016)Google Scholar
  23. 23.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014)Google Scholar
  24. 24.
    Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S.E., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, pp. 1–9 (2015)Google Scholar
  25. 25.
    Xie, S., Girshick, R.B., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, pp. 5987–5995 (2017)Google Scholar
  26. 26.
    Yan, C., Xie, H., Yang, D., Yin, J., Zhang, Y., Dai, Q.: Supervised hash coding with deep neural network for environment perception of intelligent vehicles. IEEE Trans. Intell. Transp. Syst. 19(1), 284–295 (2018)CrossRefGoogle Scholar
  27. 27.
    Zhang, X., Zhou, X., Lin, M., Sun, J.: Shufflenet: an extremely efficient convolutional neural network for mobile devices. CoRR abs/1707.01083 (2017)Google Scholar
  28. 28.
    Zhu, Y., Zhao, C., Wang, J., Zhao, X., Wu, Y., Lu, H.: Couplenet: coupling global structure with local parts for object detection. In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 22–29 October 2017, pp. 4146–4154 (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Jianjun Li
    • 2
    Email author
  • Kai Zheng
    • 2
  • Xin Zhang
    • 2
  • Zhenxing Luo
    • 1
  • Zhuo Tang
    • 1
  • Ching-Chun Chang
    • 3
  • Yuqi Lin
    • 4
  • Peiqi Tang
    • 2
  1. 1.Science and Technology on Communication and Information Security Control Laboratory of the 36th Institute of China Electronics Technology Group CorporationJiaxingChina
  2. 2.School of Computer Science and EngineeringHangzhou Dianzi UniversityHangzhouChina
  3. 3.Department of Computer ScienceUniversity of WarwickCoventryUK
  4. 4.Yunnan Key Laboratory of Computer Technology Application/Faculty of Information Engineering and AutomationKunming University of Science and TechnologyKunmingChina

Personalised recommendations