Skip to main content

PackDet: Packed Long-Head Object Detector

  • Conference paper
  • First Online:
Computer Vision – ECCV 2020 (ECCV 2020)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12358))

Included in the following conference series:

  • 2928 Accesses

Abstract

State-of-the-art object detectors exploit multi-branch structure and predict objects at several different scales, although substantially boosted accuracy is acquired, low efficiency is inevitable as fragmented structure is hardware unfriendly. To solve this issue, we propose a packing operator (PackOp) to combine all head branches together at spatial. Packed features are computationally more efficient and allow to use cross-head group normalization (GN) at handy, leading to notable accuracy improvement against the common head-separate GN. All of these are only at the cost of less than 5.7% relative increase on runtime memory and introduction of a few noisy training samples, however, whose side-effects could be diminished by good packing patterns design. With PackOp, we propose a new anchor-free one-stage detector, PackDet, which features a single deeper/longer but narrower head compared to the existing methods: multiple shallow but wide heads. Our best models on COCO test-dev achieve better speed-accuracy balance: 35.1%, 42.3%, 44.0%, 47.4% AP with 22.6, 16.9, 12.4, 4.7 FPS using MobileNet-v2, ResNet-50, ResNet-101, and ResNeXt-101-DCN backbone, respectively. Codes will be released.(https://github.com/kding1225/PackDet)

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/facebookresearch/maskrcnn-benchmark.

References

  1. Bochkovskiy, A., Wang, C.Y., Liao, H.Y.: YOLOv4: Optimal speed and accuracy of object detection. arXiv:2004.10934 (2020)

  2. Cai, Z., Vasconcelos, N.: Cascade R-CNN: delving into high quality object detection. In: CVPR (2018)

    Google Scholar 

  3. Dai, J., Li, Y., He, K., et al.: R-FCN: object detection via region-based fully convolutional networks. In: NeurIPS (2016)

    Google Scholar 

  4. Duan, K., Bai, S., Xie, L., et al.: CenterNet: Keypoint triplets for object detection. arXiv:1904.08189 (2019)

  5. Fu, C., Liu, W., Ranga, A., et al.: DSSD: Deconvolutional single shot detector. arXiv:1701.06659 (2017)

  6. Girshick, R.B.: Fast R-CNN. In: ICCV (2015)

    Google Scholar 

  7. He, K., Gkioxari, G., Dollár, P., et al.: Mask R-CNN. In: ICCV (2017)

    Google Scholar 

  8. He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: CVPR (2016)

    Google Scholar 

  9. Huang, E., Korf, R.E.: New improvements in optimal rectangle packing. In: IJCAI (2009)

    Google Scholar 

  10. Iandola, F., Moskewicz, M., Karayev, S., et al.: DenseNet: Implementing efficient convnet descriptor pyramids. arXiv:1404.1869 (2014)

  11. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: ICML (2015)

    Google Scholar 

  12. Korf, R.E.: Optimal rectangle packing: initial results. In: ICAPS (2003)

    Google Scholar 

  13. Li, Y., Chen, Y., Wang, N., et al.: Scale-aware trident networks for object detection. In: ICCV (2019)

    Google Scholar 

  14. Li, Z., Peng, C., Yu, G., et al.: Light-head R-CNN: In defense of two-stage object detector. arXiv:1711.07264 (2017)

  15. Lin, T., Dollár, P., Girshick, R.B., et al.: Feature pyramid networks for object detection. In: CVPR (2017)

    Google Scholar 

  16. Lin, T., Goyal, P., Girshick, R.B., et al.: Focal loss for dense object detection. In: ICCV (2017)

    Google Scholar 

  17. Liu, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

  18. Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2

    Chapter  Google Scholar 

  19. Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 483–499. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_29

    Chapter  Google Scholar 

  20. Pang, J., Chen, K., Shi, J., et al.: Libra R-CNN: towards balanced learning for object detection. In: CVPR (2019)

    Google Scholar 

  21. Papandreou, G., Kokkinos, I., Savalle, P.A.: Untangling local and global deformations in deep convolutional networks for image classification and sliding window detection. arXiv:1412.0296 (2014)

  22. Peng, C., Xiao, T., Li, Z., et al.: MegDet: a large mini-batch object detector. In: CVPR (2018)

    Google Scholar 

  23. Redmon, J., Divvala, S.K., Girshick, R.B., et al.: You only look once: unified, real-time object detection. In: CVPR (2016)

    Google Scholar 

  24. Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: CVPR (2017)

    Google Scholar 

  25. Redmon, J., Farhadi, A.: YOLOv3: An incremental improvement. arXiv:1804.02767 (2018)

  26. Ren, S., He, K., Girshick, R.B., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NeurIPS (2015)

    Google Scholar 

  27. Rezatofighi, H., Tsoi, N., Gwak, J., et al.: Generalized intersection over union: a metric and a loss for bounding box regression. In: CVPR (2019)

    Google Scholar 

  28. Sandler, M., Howard, A.G., Zhu, M., et al.: MobileNetV2: inverted residuals and linear bottlenecks. In: CVPR (2018)

    Google Scholar 

  29. Tan, M., Pang, R., Le, Q.V.: EfficientDet: scalable and efficient object detection. arXiv:1911.09070 (2019)

  30. Tian, Z., Shen, C., Chen, H., et al.: FCOS: Fully convolutional one-stage object detection. arXiv:1904.01355 (2019)

  31. Wang, N., Gao, Y., Chen, H., et al.: NAS-FCOS: Fast neural architecture search for object detection. arXiv:1906.04423 (2019)

  32. Wei, C., Xie, L., Ren, X., et al.: Iterative reorganization with weak spatial constraints: solving arbitrary Jigsaw puzzles for unsupervised representation learning. In: CVPR (2019)

    Google Scholar 

  33. Wu, B., Dai, X., Zhang, P., et al.: FBNet: hardware-aware efficient ConvNet design via differentiable neural architecture search. In: CVPR (2019)

    Google Scholar 

  34. Wu, Y., He, K.: Group normalization. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11217, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01261-8_1

    Chapter  Google Scholar 

  35. Xie, S., Girshick, R.B., Dollár, P., et al.: Aggregated residual transformations for deep neural networks. In: CVPR (2017)

    Google Scholar 

  36. Yang, T., Zhang, X., Li, Z., et al.: MetaAnchor: learning to detect objects with customized anchors. In: NeurIPS (2018)

    Google Scholar 

  37. Yang, Z., Liu, S., Hu, H., et al.: RepPoints: Point set representation for object detection. arXiv:1904.11490 (2019)

  38. Zhang, S., Wen, L., Bian, X., et al.: Single-shot refinement neural network for object detection. In: CVPR (2018)

    Google Scholar 

  39. Zhang, Z., He, T., Zhang, H., et al.: Bag of freebies for training object detection neural networks. arXiv:1902.04103 (2019)

  40. Zhao, Q., Sheng, T., Wang, Y., et al.: M2Det: a single-shot object detector based on multi-level feature pyramid network. In: AAAI (2019)

    Google Scholar 

  41. Zhong, Y., Wang, J., Peng, J., et al.: Anchor box optimization for object detection. arXiv:1812.00469 (2018)

  42. Zhu, C., Chen, F., Shen, Z., et al.: Soft anchor-point object detection. arXiv, arXiv:1911.12448 (2019)

  43. Zhu, C., He, Y., Savvides, M.: Feature selective anchor-free module for single-shot object detection. In: CVPR (2019)

    Google Scholar 

  44. Zhu, X., Hu, H., Lin, S., et al.: Deformable ConvNets v2: more deformable, better results. In: CVPR (2019)

    Google Scholar 

Download references

Acknowledgement

This research was financially supported by National Natural Science Foundation of China (61731022, 91646207) and the Strategic Priority Research Program of the Chinese Academy of Sciences (XDA19090300). We would like to thank Rui Yang and Chaoyi Liu from EvaVisdom Tech for the inspiring discussions. We also thank the anonymous reviewers for their valuable suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kun Ding .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 18546 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ding, K., He, G., Gu, H., Zhong, Z., Xiang, S., Pan, C. (2020). PackDet: Packed Long-Head Object Detector. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12358. Springer, Cham. https://doi.org/10.1007/978-3-030-58601-0_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58601-0_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58600-3

  • Online ISBN: 978-3-030-58601-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics