Advertisement

An Improved Algorithm for Dense Object Detection Based on YOLO

  • Jiyang Ruan
  • Zhili WangEmail author
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 905)

Abstract

The YOLO v3 (you only look once) algorithm based on CNN (convolutional neural network) is currently the state-of-the-art algorithm that achieves the best performance in real-time object detection. However, this algorithm still has the problem of large detection errors in dense object scenes. This paper analyses the reason for the large error, and proposes an improved algorithm by optimizing confidence adjustment strategy for overlapping boxes and using dynamic overlap threshold setting. Experiments show that the improved algorithm has better performance in dense scenes while has little difference in other scenarios compared to the original algorithm.

Keywords

Object detection Confidence adjustment Dynamic threshold 

References

  1. 1.
    Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587. IEEE Computer Society (2014)Google Scholar
  2. 2.
    Girshick, R.: Fast R-CNN. In: IEEE International Conference on Computer Vision, pp. 1440–1448. IEEE Computer Society (2015)Google Scholar
  3. 3.
    Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)CrossRefGoogle Scholar
  4. 4.
    Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 6517–6525. IEEE Computer Society (2017)Google Scholar
  5. 5.
    Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Computer Vision and Pattern Recognition, pp. 779–788. IEEE (2016)Google Scholar
  6. 6.
    Bodla, N., Singh, B., Chellappa, R., Davis, L.S.: Soft-NMS-Improving object detection with one line of code. In: IEEE International Conference on Computer Vision, pp. 5562–5570. IEEE Computer Society (2017)Google Scholar
  7. 7.
    Sam, D.B., Surya, S., Babu, R.V.: Switching convolutional neural network for crowd counting. In: Computer Vision and Pattern Recognition, pp. 5744–5752. IEEE (2017)Google Scholar
  8. 8.
    He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2014)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Beijing University of Posts and TelecommunicationsBeijingChina

Personalised recommendations