Advertisement

New Default Box Strategy of SSD for Small Target Detection

  • Yuyao He
  • Baoqi Li
  • Yaohua Zhao
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11304)

Abstract

SSD, which combines the advantages of Faster-RCNN and YOLO, has excellent performance in both detection speed and precision by merging the default boxes of six different layers. As the original default box strategy cannot accurately capture the small target information, the detection precision of SSD for small target images is not as good as normal size targets. In this paper, a new default box strategy, which can give the appropriate size and number of default boxes, is proposed to improve the performance of SSD for small target detection. The new default box strategy is made up of new scales and new aspect ratios. The new scales, which provide the basic scales for the six layers, are defined by the size ratio of the kernel to the convolutional layer. In addition, the new scale range is reduced from [20, 90] to [20, 60]. The new aspect ratios, which determine the size and the number of default boxes of the six layers, are defined as [[1.1], [1.1], [1.1], [1.1], [0.8, 1.2], [1.1]]. Experiment results on the small ground target dataset show that the detection precision of SSD with the new strategy is 99.5 mAP, which is 4.6 mAP higher than that of the original SSD. More importantly, the training time of SSD with the new strategy is 963 s or 326 s less than that of the original SSD.

Keywords

Ground small target detection SSD Default boxes Scales Aspect ratios 

References

  1. 1.
    Abdullah, R.S.A.R., Salah, A.A., Ismail, A., Hashim, F., Rashid, N.E.A., Aziz, N.H.A.: LTE-based passive bistatic radar system for detection of ground moving targets. ETRI J. 38, 302–313 (2016)CrossRefGoogle Scholar
  2. 2.
    Kim, S., Song, W.J., Kim, S.H.: Robust ground target detection by sar and IR sensor fusion using adaboost-feature selection. Sensors 16, 1117–1134 (2016)CrossRefGoogle Scholar
  3. 3.
    Xu, H., Yang, Z., Chen, G., Liao, G., Tian, M.: A ground moving target detection approach based on shadow feature with multichannel high-resolution synthetic aperture radar. IEEE Geosci. Remote. Sens. Lett. 13, 1572–1576 (2016)CrossRefGoogle Scholar
  4. 4.
    Breitenstein, M.D., Reichlin, F., Leibe, B., Kollermeier, E., Gool, L.V.: Online multiperson tracking-by-detection from a single, uncalibrated camera. IEEE Trans. Pattern Anal. Mach. Intell. 33, 1820–1833 (2011)CrossRefGoogle Scholar
  5. 5.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 34th IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. IEEE Press, Las Vegas (2016)Google Scholar
  6. 6.
    Shrivastava, A., Gupta, A., Girshick, R.: Training region-based object detectors with online hard example mining. In: 34th IEEE Conference on Computer Vision and Pattern Recognition, pp. 761–769. IEEE Press, Las Vegas (2016)Google Scholar
  7. 7.
    Felzenszwalb, P.F., Girshick, R.B., Mcallester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1627–1630 (2010)CrossRefGoogle Scholar
  8. 8.
    Najibi, M., Rastegari, M., Davis, L.S.: G-CNN: an iterative grid based object detector. IEEE Access 5, 24023–24031 (2017)CrossRefGoogle Scholar
  9. 9.
    Erhan, D., Szegedy, C., Toshev, A., Anguelov, D.: Scalable object detection using deep neural networks. In: 32th IEEE Conference on Computer Vision and Pattern Recognition, pp. 2155–2162. IEEE Press, Columbus (2014)Google Scholar
  10. 10.
    Hinton, G.: Where do features come from? Cogn. Sci. 38, 1078–101 (2014)CrossRefGoogle Scholar
  11. 11.
    Krizhevsky, A., Sutskever, I., Hinton, G. E.: ImageNet classification with deep convolutional neural networks. In: 26th International Conference on Neural Information Processing Systems, pp. 1097–1105. Curran Associates Inc. Press (2012)Google Scholar
  12. 12.
    Szegedy, C., Toshev, A., Erhan, D.: Deep neural networks for object detection. Adv. Neural Inf. Process. Syst. 26, 2553–2561 (2013)Google Scholar
  13. 13.
    Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Netw. 18, 1527 (2006)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2, 1–127 (2009)CrossRefGoogle Scholar
  15. 15.
    Cao, X., Zhang, X., Yu, Y., Niu, L.: Deep learning-based recognition of underwater target. In: 20th IEEE International Conference on Digital Signal Processing, pp. 89–93. IEEE Press, Beijing (2017)Google Scholar
  16. 16.
    Girshick R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: 32th IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587. IEEE Press, Columbus (2014)Google Scholar
  17. 17.
    Girshick, R.: Fast R-CNN. In: 33th IEEE International Conference on Computer Vision, pp. 1440–1448. IEEE Press, Santiago (2015)Google Scholar
  18. 18.
    Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2017)CrossRefGoogle Scholar
  19. 19.
    He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37, 1904–1916 (2015)CrossRefGoogle Scholar
  20. 20.
    Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: 34th IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788. IEEE Press, Las Vegas (2016)Google Scholar
  21. 21.
    Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y.: SSD: single shot multibox detector. In: 34th IEEE Conference on Computer Vision and Pattern Recognition, pp. 21–37. IEEE Press, Las Vegas (2016)CrossRefGoogle Scholar
  22. 22.
    Cai, Z., Fan, Q., Feris, R.S., Vasconcelos, N.: A unified multi-scale deep convolutional neural network for fast object detection. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 354–370. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46493-0_22CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.School of Marine Science and TechnologyNorthwestern Polytechnical UniversityXi’anPeople’s Republic of China

Personalised recommendations