Skip to main content
Log in

Pedestrian detection via deep segmentation and context network

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

For pedestrian detection, many deep learning approaches have shown effectiveness, but they are not accurate enough for the positioning of obstructed pedestrians. A novel segmentation and context network (SCN) structure is proposed that combines the segmentation and context information for improving the accuracy of bounding box regression for pedestrian detection. The SCN model contains the segmentation sub-model and the context sub-model. For separating the pedestrian instance from the background and solving the pedestrian occlusion problem, this paper uses the segmentation sub-model for extracting pedestrian segmentation information to generate more accurate pedestrian regions. Considering that different pedestrian instances need different context information, this paper uses context regions with different scales to extract context information. For improving the detection performance, this paper uses the hole algorithm in the context sub-model to expand the receptive field of the output feature maps and connect the multi-channel features with the skip layer. Finally, the loss functions of the two sub-models outputs are fused. The experimental results on different datasets validate the effectiveness of our SCN model, and the deeply supervised algorithm has a good trade-off between accuracy and complexity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE computer society conference on computer vision and pattern recognition

  2. Felzenszwalb P, McAllester D, Ramanan D (2008) A discriminatively trained, multiscale, deformable part model. In: 2008 IEEE conference on computer vision and pattern recognition

  3. Ouyang W, Wang X (2014) Joint deep learning for pedestrian detection. In: IEEE international conference on computer vision. IEEE, pp 2056–2063

  4. Yang B, Yan J, Lei Z, Li SZ (2015) Convolutional channel features. In: 2015 IEEE international conference on computer vision (ICCV), pp 82–90

  5. Cai Z, Saberian M, Vasconcelos N (2015) Learning complexity-aware cascades for deep pedestrian detection. In: 2015 IEEE international conference on computer vision (ICCV), Santiago, Chile, pp 3361–3369

  6. Liang X, He K, Zhang L, Lin L (2016) Is faster R-CNN doing well for pedestrian detection? In: European conference on computer vision, pp 443–457

  7. Su Y, Colombo A, Ghorban F, Marn J, Kummert A (2018) Aggregated channels network for real-time pedestrian detection. arXiv: 1801.00476v1

  8. Zhang H, Cao X, Ho JKL et al (2017) Object-level video advertising: an optimization framework. IEEE Trans Ind Inf 13(2):520–531

    Article  Google Scholar 

  9. Zhang H, Ji Y, Huang W et al (2018) Sitcom-star-based clothing retrieval for video advertising: a deep learning framework. Neural Comput Appl, pp 1–20

  10. Du X, El-Khamy M, Lee J, Davis L (2017) Fused DNN: a deep neural network fusion approach to fast and robust pedestrian detection. In: 2017 IEEE winter conference on applications of computer vision (WACV), pp 953–961

  11. Wang S, Liu J, Zhang S, Metaxas DN (2016) Multispectral deep neural networks for pedestrian detection. In: Computer vision and pattern recognition

  12. Mao J, Xiao T, Jiang Y, Cao Z (2017) What can help pedestrian detection? In: 2017 IEEE conference on computer vision and pattern recognition, pp 6034–6043

  13. Li G, Yu Y (2016) Deep contrast learning for salient object detection. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 478–487

  14. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE computer society conference on computer vision and pattern recognition, pp 580–587

  15. Girshick R (2015) Fast R-CNN. In: 2015 IEEE international conference on computer vision (ICCV), pp 1440–1448

  16. Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149

    Article  Google Scholar 

  17. Redmon J, Farhadi A (2016) YOLO9000: better, faster, stronger. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 6517–6525

  18. Erhan D, Szegedy C, Reed S, Fu CY, Liu W, Anguelov D, Berg AC (2016) SSD: single shot multibox detector. In: European conference computer vision (ECCV), pp 21–37

  19. Guanbin L, Yu Y (2005) Visual saliency based on multiscale deep features. In: 2005 IEEE conference on computer vision and pattern recognition, pp 5455–5463

  20. Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848

    Article  Google Scholar 

  21. Ranga A, Tyagi A, Fu CY, Liu W, Berg AC (2017) DSSD: deconvolutional single shot detector. In: Computer vision and pattern recognition

  22. Shen Z, Liu Z, Li J, Jiang Y, Chen Y, Xue X (2017) DSOD: learning deeply supervised object detectors from scratch. In: 2017 IEEE international conference on computer vision (ICCV), pp 1937–1945

  23. Leibe B, Seemann E, Schiele B (2005) Pedestrian detection in crowded scenes. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), pp 878–885

  24. Zhao, X, Liang S, Wei Y (2018) Pseudo mask augmented object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition

  25. Cho M, Laptev I, Kantorov V, Oquab M (2016) ContextLocNet: context-aware deep network models for weakly supervised localization. In: European conference on computer vision, Springer, Berlin, pp 350–365

  26. Liang X, Yu Y, Cheng H, Li Z, Gan Y, Lin L (2016) LSTM-CF: unifying context modeling and fusion with LSTMs for RGB-D scene labeling. In: European conference on computer vision, pp 541–557

  27. Zhang H, Li J, Ji Y et al (2017) Understanding subtitles by character-level sequence-to-sequence learning. IEEE Trans Ind Inf 13(2):616–624

    Article  Google Scholar 

  28. Li X, Liu Z, Luo P, Loy CC, Tang X (2017) Not all pixels are equal: difficulty-aware semantic segmentation via deep layer cascade. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 6459–6468

  29. Roh B, Cheon Y, Kim KH, Hong S, Park M (2016) PVANET: deep but lightweight neural networks for real-time object detection. arXiv: 1608.08021v1

  30. Cao J, Pang Y, Li X (2018) Exploring multi-branch and high-level semantic networks for improving pedestrian detection. In: IEEE conference on computer vision and pattern recognition

  31. Dollr P, Appel R, Belongie S, Perona P (2014) Fast feature pyramids for object detection. IEEE Trans Pattern Anal Mach Intell 36(8):1532–1545

    Article  Google Scholar 

  32. Lin T, Dollar P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: IEEE computer society conference on computer vision and pattern recognition, pp 936–944

  33. Tian Y, Luo P, Wang X, Tang X (2015) Pedestrian detection aided by deep learning semantic tasks. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 5079–5087

  34. Liu H, Wang S, Cheng J, Tang M (2018) Pcn: part and context information for pedestrian detection with cnns. arXiv: 1804.04483v1

  35. Jia Y, Shelhamer E, Donahue J et al (2014) Caffe: convolutional architecture for fast feature embedding. arXiv preprint, arXiv: 1408.5093

  36. Tom D, Monti F, Baroffio L, Bondi L, Tagliasacchi M, Tubaro S (2016) Deep convolutional neural networks for pedestrian detection. Sig Process Image Commun 47(C):482–489

    Article  Google Scholar 

  37. Ess A, Leibe B, Van Gool L (2007) Depth and appearance for mobile scene analysis. In: 2007 IEEE international conference on computer vision (ICCV)

  38. Schiele B, Dollr P, Wojek C, Perona P (2012) Pedestrian detection: an evaluation of the state of the art. IEEE Trans Pattern Anal Mach Intell 34(4):743–761

    Article  Google Scholar 

  39. Hengel A, Paisitkriangkrai S, Shen C, Van D (2014) Strengthening the effectiveness of pedestrian detection with spatially pooled features. Computer Vision–ECCV 2014. Springer International Publishing, pp 546–561

  40. Zhang X, Cheng L, Li B, Hu HM (2018) Too far to see? Not really!—pedestrian detection with scale-aware localization policy. IEEE Trans Image Process 27(8):3703–3715

    Article  MathSciNet  Google Scholar 

  41. Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Russakovsky O, Deng J, Bernstein M (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252

    Article  MathSciNet  Google Scholar 

  42. Hosang J, Benenson R, Omran M, Schiele B (2014) Ten years of pedestrian detection, what have we learned? Computer vision-ECCV 2014 workshops. Springer, Berlin, pp 613–627

    Google Scholar 

  43. Viola P, Jones MJ, Snow D (2005) Detecting pedestrians using patterns of motion and appearance. Int J Comput Vis 63(2):153–161

    Article  Google Scholar 

  44. Maji S, Berg AC, Malik J (2008) Classification using intersection kernel support vector machines is efficient. In: 2008 IEEE conference on computer vision and pattern recognition, pp 1–8

  45. Davis LS, Lin Z (2008) A pose-invariant descriptor for human detection and segmentation. In: European conference on computer vision, pp 423–436

  46. Dollar P, Tu Z, Tao H, Belongie S (2007) Feature mining for image classification. In: 2007 conference on computer vision and pattern recognition, pp 1–8

  47. Wojek C, Schiele B (2008) A Performance evaluation of single and multi-feature people detection. In: Rigoll G (ed) Pattern recognition. DAGM 2008. Lecture notes in computer science, vol 5096. Springer, Berlin, pp 82–91

    Google Scholar 

  48. Dollr P, Nam W, Han JH (2014) Local decorrelation for improved pedestrian detection. In: NIPS’14 proceedings of the 27th international conference on neural information processing systems, pp 424–432

  49. Dollar P, Wojek C, Schiele B, Perona P (2009) Pedestrian detection: a benchmark. In: 2009 IEEE conference on computer vision and pattern recognition (CVPR), pp 304–311

  50. Perona P, Dollar P, Schiele B, Wojek C http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians

Download references

Acknowledgements

The authors would like to thank Dollar et al. for sharing Caltech dataset and MATLAB Toolbox, Ess et al. for sharing ETH dataset and Dalal et al. for sharing INRIA dataset. This work was supported in part by the National Natural Science Foundation of China under Grants 61203261, 61876099 and U1613223, in part by the China Post-Doctoral Science Foundation through the Project under Grant 2012M521335, in part by the Research Fund of Guangxi Key Lab of Multi-source Information Mining and Security under Grant MIMS16-02, in part by the Shenzhen Science and Technology Research and Development Funds under Grant JCYJ20170307093018753, and in part by the Fundamental Research Funds of Shandong University under Grant 2018JCG07.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhenxue Chen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, Z., Chen, Z., Jonathan Wu, Q.M. et al. Pedestrian detection via deep segmentation and context network. Neural Comput & Applic 32, 5845–5857 (2020). https://doi.org/10.1007/s00521-019-04057-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-019-04057-4

Keywords

Navigation