Skip to main content
Log in

PerSnake: a real-time pedestrian instance segmentation network using contour representation

  • ORIGINAL PAPER
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

In an intelligent transportation system, pedestrian identification is an indispensable security link. This paper aims to explore a high precision and real-time pedestrian recognition method. Most of the existing pedestrian recognition methods are usually pixel-level and framed. However, these methods cannot be able to make an accurate or real-time judgment of the pedestrian position. To this end, this paper introduces a novel contour-based segmentation network (PerSnake) for real-time pedestrian detection in autonomous driving. We design an octagon contour specifically for pedestrians by using a YOLO-V4 detector as the initial pedestrian contour, and a contour feature aggregation module is proposed to aggregate the multi-level pedestrian contour features. To construct pedestrian contour labels, we annotate the ground truth of the pedestrian contour on Penn-Fudan and Citypersons datasets by using edge detection. Substantial experiments are conducted on the CityPersons, and the Penn-Fudan database. The results demonstrate that our PerSnake is capable of achieving real-time pedestrian identification with a speed of 37.8 frame per second and the average precision reaches 36.4%. Compared with the existing methods, our PerSnake exhibits competitive advantages in terms of segmentation speed and precision.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data availability

The data and material used during the current study are available from the corresponding author on reasonable request.

Code availability

The code used during the current study are available from the corresponding author on reasonable request.

References

  1. Wang, X., Han, T-X., Yan, S.: An hog-lbp human detector with partial occlusion handling. In: Proceeding of IEEE international conferences on computer vision (ICCV), pp. 32–39. (2009)

  2. Viola, P., Jones, M-J., Snow, D.: Detecting pedestrian using patterns of motion and appearance. In: Proceedings ninth IEEE international conference on computer vision IJCV, pp. 153–161. (2005)

  3. Acques. J., Musse, S-R.: Shape-based pedestrian segmentation in still images. In; IEEE international symposium on multimedia (ISM), vol. 10, pp. 53–71. (2016)

  4. Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: IEEE conference on computer vision and pattern recognition, pp. 7263–7271. (2017)

  5. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 580–587. (2014)

  6. Ren, S., Ren., et al.: Faster r-cnn: towards real-time object detection with region proposal networks. In: IEEE transactions on pattern analysis and machine intelligence. vol. 39, pp. 1137–1149. (2016)

  7. Chen, L-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp. 518–534. (2018)

  8. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 7–12. (2015)

  9. He, K., Gkioxari, P., Dollar, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE international conference on computer vision and pattern recognition (CVPR), pp. 2980–2988. (2017)

  10. Liu, S., Qi, L., Qin, H., Jia, J.: Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 8759–8768. (2018)

  11. Xu, W., Wang, H., Qi, F., Lu, C.: Explicit shape encoding for real-time instance segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp. 5167–5176. (2019)

  12. Castrejon, L., Kundu, K., Urtasun, R., Fidler, S.: Annotating object instances with a polygon-rnn. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). (2017)

  13. Ling, H., Gao, J., Kar, A., Chen, W., Fidler, S.: Fast interactive object annotation with curve-GCN. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). (2019)

  14. Peng, S., Jiang, W., Pi, H., Li, X., Zhou, X.: Deep snake for real-time instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). (2020)

  15. Wang, L., Shi, J., Song, G., Shen, I-F.: Object detection combining recognition and segmentation. In: Proceeding of Asian conference on computer vision, pp. 189–199. (2017)

  16. Zhang, S., Benenson, R., Schiele, B.: CityPersons: a diverse dataset for pedestrian detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 3213–3221. (2017)

  17. Radeva, P., Serrat, J., Martí, E.: A snake for model-based segmentation. In: Proceedings of the IEEE conference on computer vision, pp. 816–821. (1997)

  18. Flohr, F., Gavrila, D.: PedCut: an iterative framework for pedestrian segmentation combining shape models and multiple data cues. In: British machine vision conference, (2013). https://doi.org/10.5244/C.27.66

  19. Heess, N., Eslami, S., Winn, J.: The shape boltzmann machine: a strong model of object shape. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). (2012)

  20. Eslami, S., Williams, C.: A generative model for parts-based object segmentation. In: Proceeding of 31st international conference neural information processing systems, pp. 100–107. (2014)

  21. Li, Y., Zhong, Z., Wei, W.: Combining shape and appearance for automatic pedestrian segmentation. In: IEEE international conference on tools with artificial intelligence. (2011). https://doi.org/10.1109/ICTAI.2011.61

  22. Zhang, L., Liang, L., Liang, X., He, K.: Is faster R-CNN doing well for pedestrian detection?. In: Proceeding of European conference on computer vision (ECCV), pp. 418–434. (2016)

  23. Gao, Z., Shaobo, L.-I., Chen, J., Zheng, L.-I.: Zhengjie pedestrian detection method based on YOLO network. Comput. Eng. 44, 215–219 (2018)

    Google Scholar 

  24. Zhou, C., Yuan, Y.: Bi.box regression for pedestrian detection and occlusion In: Proceedings of the European conference on computer vision (ECCV), pp. 135–151. (2018)

  25. Mohib, U., Ahmed, M., Faouzi, A.-C.: Pednet: a spatio-temporal deep convolutional neural network for pedestrian segmentation. J. Imaging 4, 107–118 (2018)

    Article  Google Scholar 

  26. Xu, X., Ma, M., Thompson, S.-G., Li, Z.: Intelligent co-detection of cyclists and motorcyclists based on an improved deep learning method. Meas. Sci. Technol. 32, 025402 (2021)

    Article  Google Scholar 

  27. Lee, Y., Park, J., Recognition, P.: CenterMask: real-time anchor-free instance segmentation In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR). (2020)

  28. Dai J, He, K and Sun, J 2016 Instance-Aware Semantic Segmentation via Multi-task Network CascadesIEEE Conf. Comput. Vis. Pattern Recognit (CVPR) pp. 3150–3158.

  29. Bochkovskiy, A., Wang, C-Y., Liao, H.: Yolov4: optimal speed and accuracy of object detection. (2020) (https://arxiv.org/abs/2004.10934).

  30. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: IEEE transactions on pattern analysis and machine intelligence. (2017)

  31. Zhou, X., Zhuo, J., Krhenbühl, P.: Bottom-up object detection by grouping extreme and center points. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR). (2019)

Download references

Funding

This research was funded by the National Nature Science Foundation of China, grant number 61374197 、the Shanghai Nature Science Foundation of the Shanghai Science and Technology Commission, China, grant number 20ZR1437900 and the Natural Science Project of Jiangsu Province Colleges and Universities, China, grant number 22KJB580003.

Author information

Authors and Affiliations

Authors

Contributions

ZG has developed the theoretical methodology and experimental analysis. YH provides the project administration and funding acquisition. ZG, XH, LM and WZ make investigation and write original draft preparation.

Corresponding author

Correspondence to Zhiyang Guo.

Ethics declarations

Conflict of interest

The authors have not conflict of interest to declare.

Ethical approval

All the developed results presented in this study have been conducted under the most strict ethical guidelines.

Consent to participate

All authors participated in the development of both theoretical and simulated outcomes.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guo, Z., Wang, X., Zhang, Z. et al. PerSnake: a real-time pedestrian instance segmentation network using contour representation. Machine Vision and Applications 34, 78 (2023). https://doi.org/10.1007/s00138-023-01419-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00138-023-01419-w

Keywords

Navigation