PerSnake: a real-time pedestrian instance segmentation network using contour representation

Guo, Zhiyang; Wang, Xincan; Zhang, Zhihua; Huang, Yingping; Ma, Xueyin

doi:10.1007/s00138-023-01419-w

PerSnake: a real-time pedestrian instance segmentation network using contour representation

ORIGINAL PAPER
Published: 29 July 2023

Volume 34, article number 78, (2023)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Zhiyang Guo ORCID: orcid.org/0000-0002-3732-3853¹,
Xincan Wang¹,
Zhihua Zhang¹,
Yingping Huang² &
…
Xueyin Ma¹

261 Accesses
1 Citation
Explore all metrics

Abstract

In an intelligent transportation system, pedestrian identification is an indispensable security link. This paper aims to explore a high precision and real-time pedestrian recognition method. Most of the existing pedestrian recognition methods are usually pixel-level and framed. However, these methods cannot be able to make an accurate or real-time judgment of the pedestrian position. To this end, this paper introduces a novel contour-based segmentation network (PerSnake) for real-time pedestrian detection in autonomous driving. We design an octagon contour specifically for pedestrians by using a YOLO-V4 detector as the initial pedestrian contour, and a contour feature aggregation module is proposed to aggregate the multi-level pedestrian contour features. To construct pedestrian contour labels, we annotate the ground truth of the pedestrian contour on Penn-Fudan and Citypersons datasets by using edge detection. Substantial experiments are conducted on the CityPersons, and the Penn-Fudan database. The results demonstrate that our PerSnake is capable of achieving real-time pedestrian identification with a speed of 37.8 frame per second and the average precision reaches 36.4%. Compared with the existing methods, our PerSnake exhibits competitive advantages in terms of segmentation speed and precision.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

BoostTrack: boosting the similarity measure and detection confidence for improved multiple object tracking

Article Open access 12 April 2024

Enhancing ALPR: a two stage YOLO model with data augmentation for improved accuracy and robustness

Article 23 May 2024

SES-YOLOv8n: automatic driving object detection algorithm based on improved YOLOv8

Article 22 March 2024

Data availability

The data and material used during the current study are available from the corresponding author on reasonable request.

Code availability

The code used during the current study are available from the corresponding author on reasonable request.

References

Wang, X., Han, T-X., Yan, S.: An hog-lbp human detector with partial occlusion handling. In: Proceeding of IEEE international conferences on computer vision (ICCV), pp. 32–39. (2009)
Viola, P., Jones, M-J., Snow, D.: Detecting pedestrian using patterns of motion and appearance. In: Proceedings ninth IEEE international conference on computer vision IJCV, pp. 153–161. (2005)
Acques. J., Musse, S-R.: Shape-based pedestrian segmentation in still images. In; IEEE international symposium on multimedia (ISM), vol. 10, pp. 53–71. (2016)
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: IEEE conference on computer vision and pattern recognition, pp. 7263–7271. (2017)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 580–587. (2014)
Ren, S., Ren., et al.: Faster r-cnn: towards real-time object detection with region proposal networks. In: IEEE transactions on pattern analysis and machine intelligence. vol. 39, pp. 1137–1149. (2016)
Chen, L-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp. 518–534. (2018)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 7–12. (2015)
He, K., Gkioxari, P., Dollar, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE international conference on computer vision and pattern recognition (CVPR), pp. 2980–2988. (2017)
Liu, S., Qi, L., Qin, H., Jia, J.: Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 8759–8768. (2018)
Xu, W., Wang, H., Qi, F., Lu, C.: Explicit shape encoding for real-time instance segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp. 5167–5176. (2019)
Castrejon, L., Kundu, K., Urtasun, R., Fidler, S.: Annotating object instances with a polygon-rnn. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). (2017)
Ling, H., Gao, J., Kar, A., Chen, W., Fidler, S.: Fast interactive object annotation with curve-GCN. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). (2019)
Peng, S., Jiang, W., Pi, H., Li, X., Zhou, X.: Deep snake for real-time instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). (2020)
Wang, L., Shi, J., Song, G., Shen, I-F.: Object detection combining recognition and segmentation. In: Proceeding of Asian conference on computer vision, pp. 189–199. (2017)
Zhang, S., Benenson, R., Schiele, B.: CityPersons: a diverse dataset for pedestrian detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 3213–3221. (2017)
Radeva, P., Serrat, J., Martí, E.: A snake for model-based segmentation. In: Proceedings of the IEEE conference on computer vision, pp. 816–821. (1997)
Flohr, F., Gavrila, D.: PedCut: an iterative framework for pedestrian segmentation combining shape models and multiple data cues. In: British machine vision conference, (2013). https://doi.org/10.5244/C.27.66
Heess, N., Eslami, S., Winn, J.: The shape boltzmann machine: a strong model of object shape. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). (2012)
Eslami, S., Williams, C.: A generative model for parts-based object segmentation. In: Proceeding of 31st international conference neural information processing systems, pp. 100–107. (2014)
Li, Y., Zhong, Z., Wei, W.: Combining shape and appearance for automatic pedestrian segmentation. In: IEEE international conference on tools with artificial intelligence. (2011). https://doi.org/10.1109/ICTAI.2011.61
Zhang, L., Liang, L., Liang, X., He, K.: Is faster R-CNN doing well for pedestrian detection?. In: Proceeding of European conference on computer vision (ECCV), pp. 418–434. (2016)
Gao, Z., Shaobo, L.-I., Chen, J., Zheng, L.-I.: Zhengjie pedestrian detection method based on YOLO network. Comput. Eng. 44, 215–219 (2018)
Google Scholar
Zhou, C., Yuan, Y.: Bi.box regression for pedestrian detection and occlusion In: Proceedings of the European conference on computer vision (ECCV), pp. 135–151. (2018)
Mohib, U., Ahmed, M., Faouzi, A.-C.: Pednet: a spatio-temporal deep convolutional neural network for pedestrian segmentation. J. Imaging 4, 107–118 (2018)
Article Google Scholar
Xu, X., Ma, M., Thompson, S.-G., Li, Z.: Intelligent co-detection of cyclists and motorcyclists based on an improved deep learning method. Meas. Sci. Technol. 32, 025402 (2021)
Article Google Scholar
Lee, Y., Park, J., Recognition, P.: CenterMask: real-time anchor-free instance segmentation In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR). (2020)
Dai J, He, K and Sun, J 2016 Instance-Aware Semantic Segmentation via Multi-task Network CascadesIEEE Conf. Comput. Vis. Pattern Recognit (CVPR) pp. 3150–3158.
Bochkovskiy, A., Wang, C-Y., Liao, H.: Yolov4: optimal speed and accuracy of object detection. (2020) (https://arxiv.org/abs/2004.10934).
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: IEEE transactions on pattern analysis and machine intelligence. (2017)
Zhou, X., Zhuo, J., Krhenbühl, P.: Bottom-up object detection by grouping extreme and center points. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR). (2019)

Download references

Funding

This research was funded by the National Nature Science Foundation of China, grant number 61374197 、the Shanghai Nature Science Foundation of the Shanghai Science and Technology Commission, China, grant number 20ZR1437900 and the Natural Science Project of Jiangsu Province Colleges and Universities, China, grant number 22KJB580003.

Author information

Authors and Affiliations

School of Traffic Engineering, Jiangsu Shipping College, NanTong, 226010, Jiangsu, China
Zhiyang Guo, Xincan Wang, Zhihua Zhang & Xueyin Ma
School of Optical-Electrical and Computer Engineering, University of Shanghai for Science & Technology, Shanghai, 200093, China
Yingping Huang

Authors

Zhiyang Guo
View author publications
You can also search for this author in PubMed Google Scholar
Xincan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhihua Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yingping Huang
View author publications
You can also search for this author in PubMed Google Scholar
Xueyin Ma
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

ZG has developed the theoretical methodology and experimental analysis. YH provides the project administration and funding acquisition. ZG, XH, LM and WZ make investigation and write original draft preparation.

Corresponding author

Correspondence to Zhiyang Guo.

Ethics declarations

Conflict of interest

The authors have not conflict of interest to declare.

Ethical approval

All the developed results presented in this study have been conducted under the most strict ethical guidelines.

Consent to participate

All authors participated in the development of both theoretical and simulated outcomes.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Guo, Z., Wang, X., Zhang, Z. et al. PerSnake: a real-time pedestrian instance segmentation network using contour representation. Machine Vision and Applications 34, 78 (2023). https://doi.org/10.1007/s00138-023-01419-w

Download citation

Received: 30 August 2022
Revised: 25 April 2023
Accepted: 12 June 2023
Published: 29 July 2023
DOI: https://doi.org/10.1007/s00138-023-01419-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

PerSnake: a real-time pedestrian instance segmentation network using contour representation

Abstract

Access this article

Similar content being viewed by others

BoostTrack: boosting the similarity measure and detection confidence for improved multiple object tracking

Enhancing ALPR: a two stage YOLO model with data augmentation for improved accuracy and robustness

SES-YOLOv8n: automatic driving object detection algorithm based on improved YOLOv8

Data availability

Code availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Consent to participate

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

PerSnake: a real-time pedestrian instance segmentation network using contour representation

Abstract

Access this article

Similar content being viewed by others

BoostTrack: boosting the similarity measure and detection confidence for improved multiple object tracking

Enhancing ALPR: a two stage YOLO model with data augmentation for improved accuracy and robustness

SES-YOLOv8n: automatic driving object detection algorithm based on improved YOLOv8

Data availability

Code availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Consent to participate

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation