Abstract
In an intelligent transportation system, pedestrian identification is an indispensable security link. This paper aims to explore a high precision and real-time pedestrian recognition method. Most of the existing pedestrian recognition methods are usually pixel-level and framed. However, these methods cannot be able to make an accurate or real-time judgment of the pedestrian position. To this end, this paper introduces a novel contour-based segmentation network (PerSnake) for real-time pedestrian detection in autonomous driving. We design an octagon contour specifically for pedestrians by using a YOLO-V4 detector as the initial pedestrian contour, and a contour feature aggregation module is proposed to aggregate the multi-level pedestrian contour features. To construct pedestrian contour labels, we annotate the ground truth of the pedestrian contour on Penn-Fudan and Citypersons datasets by using edge detection. Substantial experiments are conducted on the CityPersons, and the Penn-Fudan database. The results demonstrate that our PerSnake is capable of achieving real-time pedestrian identification with a speed of 37.8 frame per second and the average precision reaches 36.4%. Compared with the existing methods, our PerSnake exhibits competitive advantages in terms of segmentation speed and precision.
Similar content being viewed by others
Data availability
The data and material used during the current study are available from the corresponding author on reasonable request.
Code availability
The code used during the current study are available from the corresponding author on reasonable request.
References
Wang, X., Han, T-X., Yan, S.: An hog-lbp human detector with partial occlusion handling. In: Proceeding of IEEE international conferences on computer vision (ICCV), pp. 32–39. (2009)
Viola, P., Jones, M-J., Snow, D.: Detecting pedestrian using patterns of motion and appearance. In: Proceedings ninth IEEE international conference on computer vision IJCV, pp. 153–161. (2005)
Acques. J., Musse, S-R.: Shape-based pedestrian segmentation in still images. In; IEEE international symposium on multimedia (ISM), vol. 10, pp. 53–71. (2016)
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: IEEE conference on computer vision and pattern recognition, pp. 7263–7271. (2017)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 580–587. (2014)
Ren, S., Ren., et al.: Faster r-cnn: towards real-time object detection with region proposal networks. In: IEEE transactions on pattern analysis and machine intelligence. vol. 39, pp. 1137–1149. (2016)
Chen, L-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp. 518–534. (2018)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 7–12. (2015)
He, K., Gkioxari, P., Dollar, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE international conference on computer vision and pattern recognition (CVPR), pp. 2980–2988. (2017)
Liu, S., Qi, L., Qin, H., Jia, J.: Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 8759–8768. (2018)
Xu, W., Wang, H., Qi, F., Lu, C.: Explicit shape encoding for real-time instance segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp. 5167–5176. (2019)
Castrejon, L., Kundu, K., Urtasun, R., Fidler, S.: Annotating object instances with a polygon-rnn. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). (2017)
Ling, H., Gao, J., Kar, A., Chen, W., Fidler, S.: Fast interactive object annotation with curve-GCN. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). (2019)
Peng, S., Jiang, W., Pi, H., Li, X., Zhou, X.: Deep snake for real-time instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). (2020)
Wang, L., Shi, J., Song, G., Shen, I-F.: Object detection combining recognition and segmentation. In: Proceeding of Asian conference on computer vision, pp. 189–199. (2017)
Zhang, S., Benenson, R., Schiele, B.: CityPersons: a diverse dataset for pedestrian detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 3213–3221. (2017)
Radeva, P., Serrat, J., Martí, E.: A snake for model-based segmentation. In: Proceedings of the IEEE conference on computer vision, pp. 816–821. (1997)
Flohr, F., Gavrila, D.: PedCut: an iterative framework for pedestrian segmentation combining shape models and multiple data cues. In: British machine vision conference, (2013). https://doi.org/10.5244/C.27.66
Heess, N., Eslami, S., Winn, J.: The shape boltzmann machine: a strong model of object shape. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). (2012)
Eslami, S., Williams, C.: A generative model for parts-based object segmentation. In: Proceeding of 31st international conference neural information processing systems, pp. 100–107. (2014)
Li, Y., Zhong, Z., Wei, W.: Combining shape and appearance for automatic pedestrian segmentation. In: IEEE international conference on tools with artificial intelligence. (2011). https://doi.org/10.1109/ICTAI.2011.61
Zhang, L., Liang, L., Liang, X., He, K.: Is faster R-CNN doing well for pedestrian detection?. In: Proceeding of European conference on computer vision (ECCV), pp. 418–434. (2016)
Gao, Z., Shaobo, L.-I., Chen, J., Zheng, L.-I.: Zhengjie pedestrian detection method based on YOLO network. Comput. Eng. 44, 215–219 (2018)
Zhou, C., Yuan, Y.: Bi.box regression for pedestrian detection and occlusion In: Proceedings of the European conference on computer vision (ECCV), pp. 135–151. (2018)
Mohib, U., Ahmed, M., Faouzi, A.-C.: Pednet: a spatio-temporal deep convolutional neural network for pedestrian segmentation. J. Imaging 4, 107–118 (2018)
Xu, X., Ma, M., Thompson, S.-G., Li, Z.: Intelligent co-detection of cyclists and motorcyclists based on an improved deep learning method. Meas. Sci. Technol. 32, 025402 (2021)
Lee, Y., Park, J., Recognition, P.: CenterMask: real-time anchor-free instance segmentation In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR). (2020)
Dai J, He, K and Sun, J 2016 Instance-Aware Semantic Segmentation via Multi-task Network CascadesIEEE Conf. Comput. Vis. Pattern Recognit (CVPR) pp. 3150–3158.
Bochkovskiy, A., Wang, C-Y., Liao, H.: Yolov4: optimal speed and accuracy of object detection. (2020) (https://arxiv.org/abs/2004.10934).
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: IEEE transactions on pattern analysis and machine intelligence. (2017)
Zhou, X., Zhuo, J., Krhenbühl, P.: Bottom-up object detection by grouping extreme and center points. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR). (2019)
Funding
This research was funded by the National Nature Science Foundation of China, grant number 61374197 、the Shanghai Nature Science Foundation of the Shanghai Science and Technology Commission, China, grant number 20ZR1437900 and the Natural Science Project of Jiangsu Province Colleges and Universities, China, grant number 22KJB580003.
Author information
Authors and Affiliations
Contributions
ZG has developed the theoretical methodology and experimental analysis. YH provides the project administration and funding acquisition. ZG, XH, LM and WZ make investigation and write original draft preparation.
Corresponding author
Ethics declarations
Conflict of interest
The authors have not conflict of interest to declare.
Ethical approval
All the developed results presented in this study have been conducted under the most strict ethical guidelines.
Consent to participate
All authors participated in the development of both theoretical and simulated outcomes.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Guo, Z., Wang, X., Zhang, Z. et al. PerSnake: a real-time pedestrian instance segmentation network using contour representation. Machine Vision and Applications 34, 78 (2023). https://doi.org/10.1007/s00138-023-01419-w
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00138-023-01419-w