Abstract
Instance-level human parsing in real-world human analysis scenarios is still underexplored due to the absence of sufficient data resources and technical difficulty in parsing multiple instances in a single pass. In this chapter, we make the first attempt to explore a detection-free part grouping network (PGN) to efficiently parse multiple people in an image in a single pass. PGN reformulates instance-level human parsing as twinned subtasks that can be jointly learned and mutually refined via a unified network: (1) semantic part segmentation for assigning each pixel as a human part and (2) instance-aware edge detection to group semantic parts into distinct person instances. Thus, the shared intermediate representation is endowed with capabilities in both characterizing fine-grained parts and inferring instance belongings of each part. Finally, a simple instance partition process is employed to obtain the final results during inference.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
L. Wang, X. Ji, Q. Deng, M. Jia, Deformable part model based multiple pedestrian detection for video surveillance in crowded scenes, in VISAPP (2014)
K. Gong, X. Liang, D. Zhang, X. Shen, L. Lin, Look into person: Self-supervised structure-sensitive learning and a new benchmark for human parsing, in CVPR (2017)
K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in CVPR (2016)
Q. Li, A. Arnab, P.H. Torr, Holistic, instance-level human parsing. arXiv preprint arXiv:1709.03612 (2017)
B. Hariharan, P. Arbeláez, R. Girshick, J. Malik, Simultaneous detection and segmentation, in ECCV (2014)
X. Liang, Y. Wei, X. Shen, Z. Jie, J. Feng, L. Lin, S. Yan, Reversible recursive instance-level object segmentation, in CVPR (2016)
J. Dai, K. He, J. Sun, Instance-aware semantic segmentation via multi-task network cascades, in CVPR (2016)
P.O. Pinheiro, R. Collobert, P. Dollár, Learning to segment object candidates, in NIPS (2015)
K. He, G. Gkioxari, P. Dollar, R. Girshick, Mask r-cnn, in ICCV (2017)
S. Liu, J. Jia, S. Fidler, R. Urtasun, Sgn: Sequential grouping networks for instance segmentation, in ICCV (2017)
A. Kirillov, E. Levinkov, B. Andres, B. Savchynskyy, C. Rother, Instancecut: from edges to instances with multicut, in CVPR (2017)
E. Simo-Serra, S. Fidler, F. Moreno-Noguer, R. Urtasun, A High Performance CRF Model for Clothes Parsing, in ACCV (2014)
Z. Liu, P. Luo, S. Qiu, X. Wang, X. Tang, Deepfashion: Powering robust clothes recognition and retrieval with rich annotations, in CVPR (2016)
M. Hadi Kiapour, X.  Han, S.L.A.C.B.T.L.B.: Where to buy it:matching street clothing photos in online shops, in ICCV (2015)
Simo-Serra, E., Fidler, S., Moreno-Noguer, F., Urtasun, R.: Neuroaesthetics in fashion: modeling the perception of fashionability, in CVPR (2015)
A. Arnab, P.H.S. Torr, Pixelwise instance segmentation with a dynamically instantiated network, in CVPR (2017)
S. Ren, K. He, R. Girshick, J. Sun, Faster r-cnn: Towards real-time object detection with region proposal networks, in NIPS (2015)
M. Ren, R.S. Zemel, End-to-end instance segmentation with recurrent attention, in CVPR (2017)
M. Bai, R. Urtasun, Deep watershed transform for instance segmentation, in CVPR (2017)
B. Romera-Paredes, P.H.S. Torr, Recurrent instance segmentation, in ECCV (2016)
H. Zhao, J. Shi, X. Qi, X. Wang, J. Jia, Pyramid scene parsing network, in CVPR (2017)
M. Everingham, L. Van Gool, C.K. Williams, J. Winn, A. Zisserman, The pascal visual object classes (voc) challenge, in IJCV (2010)
S. Xie, Z. Tu, Holistically-nested edge detection, in ICCV (2015)
L.C. Chen, G. Papandreou, F. Schroff, H. Adam, Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017)
Y. Liu, M.M. Cheng, X. Hu, K. Wang, X. Bai, Richer convolutional features for edge detection, in CVPR (2017)
J. Yang, B. Price, S. Cohen, H. Lee, M.H. Yang, Object contour detection with a fully convolutional encoder-decoder network, in CVPR (2016)
C. Gan, M. Lin, Y. Yang, G. de Melo, A.G. Hauptmann, Concepts not alone: Exploring pairwise relationships for zero-shot video activity recognition, in AAAI (2016)
X. Liang, Y. Wei, X. Shen, J. Yang, L. Lin, S. Yan, Proposal-free network for instance-level object segmentation. arXiv preprint arXiv:1509.02636 (2015)
J. Long, E. Shelhamer, T. Darrell, Fully convolutional networks for semantic segmentation, arXiv preprint arXiv:1411.4038 (2014)
H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M, Bernstein, A.C. Berg Olga, Russakovsky, J. Deng, L. Fei-Fei, Imagenet large scale visual recognition challenge (2015)
S. Zheng, S. Jayasumana, B. Romera-Paredes, V. Vineet, Z. Su, D. Du, C. Huang, P. Torr, Conditional random fields as recurrent neural networks, in ICCV (2015)
L.C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A.L. Yuille, Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. arXiv preprint arXiv:1606.00915 (2016 TPAMI (2015)
K. Yamaguchi, M. Kiapour, T. Berg, Paper doll parsing: Retrieving similar styles to parse clothing items, in ICCV (2013)
J. Dong, Q. Chen, W. Xia, Z. Huang, S. Yan, A deformable mixture parsing model with parselets, in ICCV (2013)
X. Chen, R. Mottaghi, X. Liu, S. Fidler, R. Urtasun, et al., Detect what you can: Detecting and representing objects using holistic models and body parts, in CVPR (2014)
T. Lin, M. Maire, S.J. Belongie, L.D. Bourdev, R.B. Girshick, J. Hays, P. Perona, D. Ramanan, P., C.L. Zitnick, Microsoft COCO: common objects in context. CoRR arXiv:1405.0312 (2014)
K. Yamaguchi, M. Kiapour, L. Ortiz, T. Berg, Parsing clothing in fashion photographs, in CVPR (2012)
X. Liang, C. Xu, X. Shen, J. Yang, S. Liu, J. Tang, L. Lin, S. Yan, Human parsing with contextualized convolutional neural network, in ICCV (2015)
L.C. Chen, Y. Yang, J. Wang, W. Xu, A.L. Yuille, Attention to scale: Scale-aware semantic image segmentation, in CVPR (2016)
F. Xia, P. Wang, L.C., A.L. Yuille, Zoom better to see clearer: Huamn part segmentation with auto zoom net, in ECCV (2016)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Lin, L., Zhang, D., Luo, P., Zuo, W. (2020). Instance-Level Human Parsing. In: Human Centric Visual Analysis with Deep Learning. Springer, Singapore. https://doi.org/10.1007/978-981-13-2387-4_6
Download citation
DOI: https://doi.org/10.1007/978-981-13-2387-4_6
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-2386-7
Online ISBN: 978-981-13-2387-4
eBook Packages: Computer ScienceComputer Science (R0)