Skip to main content

Instance-Level Human Parsing

  • Chapter
  • First Online:
Human Centric Visual Analysis with Deep Learning

Abstract

Instance-level human parsing in real-world human analysis scenarios is still underexplored due to the absence of sufficient data resources and technical difficulty in parsing multiple instances in a single pass. In this chapter, we make the first attempt to explore a detection-free part grouping network (PGN) to efficiently parse multiple people in an image in a single pass. PGN reformulates instance-level human parsing as twinned subtasks that can be jointly learned and mutually refined via a unified network: (1) semantic part segmentation for assigning each pixel as a human part and (2) instance-aware edge detection to group semantic parts into distinct person instances. Thus, the shared intermediate representation is endowed with capabilities in both characterizing fine-grained parts and inferring instance belongings of each part. Finally, a simple instance partition process is employed to obtain the final results during inference.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. L. Wang, X. Ji, Q. Deng, M. Jia, Deformable part model based multiple pedestrian detection for video surveillance in crowded scenes, in VISAPP (2014)

    Google Scholar 

  2. K. Gong, X. Liang, D. Zhang, X. Shen, L. Lin, Look into person: Self-supervised structure-sensitive learning and a new benchmark for human parsing, in CVPR (2017)

    Google Scholar 

  3. K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in CVPR (2016)

    Google Scholar 

  4. Q. Li, A. Arnab, P.H. Torr, Holistic, instance-level human parsing. arXiv preprint arXiv:1709.03612 (2017)

  5. B. Hariharan, P. Arbeláez, R. Girshick, J. Malik, Simultaneous detection and segmentation, in ECCV (2014)

    Google Scholar 

  6. X. Liang, Y. Wei, X. Shen, Z. Jie, J. Feng, L. Lin, S. Yan, Reversible recursive instance-level object segmentation, in CVPR (2016)

    Google Scholar 

  7. J. Dai, K. He, J. Sun, Instance-aware semantic segmentation via multi-task network cascades, in CVPR (2016)

    Google Scholar 

  8. P.O. Pinheiro, R. Collobert, P. Dollár, Learning to segment object candidates, in NIPS (2015)

    Google Scholar 

  9. K. He, G. Gkioxari, P. Dollar, R. Girshick, Mask r-cnn, in ICCV (2017)

    Google Scholar 

  10. S. Liu, J. Jia, S. Fidler, R. Urtasun, Sgn: Sequential grouping networks for instance segmentation, in ICCV (2017)

    Google Scholar 

  11. A. Kirillov, E. Levinkov, B. Andres, B. Savchynskyy, C. Rother, Instancecut: from edges to instances with multicut, in CVPR (2017)

    Google Scholar 

  12. E. Simo-Serra, S. Fidler, F. Moreno-Noguer, R. Urtasun, A High Performance CRF Model for Clothes Parsing, in ACCV (2014)

    Google Scholar 

  13. Z. Liu, P. Luo, S. Qiu, X. Wang, X. Tang, Deepfashion: Powering robust clothes recognition and retrieval with rich annotations, in CVPR (2016)

    Google Scholar 

  14. M. Hadi Kiapour, X.  Han, S.L.A.C.B.T.L.B.: Where to buy it:matching street clothing photos in online shops, in ICCV (2015)

    Google Scholar 

  15. Simo-Serra, E., Fidler, S., Moreno-Noguer, F., Urtasun, R.: Neuroaesthetics in fashion: modeling the perception of fashionability, in CVPR (2015)

    Google Scholar 

  16. A. Arnab, P.H.S. Torr, Pixelwise instance segmentation with a dynamically instantiated network, in CVPR (2017)

    Google Scholar 

  17. S. Ren, K. He, R. Girshick, J. Sun, Faster r-cnn: Towards real-time object detection with region proposal networks, in NIPS (2015)

    Google Scholar 

  18. M. Ren, R.S. Zemel, End-to-end instance segmentation with recurrent attention, in CVPR (2017)

    Google Scholar 

  19. M. Bai, R. Urtasun, Deep watershed transform for instance segmentation, in CVPR (2017)

    Google Scholar 

  20. B. Romera-Paredes, P.H.S. Torr, Recurrent instance segmentation, in ECCV (2016)

    Google Scholar 

  21. H. Zhao, J. Shi, X. Qi, X. Wang, J. Jia, Pyramid scene parsing network, in CVPR (2017)

    Google Scholar 

  22. M. Everingham, L. Van Gool, C.K. Williams, J. Winn, A. Zisserman, The pascal visual object classes (voc) challenge, in IJCV (2010)

    Google Scholar 

  23. S. Xie, Z. Tu, Holistically-nested edge detection, in ICCV (2015)

    Google Scholar 

  24. L.C. Chen, G. Papandreou, F. Schroff, H. Adam, Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017)

  25. Y. Liu, M.M. Cheng, X. Hu, K. Wang, X. Bai, Richer convolutional features for edge detection, in CVPR (2017)

    Google Scholar 

  26. J. Yang, B. Price, S. Cohen, H. Lee, M.H. Yang, Object contour detection with a fully convolutional encoder-decoder network, in CVPR (2016)

    Google Scholar 

  27. C. Gan, M. Lin, Y. Yang, G. de Melo, A.G. Hauptmann, Concepts not alone: Exploring pairwise relationships for zero-shot video activity recognition, in AAAI (2016)

    Google Scholar 

  28. X. Liang, Y. Wei, X. Shen, J. Yang, L. Lin, S. Yan, Proposal-free network for instance-level object segmentation. arXiv preprint arXiv:1509.02636 (2015)

  29. J. Long, E. Shelhamer, T. Darrell, Fully convolutional networks for semantic segmentation, arXiv preprint arXiv:1411.4038 (2014)

  30. H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M, Bernstein, A.C. Berg Olga, Russakovsky, J. Deng, L. Fei-Fei, Imagenet large scale visual recognition challenge (2015)

    Google Scholar 

  31. S. Zheng, S. Jayasumana, B. Romera-Paredes, V. Vineet, Z. Su, D. Du, C. Huang, P. Torr, Conditional random fields as recurrent neural networks, in ICCV (2015)

    Google Scholar 

  32. L.C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A.L. Yuille, Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. arXiv preprint arXiv:1606.00915 (2016 TPAMI (2015)

  33. K. Yamaguchi, M. Kiapour, T. Berg, Paper doll parsing: Retrieving similar styles to parse clothing items, in ICCV (2013)

    Google Scholar 

  34. J. Dong, Q. Chen, W. Xia, Z. Huang, S. Yan, A deformable mixture parsing model with parselets, in ICCV (2013)

    Google Scholar 

  35. X. Chen, R. Mottaghi, X. Liu, S. Fidler, R. Urtasun, et al., Detect what you can: Detecting and representing objects using holistic models and body parts, in CVPR (2014)

    Google Scholar 

  36. T. Lin, M. Maire, S.J. Belongie, L.D. Bourdev, R.B. Girshick, J. Hays, P. Perona, D. Ramanan, P., C.L. Zitnick, Microsoft COCO: common objects in context. CoRR arXiv:1405.0312 (2014)

  37. K. Yamaguchi, M. Kiapour, L. Ortiz, T. Berg, Parsing clothing in fashion photographs, in CVPR (2012)

    Google Scholar 

  38. X. Liang, C. Xu, X. Shen, J. Yang, S. Liu, J. Tang, L. Lin, S. Yan, Human parsing with contextualized convolutional neural network, in ICCV (2015)

    Google Scholar 

  39. L.C. Chen, Y. Yang, J. Wang, W. Xu, A.L. Yuille, Attention to scale: Scale-aware semantic image segmentation, in CVPR (2016)

    Google Scholar 

  40. F. Xia, P. Wang, L.C., A.L. Yuille, Zoom better to see clearer: Huamn part segmentation with auto zoom net, in ECCV (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liang Lin .

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Lin, L., Zhang, D., Luo, P., Zuo, W. (2020). Instance-Level Human Parsing. In: Human Centric Visual Analysis with Deep Learning. Springer, Singapore. https://doi.org/10.1007/978-981-13-2387-4_6

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-2387-4_6

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-2386-7

  • Online ISBN: 978-981-13-2387-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics