Skip to main content

Graph-Based Scale-Aware Network for Human Parsing

  • Conference paper
  • First Online:
Book cover Pattern Recognition and Computer Vision (PRCV 2019)

Abstract

Recent work has made considerable progress in exploring contextual information for human parsing with the Fully Convolutional Network framework. However, there still exist two challenges: (1) inherent relative relationships between parts; (2) scale variation of human parts. To tackle both problems, we propose a Graph-Based Scale-Aware Network for human parsing. First, we embed a Graph-Based Part Reasoning Layer into the backbone network to reason the relative relationship between human parts. Then we construct a Scale-Aware Context Embedding Layer, which consists of two branches to capture scale-specific contextual information, with different receptive fields and scale-specific supervisions. In addition, we adopt an edge supervision to further improve the performance. Extensive experimental evaluations demonstrate that the proposed model performs favorably against the state-of-the-art human parsing methods. More specifically, our algorithm achieves 53.32% (mIoU) on the LIP dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. TPAMI 39(12), 2481–2495 (2017)

    Article  Google Scholar 

  2. Buades, A., Coll, B., Morel, J.-M.: A non-local algorithm for image denoising. In: CVPR (2005)

    Google Scholar 

  3. Chen, L.-C., Barron, J.T., Papandreou, G., Murphy, K., Yuille, A.L.: Semantic image segmentation with task-specific edge detection using CNNs and a discriminatively trained domain transform. In: CVPR (2016)

    Google Scholar 

  4. Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected CRFs. In: ICLR (2015)

    Google Scholar 

  5. Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. TPAMI 40(4), 834–848 (2017)

    Article  Google Scholar 

  6. Chen, L.-C., Yang, Y., Wang, J., Xu, W., Yuille, A.L.: Attention to scale: scale-aware semantic image segmentation. In CVPR (2016)

    Google Scholar 

  7. Chen, Y., Rohrbach, M., Yan, Z., Yan, S., Kalantidis, Y.: Graph-based global reasoning networks. In: CVPR (2019)

    Google Scholar 

  8. Fu, J., Zheng, H., Mei, T.: Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. In: CVPR (2017)

    Google Scholar 

  9. Fu, J., Liu, J., Tian, H., Fang, Z., Lu, H.: Dual attention network for scene segmentation. In: CVPR (2019)

    Google Scholar 

  10. Gan, C., Lin, M., Yang, Y., de Melo, G., Hauptmann, A.G.: Concepts not alone: exploring pairwise relationships for zero-shot video activity recognition. AAAI Press (2016)

    Google Scholar 

  11. Gong, K., Liang, X., Zhang, D., Shen, X., Lin, L.: Look into person: self-supervised structure-sensitive learning and a new benchmark for human parsing. In: CVPR (2017)

    Google Scholar 

  12. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

    Google Scholar 

  13. Jégou, S., Drozdzal, M., Vazquez, D., Romero, A., Bengio, Y.: The one hundred layers tiramisu: fully convolutional densenets for semantic segmentation. In: CVPR (2017)

    Google Scholar 

  14. Liang, X., Gong, K., Shen, X., Lin, L.: Look into person: joint body parsing & pose estimation network and a new benchmark. TPAMI 41(4), 871–885 (2018)

    Article  Google Scholar 

  15. Liang, X., Lin, L., Wei, Y., Shen, X., Yang, J., Yan, S.: Proposal-free network for instance-level object segmentation. TPAMI 40(12), 2978–2991 (2017)

    Article  Google Scholar 

  16. Liang, X., et al.: Human parsing with contextualized convolutional neural network. In: ICCV (2015)

    Google Scholar 

  17. Lin, G., Milan, A., Shen, C., Reid, I.: RefineNet: multi-path refinement networks for high-resolution semantic segmentation. In: CVPR (2017)

    Google Scholar 

  18. Liu, T., et al.: Devil in the details: towards accurate single and multiple human parsing. In: AAAI (2019)

    Google Scholar 

  19. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR (2015)

    Google Scholar 

  20. Mostajabi, M., Yadollahpour, P., Shakhnarovich, G.: Feedforward semantic segmentation with zoom-out features. In: CVPR (2015)

    Google Scholar 

  21. Park, S., Nie, B.X., Zhu, S.-C.: Attribute and-or grammar for joint parsing of human pose, parts and attributes. TPAMI 40(7), 1555–1569 (2017)

    Article  Google Scholar 

  22. Paszke, A., et al.: Automatic differentiation in PyTorch. In: NIPS (2017)

    Google Scholar 

  23. Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: CVPR (2018)

    Google Scholar 

  24. Xia, F., Wang, P., Chen, X., Yuille, A.L.: Joint multi-person pose estimation and semantic part segmentation. In: CVPR (2017)

    Google Scholar 

  25. Xia, F., Zhu, J., Wang, P., Yuille, A.L.: Pose-guided human parsing by an and/or graph using pose-context features. In: AAAI (2016)

    Google Scholar 

  26. Yu, C., Wang, J., Peng, C., Gao, C., Yu, G., Sang, N.: BiSeNet: bilateral segmentation network for real-time semantic segmentation. In: ECCV (2018)

    Google Scholar 

  27. Yu, C., Wang, J., Peng, C., Gao, C., Yu, G., Sang, N.: Learning a discriminative feature network for semantic segmentation. In: CVPR (2018)

    Google Scholar 

  28. Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: CVPR (2017)

    Google Scholar 

  29. Zhao, J., et al.: Self-supervised neural aggregation networks for human parsing. In: CVPR (2017)

    Google Scholar 

  30. Zhu, S., Urtasun, R., Fidler, S., Lin, D., Change Loy, C.: Be your own prada: fashion synthesis with structural coherence. In ICCV (2017)

    Google Scholar 

Download references

Acknowledgements

This work was supported by the Project of the National Natural Science Foundation of China (No. 61876210), and Natural Science Foundation of Hubei Province (No. 2018CFB426).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Changxin Gao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yang, B., Yu, C., Liu, J., Gao, C., Sang, N. (2019). Graph-Based Scale-Aware Network for Human Parsing. In: Lin, Z., et al. Pattern Recognition and Computer Vision. PRCV 2019. Lecture Notes in Computer Science(), vol 11858. Springer, Cham. https://doi.org/10.1007/978-3-030-31723-2_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-31723-2_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-31722-5

  • Online ISBN: 978-3-030-31723-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics