Abstract
Recent work has made considerable progress in exploring contextual information for human parsing with the Fully Convolutional Network framework. However, there still exist two challenges: (1) inherent relative relationships between parts; (2) scale variation of human parts. To tackle both problems, we propose a Graph-Based Scale-Aware Network for human parsing. First, we embed a Graph-Based Part Reasoning Layer into the backbone network to reason the relative relationship between human parts. Then we construct a Scale-Aware Context Embedding Layer, which consists of two branches to capture scale-specific contextual information, with different receptive fields and scale-specific supervisions. In addition, we adopt an edge supervision to further improve the performance. Extensive experimental evaluations demonstrate that the proposed model performs favorably against the state-of-the-art human parsing methods. More specifically, our algorithm achieves 53.32% (mIoU) on the LIP dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. TPAMI 39(12), 2481–2495 (2017)
Buades, A., Coll, B., Morel, J.-M.: A non-local algorithm for image denoising. In: CVPR (2005)
Chen, L.-C., Barron, J.T., Papandreou, G., Murphy, K., Yuille, A.L.: Semantic image segmentation with task-specific edge detection using CNNs and a discriminatively trained domain transform. In: CVPR (2016)
Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected CRFs. In: ICLR (2015)
Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. TPAMI 40(4), 834–848 (2017)
Chen, L.-C., Yang, Y., Wang, J., Xu, W., Yuille, A.L.: Attention to scale: scale-aware semantic image segmentation. In CVPR (2016)
Chen, Y., Rohrbach, M., Yan, Z., Yan, S., Kalantidis, Y.: Graph-based global reasoning networks. In: CVPR (2019)
Fu, J., Zheng, H., Mei, T.: Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. In: CVPR (2017)
Fu, J., Liu, J., Tian, H., Fang, Z., Lu, H.: Dual attention network for scene segmentation. In: CVPR (2019)
Gan, C., Lin, M., Yang, Y., de Melo, G., Hauptmann, A.G.: Concepts not alone: exploring pairwise relationships for zero-shot video activity recognition. AAAI Press (2016)
Gong, K., Liang, X., Zhang, D., Shen, X., Lin, L.: Look into person: self-supervised structure-sensitive learning and a new benchmark for human parsing. In: CVPR (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Jégou, S., Drozdzal, M., Vazquez, D., Romero, A., Bengio, Y.: The one hundred layers tiramisu: fully convolutional densenets for semantic segmentation. In: CVPR (2017)
Liang, X., Gong, K., Shen, X., Lin, L.: Look into person: joint body parsing & pose estimation network and a new benchmark. TPAMI 41(4), 871–885 (2018)
Liang, X., Lin, L., Wei, Y., Shen, X., Yang, J., Yan, S.: Proposal-free network for instance-level object segmentation. TPAMI 40(12), 2978–2991 (2017)
Liang, X., et al.: Human parsing with contextualized convolutional neural network. In: ICCV (2015)
Lin, G., Milan, A., Shen, C., Reid, I.: RefineNet: multi-path refinement networks for high-resolution semantic segmentation. In: CVPR (2017)
Liu, T., et al.: Devil in the details: towards accurate single and multiple human parsing. In: AAAI (2019)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR (2015)
Mostajabi, M., Yadollahpour, P., Shakhnarovich, G.: Feedforward semantic segmentation with zoom-out features. In: CVPR (2015)
Park, S., Nie, B.X., Zhu, S.-C.: Attribute and-or grammar for joint parsing of human pose, parts and attributes. TPAMI 40(7), 1555–1569 (2017)
Paszke, A., et al.: Automatic differentiation in PyTorch. In: NIPS (2017)
Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: CVPR (2018)
Xia, F., Wang, P., Chen, X., Yuille, A.L.: Joint multi-person pose estimation and semantic part segmentation. In: CVPR (2017)
Xia, F., Zhu, J., Wang, P., Yuille, A.L.: Pose-guided human parsing by an and/or graph using pose-context features. In: AAAI (2016)
Yu, C., Wang, J., Peng, C., Gao, C., Yu, G., Sang, N.: BiSeNet: bilateral segmentation network for real-time semantic segmentation. In: ECCV (2018)
Yu, C., Wang, J., Peng, C., Gao, C., Yu, G., Sang, N.: Learning a discriminative feature network for semantic segmentation. In: CVPR (2018)
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: CVPR (2017)
Zhao, J., et al.: Self-supervised neural aggregation networks for human parsing. In: CVPR (2017)
Zhu, S., Urtasun, R., Fidler, S., Lin, D., Change Loy, C.: Be your own prada: fashion synthesis with structural coherence. In ICCV (2017)
Acknowledgements
This work was supported by the Project of the National Natural Science Foundation of China (No. 61876210), and Natural Science Foundation of Hubei Province (No. 2018CFB426).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Yang, B., Yu, C., Liu, J., Gao, C., Sang, N. (2019). Graph-Based Scale-Aware Network for Human Parsing. In: Lin, Z., et al. Pattern Recognition and Computer Vision. PRCV 2019. Lecture Notes in Computer Science(), vol 11858. Springer, Cham. https://doi.org/10.1007/978-3-030-31723-2_24
Download citation
DOI: https://doi.org/10.1007/978-3-030-31723-2_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-31722-5
Online ISBN: 978-3-030-31723-2
eBook Packages: Computer ScienceComputer Science (R0)