Learning deep representations for semantic image parsing: a comprehensive overview

Huang, Lili; Peng, Jiefeng; Zhang, Ruimao; Li, Guanbin; Lin, Liang

doi:10.1007/s11704-018-7195-8

Learning deep representations for semantic image parsing: a comprehensive overview

Research Article
Published: 30 August 2018

Volume 12, pages 840–857, (2018)
Cite this article

Frontiers of Computer Science Aims and scope Submit manuscript

Lili Huang¹,
Jiefeng Peng¹,
Ruimao Zhang¹,
Guanbin Li¹ &
…
Liang Lin¹

124 Accesses
10 Citations
6 Altmetric
Explore all metrics

Abstract

Semantic image parsing, which refers to the process of decomposing images into semantic regions and constructing the structure representation of the input, has recently aroused widespread interest in the field of computer vision. The recent application of deep representation learning has driven this field into a new stage of development. In this paper, we summarize three aspects of the progress of research on semantic image parsing, i.e., category-level semantic segmentation, instance-level semantic segmentation, and beyond segmentation. Specifically, we first review the general frameworks for each task and introduce the relevant variants. The advantages and limitations of each method are also discussed. Moreover, we present a comprehensive comparison of different benchmark datasets and evaluation metrics. Finally, we explore the future trends and challenges of semantic image parsing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Supervised semantic segmentation based on deep learning: a survey

Article 02 April 2022

Survey of recent progress in semantic image segmentation with CNNs

Article 17 November 2017

Semantic Object Parsing with Graph LSTM

References

Zhao H S, Shi J P, Qi X J, Wang X G, Jia J Y. Pyramid scene parsing network. In: Proceedings of International Conference on Computer Vision and Pattern Recognition. 2017, 2881–2890
Google Scholar
He K, Gkioxari G, Dollár P, Girshick R. Mask R-CNN. In: Proceedings of IEEE International Conference on Computation Vision. 2017, 2980–2988
Google Scholar
Tu Z, Chen X, Yuille A L, Zhu S C. Image parsing: unifying segmentation, detection, and recognition. International Journal of Computer Vision, 2005, 63(2): 113–140
Google Scholar
Tu Z, Zhu S C. Parsing images into region and curve processes. In: Proceedings of European Conference on Computer Vision. 2002, 393–407
Google Scholar
Han F, Zhu S C. Bottom-up/top-down image parsing with attribute grammar. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 31(1): 59–73
Google Scholar
Lin L, Wang G, Zhang R, Zhang R, Liang X, Zuo W. Deep structured scene parsing by learning with image descriptions. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2016, 2276–2284
Google Scholar
Lowe D G. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 2004, 60(2): 91–110
MathSciNet Google Scholar
Dalal N, Triggs B. Histograms of oriented gradients for human detection. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2005, 886–893
Google Scholar
Ahonen T, Hadid A, Pietikäinen M. Face recognition with local binary patterns. In: Proceedings of European Conference on Computer Vision. 2004, 469–481
Google Scholar
Liu Z, Li X, Luo P, Loy C C, Tang X. Semantic image segmentation via deep parsing network. In: Proceedings of IEEE International Conference on Computer Vision. 2015, 1377–1385
Google Scholar
Chen L C, Papandreou G, Kokkinos I, Murphy K, Yuille A L. Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834–848
Google Scholar
Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 3431–3440
Google Scholar
Chen L C, Papandreou G, Kokkinos I, Murphy K, Yuille A L. Semantic image segmentation with deep convolutional nets and fully connected crfs. 2014, arXiv preprint arXiv:14127062
Google Scholar
Peng C, Zhang X, Yu G, Luo G, Sun J. Large kernel matters-improve semantic segmentation by global convolutional network. 2017, arXiv preprint arXiv:170302719
Google Scholar
Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks. In: Proceedings of Advances in Neural Information Processing Systems. 2012, 1097–1105
Google Scholar
Socher R, Manning C D, Ng A Y. Learning continuous phrase representations and syntactic parsing with recursive neural networks. In: Proceedings of NIPS-2010 Deep Learning and Unsupervised Feature Learning Workshop. 2010, 1–9
Google Scholar
Li Y, Qi H, Dai J, Ji X, Wei Y. Fully convolutional instance-aware semantic segmentation. 2016, arXiv preprint arXiv:161107709
Google Scholar
Bengio Y. Deep learning of representations: looking forward. In: Proceedings of International Conference on Statistical Language and Speech Processing. 2013, 1–37
Google Scholar
Bengio Y. Deep learning of representations for unsupervised and transfer learning. In: Proceedings of ICML Workshop on Unsupervised and Transfer Learning. 2012, 17–36
Google Scholar
Bengio Y, Courville A, Vincent P. Representation learning: a review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(8): 1798–1828
Google Scholar
LeCun Y, Boser B, Denker J S, Henderson D, Howard R E, Hubbard W, Jackel L D. Backpropagation applied to handwritten zip code recognition. Neural Computation, 1989, 1(4): 541–551
Google Scholar
Dai J, He K, Li Y, Ren S, Sun J. Instance-sensitive fully convolutional networks. In: Proceedings of European Conference on Computer Vision. 2016, 534–549
Google Scholar
Islam M A, Naha S, Rochan M, Bruce N, Wang Y. Label refinement network for coarse-to-fine semantic segmentation. 2017, arXiv preprint arXiv:170300551
Google Scholar
Lipton Z C, Berkowitz J, Elkan C. A critical review of recurrent neural networks for sequence learning. 2015, arXiv preprint arXiv:150600019
Google Scholar
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521(7553): 436–444
Google Scholar
Liang X, Shen X, Xiang D, Feng J, Lin L, Yan S. Semantic object parsing with local-global long short-term memory. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2016, 3185–3193
Google Scholar
Karpathy A, Li F F. Deep visual-semantic alignments for generating image descriptions. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 3128–3137
Google Scholar
Sutskever I, Vinyals O, Le Q V. Sequence to sequence learning with neural networks. In: Proceedings of Advances in Neural Information Processing Systems. 2014, 3104–3112
Google Scholar
Li Z, Gan Y, Liang X, Yu Y, Cheng H, Lin L. LSTM-CF: unifying context modeling and fusion with LSTMS for RGB-D scene labeling. In: Proceedings of European Conference on Computer Vision. 2016, 541–557
Google Scholar
Peng Z, Zhang R, Liang X, Liu X, Lin L. Geometric scene parsing with hierarchical LSTM. 2016, arXiv preprint arXiv:160401931
Google Scholar
Byeon W, Breuel TM, Raue F, Liwicki M. Scene labeling with LSTM recurrent neural networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 3547–3555
Google Scholar
Liang X, Shen X, Feng J, Lin L, Yan S. Semantic object parsing with graph LSTM. In: Proceedings of European Conference on Computer Vision. 2016, 125–143
Google Scholar
Liang X, Lin L, Shen X, Feng J, Yan S, Xing E P. Interpretable structure-evolving LSTM. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2017, 2175–2184
Google Scholar
Zhang R, Yang W, Peng Z, Wang X, Lin L. Progressively diffused networks for semantic image segmentation. 2017, arXiv preprint arXiv:170205839
Google Scholar
Elman J L. Distributed representations, simple recurrent networks, and grammatical structure. Machine Learning, 1991, 7(2-3): 195–225
Google Scholar
Liu W, Rabinovich A, Berg A C. Parsenet: looking wider to see better. 2015, arXiv preprint arXiv:150604579
Google Scholar
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 1–9
Google Scholar
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2016, 770–778
Google Scholar
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014, arXiv preprint arXiv:14091556
Google Scholar
Pinheiro P H O, Collobert R. Recurrent convolutional neural networks for scene labeling. In: Proceedings of International Conference on Machine Learning. 2014, 82–90
Google Scholar
Graves A, Fernández S, Schmidhuber J. Multi-dimensional recurrent neural networks. In: Proceedings of the International Conference on Artificial Neural Networks. 2007, 549–558
Google Scholar
Lin L, Huang L, Chen T, Gan Y, Cheng H. Knowledge-guided recurrent neural network learning for task-oriented action prediction. In: Proceedings of IEEE International Conference on Multimedia and Expo. 2017, 625–630
Google Scholar
Farabet C, Couprie C, Najman L, LeCun Y. Learning hierarchical features for scene labeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(8): 1915–1929
Google Scholar
Gupta S, Girshick R, Arbeláez P, Malik J. Learning rich features from RGB-D images for object detection and segmentation. In: Proceedings of European Conference on Computer Vision. 2014, 345–360
Google Scholar
Ning F, Delhomme D, LeCun Y, Piano F, Bottou L, Barbano P E. Toward automatic phenotyping of developing embryos from videos. IEEE Transactions on Image Processing, 2005, 14(9): 1360–1371
Google Scholar
Liang X, Liu S, Shen X, Yang J, Liu L, Dong J, Lin L, Yan S. Deep human parsing with active template regression. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(12): 2402–2414
Google Scholar
Liang X, Xu C, Shen X, Yang J, Liu S, Tang J, Lin L, Yan S. Human parsing with contextualized convolutional neural network. In: Proceedings of IEEE International Conference on Computer Vision. 2015, 1386–1394
Google Scholar
Krähenbühl P, Koltun V. Efficientcient inference in fully connected CRFS with gaussian edge potentials. In: Proceedings of Advances in Neural Information Processing Systems. 2011, 109–117
Google Scholar
Noh H, Hong S, Han B. Learning deconvolution network for semantic segmentation. In: Proceedings of IEEE International Conference on Computer Vision. 2015, 1520–1528
Google Scholar
Badrinarayanan V, Handa A, Cipolla R. Segnet: a deep convolutional encoder-decoder architecture for robust semantic pixel-wise labelling. 2015, arXiv preprint arXiv:150507293
Google Scholar
Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation. In: Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention. 2015, 234–241
Google Scholar
Lin G, Milan A, Shen C, Reid I. Refinenet: multi-path refinement networks with identity mappings for high-resolution semantic segmentation. 2016, arXiv preprint arXiv:161106612
Google Scholar
Chen L C, Papandreou G, Schroff F, Adam H. Rethinking atrous convolution for semantic image segmentation. 2017, arXiv preprint arXiv:170605587
Google Scholar
Yu F, Koltun V. Multi-scale context aggregation by dilated convolutions. 2015, arXiv preprint arXiv:151107122
Google Scholar
Li X, Liu Z, Luo P, Loy C C, Tang X. Not all pixels are equal: difficulty-aware semantic segmentation via deep layer cascade. 2017, arXiv preprint arXiv:170401344
Google Scholar
Zhou Y, Xie L, Shen W, Wang Y, Fishman E K, Yuille A L. A fixedpoint model for pancreas segmentation in abdominal ct scans. In: Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention. 2017, 693–701
Google Scholar
Li Q, Wang J, Wipf D, Tu Z. Fixed-point model for structured labeling. In: Proceedings of International Conference on Machine Learning. 2013, 214–221
Google Scholar
Wang G, Luo P, Lin L, Wang X. Learning object interactions and descriptions for semantic image segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2017, 5859–5867
Google Scholar
Luo P, Wang G, Lin L, Wang X. Deep dual learning for semantic image segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2017, 2718–2726
Google Scholar
Schwing A G, Urtasun R. Fully connected deep structured networks. 2015, arXiv preprint arXiv:150302351
Google Scholar
Yang W, Luo P, Lin L. Clothing co-parsing by joint image segmentation and labeling. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 3182–3189
Google Scholar
Byeon W, Liwicki M, Breuel T M. Texture classification using 2D LSTM networks. In: Proceedings of International Conference on Pattern Recognition. 2014, 1144–1149
Google Scholar
Eigen D, Fergus R. Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: Proceedings of IEEE International Conference on Computer Vision. 2015, 2650–2658
Google Scholar
Girshick R, Donahue J, Darrell T, Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 580–587
Google Scholar
Reza M, Kosecka J. Reinforcement learning for semantic segmentation in indoor scenes. 2016, arXiv preprint arXiv:160601178
Google Scholar
Van Oord A, Kalchbrenner N, Kavukcuoglu K. Pixel recurrent neural networks. In: Proceedings of International Conference on Machine Learning. 2016, 1747–1756
Google Scholar
Kalchbrenner N, Danihelka I, Graves A. Grid long short-term memory. 2015, arXiv preprint arXiv:150701526
Google Scholar
Hariharan B, Arbeláez P, Girshick R, Malik J. Simultaneous detection and segmentation. In: Proceedings of European Conference on Computer Vision. 2014, 297–312
Google Scholar
Liang X, Wei Y, Shen X, Jie Z, Feng J, Lin L, Yan S. Reversible recursive instance-level object segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2016, 633–641
Google Scholar
Liang X, Wei Y, Shen X, Yang J, Lin L, Yan S. Proposal-free network for instance-level object segmentation. 2015, arXiv preprint arXiv:150902636
Google Scholar
Abtahi F, Zhu Z, Burry AM. A deep reinforcement learning approach to character segmentation of license plate images. In: Proceedings of International Conference on Machine Vision Applications. 2015, 539–542
Google Scholar
Lin L, Wang K, Zuo W, Wang M, Luo J, Zhang L. A deep structured model with radius–margin bound for 3D human activity recognition. International Journal of Computer Vision, 2016, 118(2): 256–273
MathSciNet Google Scholar
Hariharan B, Arbeláez P, Girshick R, Malik J. Hypercolumns for object segmentation and fine-grained localization. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 447–456
Google Scholar
Chen Y T, Liu X, Yang M H. Multi-instance object segmentation with occlusion handling. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 3470–3478
Google Scholar
Arbeláez P, Pont-Tuset J, Barron J T, Marques F, Malik J. Multiscale combinatorial grouping. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 328–335
Google Scholar
Li G, Xie Y, Lin L, Yu Y. Instance-level salient object segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2017, 247–256
Google Scholar
Dai J, He K, Sun J. Instance-aware semantic segmentation via multitask network cascades. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2016, 3150–3158
Google Scholar
Girshick R. Fast r-cnn. In: Proceedings of IEEE International Conference on Computer Vision. 2015, 1440–1448
Google Scholar
He K, Zhang X, Ren S, Sun J. Spatial pyramid pooling in deep convolutional networks for visual recognition. In: Proceedings of European Conference on Computer Vision. 2014, 346–361
Google Scholar
Ren S, He K, Girshick R, Sun J. Faster R-CNN: towards real-time object detection with region proposal networks. In: Proceedings of Advances in Neural Information Processing Systems. 2015, 91–99
Google Scholar
Newell A, Huang Z, Deng J. Associative embedding: end-to-end learning for joint detection and grouping. In: Proceedings of Advances in Neural Information Processing Systems. 2017, 2274–2284
Google Scholar
Harley A W, Derpanis K G, Kokkinos I. Learning dense convolutional embeddings for semantic segmentation. 2015, arXiv preprint arXiv:151104377
Google Scholar
Fathi A, Wojna Z, Rathod V, Wang P, Song H O, Guadarrama S, Murphy K P. Semantic instance segmentation via deep metric learning. 2017, arXiv preprint arXiv:170310277
Google Scholar
Yang L, Jin R. Distance metric learning: a comprehensive survey. Michigan State Universiy, 2006, 2(2): 1–51
MathSciNet Google Scholar
Xu J, Schwing A G, Urtasun R. Tell me what you see and I will show you where it is. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 3190–3197
Google Scholar
Miller G A, Beckwith R, Fellbaum C, Gross D, Miller K J. Introduction to wordnet: an on-line lexical database. International Journal of Lexicography, 1990, 3(4): 235–244
Google Scholar
Socher R, Bauer J, Manning C D, Ng A Y. Parsing with compositional vector grammars. In: Proceedings of Annual Meeting of the Association for Computational Linguistics. 2013, 455–465
Google Scholar
Everingham M, Van Gool L, Williams C K, Winn J, Zisserman A. The pascal visual object classes (VOC) challenge. International Journal of Computer Vision, 2010, 88(2): 303–338
Google Scholar
Chen X, Mottaghi R, Liu X, Fidler S, Urtasun R, Yuille A L. Detect what you can: detecting and representing objects using holistic models and body parts. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 1971–1978
Google Scholar
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M. Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 2015, 115(3): 211–252
MathSciNet Google Scholar
Zhou B, Zhao H, Puig X, Fidler S, Barriuso A, Torralba A. Semantic understanding of scenes through the ADE20K dataset. 2016, arXiv preprint arXiv:160805442
Google Scholar
Lin T Y, Maire M, Belongie S, Bourdev L, Girshick R, Hays J, Perona P, Ramanan D, Zitnick C L, Dollár P. Microsoft COCO: common objects in context. 2015, arXiv preprint arXiv:14050312v3
Google Scholar
Lin T Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick C L. Microsoft COCO: common objects in context. In: Proceedings of European Conference on Computer Vision. 2014, 740–755
Google Scholar
Liu C, Yuen J, Torralba A. Nonparametric scene parsing: label transfer via dense scene alignment. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2009, 1972–1979
Google Scholar
Silberman N, Hoiem D, Kohli P, Fergus R. Indoor segmentation and support inference from RGBD images. In: Proceedings of European Conference on Computer Vision. 2012, 746–760
Google Scholar
Gupta S, Arbelaez P, Malik J. Perceptual organization and recognition of indoor scenes from RGB-D images. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2013, 564–571
Google Scholar
Song S, Lichtenberg S P, Xiao J. Sun RGB-D: a RGB-D scene understanding benchmark suite. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 567–576
Google Scholar
Janoch A, Karayev S, Jia Y, Barron J T, Fritz M, Saenko K, Darrell T. A category-level 3D object dataset: putting the kinect to work. Consumer Depth Cameras for Computer Vision. London: Springer, 2013
Google Scholar
Xiao J, Owens A, Torralba A. SUN3D: a database of big spaces reconstructed using SFM and object labels. In: Proceedings of IEEE International Conference on Computer Vision. 2013, 1625–1632
Google Scholar
Yamaguchi K, Kiapour M H, Ortiz L E, Berg T L. Parsing clothing in fashion photographs. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2012, 3570–3577
Google Scholar
Liu S, Feng J, Domokos C, Xu H, Huang J, Hu Z, Yan S. Fashion parsing with weak color-category labels. IEEE Transactions on Multimedia, 2014, 16(1): 253–265
Google Scholar
Dong J, Chen Q, XiaW, Huang Z, Yan S. A deformable mixture parsing model with parselets. In: Proceedings of IEEE International Conference on Computer Vision. 2013, 3408–3415
Google Scholar
Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B. The cityscapes dataset for semantic urban scene understanding. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2016, 3213–3223
Google Scholar

Download references

Acknowledgements

This work was supported by the National Science Fund for Excellent Young Scholars (61622214), the National Natural Science Foundation of China (Grant Nos. 61702565 and 61622214), Guangdong Natural Science Foundation Project for Research Teams (2017A030312006), and was also sponsored by CCF-Tencent Open Research Fund.

Author information

Authors and Affiliations

School of Data and Computer Science, Sun Yat-sen University, Guangzhou, 510006, China
Lili Huang, Jiefeng Peng, Ruimao Zhang, Guanbin Li & Liang Lin

Authors

Lili Huang
View author publications
You can also search for this author in PubMed Google Scholar
Jiefeng Peng
View author publications
You can also search for this author in PubMed Google Scholar
Ruimao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Guanbin Li
View author publications
You can also search for this author in PubMed Google Scholar
Liang Lin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liang Lin.

Additional information

Lili Huang is a PhD student in Computer Science with the School of Data and Computer Science, Sun Yat-sen University, China. Before that, she received her BE and master degrees from School of Computer Science and Technology, Anhui University, China. Her current research interests lie in the fields of computer vision, machine learning, and cognitive science.

Jiefeng Peng received the BS degree from the school of mathematics, Sun Yat-sen University, China in 2016, where he is currently pursuing the Master degree in computer science at the School of data and computer science. His research interests include deep learning and computer vision.

Ruimao Zhang received the BE and PhD degrees from Sun Yat-sen University (SYSU), China in 2011 and 2016, respectively. From 2013 to 2014, he was a visiting PhD student with the Department of Computing, Hong Kong Polytechnic University (PolyU), Hong Kong, China. His research interests include computer vision, deep learning and related multimedia applications. He currently serves as a reviewer of several academic journals, including IEEE Trans. on Image Processing, IEEE Trans. on Circuits and Systems for Video Technology, IEEE Trans. on Information Forensics and Security, Pattern Recognition and Neurocomputing.

Liang Lin is a full Professor of Sun Yatsen University. He is the Excellent Young Scientist of the National Natural Science Foundation of China. He received his BS and PhD degrees from the Beijing Institute of Technology (BIT), China in 2003 and 2008, respectively, and was a joint PhD student with the Department of Statistics, University of California, Los Angeles (UCLA). From 2008 to 2010, he was a post-doctoral fellow at UCLA. From 2014 to 2015, as a senior visiting scholar he was with The Hong Kong Polytechnic University, China and The Chinese University of Hong Kong, China. His research interests include Computer Vision, Data Analysis and Mining, and Intelligent Robotic Systems, etc. Dr. Lin has authorized and co-authorized on more than 100 papers in top-tier academic journals and conferences. He has been serving as an associate editor of IEEE Trans. Human-Machine Systems. He was the recipient of the Best Paper Runners-Up Award in ACM NPAR 2010, Google Faculty Award in 2012, Best Student Paper Award in IEEE ICME 2014, Hong Kong Scholars Award in 2014 and Best Paper Award in IEEE ICME 2017.

Electronic supplementary material

Supplementary material, approximately 600 KB.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, L., Peng, J., Zhang, R. et al. Learning deep representations for semantic image parsing: a comprehensive overview. Front. Comput. Sci. 12, 840–857 (2018). https://doi.org/10.1007/s11704-018-7195-8

Download citation

Received: 08 June 2017
Accepted: 08 March 2018
Published: 30 August 2018
Issue Date: October 2018
DOI: https://doi.org/10.1007/s11704-018-7195-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning deep representations for semantic image parsing: a comprehensive overview

Abstract

Access this article

Similar content being viewed by others

Supervised semantic segmentation based on deep learning: a survey

Survey of recent progress in semantic image segmentation with CNNs

Semantic Object Parsing with Graph LSTM

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supplementary material, approximately 600 KB.

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning deep representations for semantic image parsing: a comprehensive overview

Abstract

Access this article

Similar content being viewed by others

Supervised semantic segmentation based on deep learning: a survey

Survey of recent progress in semantic image segmentation with CNNs

Semantic Object Parsing with Graph LSTM

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supplementary material, approximately 600 KB.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation