2D freehand sketch labeling using CNN and CRF

Zhu, Xianyi; Xiao, Yi; Zheng, Yan

doi:10.1007/s11042-019-08158-z

2D freehand sketch labeling using CNN and CRF

Published: 05 November 2019

Volume 79, pages 1585–1602, (2020)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

532 Accesses
13 Citations
Explore all metrics

Abstract

Accurate and fast sketch segmentation and labeling is a hard task, since sketches have much fewer features than natural images. This paper proposes a novel hybrid approach for fast automatic sketch labeling, which is based on convolutional neural network (CNN) and conditional random field (CRF). Firstly, we design a CNN for stroke classification. The CNN is equipped with larger first layer filters and larger pooling, which is suitable for extracting descriptive features from strokes. Secondly, we integrate each stroke with its host sketch to construct a more informative input for the CNN model. Finally, we leverage the spatio-temporal relations among strokes in the same sketch to create a connected graph, based on which we apply a CRF model to further refine the result of the CNN. We evaluate our method on two public benchmark datasets. Experimental results demonstrate that our method achieves the state-of-the-art level on both accuracy and runtime.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Fig. 4

Part-Level Sketch Segmentation and Labeling Using Dual-CNN

Stroke classification for sketch segmentation by fine-tuning a developmental VGGNet16

Article 09 March 2020

Stroke-based semantic segmentation for scene-level free-hand sketches

Article 07 December 2022

References

Besag J (1986) On the statistical analysis of dirty pictures. J R Stat Soc Ser B Methodol 48(3):259–302
MathSciNet MATH Google Scholar
Eitz M, Hays J, Alexa M (2012) How do humans sketch objects? ACM Trans. Graph 31(4):44:1–44:10
Google Scholar
Fan L, Wang R, Xu L, Deng J, Liu L (2013) Modeling by drawing with shadow guidance. Comput Graphics Forum 32(7):157–166
Article Google Scholar
Gu J, Wang Z, Kuen J, Ma L, Shahroudy A, Shuai B, Liu T, Wang X, Wang G, Cai J, Chen T (2018) Recent advances in convolutional neural networks. Pattern Recogn 77:354–377. https://doi.org/10.1016/j.patcog.2017.10.013
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition, pp 770–778
He JY, Wu X, Jiang YG, Zhao B, Peng Q (2017) Sketch recognition with deep visual-sequential fusion model. In: Proceedings of the 2017 ACM on multimedia conference. ACM, pp 448–456
Hu M, Ou B, Xiao Y (2017) Efficient image colorization based on seed pixel selection. Multimedia Tools Appl 76(22):23567–23588
Article Google Scholar
Huang Z, Fu H, Lau RW (2014) Data-driven segmentation and labeling of freehand sketches. ACM Trans Graph 33(6):175:1–175:10
Article Google Scholar
Kim B, Wang O, Öztireli AC, Gross M (2018) Semantic segmentation for line drawing vectorization using neural networks. Comput Graphics Forum 37(2):329–338
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Lafferty J, McCallum A, Pereira F (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
Li SZ (1994) Markov random field models in computer vision. In: European conference on computer vision. Springer, pp 361–370
Li B, Lu Y, Johan H, Fares R (2017) Sketch-based 3d model retrieval utilizing adaptive view clustering and semantic information. Multimed Tools Appl 76 (24):26603–26631
Article Google Scholar
Li Y, Lei H, Lin S, Luo G (2018) A new sketch-based 3d model retrieval method by using composite features. Multimed Tools Appl 77(2):2921–2944
Article Google Scholar
Li L, Fu H, Tai C (2019) Fast sketch segmentation and labeling with deep learning. IEEE Comput Graph Appl 39(2):38–51. https://doi.org/10.1109/MCG.2018.2884192
Article Google Scholar
Liu L, Wiliem A, Chen S, Lovell BC (2016) Automatic and quantitative evaluation of attribute discovery methods. In: 2016 IEEE winter conference on applications of computer vision, WACV 2016, Lake Placid, NY, USA, March 7-10, pp 1–9
Liu L, Shen F, Shen Y, Liu X, Shao L (2017) Deep sketch hashing: fast free-hand sketch-based image retrieval. In: Proceedings of CVPR, pp 2862–2871
Liu L, Wiliem A, Chen S, Lovell BC (2017) What is the best way for extracting meaningful attributes from pictures? Pattern Recogn 64:314–326
Article Google Scholar
Liu L, Nie F, Wiliem A, Li Z, Zhang T, Lovell BC (2018) Multi-modal joint clustering with application for unsupervised attribute discovery. IEEE Trans Image Process 27(9):4345–4356
Article MathSciNet Google Scholar
Lowe DG (1999) Object recognition from local scale-invariant features. In: IEEE international conference on computer vision. IEEE, pp 1150–1157
Mark S (2015) UGM: Matlab code for undirected graphical models. http://www.cs.ubc.ca/schmidtm/Software/UGM.html
Noris G, Sỳkora D, Shamir A, Coros S, Whited B, Simmons M, Hornung A, Gross M, Sumner R (2012) Smart scribbles for sketch segmentation. Comput Graphics Forum 31(8):2516–2527
Article Google Scholar
Qi Y, Guo J, Li Y, Zhang H, Xiang T, Song YZ (2013) Sketching by perceptual grouping. In: 2013 20th IEEE international conference on image processing (ICIP). IEEE, pp 270–274
Qi Y, Song YZ, Xiang T, Zhang H, Hospedales T, Li Y, Guo J (2015) Making better use of edges via perceptual grouping. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1856–1865
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein MS, Berg AC, Li F (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Article MathSciNet Google Scholar
Sánchez J, Perronnin F, Mensink T, Verbeek J (2013) Image classification with the fisher vector: theory and practice. Int J Comput Vis 105(3):222–245
Article MathSciNet Google Scholar
Sangkloy P, Burnell N, Ham C, Hays J (2016) The sketchy database: learning to retrieve badly drawn bunnies. ACM Trans Graph (TOG) 35(4):119
Article Google Scholar
Sangkloy P, Lu J, Fang C, Yu F, Hays J (2017) Scribbler: controlling deep image synthesis with sketch and color. In: IEEE conference on computer vision and pattern recognition (CVPR), vol 2
Sarvadevabhatla RK, Dwivedi I, Biswas A, Manocha S et al (2017) Sketchparse: towards rich descriptions for poorly drawn sketches using multi-task hierarchical deep networks. In: Proceedings of the 2017 ACM on multimedia conference. ACM, pp 10–18
Schneider RG, Tuytelaars T (2016) Example-based sketch segmentation and labeling using crfs. ACM Trans Graph 35(5):151:1–151:9
Article Google Scholar
Seddati O, Dupont S, Mahmoudi S (2017) Deepsketch 3. Multimed Tools Appl 76(21):22333–22359
Article Google Scholar
Shang C, Liu Q, Chen KS, Sun J, Lu J, Yi J, Bi J (2018) Edge attention-based multi-relational graph convolutional networks. arXiv preprint arXiv:180204944
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: International conference on learning representations
Sun Z, Wang C, Zhang L, Zhang L (2012) Free hand-drawn sketch segmentation. In: European conference on computer vision. Springer, pp 626–639
Tan G, Chen H, Qi J (2016) A novel image matting method using sparse manual clicks. Multimed Tools Appl 75(17):10213–10225
Article Google Scholar
Tompson JJ, Jain A, LeCun Y, Bregler C (2014) Joint training of a convolutional network and a graphical model for human pose estimation. In: Advances in neural information processing systems, pp 1799–1807
Wan L, Xiao Y, Dou N, Leung C, Lai Y (2018) Scribble-based gradient mesh recoloring. Multimed Tools Appl 77(11):13753–13771
Article Google Scholar
Wang C, Yang H, Bartz C, Meinel C (2016) Image captioning with deep bidirectional lstms. In: Proceedings of the 24th ACM international conference on multimedia. ACM, pp 988–997
Wang C, Yang H, Meinel C (2016) A deep semantic framework for multimodal representation learning. Multimed Tools Appl 75(15):9255–9276
Article Google Scholar
Wang C, Niepert M, Li H (2018) LRMM: learning to recommend with missing modalities. In: Proceedings of the 2018 conference on empirical methods in natural language processing, Brussels, Belgium, October 31 - November 4, pp 3360–3370
Wang C, Yang H, Meinel C (2018) Image captioning with deep bidirectional lstms and multi-task learning. ACM Trans Multimed Comput Commun Appl (TOMM) 14(2s):40
Google Scholar
Wang SH, Muhammad K, Hong J, Sangaiah AK, Zhang YD (2019) Alcoholism identification via convolutional neural network based on parametric relu, dropout, and batch normalization. Neural Comput & Applic, pp 1–16. https://doi.org/10.1007/s00521-018-3924-0
Article Google Scholar
Xu K, Chen K, Fu H, Sun WL, Hu SM (2013) Sketch2scene: sketch-based co-retrieval and co-placement of 3d models. ACM Trans Graph (TOG) 32(4):123
Article Google Scholar
Xu B, Chang W, Sheffer A, Bousseau A, McCrae J, Singh K (2014) True2form: 3d curve networks from 2d sketches via selective regularization. ACM Trans Graph 33(4):131:1–131:13
Google Scholar
Yin W (2009) Gurobi mex: a matlab interface for gurobi. http://convexoptimization.com/wikimization/index.php/gurobi_mex
Yu Q, Yang Y, Liu F, Song YZ, Xiang T, Hospedales TM (2017) Sketch-a-net: a deep neural network that beats humans. Int J Comput Vis 122 (3):411–425
Article MathSciNet Google Scholar
Zhang YD, Muhammad K, Tang C (2018) Twelve-layer deep convolutional neural network with stochastic pooling for tea category classification on gpu platform. Multimed Tools Appl 77(17):22821–22839
Article Google Scholar
Zheng Y, Cao X, Xiao Y, Zhu X, Yuan J (2019) Joint residual pyramid for joint image super-resolution. J Vis Commun Image Represent 58:53–62
Article Google Scholar
Zhou S, Zhou C, Xiao Y, Tan G (2018) Patchswapper: a novel real-time single-image editing technique by region-swapping. Comput Graph 73:80–87
Article Google Scholar

Download references

Acknowledgements

The work is supported by the National Key R&D Program of China (2018YFB0203904), NSFC from PRC (61872137, 61502158, 61803150), Hunan NSF (2017JJ3042, 2018JJ3067).

Author information

Authors and Affiliations

College of Computer Science and Electronic Engineering, Hunan University, Changsha, People’s Republic of China
Xianyi Zhu & Yi Xiao
College of Electrical and Information Engineering, Hunan University, Changsha, People’s Republic of China
Yan Zheng

Authors

Xianyi Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Yi Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Yan Zheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yi Xiao.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhu, X., Xiao, Y. & Zheng, Y. 2D freehand sketch labeling using CNN and CRF. Multimed Tools Appl 79, 1585–1602 (2020). https://doi.org/10.1007/s11042-019-08158-z

Download citation

Received: 24 September 2018
Revised: 20 June 2019
Accepted: 02 September 2019
Published: 05 November 2019
Issue Date: January 2020
DOI: https://doi.org/10.1007/s11042-019-08158-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

2D freehand sketch labeling using CNN and CRF

Abstract

Access this article

Similar content being viewed by others

Part-Level Sketch Segmentation and Labeling Using Dual-CNN

Stroke classification for sketch segmentation by fine-tuning a developmental VGGNet16

Stroke-based semantic segmentation for scene-level free-hand sketches

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

2D freehand sketch labeling using CNN and CRF

Abstract

Access this article

Similar content being viewed by others

Part-Level Sketch Segmentation and Labeling Using Dual-CNN

Stroke classification for sketch segmentation by fine-tuning a developmental VGGNet16

Stroke-based semantic segmentation for scene-level free-hand sketches

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation