Abstract
Deep Neural Networks (DNNs) have recently achieved impressive performance for many recognition tasks across different disciplines including image recognition task. However, most of existing works on deep learning for image recognition focus on natural image data (photo-based images) and not on sketches. Moreover, most of existing works on sketch classification are based on hand crafted feature representations. In this paper, we propose to train a convolutional neural network for sketch recognition using the TU-Berlin sketch dataset composed of 250 object categories with 80 images each. We find that training a CNN with a proper data-augmentation and a multi-scale multi-angle voting technique can achieve an accuracy of 75.43%, which surpasses human-level performance in the standard sketch classification benchmark and significantly outperforms state-of-the-art sketch recognition methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Eitz, M., Hays, J., Alexa, M.: How do humans sketch objects? In: Proceedings of the SIGGRAPH 2012 (2012)
Yu, Q., Yang, Y., Song, Y.-Z., Xiang, T., Hospedales, T.M.: Sketch-a-net that beats humans. In: Proceedings of the British Machine Vision Conference (BMVC), pp. 1–12 (2015)
Csurka, G., Dance, C.R., Dan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Proceedings of the ECCV Workshop on Statistical Learning in Computer Vision (2004)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)
Li, Y., Hospedales, T.M., Song, Y., Gong, S.: Free-hand sketch recognition by multikernel feature learning. In: CVIU (2015)
Schneider, R.G., Tuytelaars, T.: Sketch classification and classification-driven analysis using fisher vectors. In: SIGGRAPH Asia (2014)
Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: CVPR (2009)
Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. In: BMVC (2014)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)
Jégou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: CVPR (2010)
Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580 (2012)
MindFinder: Interactive Sketch-Based Image Search on Millions of Images (2010)
Eitz, M., Hildebrand, K., Boubekeur, T., Alexa, M.: Sketch-based image retrieval: benchmark and bag-of-features descriptors. TVCG 17(11), 1624–1636 (2011)
Le Cun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., Jackel, L.D.: Handwritten digit recognition with a back-propagation network. In: NIPS (1990)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS (2012)
Hu, R., Collomosse, J.: A performance evaluation of gradient field hog descriptor for sketch based image retrieval. CVIU 117(7), 790–806 (2013)
Leung, T., Malik, J.: Representing and recognizing the visual appearance of materials using three-dimensional textons. IJCV 43(1), 1 (2001)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 580–587. IEEE (2014)
Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: Deepface: closing the gap to humanlevel performance in face verification. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, Ohio, pp. 1701–1708. IEEE (2014)
Lowe, D.G.: Distinctive image features from scale-invariant key-points. IJCV 2(60), 91–110 (2004)
Babenko, A., Slesarev, A., Chigorin, A., Lempitsky, V.: Neural codes for image retrieval. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 584–599. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10590-1_38
Li, Y., Song, Y.-Z., Gong, S.: Sketch recognition by ensemble matching of structured features. In: British Machine Vision Conference (BMVC). Citeseer (2013)
Perronnin, F., Dance, C.: Fisher kernels on visual vocabularies for image categorizaton. In: CVPR (2006)
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M.S., Berg, A.C., Fei-Fei, L.: Imagenet large scale visual recognition challenge. CoRR, abs/1409.0575 (2014)
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., Le-Cun, Y.: Overfeat: integrated recognition, localization and detection using convolutional networks. CoRR, volume abs/1312.6229 (2014)
Wang, F., Kang, L., Li, Y.: Sketch-based 3D shape retrieval using convolutional neural networks. arXiv preprint arXiv:1504.03504 (2015)
Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: ICCV (2003). 1
Klare, B.F., Li, Z., Jain, A.K.: Matching forensic sketches to mug shot photos. TPAMI 33(3), 639–646 (2011)
Ouyang, S., Hospedales, T., Song, Y., Li, X.: Cross-modal face matching: beyong viewed sketches. In: ACCV (2014)
LeCun, Y., Bottou, L., Orr, G.B., Müller, K.-R.: Efficient BackProp. In: Orr, Genevieve, B., Müller, K.-R. (eds.). LNCS, vol. 1524, pp. 9–50Springer, Heidelberg (1998). doi:10.1007/3-540-49430-8_2
Sarvadevabhatla, R.K., Babu, R.V.: Freehand sketch recognition using deep features. arXiv:1502.00254 (2015)
Vedaldi, A., Lenc, K.: MatConvNet: CNNs for MATLAB (2014). http://www.vlfeat.org/matconvnet/
Bengio, Y.: Practical Recommendations for Gradient-Based Training of Deep Architectures. arXiv:1206.5533 (2012)
Yang, J., Yu, K., Huang, T.: Supervised translation-invariant sparse coding. In: CVPR (2010)
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10590-1_53
Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: NIPS, pp. 487–495 (2014)
Zhou, X., Yu, K., Zhang, T., Huang, Thomas, S.: Image classification using super-vector coding of local image descriptors. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6315, pp. 141–154. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15555-0_11
Frazão, X., Alexandre, L.A.: Weighted convolutional neural network ensemble. In: Bayro-Corrochano, E., Hancock, E. (eds.) CIARP 2014. LNCS, vol. 8827, pp. 674–681. Springer, Heidelberg (2014). doi:10.1007/978-3-319-12568-8_82
Seddati, O., Dupont, S., Mahmoudi, S.: DeepSketch: deep convolutional neural networks for sketch recognition and similarity search. In: 13th International Workshop on Content-Based Multimedia Indexing (CBMI), June 2015
Wang, X., Duan, X., Bai, X.: Deep sketch feature for cross-domain image retrieval. Elsevier (2016)
Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13(2012), 281–305 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Sadouk, L., Gadi, T., Essoufi, E.H. (2017). A Novel Approach of Deep Convolutional Neural Networks for Sketch Recognition. In: Abraham, A., Haqiq, A., Alimi, A., Mezzour, G., Rokbani, N., Muda, A. (eds) Proceedings of the 16th International Conference on Hybrid Intelligent Systems (HIS 2016). HIS 2016. Advances in Intelligent Systems and Computing, vol 552. Springer, Cham. https://doi.org/10.1007/978-3-319-52941-7_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-52941-7_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-52940-0
Online ISBN: 978-3-319-52941-7
eBook Packages: EngineeringEngineering (R0)