Abstract
Classic sparse representation, as one of prevalent feature learning methods, is successfully applied for different computer vision tasks. However it has some intrinsic defects in object detection. Firstly, how to learn a discriminative dictionary for object detection is a hard problem. Secondly, it is usually very time-consuming to learn dictionary based features in a traditional exhaustive search manner like sliding window. In this paper, we propose a novel feature learning framework for object detection with the structure sparsity constraint and classification error minimization constraint to learn a discriminative dictionary. For improving the efficiency, we just learn sparse representation coefficients from object candidate regions and feed them to a kernelized SVM classifier. Experiments on INRIA Person Dataset and Pascal VOC 2007 challenge dataset clearly demonstrate the effectiveness of the proposed approach compared with two state-of-the-art baselines.
Similar content being viewed by others
References
Elad M, Aharon M. Image denoising via sparse and redundant representations over learned dictionaries [J]. IEEE Transactions on Image Processing, 2006, 15(12): 3736–3745.
Lazebnik S, Schmid C, Ponce J. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories [C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2012: 2016–2078.
Lee H, Battle A, Raina R, et al. Efficient sparse coding algorithms [J]. Advances in Neural Information Processing Systems, 2006, 19: 801–808.
Ren X, Ramanan D. Histograms of sparse codes for object detection [C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2013: 3246–3253.
Van K E A, Uijlings J R R, Gevers T, et al. Segmentation as selective search for object recognition [C] // Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2011: 1879–1886.
Khan F S, Anwer R M, van de Weijer J, et al. Color attributes for object detection [C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2012: 3306–3313.
Cinbis R G, Verbeek J, Schmid C. Segmentation driven object detection with fisher vectors [C] // Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2013: 2968–2975.
Yuan X T, Liu X, Yan S. Visual classification with multitask joint sparse representation [J]. IEEE Transactions on Image Processing, 2012, 21(10): 4349–4360.
Chen L C, Hsieh J W, Yan Y, et al. Vehicle make and model recognition using sparse representation and symmetrical surfs [J]. Pattern Recognition, 2015, 48(6): 1979–1998.
Hosang J, Benenson R, Dollar P, et al. What makes for effective detection proposals?[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2015, 38(4): 814–830.
Zitnick C L, Dollár P. Edge boxes: Locating object proposals from edges [C] // European Conference on Computer Vision. Berlin: Springer-Verlag, 2014: 391–405.
Bengio S, Pereira F C N, Singer Y, et al. Group sparse coding [J]. Advances in Neural Information Processing Systems, 2009, 22(11): 82–89.
Chang C C, Lin C J. LIBSVM: A library for support vector machines[J]. ACM Transactions on Intelligent Systems and Technology, 2011, 2(27): 1–27.
Felzenszwalb P, Mcallester D, Ramanan D, et al. A discriminatively trained, multiscale, deformable part model[C]// Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2008: 1–8.
Dalal N, Triggs B. Histograms of oriented gradients for human detection[C]// Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2005: 886–893.
Everingham M, Eslami S M A, Gool L V, et al. The pascal, visual object classes challenge: A retrospective[J]. International Journal of Computer Vision, 2015, 111(1): 98–136.
Author information
Authors and Affiliations
Corresponding author
Additional information
Foundation item: Supported by the National Natural Science Foundation of China (61231015, 61170023), National High Technology Research and Development Program of China (863 Program, 2015AA016306), Internet of Things Development Funding Project of Ministry of Industry in 2013 (No. 25), Technology Research Program of Ministry of Public Security (2014JSYJA016), Major Science and Technology Innovation Plan of Hubei Province (2013AAA020), and the Natural Science Foundation of Hubei Province (2014CFB712)
Biography: FANG Wenhua, male, Ph.D. candidate, research direction: multimedia analysis and computer vision.
Rights and permissions
About this article
Cite this article
Fang, W., Chen, J. & Hu, R. Structural sparse representation for object detection. Wuhan Univ. J. Nat. Sci. 22, 318–322 (2017). https://doi.org/10.1007/s11859-017-1253-2
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11859-017-1253-2