Abstract
Robust object recognition has drawn increasing attention in the field of computer vision and machine learning with fast development in feature extraction and classification techniques, and release of public datasets, such as Caltech datasets, Pascal Visual Object Classes, and ImageNet. Recently, deep learning based object recognition systems have shown significant performance improvements in visual object recognition tasks using innovative learning methodology. However, high dimensional space searching and recognition is time consuming, so performing point and range queries in high dimension is reconsidered for object recognition. This paper proposes optimized dimensionality reduction using structured sparse principle component analysis. The proposed method retains high dimensional feature structures, removes redundant features that do not contribute to similarity, and classifies the query image in a large database. The qualitative and quantitative experimental results, including a comparison with the current state-of-the-art visual object recognition algorithms, verify that the proposed recognition algorithm performs favorably in reducing the query image dimension and number of training images.
Similar content being viewed by others
References
Abdechiri M, Faez K, Amindavar H, Bilotta E (2017) Chaotic target representation for robust object tracking. Signal Process Image Commun 54:23–35
Akaike H (1987) Factor analysis and AIC. Psychometrika 52(3):317–332
Arias RS A convex optimization algorithm for sparse representation and applications in classification problems. Ph.D. thesis, DigitalCommons@UTEP. http://digitalcommons.utep.edu/dissertations/AAI3565935
Bellman R (1957) Dynamic programming. Princeton University Press
Bo L, Ren X, Fox D (2013) Multipath sparse coding using hierarchical matching pursuit. In: IEEE Conference on computer vision and pattern recognition
Bosch A, Zisserman A, Mu X, Munoz X (2007) Image classification using random forests and ferns. In: IEEE 11th International conference on computer vision (ICCV), pp 1–8
Boureau Y L, Bach F, LeCun Y, Ponce J (2010) Learning mid-level features for recognition. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 2559–2566
Chen L, Chen J, Gu Y (2012) Greedy pursuits: stability of recovery performance against general perturbations. In: ICNC. IEEE Computer Society, pp 897–901
Ciresan D C, Meier U, Masci J, Gambardella L M, Schmidhuber J (2011) High-performance neural networks for visual object classification. CoRR arXiv:http://arXiv.org/abs/1102.0183
Davison M L (1983) Multidimensional scaling. Wiley, New York
De Pierrefeu A, Löfstedt T, Hadj-Selem F, Dubois M, Ciuciu P, Frouin V, Duchesnay E (2016) Structured sparse principal components analysis with the tv-elastic net penalty. arXiv:http://arXiv.org/abs/1609.01423
Donoho D L (2006) Compressed sensing. IEEE Trans Inf Theory 52(4):1289–1306
Field D J (1994) What is the goal of sensory coding? Neural Comput 6(4):559–601
Gan G, Ng M K P (2015) Subspace clustering with automatic feature grouping. Pattern Recogn 48(11):3703–3713
Grauman K, Darrell T (2005) The pyramid match kernel: Discriminative classification with sets of image features. In: Proceedings of the IEEE international conference on computer vision, vol II, pp 1458–1465. https://doi.org/10.1109/ICCV.2005.239
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916. https://doi.org/10.1109/TPAMI.2015.2389824 https://doi.org/10.1109/TPAMI.2015.2389824
Hoffmann H (2007) Kernel PCA for novelty detection. Pattern Recogn 40 (3):863–874
Huang J, Zhang T (2010) The benefit of group sparsity. Ann Stat 38 (4):1978–2004
Huang J, Zhang T, Metaxas D (2009) Learning with structured sparsity. J Mach Learn Res 12:1–30. https://doi.org/10.1145/1553374.1553429
Huber P J (1985) Projection pursuit. Ann Statist 13(2):435–475
Jenatton R, Audibert J Y, Bach F (2011) Structured variable selection with sparsity-inducing norms. J Mach Learn Res 12:2777–2824
Jenatton R, Obozinski G, Bach F (2010) Structured sparse principal component analysis. In: International conference on artificial intelligence and statistics, pp 1–13
Jianchao Y, Yu K, Gong Y, Huang T (2009) Linear spatial pyramid matching using sparse coding for image classification. In: 2009 IEEE Conference on computer vision and pattern recognition pp 1794–1801
Jolliffe I (1986) Principal component analysis. Springer, New York
Kavukcuoglu K, LeCun Y, Ranzato M (2010) Fast inference in sparse coding algorithms with applications to object recognition, pp 1–9. arXiv:http://arXiv.org/abs/1010.3467. https://doi.org/10.1109/ICIP.2001.958968
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 2, pp 2169–2178. https://doi.org/10.1109/CVPR.2006.68
LeCun Y, Kavukcuoglu K, Farabet C (2010) Convolutional networks and applications in vision. In: ISCAS. IEEE, pp 253–256
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521 (7533):436–444
Lee T W (1998) Independent component analysis, theory and applications. Kluwer Academic Publishers
Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–91. https://doi.org/10.1038/44565
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC SSD: single shot multibox detector. arXiv:https://arxiv.org/abs/1512.02325
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110. https://doi.org/10.1023/B:VISI.0000029664.99615.94
Mutch J, Lowe DG (2006) Multiclass object recognition using sparse, localized features. In: IEEE Conference on computer vision and pattern recognition, pp 11–18. https://doi.org/10.1109/CVPR.2006.200
Naikal N, Yang AY, Shankar S (2011) Informative feature selection for object recognition via Sparse PCA. In: Proceedings of the IEEE international conference on computer vision, pp 818–825. https://doi.org/10.1109/ICCV.2011.6126321
Oliveira GL, Nascimento ER, Vieira AW, Campos MFM (2012) Sparse spatial coding: a novel approach for efficient and accurate object recognition. In: Proceedings - IEEE international conference on robotics and automation, pp 2592–2598. https://doi.org/10.1109/ICRA.2012.6224785
Olshausen BA, Field DJ (1996) Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381(6583):607–609. https://doi.org/10.1038/381607a0
Redmon J, Farhadi A YOLO9000: better, faster, stronger. arXiv:https://arxiv.org/abs/1612.08242
Roweis S T, Saul L K (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326
Serre T, Wolf L, Poggio T (2005) Object recognition with features inspired by visual cortex. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 2, pp 994–1000. https://doi.org/10.1109/CVPR.2005.254
Sohn K, Jung DY, Lee H, Hero AO (2011) Efficient learning of sparse, distributed, convolutional feature representations for object recognition. In: Proceedings of the IEEE international conference on computer vision, pp 2643–2650. https://doi.org/10.1109/ICCV.2011.6126554
Tenenbaum J B, de Silva V, Langford J C (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323
Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010) Locality-constrained linear coding for image classification. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 3360–3367. https://doi.org/10.1109/CVPR.2010.5540018
Weinberger K Q, Saul L K (2006) An introduction to nonlinear dimensionality reduction by maximum variance unfolding. In: AAAI. AAAI Press, pp 1683–1686
Weinberger K Q, Saul L K (2006) Unsupervised learning of image manifolds by semidefinite programming. Int J Comput Vis 70(1):77–90
Yang J, Li Y, Tian Y, Duan L, Gao W (2009) Group-sensitive multiple kernel learning for object categorization. In: IEEE International conference on computer vision
Zeiler M D, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision
Zhang S, Huang J, Li H, Metaxas D N (2012) Automatic image annotation and retrieval using group sparsity. IEEE Trans Syst Man Cybern Part B 42(3):838–849
Zhang Z, Xu Y, Yang J, Li X, Zhang D (2015) A survey of sparse representation: algorithms and applications. IEEE Access 3:490–530. https://doi.org/10.1109/ACCESS.2015.2430359
Zhu P, Zhu W, Hu Q, Zhang C, Zuo W (2017) Subspace clustering guided unsupervised feature selection. Pattern Recogn 66:364–374
Zou H, Hastie T, Tibshirani R (2006) Sparse principal component analysis. J Comput Graph Stat 15(2):265–286. https://doi.org/10.1198/106186006X113430
Acknowledgements
J. Song and S.M. Yoon were supported by the National Research Foundation of Korea grants funded (No.2015R1A5A7037615, No.2016R1D1A1B04932889) and IITP (#2014-0-00501) by the Korean Government. H. Cho was support by the National Research Foundation of Korea (No. 2017R1A2B4011015). G.J.Yoon was supported by National Institute for Mathematical Sciences (NIMS).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Song, J., Yoon, G., Cho, H. et al. Structure preserving dimensionality reduction for visual object recognition. Multimed Tools Appl 77, 23529–23545 (2018). https://doi.org/10.1007/s11042-018-5682-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-5682-5