Reducing and Stretching Deep Convolutional Activation Features for Accurate Image Classification
In order to extract effective representations of data using deep learning models, deep convolutional activation feature (DeCAF) is usually considered. However, since the deep models for learning DeCAF are generally pre-trained, the dimensionality of DeCAF is simply fixed to a constant number (e.g., 4096D). In this case, one may ask whether DeCAF is good enough for image classification and whether we can further improve its performance? In this paper, to answer these two challenging questions, we propose a new model called RS-DeCAF based on “reducing” and “stretching” the dimensionality of DeCAF. In the implementation of RS-DeCAF, we reduce the dimensionality of DeCAF using dimensionality reduction methods and increase its dimensionality by stretching the weight matrix between successive layers. To improve the performance of RS-DeCAF, we also present a modified version of RS-DeCAF by applying the fine-tuning operation. Extensive experiments on several image classification tasks show that RS-DeCAF not only improves DeCAF but also outperforms previous “stretching” approaches. More importantly, from the results, we find that RS-DeCAF can generally achieve the highest classification accuracy when its dimensionality is two to four times of that of DeCAF.
KeywordsImage classification Feature learning Deep convolutional neural network DeCAF Stretching
This work was supported by the National Natural Science Foundation of China (No. 61271405, 61403353), the Ph.D. Program Foundation of Ministry of Education Of China (No. 20120132110018) and the Fundamental Research Funds for the Central Universities of China.
Compliance with Ethical Standards
Conflict of interests
The authors declare that they have no conflict of interest.
This article does not contain any studies with human participants or animals performed by any of the authors.
- 3.Cai Y, Zhong G, Zheng Y, Huang K. Is DeCAF good enough for accurate image classification? ICONIP; 2015. p. 354–363.Google Scholar
- 5.Coates A, Ng A, Lee H. An analysis of single-layer networks in unsupervised feature learning. In: AISTATS; 2011. p. 215–223.Google Scholar
- 6.Deng J, Dong W, Socher R, Li L, Li K, Li F. ImageNet: a large-scale hierarchical image database. In: CVPR; 2009. p. 248–255.Google Scholar
- 7.Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T. DeCAF: a deep convolutional activation feature for generic visual recognition. In: ICML; 2014. p. 647–655.Google Scholar
- 11.Guo T, Zhang L, Tan X. Neuron pruning-based discriminative extreme learning machine for pattern classification. Cognitive Computation. 2017Google Scholar
- 12.He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: CVPR; 2016. p. 770–778.Google Scholar
- 14.Hinton G, Salakhutdinov R. Reducing the dimensionality of data with neural networks. Science. 313. 2006.Google Scholar
- 15.Hinton H, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov R. Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint. 2012;3:212–23.Google Scholar
- 16.Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T. Caffe: convolutional architecture for fast feature embedding. In: ACM MM; 2014. p. 675–678.Google Scholar
- 17.Jolliffe I. 1986. Principal component analysis. Springer.Google Scholar
- 18.Kelly J III. 2015. Computing, cognition and the future of knowing. IBM Research: Cognitive Computing.Google Scholar
- 19.Krizhevsky A, Sutskever I, Hinton G. ImageNet classification with deep convolutional neural networks. In: NIPS; 2012. p. 1106–1114.Google Scholar
- 21.Lin M, Chen Q, Yan S. 2013. Network in network. CoRR arXiv:1312.4400.
- 22.Liu J, Dong J, Cai X, Qi L, Chantler M. 2015. Visual perception of procedural textures: identifying perceptual dimensions and predicting generation models. PloS One 10.Google Scholar
- 23.Luo B, Hussain A, Mahmud M, Tang J. Advances in brain-inspired cognitive systems. Cognitive Computation. 2016;8(5):795–6.Google Scholar
- 24.Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng A. Reading digits in natural images with unsupervised feature learning . NIPS workshop on deep learning and unsupervised feature learning; 2011.Google Scholar
- 25.Pandey G, Dukkipati A. Learning by stretching deep networks. In: ICML; 2014. p. 1719–1727.Google Scholar
- 26.Peter W, Steve B, Takeshi M, Catherine W, Florian S, Serge B, Pietro P. Caltech-UCSD birds 200. Tech. Rep. CNS-TR-2010-001, California Institute of Technology. 2010Google Scholar
- 28.Scholkopf B, Smola A. Learning with kernels: support vector machines, regularization, optimization, and beyond. adaptive computation and machine learning series. MIT Press. 2002.Google Scholar
- 30.Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y. 2013. Overfeat: integrated recognition, localization and detection using convolutional networks eprint Arxiv.Google Scholar
- 31.Simonyan K, Zisserman A. 2014. Very deep convolutional networks for large-scale image recognition. CoRR arXiv:1409.1556.
- 33.Sun Y, Wang X, Tang X. Deep learning face representation by joint Identification-Verification. NIPS; 2014. p. 1988–96.Google Scholar
- 34.Swersky K, Snoek J, Adams R. Multi-task bayesian optimization. NIPS; 2013. p. 2004–2012.Google Scholar
- 35.Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. In: CVPR; 2015. p. 1–9.Google Scholar
- 37.Vapnik V. Statistical learning theory, vol. 1. Wiley. 1998.Google Scholar
- 38.Wang N, Yeung D. Ensemble-based tracking: Aggregating crowdsourced structured time series data. In: ICML; 2014. p. 1107–1115.Google Scholar
- 41.Zheng Y, Zhong G, Liu J, Cai X, Dong J. Visual texture perception with feature learning models and deep architectures. In: CCPR; 2014. p. 401–410.Google Scholar