Multimedia Tools and Applications

, Volume 76, Issue 8, pp 11065–11079 | Cite as

Image classification based on convolutional neural networks with cross-level strategy

Article

Abstract

In the past few years, convolutional neural networks (CNNs) have exhibited great potential in the field of image classification. In this paper, we present a novel strategy named cross-level to improve the existing networks’ architecture in which different levels of feature representation in a network are merely connected in series. The basic idea of cross-level is to establish a convolutional layer between two nonadjacent levels, aiming to extract more sufficient features with multiple scales at each feature representation level. The proposed cross-level strategy can be naturally integrated into an existing network without any change on its original architecture, which makes it very practical and convenient. Four popular convolutional networks for image classification are employed to illustrate its implementation in detail. Experimental results on the dataset adopted by the ImageNet Large-Scale Visual Recognition Challenge (ILSVRC) verify the effectiveness of the cross-level strategy on image classification. Furthermore, a new convolutional network with cross-level architecture is presented to demonstrate the potential of the proposed strategy in future network design.

Keywords

Convolutional neural networks (CNNs) Image classification Network architecture Feature representation Deep learning 

Notes

Acknowledgments

The authors would like to thank the editors and anonymous reviewers for their constructive comments and valuable suggestions. This work was supported by the National Natural Science Foundation of China (No. 61472393 and No. 61303150), the National Science and Technology Major Project of the Ministry of Science and Technology of China (No. 2012GB102007), and the Anhui Province Initiative Funds on Intelligent Speech Technology and Industrialization (No. 13Z02008). The authors greatly acknowledge the support of IFLYTEK CO.,LTD.

Compliance with Ethical Standards

Conflict of interests

The authors declare that they have no conflict of interest.

References

  1. 1.
    Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35:1798–1828CrossRefGoogle Scholar
  2. 2.
  3. 3.
  4. 4.
  5. 5.
    Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297MATHGoogle Scholar
  6. 6.
    Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE Conference on computer vision and pattern recognition (CVPR), vol 1, pp 886–893Google Scholar
  7. 7.
    Desimone R, Duncan J (1995) Neural mechanisms of selective visual attention. Ann Rev Neurosci 18:193–222CrossRefGoogle Scholar
  8. 8.
    Fan J, Xu W, Wu Y, Gong Y (2010) Human tracking using convolutional neural networks. IEEE Trans Neural Netw 21:1610–1623CrossRefGoogle Scholar
  9. 9.
    Freund Y, Schapire R (1995) A desicion-theoretic generalization of on-line learning and an application to boosting. In: Computational learning theory, pp 23–37Google Scholar
  10. 10.
    He K, Zhang X, Ren S, Sun J (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. In: European conference on computer vision (ECCV), pp 346–361Google Scholar
  11. 11.
    He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imageNet classification. arXiv:1502.01852
  12. 12.
    Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313:504–507MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR (2012) Improving neural net-works by preventing co-adaptation of feature detectors. arXiv:1207.0580
  14. 14.
    ImageNet Website: http://www.image-net.org/
  15. 15.
    Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: ACM International conference on multimedia, pp 675–678Google Scholar
  16. 16.
    Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convoluntional neural networks. In: Advances in neural information processing systems (NIPS), vol 25, pp 1106–1114Google Scholar
  17. 17.
    Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE Conference on computer vision and pattern recognition (CVPR), vol 2, pp 2169–2178Google Scholar
  18. 18.
    LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1:541–551CrossRefGoogle Scholar
  19. 19.
    LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86:2278–2324CrossRefGoogle Scholar
  20. 20.
    LeCun Y, Kavukcuoglu K, Farabet C (2010) Convolutional networks and applications in vision. In: IEEE International symposium on circuits and systems, pp 254–256Google Scholar
  21. 21.
    Lee C, Xie S, Gallagher P, Zhang Z, Tu Z (2014) Deeply-supervised networks. arXiv:1409.5185
  22. 22.
    Li F F, Fergus R, Perona P (2007) Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. Comput Vis Image Understand 106:59–70CrossRefGoogle Scholar
  23. 23.
    Lin M, Chen Q, Yan S (2013) Network in network. arXiv:1312.4400
  24. 24.
    Liu Y, Yin B, Yu J, Wang Z (2015) Cross-level: a practical strategy for convolutional neural networks based image classification. In: CCF Chinese conference on computer vision, pp 398–406Google Scholar
  25. 25.
    Long X, Lu H, Li W (2014) Image classification based on nearest neighbor basis vectors. Mulitimed Tools Appl 71:1559–1576CrossRefGoogle Scholar
  26. 26.
    Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60:91–110CrossRefGoogle Scholar
  27. 27.
    Qu Y, Wu S, Liu H, Xie Y, Wang H (2014) Evaluation of local features and classifiers in BOW model for image classification. Mulitimed Tools Appl 70:605–624CrossRefGoogle Scholar
  28. 28.
    Sermanet P, LeCun Y (2011) Traffic sign recognition with multi-scale convolutional networks. In: International joint conference on neural networks, pp 2809–2813Google Scholar
  29. 29.
    Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409-1556
  30. 30.
    Sivic J, Zisserman A (2003) Video google: a text retrieval approach to object matching in videos. In: International conference on computer vision (ICCV), pp 1470–1477Google Scholar
  31. 31.
    Spirkovska L, Reid M B (1992) Robust position, scale, and rotation invariant object recognition using higher-order neural networks. Pattern Recog 25:975–985CrossRefGoogle Scholar
  32. 32.
    Sun Y, Wang X, Tang X (2014) Deep learning face representation from predicting 10,000 classes. In: IEEE International conference on computer vision and pattern recognition (CVPR), pp 1891–1898Google Scholar
  33. 33.
    Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2014) Going deeper with convolutions. arXiv:1409-4842
  34. 34.
    Wang JJ, Yang JC, Yu K, Lv FJ, Huang T, Gong YH (2010) Locality-constrained linear coding for image classification. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 3360–3367Google Scholar
  35. 35.
    Yang JC, Yu K, Gong YH, Huang T (2009) Linear spatial pyramid matching using sparse coding for image classification. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 1794–1801Google Scholar
  36. 36.
    Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision (ECCV), Part I, pp 818–833Google Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.Department of AutomationUniversity of Science and Technology of ChinaHefeiPeople’s Republic of China
  2. 2.Institute of Intelligent MachinesChinese Academy of SciencesHefeiPeople’s Republic of China

Personalised recommendations