Advertisement

Single-label and multi-label conceptor classifiers in pre-trained neural networks

  • Guangwu Qian
  • Lei Zhang
  • Yan Wang
Original Article
  • 42 Downloads

Abstract

Training large neural network models from scratch is not feasible due to over-fitting on small datasets and too much time consumed on large datasets. To address this, transfer learning, namely utilizing the feature extracting capacity learned by large models, becomes a hot spot in neural network community. At the classifying stage of pre-trained neural network model, either a linear SVM classifier or a Softmax classifier is employed and that is the only trained part of the whole model. In this paper, inspired by transfer learning, we propose a classifier based on conceptors called Multi-label Conceptor Classifier (MCC) to deal with multi-label classification in pre-trained neural networks. When no multi-label sample exists, MCC equates to Fast Conceptor Classifier, a fast single-label classifier proposed in our previous work, thus being applicable to single-label classification. Moreover, by introducing a random search algorithm, we further improve the performance of MCC on single-label datasets Caltech-101 and Caltech-256, where it achieves state-of-the-art results. Also, its evaluations with pre-trained rather than fine-tuning neural networks are investigated on multi-label dataset PASCAL VOC-2007, where it achieves comparable results.

Keywords

Conceptors Multi-label classification Pre-trained neural networks Transfer learning 

Notes

Acknowledgements

This work was supported by National Key Research Priorities Program of China (Grant 2016YFC0801800), Fok Ying Tung Education Foundation (Grant 151068), National Natural Science Foundation of China (Grants 61772353 and 61332002) and Foundation for Youth Science and Technology Innovation Research Team of Sichuan Province (Grant 2016TD0018). The authors would like to extend our sincere appreciation for the help of Prof. Dr Herbert Jaeger.

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

References

  1. 1.
    Boser BE, Guyon IM, Vapnik VN (1992) A training algorithm for optimal margin classifiers. In: Proceedings of the fifth annual workshop on Computational learning theory, ACM, pp 144–152Google Scholar
  2. 2.
    Bruna J, Mallat S (2013) Invariant scattering convolution networks. IEEE Trans Pattern Anal Mach Intell 35(8):1872–1886CrossRefGoogle Scholar
  3. 3.
    Chan TH, Jia K, Gao S, Lu J, Zeng Z, Ma Y (2015) PCANet: a simple deep learning baseline for image classification? IEEE Trans Image Process 24(12):5017–5032MathSciNetCrossRefGoogle Scholar
  4. 4.
    Chatfield K, Simonyan K, Vedaldi A, Zisserman A (2014) Return of the devil in the details: delving deep into convolutional nets. arXiv preprint arXiv:1405.3531
  5. 5.
    Chen Q, Song Z, Hua Y, Huang Z, Yan S (2012) Hierarchical matching with side information for image classification. In: 2012 IEEE conference on computer vision and pattern recognition, pp 3426–3433.  https://doi.org/10.1109/CVPR.2012.6248083
  6. 6.
    Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2014) Decaf: a deep convolutional activation feature for generic visual recognition. In: ICML, pp 647–655Google Scholar
  7. 7.
    Dong J, Xia W, Chen Q, Feng J, Huang Z, Yan S (2013) Subcategory-aware object classification. In: 2013 IEEE conference on computer vision and pattern recognition, pp 827–834.  https://doi.org/10.1109/CVPR.2013.112
  8. 8.
    Fei-Fei L, Fergus R, Perona P (2007) Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. Comput Vis Image Underst 106(1):59–70CrossRefGoogle Scholar
  9. 9.
    Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587Google Scholar
  10. 10.
    Griffin G, Holub A, Perona P (2007) Caltech-256 object category dataset. California Institute of TechnologyGoogle Scholar
  11. 11.
    Guo Q, Jia J, Shen G, Zhang L, Cai L, Yi Z (2016) Learning robust uniform features for cross-media social data by using cross autoencoders. Knowl-Based Syst 102:64–75CrossRefGoogle Scholar
  12. 12.
    He K, Zhang X, Ren S, Sun J (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. In: European conference on computer vision, Springer, pp 346–361Google Scholar
  13. 13.
    He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778Google Scholar
  14. 14.
    Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Holmes M, Gray A, Isbell C (2007) Fast svd for large-scale matrices. In: Workshop on efficient machine learning at NIPS, vol 58, pp 249–252Google Scholar
  17. 17.
    Jaeger H (2014) Controlling recurrent neural networks by conceptors. arXiv preprint arXiv:1403.3369
  18. 18.
    Jaeger H, Haas H (2004) Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless communication. Science 304(5667):78–80CrossRefGoogle Scholar
  19. 19.
    Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105Google Scholar
  20. 20.
    Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440Google Scholar
  21. 21.
    Niu XN, Yang C, Wang H, Wang Y (2017) Investigation of ANN and SVM based on limited samples for performance and emissions prediction of a CRDI-assisted marine diesel engine. Appl Therm Eng 111:1353–1364CrossRefGoogle Scholar
  22. 22.
    Perronnin F, Sánchez J, Mensink T (2010) Improving the Fisher kernel for large-scale image classification. Springer, Berlin, pp 143–156Google Scholar
  23. 23.
    Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vision 115(3):211–252MathSciNetCrossRefGoogle Scholar
  24. 24.
    Qian G, Zhang L (2017) A simple feedforward convolutional conceptor neural network for classification. Appl Soft Comput.  https://doi.org/10.1016/j.asoc.2017.08.016
  25. 25.
    Qian G, Zhang L, Zhang Q (2017) Fast conceptor classifier in pre-trained neural networks for visual recognition. Springer International Publishing, Cham, pp 290–298Google Scholar
  26. 26.
    Schapire RE, Singer Y (2000) Boostexter: a boosting-based system for text categorization. Mach Learn 39(2–3):135–168CrossRefzbMATHGoogle Scholar
  27. 27.
    Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2013) Overfeat: integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229
  28. 28.
    Razavian AS, Azizpour H, Sullivan J, Carlsson S (2014) CNN features off-the-shelf: an astounding baseline for recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 806–813Google Scholar
  29. 29.
    Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
  30. 30.
    Wan L, Zeiler M, Zhang S, Cun YL, Fergus R (2013) Regularization of neural networks using dropconnect. In: Proceedings of the 30th international conference on machine learning (ICML-13), pp 1058–1066Google Scholar
  31. 31.
    Wei Y, Xia W, Lin M, Huang J, Ni B, Dong J, Zhao Y, Yan S (2016) Hcp: a flexible cnn framework for multi-label image classification. IEEE Trans Pattern Anal Mach Intell 38(9):1901–1907.  https://doi.org/10.1109/TPAMI.2015.2491929 CrossRefGoogle Scholar
  32. 32.
    Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision, Springer, pp 818–833Google Scholar
  33. 33.
    Zhang H, Li J, Ji Y, Yue H (2017) Understanding subtitles by character-level sequence-to-sequence learning. IEEE Trans Ind Inf 13(2):616–624CrossRefGoogle Scholar
  34. 34.
    Zhang L, Yi Z (2011) Selectable and unselectable sets of neurons in recurrent neural networks with saturated piecewise linear transfer function. IEEE Trans Neural Netw 22(7):1021–1031CrossRefGoogle Scholar
  35. 35.
    Zhang L, Yi Z, Yu J (2008) Multiperiodicity and attractivity of delayed recurrent neural networks with unsaturating piecewise linear transfer functions. IEEE Trans Neural Netw 19(1):158–167CrossRefGoogle Scholar
  36. 36.
    Zhang L, Yi Z, Zhang SL, Heng PA (2009) Activity invariant sets and exponentially stable attractors of linear threshold discrete-time recurrent neural networks. IEEE Trans Autom Control 54(6):1341–1347MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© The Natural Computing Applications Forum 2018

Authors and Affiliations

  1. 1.Machine Intelligence Laboratory, College of Computer ScienceSichuan UniversityChengduPeople’s Republic of China

Personalised recommendations