Advertisement

Cognitive Computation

, Volume 10, Issue 2, pp 321–333 | Cite as

End-to-End Lifelong Learning: a Framework to Achieve Plasticities of both the Feature and Classifier Constructions

  • Wangli Hao
  • Junsong Fan
  • Zhaoxiang Zhang
  • Guibo Zhu
Article

Abstract

Plasticity in our brain offers us promising ability to learn and know the world. Although great successes have been achieved in many fields, few bio-inspired machine learning methods have mimicked this ability. Consequently, when meeting large-scale or time-varying data, these bio-inspired methods are infeasible, due to the reasons that they lack plasticity and need all training data loaded into memory. Furthermore, even the popular deep convolutional neural network (CNN) models have relatively fixed structures and cannot process time varying data well. Through incremental methodologies, this paper aims at exploring an end-to-end lifelong learning framework to achieve plasticities of both the feature and classifier constructions. The proposed model mainly comprises of three parts: Gabor filters followed by max pooling layer offering shift and scale tolerance to input samples, incremental unsupervised feature extraction, and incremental SVM trying to achieve plasticities of both the feature learning and classifier construction. Different from CNN, plasticity in our model has no back propogation (BP) process and does not need huge parameters. Our incremental models, including IncPCANet and IncKmeansNet, have achieved better results than PCANet and KmeansNet on minist and Caltech101 datasets respectively. Meanwhile, IncPCANet and IncKmeansNet show promising plasticity of feature extraction and classifier construction when the distribution of data changes. Lots of experiments have validated the performance of our model and verified a physiological hypothesis that plasticity exists in high level layer better than that in low level layer.

Keywords

Plasticity Lifelong learning End-to-end Incremental PCANet Incremental KMeansNet Incremental SVM 

Notes

Funding Information

This work was supported in part by the National Natural Science Foundation of China under Grant 61773375, Grant 61375036, and Grant 61511130079, in part by the Microsoft Collaborative Research Project.

Compliance with Ethical Standards

Conflict of Interests

The authors declare that they have no conflict of interest.

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

References

  1. 1.
    Jim M, David LG. Multiclass object recognition with sparse, localized features. In: IEEE Computer society conference on computer vision and pattern recognition; 2006. p. 11–18.Google Scholar
  2. 2.
    Rolls ET, Milward T. A model of invariant object recognition in the visual system: learning rules, activation functions, lateral inhibition, and information-based performance measures. Neural Comput. 2000;12(11):2547–72.CrossRefPubMedGoogle Scholar
  3. 3.
    Thorpe S, Fize D, Marlot C. Speed of processing in the human visual system. Nature. 1996;381(6582):520–2.CrossRefPubMedGoogle Scholar
  4. 4.
    Schmidhuber J. Deep learning in neural networks: an overview. Neural Netw. 2015;61:85–117.CrossRefPubMedGoogle Scholar
  5. 5.
    LeCun Y, Bengio Y. Convolutional networks for images, speech, and time series. In: The handbook of brain theory and neural networks; 1995. 3361(10).Google Scholar
  6. 6.
    Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems; 2012. p. 1097–1105.Google Scholar
  7. 7.
    Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556. 2014.
  8. 8.
    Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Rabinovich A. Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2015. p. 1–9.Google Scholar
  9. 9.
    LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436–44.CrossRefPubMedGoogle Scholar
  10. 10.
    Chan TH, Jia K, Gao S, Lu J, Zeng Z, Ma Y. PCANet: a simple deep learning baseline for image classification. IEEE Trans Image Process. 2015;24(12):5017–32.CrossRefPubMedGoogle Scholar
  11. 11.
    Fukushima K. Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol Cybern. 1980;36(4):193–202.CrossRefPubMedGoogle Scholar
  12. 12.
    LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD. Backpropagation applied to handwritten zip code recognition. Neural Comput. 1989;1(4):541–51.CrossRefGoogle Scholar
  13. 13.
    He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. arXiv:1512.03385. 2015.
  14. 14.
    Bruna J, Mallat S. Invariant scattering convolution networks. IEEE Trans Pattern Anal Mach Intell. 2013;35(8):1872C–86.CrossRefGoogle Scholar
  15. 15.
    Turk M, Pentland A. Eigenfaces for recognition. J Cogn Neurosci. 1991;3(1):71–86.CrossRefPubMedGoogle Scholar
  16. 16.
    Hyvarinen A. Survey on independent component analysis. Neural Computing Surveys. 1999;2(4):94–128.Google Scholar
  17. 17.
    Bartlett MS. Independent component representations for face recognition. In: Face image analysis by unsupervised learning. US: Springer; 2001. p. 39–67.Google Scholar
  18. 18.
    Belhumeur PN, Hespanha JP, Kriegman DJ. Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell. 1997;19(7):711–20.CrossRefGoogle Scholar
  19. 19.
    Lee H, Battle A, Raina R, Ng AY. Efficient sparse coding algorithms. In: Advances in neural information processing systems; 2007. p. 801–808.Google Scholar
  20. 20.
    Schölkopf B, Smola A, Müller KR. Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 1998;10(5):1299–319.Google Scholar
  21. 21.
    Mika S, Ratsch G, Weston J, Scholkopf B, Mullers KR. Fisher discriminant analysis with kernels. In: Neural networks for signal processing IX, proceedings of the 1999 IEEE signal processing society workshop; 1999. p. 41–48.Google Scholar
  22. 22.
    Bach FR, Jordan MI. Kernel independent component analysis. J Mach Learn Res. 2002;3:1–48.Google Scholar
  23. 23.
    MacQueen J. Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability; 1967. 1(14): p. 281–297.Google Scholar
  24. 24.
    Hegde A, Principe JC, Erdogmus D, Ozertem U, Rao YN, Peddaneni H. Perturbation-based eigenvector updates for on-line principal components analysis and canonical correlation analysis. Journal of VLSI Signal Processing Systems for Signal, Image and Video Technology. 2006;45(1–2):85–95.CrossRefGoogle Scholar
  25. 25.
    Weng J, Zhang Y, Hwang WS. Candid covariance-free incremental principal component analysis. IEEE Trans Pattern Anal Mach Intell. 2003;25:1034–40.CrossRefGoogle Scholar
  26. 26.
    Krasulina T. Method of stochastic approximation in the determination of the largest eigenvalue of the mathematical expectation of random matrices. In: Automatation and remote control; 1970. p. 50–56.Google Scholar
  27. 27.
    Diehl CP, Cauwenberghs G. SVM incremental learning, adaptation and optimization. Proceedings of the International Joint Conference on Neural Networks. 2003;4:2685–90.Google Scholar
  28. 28.
    Thrun S. Explanation-based neural network learning: a lifelong learning approach. Springer Science & Business Media. 2012;(357).Google Scholar
  29. 29.
    Thrun S, O’Sullivan J. Discovering structure in multiple learning tasks: the TC algorithm. ICML. 1996;96:489–97.Google Scholar
  30. 30.
    Thrun S, Mitchell TM. Lifelong robot learning. Robot Auton Syst. 1995;15(1–2):25–46.CrossRefGoogle Scholar
  31. 31.
    Caruana R. Multitask learning. Mach Learn. 1997;28:41C75.CrossRefGoogle Scholar
  32. 32.
    Donmez P, Carbonell JG. Proactive learning: cost-sensitive active learning with multiple imperfect oracles. In: Proceedings of the 17th ACM conference on information and knowledge management. ACM; p. 619–628.Google Scholar
  33. 33.
    Tong S, Koller D. Active learning for structure in bayesian networks. In: IJCAI; 2001.Google Scholar
  34. 34.
    Brunskill E, Leffler B, Li L, Littman ML, Roy N. Corl: a continuous-state offset-dynamics reinforcement learner. In: Proceedings of the 24th conference on uncertainty in artificial intelligence (UAI); 2012. p. 53–61.Google Scholar
  35. 35.
    Mitchell TM, Cohen WW, Hruschka Jr ER, Talukdar PP, Betteridge J, Carlson A, Lao N. Never ending learning. In: AAAI; 2015. p. 2302–2310.Google Scholar
  36. 36.
    Carlson A, Betteridge J, Kisiel B, Settles B, Hruschka Jr ER, Mitchell TM. Toward an architecture for never-ending language learning. AAAI. 2010;5:3.Google Scholar
  37. 37.
    Riesenhuber M, Poggio T. Hierarchical models of object recognition in cortex. Nat Neurosci. 1999;2(11):1019–25.CrossRefPubMedGoogle Scholar
  38. 38.
    Larochelle H, Erhan D, Courville A, Bergstra J, Bengio Y. An empirical evaluation of deep architectures on problems with many factors of variation. In: Proceedings of the 24th international conference on machine learning. ACM; 2007. p. 473–480.Google Scholar
  39. 39.
    Fei-Fei L, Fergus R, Perona P. Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. Comput Vis Image Underst. 2007;106(1):59–70.CrossRefGoogle Scholar
  40. 40.
    Chang CC, Lin CJ. LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol. 2011;2(3):27.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  • Wangli Hao
    • 1
  • Junsong Fan
    • 1
  • Zhaoxiang Zhang
    • 1
    • 2
    • 3
  • Guibo Zhu
    • 1
  1. 1.Institute of AutomationUniversity of Chinese Academy of Sciences (UCAS)BeijingChina
  2. 2.CAS Center for Excellence in Brain Science and Intelligence Technology (CEBSIT)BeijingChina
  3. 3.National Laboratory of Pattern RecognitionInstitute of Automation, Chinese Academy of Sciences (NLPR, CASIA)BeijingChina

Personalised recommendations