Memory-Efficient Incremental Learning Through Feature Adaptation

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12361)


We introduce an approach for incremental learning that preserves feature descriptors of training images from previously learned classes, instead of the images themselves, unlike most existing work. Keeping the much lower-dimensional feature embeddings of images reduces the memory footprint significantly. We assume that the model is updated incrementally for new classes as new data becomes available sequentially. This requires adapting the previously stored feature vectors to the updated feature space without having access to the corresponding original training images. Feature adaptation is learned with a multi-layer perceptron, which is trained on feature pairs corresponding to the outputs of the original and updated network on a training image. We validate experimentally that such a transformation generalizes well to the features of the previous set of classes, and maps features to a discriminative subspace in the feature space. As a result, the classifier is optimized jointly over new and old classes without requiring old class images. Experimental results show that our method achieves state-of-the-art classification accuracy in incremental learning benchmarks, while having at least an order of magnitude lower memory footprint compared to image-preserving strategies.



This research was funded in part by NSF grants IIS 1563727 and IIS 1718221, Google Research Award, Amazon Research Award, and AWS Machine Learning Research Award.

Supplementary material

504471_1_En_41_MOESM1_ESM.pdf (388 kb)
Supplementary material 1 (pdf 388 KB)


  1. 1.
    Aljundi, R., Babiloni, F., Elhoseiny, M., Rohrbach, M., Tuytelaars, T.: Memory aware synapses: learning what (not) to forget. In: ECCV (2018)Google Scholar
  2. 2.
    Belouadah, E., Popescu, A.: Il2m: class incremental learning with dual memory. In: ICCV (2019)Google Scholar
  3. 3.
    Bucher, M., Herbin, S., Jurie, F.: Generating visual representations for zero-shot classification. In: ICCV (2017)Google Scholar
  4. 4.
    Castro, F.M., Marín-Jiménez, M.J., Guil, N., Schmid, C., Alahari, K.: End-to-end incremental learning. In: ECCV (2018)Google Scholar
  5. 5.
    Cauwenberghs, G., Poggio, T.: Incremental and decremental support vector machine learning. In: NeurIPS (2001)Google Scholar
  6. 6.
    Chaudhry, A., Dokania, P.K., Ajanthan, T., Torr, P.H.: Riemannian walk for incremental learning: understanding forgetting and intransigence. In: ECCV (2018)Google Scholar
  7. 7.
    Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995). Scholar
  8. 8.
    Dhar, P., Singh, R.V., Peng, K.C., Wu, Z., Chellappa, R.: Learning without memorizing. In: CVPR (2019)Google Scholar
  9. 9.
    Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: AISTATS (2011)Google Scholar
  10. 10.
    Goodfellow, I., et al.: Generative adversarial nets. In: NeurIPS (2014)Google Scholar
  11. 11.
    Goodfellow, I.J., Mirza, M., Xiao, D., Courville, A., Bengio, Y.: An empirical investigation of catastrophic forgetting in gradient-based neural networks. arXiv preprint arXiv:1312.6211 (2013)
  12. 12.
    Hariharan, B., Girshick, R.: Low-shot visual recognition by shrinking and hallucinating features. In: CVPR (2017)Google Scholar
  13. 13.
    He, C., Wang, R., Shan, S., Chen, X.: Exemplar-supported generative reproduction for class incremental learning. In: BMVC (2018)Google Scholar
  14. 14.
    He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: CVPR (2017)Google Scholar
  15. 15.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)Google Scholar
  16. 16.
    Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. In: NIPS Deep Learning and Representation Learning Workshop (2015)Google Scholar
  17. 17.
    Hou, S., Pan, X., Loy, C.C., Wang, Z., Lin, D.: Learning a unified classifier incrementally via rebalancing. In: CVPR (2019)Google Scholar
  18. 18.
    Javed, K., Shafait, F.: Revisiting distillation and incremental classifier learning. In: Jawahar, C.V., Li, H., Mori, G., Schindler, K. (eds.) ACCV 2018. LNCS, vol. 11366, pp. 3–17. Springer, Cham (2019). Scholar
  19. 19.
    Kemker, R., Kanan, C.: Fearnet: brain-inspired model for incremental learning. In: ICLR (2018)Google Scholar
  20. 20.
    Kirkpatrick, J., et al.: Overcoming catastrophic forgetting in neural networks. Proc. Nat. Acad. Sci. 114(13), 3521–3526 (2017)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. University of Toronto, Technical report (2009)Google Scholar
  22. 22.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NeurIPS (2012)Google Scholar
  23. 23.
    Li, Z., Hoiem, D.: Learning without forgetting. IEEE Trans. Pattern Anal. Mach. Intell. 40(12), 2935–2947 (2017)CrossRefGoogle Scholar
  24. 24.
    Liu, X., Masana, M., Herranz, L., Van de Weijer, J., Lopez, A.M., Bagdanov, A.D.: Rotate your networks: better weight consolidation and less catastrophic forgetting. In: ICPR (2018)Google Scholar
  25. 25.
    Lopez-Paz, D., Ranzato, M.: Gradient episodic memory for continual learning. In: NeurIPS (2017)Google Scholar
  26. 26.
    Luo, C., Zhan, J., Xue, X., Wang, L., Ren, R., Yang, Q.: Cosine normalization: using cosine similarity instead of dot product in neural networks. In: Kůrková, V., Manolopoulos, Y., Hammer, B., Iliadis, L., Maglogiannis, I. (eds.) ICANN 2018. LNCS, vol. 11139, pp. 382–391. Springer, Cham (2018). Scholar
  27. 27.
    McCloskey, M., Cohen, N.J.: Catastrophic interference in connectionist networks: the sequential learning problem. In: Bower, G.H. (ed.) Psychology of Learning and Motivation, vol. 24. Acadamic Press, New York (1989)Google Scholar
  28. 28.
    Mensink, T., Verbeek, J., Perronnin, F., Csurka, G.: Distance-based image classification: generalizing to new classes at near-zero cost. IEEE Trans. Pattern Anal. Mach. Intell. 35(11), 2624–2637 (2013)CrossRefGoogle Scholar
  29. 29.
    Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetzbMATHGoogle Scholar
  30. 30.
    Rebuffi, S.A., Kolesnikov, A., Sperl, G., Lampert, C.H.: Icarl: incremental classifier and representation learning. In: CVPR (2017)Google Scholar
  31. 31.
    Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. In: NeurIPS (2015)Google Scholar
  32. 32.
    Ristin, M., Guillaumin, M., Gall, J., Van Gool, L.: Incremental learning of random forests for large-scale image classification. IEEE Trans. Pattern Anal. Mach. Intell. 38(3), 490–503 (2015)CrossRefGoogle Scholar
  33. 33.
    Robins, A.: Catastrophic forgetting, rehearsal and pseudorehearsal. Connection Sci. 7(2), 123–146 (1995)CrossRefGoogle Scholar
  34. 34.
    Russakovsky, O., et al.: Imagenet large scale visualrecognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015). Scholar
  35. 35.
    Rusu, A.A., et al.: Progressive neural networks. arXiv preprint arXiv:1606.04671 (2016)
  36. 36.
    Shin, H., Lee, J.K., Kim, J., Kim, J.: Continual learning with deep generative replay. In: NeurIPS (2017)Google Scholar
  37. 37.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. ICLR (2014)Google Scholar
  38. 38.
    Wang, Y.X., Ramanan, D., Hebert, M.: Growing a brain: fine-tuning by increasing model capacity. In: CVPR (2017)Google Scholar
  39. 39.
    Welling, M.: Herding dynamical weights to learn. In: ICML (2009)Google Scholar
  40. 40.
    Wu, C., et al.: Memory replay gans: learning to generate new categories without forgetting. In: NeurIPS (2018)Google Scholar
  41. 41.
    Wu, Y., et al.: Large scale incremental learning. In: CVPR (2019)Google Scholar
  42. 42.
    Xian, Y., Lorenz, T., Schiele, B., Akata, Z.: Feature generating networks for zero-shot learning. In: CVPR (2018)Google Scholar
  43. 43.
    Xiang, Y., Fu, Y., Ji, P., Huang, H.: Incremental learning using conditional adversarial networks. In: ICCV (2019)Google Scholar
  44. 44.
    Yu, L., et al.: Semantic drift compensation for class-incremental learning. In: CVPR (2020)Google Scholar
  45. 45.
    Zhang, J., et al.: Class-incremental learning via deep model consolidation. arXiv preprint arXiv:1903.07864 (2019)

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Google ResearchMeylanFrance
  2. 2.University of Illinois at Urbana-ChampaignChampaignUSA

Personalised recommendations