Lifelong learning gets better with MixUp and unsupervised continual representation

kumar, Prashant; Toshniwal, Durga

doi:10.1007/s10489-024-05434-w

Lifelong learning gets better with MixUp and unsupervised continual representation

Published: 17 April 2024

(2024)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

73 Accesses
Explore all metrics

Abstract

Continual learning enables learning systems to adapt to evolving data distributions by sequentially acquiring knowledge from a series of tasks. Unsupervised lifelong learning refers to the ability to learn over time while memorizing previous patterns without supervision. However, the prior methods in this field heavily rely on supervised or reinforcement learning, which necessitates annotated data, thereby limiting their scalability in real-world applications where data is often biased and lacks annotations. To overcome these challenges, this work introduces a novel approach called Lifelong Learning gets better with MixUp and Unsupervised Continual Representation (LL-UCR). LL-UCR aims to learn feature representations from unlabeled tasks, eliminating the need for annotated data. Within the LL-UCR framework, two innovative techniques are introduced: LL-MixUp, which mitigates catastrophic forgetting by interpolating samples between current and previous tasks, and Dark Experience Replay (DER) Buzzega et al. (Adv Neural Inf Process Syst, 33, 15920–15930 2020) adapted for UCR, aligning network logits across tasks. To overcome buffer size limitations in replay-based methods, the Retrospective Adversarial Replay (RAR) framework is incorporated, facilitating diverse replay sample generation. Through systematic analysis, we demonstrate that unsupervised visual representations exhibit remarkable resilience to catastrophic forgetting, consistently outperforming supervised methods in terms of performance and generalization on out-of-distribution tasks. Furthermore, our qualitative analysis reveals that LL-UCR fosters a smoother loss landscape and acquires meaningful feature representations. Extensive experimental evaluations conducted on diverse datasets validate the superior performance of LL-UCR compared to state-of-the-art supervised continual learning methods and the unsupervised LUMP Madaan et al. (International conference on learning representations, 2020) method, effectively mitigating catastrophic forgetting.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial intelligence in the creative industries: a review

Article Open access 02 July 2021

Knowledge Distillation: A Survey

Article 22 March 2021

A survey of transfer learning

Article Open access 28 May 2016

Data availability and access

We used publicly available datasets: split CIFAR-10 [42], split CIFAR-100 [42], and Split Tiny-ImageNet [3] datasets. As these datasets are publicly accessible, no ethical approval or informed consent was required.

References

Buzzega P, Boschini M, Porrello A, Abati D, Calderara S (2020) Dark experience for general continual learning: a strong, simple baseline. Adv Neural Inf Process Syst 33:15920–15930
Google Scholar
Madaan D, Yoon J, Li Y, Liu Y, Hwang SJ (2022) Representational continuity for unsupervised continual learning. In: International conference on learning representations
Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90
Article Google Scholar
Sun P, Zhang R, Jiang Y, Kong T, Xu C, Zhan W, Tomizuka M, Li L, Yuan Z, Wang C et al (2021) Sparse r-cnn: End-to-end object detection with learnable proposals. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14454–14463
Zheng S, Lu J, Zhao H, Zhu X, Luo Z, Wang Y, Fu Y, Feng J, Xiang T, Torr PH et al (2021) Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6881–6890
Delange M, Aljundi R, Masana M, Parisot S, Jia X, Leonardis A, Slabaugh G, Tuytelaars T (2021) A continual learning survey: Defying forgetting in classification tasks. IEEE Trans Pattern Anal Mach Intell
Thrun S (1995) A lifelong learning perspective for mobile robot control. In: Intelligent robots and systems, Elsevier, pp 201–214
McCloskey M, Cohen NJ (1989) Catastrophic interference in connectionist networks: The sequential learning problem. Elsevier 24:109–165
Google Scholar
Kirkpatrick J, Pascanu R, Rabinowitz N, Veness J, Desjardins G, Rusu AA, Milan K, Quan J, Ramalho T, Grabska-Barwinska A et al (2017) Overcoming catastrophic forgetting in neural networks. In: Proceedings of the national academy of sciences 114(13):3521–3526
Zenke F, Poole B, Ganguli S (2017) Continual learning through synaptic intelligence. In: International conference on machine learning, PMLR, pp 3987–3995
Yoon J, Yang E, Lee J, Hwang SJ (2018) Lifelong learning with dynamically expandable networks. In: International conference on learning representations
He K, Fan H, Wu Y, Xie S, Girshick R (2020) Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9729–9738
Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International conference on machine learning, PMLR, pp 1597–1607
Chen X, He K (2021) Exploring simple siamese representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 15750–15758
Zbontar J, Jing L, Misra I, LeCun Y, Deny S (2021) Barlow twins: Self-supervised learning via redundancy reduction. In: International conference on machine learning, PMLR, pp 12310–12320
Kumari L, Wang S, Zhou T, Bilmes J (2022) Retrospective adversarial replay for continual learning. In: Advances in neural information processing systems
Rolnick D, Ahuja A, Schwarz J, Lillicrap T, Wayne G (2019) Experience replay for continual learning. In: Advances in neural information processing systems 32
Li Z, Hoiem D (2017) Learning without forgetting. IEEE Trans Pattern Anal Machine Intell 40(12):2935–2947
Article Google Scholar
Schwarz J, Czarnecki W, Luketina J, Grabska-Barwinska A, Teh YW, Pascanu R, Hadsell R (2018) Progress & compress: A scalable framework for continual learning. In: International Conference on Machine Learning, PMLR, pp 4528–4537
Ahn H, Cha S, Lee D, Moon T (2019) Uncertainty-based continual learning with adaptive regularization. In: Advances in neural information processing systems 32
Huszár F (2018) Note on the quadratic penalties in elastic weight consolidation. Proceed National Academy Sci 115(11):2496–2497
Article Google Scholar
Rebuffi S-A, Kolesnikov A, Sperl G, Lampert CH (2017) icarl: Incremental classifier and representation learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2001–2010
Riemer M, Cases I, Ajemian R, Liu M, Rish I, Tu Y, Tesauro G (2019) Learning to learn without forgetting by maximizing transfer and minimizing interference. In: International conference on learning representations
Wang L, Zhang X, Yang K, Yu L, Li C, HONG L, Zhang S, Li Z, Zhong Y, Zhu J (2022) Memory replay with data compression for continual learning. In: International conference on learning representations
Aljundi R, Lin M, Goujaud B, Bengio Y (2019) Gradient based sample selection for online continual learning. In: Advances in neural information processing systems 32
Chaudhry A, Ranzato M, Rohrbach M, Elhoseiny M (2019) Efficient lifelong learning with a-gem. In: International conference on learning representations
Chaudhry A, Gordo A, Dokania P, Torr P, Lopez-Paz D (2021) Using hindsight to anchor past knowledge in continual learning. Proceedings of the AAAI conference on artificial intelligence 35:6993–7001
Article Google Scholar
Rusu AA, Rabinowitz NC, Desjardins G, Soyer H, Kirkpatrick J, Kavukcuoglu K, Pascanu R, Hadsell R (2016) Progressive neural networks
Liu Y, Wu X, Bo Y, Zheng Z, Yin M (2023) Incremental learning without looking back: a neural connection relocation approach. Neural Comput Appl 35(19):14093–14107
Article Google Scholar
Xu J, Zhu Z (2018) Reinforced continual learning. In: Advances in neural information processing systems 31
Grill J-B, Strub F, Altché F, Tallec C, Richemond P, Buchatskaya E, Doersch C, Avila Pires B, Guo Z, Gheshlaghi Azar M et al (2020) Bootstrap your own latent-a new approach to self-supervised learning. Advances in neural information processing systems 33:21271–21284
Google Scholar
Lin Z, Wang Y, Lin H (2022) Continual contrastive learning for image classification. In: 2022 IEEE International conference on multimedia and expo (ICME), IEEE, pp 1–6
Bromley J, Guyon I, LeCun Y, Säckinger E, Shah R (1993) Signature verification using a “siamese" time delay neural network. In: Advances in neural information processing systems 6
Pfeifer B, Holzinger A, Schimek MG (2022) Robust random forest-based all-relevant feature ranks for trustworthy ai. Stud Health Technol Inform 294:137–138
Google Scholar
Huo J, Zyl TL (2023) Incremental class learning using variational autoencoders with similarity learning. Neural Comput Appl 1–16
Rao D, Visin F, Rusu A, Pascanu R, Teh YW, Hadsell R (2019) Continual unsupervised representation learning. In: Advances in neural information processing systems 32
Fini E, Da Costa VGT, Alameda-Pineda X, Ricci E, Alahari K, Mairal J (2022) Self-supervised models are continual learners. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9621–9630
Yu X, Rosing T, Guo Y (2024) Evolve: Enhancing unsupervised continual learning with multiple experts. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 2366–2377
Zhang H, Cisse M, Dauphin YN, Lopez-Paz D (2018) mixup: Beyond empirical risk minimization. In: International conference on learning representations
Zhang L, Deng Z, Kawaguchi K, Ghorbani A, Zou J (2021) How does mixup help with robustness and generalization? In: International conference on learning representations
Hinton G, Vinyals O, Dean J (2014) Dark knowledge. Presented as the keynote in BayLearn 2(2)
Krizhevsky A, Hinton G et al (2009) Learning multiple layers of features from tiny images
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on computer vision and pattern recognition, IEEE, pp 248–255
Wu Z, Xiong Y, Yu SX, Lin D (2018) Unsupervised feature learning via non-parametric instance discrimination. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3733–3742
De Lange M, Aljundi R, Masana M, Parisot S, Jia X, Leonardis A, Slabaugh G, Tuytelaars T (2021) A continual learning survey: Defying forgetting in classification tasks. IEEE Trans Pattern Anal Machine Intell 44(7):3366–3385
Google Scholar
Lopez-Paz D, Ranzato M (2017) Gradient episodic memory for continual learning. In: Advances in neural information processing systems 30
Yin H, Molchanov P, Alvarez JM, Li Z, Mallya A, Hoiem D, Jha NK, Kautz J (2020) Dreaming to distill: Data-free knowledge transfer via deepinversion. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8715–8724
Kornblith S, Norouzi M, Lee H, Hinton G (2019) Similarity of neural network representations revisited. In: International conference on machine learning, PMLR, pp 3519–3529

Download references

Author information

Authors and Affiliations

Department of Computer Science & Engineering, Indian Institute of Technology, Roorkee, Roorkee, Uttrakhand, 247667, India
Prashant kumar & Durga Toshniwal

Authors

Prashant kumar
View author publications
You can also search for this author in PubMed Google Scholar
Durga Toshniwal
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Prashant Kumar: Conceptualization, Methodology, Validation, Writing - original draft. Durga Toshniwal: Writing - review & editing.

Corresponding author

Correspondence to Durga Toshniwal.

Ethics declarations

Competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethical and informed consent for data used

In this paper, the dataset names are mentioned clearly, and it is stated that these datasets are publicly available. Additionally, it is stated that no ethical approval or informed consent was required for the usage of these datasets.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

kumar, P., Toshniwal, D. Lifelong learning gets better with MixUp and unsupervised continual representation. Appl Intell (2024). https://doi.org/10.1007/s10489-024-05434-w

Download citation

Accepted: 29 March 2024
Published: 17 April 2024
DOI: https://doi.org/10.1007/s10489-024-05434-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Lifelong learning gets better with MixUp and unsupervised continual representation

Abstract

Access this article

Similar content being viewed by others

Artificial intelligence in the creative industries: a review

Knowledge Distillation: A Survey

A survey of transfer learning

Data availability and access

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interest

Ethical and informed consent for data used

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Lifelong learning gets better with MixUp and unsupervised continual representation

Abstract

Access this article

Similar content being viewed by others

Artificial intelligence in the creative industries: a review

Knowledge Distillation: A Survey

A survey of transfer learning

Data availability and access

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interest

Ethical and informed consent for data used

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation