Abstract
The replay method is an effective strategy to address the problem of catastrophic forgetting, a key challenge for continuous learning. However, the sample sets obtained by replay-based methods generally suffer from local data information deficiencies. This problem leads to an imbalance in the plasticity-stability of the model on older tasks. This paper proposes a novel method, called non-similar sample storage (NSS). Non-similar refers to the Euclidean distance for feature vectors of different samples being far. NSS extracts the feature vectors of the samples and then calculates the similarity of the feature vectors for each sample after the current task training. Samples that contribute less to the model classification effect among similar samples are iteratively deleted, and the subset of low-similarity samples is retained. Moreover, NSS reserves 30% of the storage space for saving samples near the center of the sample set. Low-similarity samples stored by NSS get larger losses during the replay process, leading to lower training effectiveness of the current task. This paper introduces a knowledge distillation strategy to solve this problem. A variable parameter was used to balance the classification loss of the new task with the distillation loss of the old task (NSS-D). Experimental results in CIFAR10 and imbalanced CIFAR10 show that NSS maximizes the data’s global information and can retain the model’s ability to recognize old tasks better. Compared with classical algorithms, NSS-D performs better on CIFAR100 (48.8%) and ImageNet-200 (36.7%).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
French, R.M.: Catastrophic forgetting in connectionist networks. Trends Cogn. Sci. 3(4), 128–135 (1999)
Kemker, R., McClure, M., Abitino, A., Hayes, T., Kanan, C.: Measuring catastrophic forgetting in neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, No. 1 (2018)
Aljundi, R., Babiloni, F., Elhoseiny, M., Rohrbach, M., Tuytelaars, T.: Memory aware synapses: learning what (not) to forget. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 144–161. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_9
Van de Ven, G.M., Tolias, A.S.: Three scenarios for continual learning. arXiv preprint arXiv:1904.07734 (2019)
De Lange, M., et al.: A continual learning survey: defying forgetting in classification tasks. IEEE Trans. Pattern Anal. Mach. Intell. 44(7), 3366–3385 (2021)
Rebuffi, S.A., Kolesnikov, A., Sperl, G., Lampert, C.H.: iCaRL: incremental classifier and representation learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2001–2010 (2017)
Isele, D., Cosgun, A.: Selective experience replay for lifelong learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, No. 1 (2018)
Tang, S., Su, P., Chen, D., Ouyang, W.: Gradient regularized contrastive learning for continual domain adaptation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, No. 3, pp. 2665–2673 (2021)
Kirkpatrick, J., et al.: Overcoming catastrophic forgetting in neural networks. Proc. Natl. Acad. Sci. 114(13), 3521–3526 (2017)
Rusu, A.A., et al.: Progressive neural networks. arXiv preprint arXiv:1606.04671 (2016)
Aljundi, R., Chakravarty, P., Tuytelaars, T.: Expert gate: lifelong learning with a network of experts. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3366–3375 (2017)
Rolnick, D., Ahuja, A., Schwarz, J., Lillicrap, T., Wayne, G.: Experience replay for continual learning. Adv. Neural Inf. Process. Syst. 32 (2019)
Rannen, A., Aljundi, R., Blaschko, M.B., Tuytelaars, T.: Encoder based lifelong learning. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1320–1328 (2017)
Fernando, C., et al.: Pathnet: evolution channels gradient descent in super neural networks. arXiv preprint arXiv:1701.08734 (2017)
Farquhar, S., Gal, Y.: Towards robust evaluations of continual learning. arXiv preprint arXiv:1805.09733 (2018)
Takesian, A.E., Hensch, T.K.: Balancing plasticity/stability across brain development. Prog. Brain Res. 207, 3–34 (2013)
Lin, Y.S., Jiang, J.Y., Lee, S.J.: A similarity measure for text classification and clustering. IEEE Trans. Knowl. Data Eng. 26(7), 1575–1590 (2013)
He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)
Sun, Y., Wong, A.K., Kamel, M.S.: Classification of imbalanced data: a review. Int. J. Pattern Recognit. Artif. Intell. 23(04), 687–719 (2009)
Gou, J., Yu, B., Maybank, S.J., Tao, D.: Knowledge distillation: a survey. Int. J. Comput. Vis. 129(6), 1789–1819 (2021)
Kim, Y., Rush, A.M.: Sequence-level knowledge distillation. arXiv preprint arXiv:1606.07947 (2016)
Chaudhry, A., Dokania, P.K., Ajanthan, T., Torr, P.H.S.: Riemannian walk for incremental learning: understanding forgetting and intransigence. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 556–572. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_33
Masana, M., Liu, X., Twardowski, B., Menta, M., Bagdanov, A.D., van de Weijer, J.: Class-incremental learning: survey and performance evaluation on image classification. IEEE Trans. Pattern Anal. Mach. Intell. (2022)
Cortes, C., Mohri, M., Rostamizadeh, A.: L2 regularization for learning kernels. arXiv preprint arXiv:1205.2653 (2012)
Chicco, D., Jurman, G.: The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics 21(1), 1–13 (2020)
Zenke, F., Poole, B., Ganguli, S.: Continual learning through synaptic intelligence. In: International Conference on Machine Learning, pp. 3987–3995. PMLR (2017)
Li, Z., Hoiem, D.: Learning without forgetting. IEEE Trans. Pattern Anal. Mach. Intell. 40(12), 2935–2947 (2017)
Benjamin, A.S., Rolnick, D., Kording, K.: Measuring and regularizing networks in function space. arXiv preprint arXiv:1805.08289 (2018)
Aljundi, R., Lin, M., Goujaud, B., Bengio, Y.: Gradient based sample selection for online continual learning. Adv. Neural Inf. Process. Syst. 32 (2019)
Chaudhry, A., Gordo, A., Dokania, P., Torr, P., Lopez-Paz, D.: Using hindsight to anchor past knowledge in continual learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, No. 8, pp. 6993–7001 (2021)
Acknowledgments
This work was supported by the National Natural Science Foundation of China under Grant 62272355, 61702383, and 62176191.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Min, Q., He, J., Yang, L., Fu, Y. (2023). Continual Learning with a Memory of Non-similar Samples. In: Pan, L., Zhao, D., Li, L., Lin, J. (eds) Bio-Inspired Computing: Theories and Applications. BIC-TA 2022. Communications in Computer and Information Science, vol 1801. Springer, Singapore. https://doi.org/10.1007/978-981-99-1549-1_25
Download citation
DOI: https://doi.org/10.1007/978-981-99-1549-1_25
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-1548-4
Online ISBN: 978-981-99-1549-1
eBook Packages: Computer ScienceComputer Science (R0)