Abstract
Humans can use acquired experience to learn new skills quickly and without forgetting the knowledge they already have. However, the neural network cannot do continual learning like humans, because it is easy to fall into the stability-plasticity dilemma and lead to catastrophic forgetting. Since meta-learning with the already acquired knowledge as a priori can directly optimize the final goal, this paper proposes LGCMLA (Lie Group Continual Meta Learning Algorithm) based on meta-learning, this algorithm is an improvement of CMLA (Continual Meta Learning Algorithm) proposed by Jiang et al. On the one hand, LGCMLA enhances the continuity between tasks by changing the inner-loop update rule (from using random initialization parameters for each task to using the updated parameters of the previous task for the subsequent task). On the other hand, it uses orthogonal groups to limit the parameter space and adopts the natural Riemannian gradient descent to accelerate the convergence speed. It not only corrects the shortcomings of poor convergence and stability of CMLA, but also further improves the generalization performance of the model and solves the stability-plasticity dilemma more effectively. Experiments on miniImageNet, tieredImageNet and Fewshot-CIFAR100 (Canadian Institute For Advanced Research) datasets prove the effectiveness of LGCMLA. Especially compared to MAML (Model-Agnostic Meta-Learning) with standard four-layer convolution, the accuracy of 1 shot and 5 shot is improved by 16.4% and 17.99% respectively under the setting of 5-way on miniImageNet.
Similar content being viewed by others
Data Availability Statement
The datasets used during this study are available upon reasonable request to the authors. The code is publicly available at https://github.com/
References
Parisi GI, Kemker R, Part JL, Kanan C, Wermter S (2018) Continual lifelong learning with neural networks: a review. Neural Networks 113:54–71. https://doi.org/10.1016/j.neunet.2019.01.012
Li C, Li Y, Zhao Y et al (2021) SLER: Self-generated long-term experience replay for continual reinforcement learning. Applied Intelligence 51:185–201. https://doi.org/10.1007/s10489-020-01786-1
Jiang MJ, Li FZ, Liu L (2021) Continual meta learning algorithm. Appl Intell. https://doi.org/10.1007/s10489-021-02543-8
Chaudhry A, Ranzato M, Rohrbach M, Elhoseiny M (2019) Efficient lifelong learning with A-GEM. In: 7th International conference on learning representations. https://openreview.net/forum?id=Hkf2_sC5FX
Chaudhry A, Rohrbach M, Elhoseiny M, Ajanthan T, Dokania PK, Torr PHS et al (2019) Continual learning with tiny episodic memories
Lopez-Paz D, Ranzato M (2017) Gradient episodic memory for continual learning. Neural Inf Process Syst:6467-6476. https://proceedings.neurips.cc/paper/2017/hash/f87522788a2be2d171666752f97ddebb-Abstract.html
Cortes C, Gonzalvo X, Kuznetsov V, Mohri M, Yang S (2016) AdaNet: adaptive structural learning of artificial neural networks. In: Proceedings of the 34th international conference on machine learning, vol 70, pp 874–883. http://proceedings.mlr.press/v70/cortes17a.html
Rebuffi S, Kolesnikov A, Sperl G, Lampert CH (2017) iCaRL: Incremental classifier and representation learning. In: 2017 IEEE conference on computer vision and pattern recognition, pp 5533–5542. https://doi.org/10.1109/CVPR.2017.587
Rusu AA, Rabinowitz NC, Desjardins G, Soyer H, Kirkpatrick J, Kavukcuoglu K, Pascanu R, Hadsell R (2016) Progressive neural networks
Fini E, Lathuiliere S, Sangineto E, Nabi M, Ricci E (2020) Online continual learning under extreme memory constraints. European Conference on Computer Vision 12373:720–735. https://doi.org/10.1007/978-3-030-58604-1_43
Kirkpatrick J, Pascanu R, Rabinowitz N, Veness J, Desjardins G, Rusu AA, Milan K, Quan J, Ramalho T, Grabska-Barwinska A et al (2016) Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences 114(13):3521–3526
Li Z, Hoiem D (2016) Learning without forgetting. European Conference on Computer Vision 9908:614–629. https://doi.org/10.1007/978-3-319-46493-0_37
Wang Y, Yao Q, Kwok JT, Ni LM (2020) Generalizing from a few examples: a survey on few-shot learning. ACM Computing Surveys 53(3):1–34. https://doi.org/10.1145/3386252
Ravi S, Larochelle H (2017) Optimization as a model for few-shot learning. In: Proceedings of the 5th international conference on learning representations. https://openreview.net/forum?id=rJY0-Kcll
Mishra N, Rohaninejad M, Chen X, Abbeel P (2017) A simple neural attentive meta-learner. In: 6th International conference on learning representations. https://openreview.net/forum?id=B1DmUzWAW
Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: 34th International conference on machine learning, vol 70, pp 1126–1135. http://proceedings.mlr.press/v70/finn17a.html
Rusu AA, Rao D, Sygnowski J, Vinyals O, Pascanu R, Osindero S, Hadsell R (2019) Meta-Learning with latent embedding optimization. In: 7th International conference on learning representations. https://openreview.net/forum?id=BJgklhAcK7
Sung F, Yang Y, Zhang L, Xiang T, Torr P, Hospedales TM (2018) Learning to compare: relation network for few-shot learning. https://doi.org/10.1109/CVPR.2018.00131
Vinyals O, Blundell C, Lillicrap T, Kavukcuoglu K, Wierstra D (2016) Matching networks for one shot learning. Neural Inf Process Syst:3630–3638. https://proceedings.neurips.cc/paper/2016/hash/90e1357833654983612fb05e3ec9148c-Abstract.html
Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning. Neural Inf Process Syst:4077–4087. https://proceedings.neurips.cc/paper/2017/hash/cb8da6767461f2812ae4290eac7cbc42-Abstract.html
Krishna NR, Balaprakash P (2020) Meta continual learning via dynamic programming
Yang H, He H, Zhang W et al (2021) Lie group manifold analysis: an unsupervised domain adaptation approach for image classification. Appl Intell. https://doi.org/10.1007/s10489-021-02564-3
Absil PA, Mahony R, Sepulchre R (2008) Optimization algorithms on matrix manifolds. Princeton University Press. http://press.princeton.edu/titles/8586.html
Nishimori Y (2021) A neural stiefel learning based on geodesics revisited
Harandi M, Fernando B (2016) Generalized backpropagation, \(\acute{E}\)tude de cas: orthogonality
Amari SI (1998) Natural gradient works efficiently in learning. Neural Computation 10(2):251–276. https://doi.org/10.1162/089976698300017746
Choi S, Cichocki A, Amari SI (2000) Flexible independent component analysis. Journal of Vlsi Signal Processing 26(1–2):25–38. https://doi.org/10.1023/A:1008135131269
M Ren, E Triantafillou, S Ravi, J Snell, and K Swersky (2018) Meta-learning for semi-supervised few-shot classification. In: 6th International conference on learning representations. https://openreview.net/forum?id=HJcSzz-CZ
Krizhevsky A, Hinton G (2012) Learning multiple layers of features from tiny images. Handb Syst Autoimmune Dis 1(4):54–57
Oreshkin BN, Lacoste A, Rodriguez P (2018) Tadam: task dependent adaptive metric for improved few-shot learning. Neural Inf Process Syst:719–729. https://proceedings.neurips.cc/paper/2018/hash/66808e327dc79d135ba18e051673d906-Abstract.html
Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: 19th International conference on computational statistics, pp 177–186. https://doi.org/10.1007/978-3-7908-2604-3_16
Kingma D, Ba J (2015) Adam: a method for stochastic optimization. In: 3rd International conference on learning representations
Lifchitz Y, Avrithis Y, Picard S, Bursuc A (2019) Dense classification and implanting for few-shot learning. In: 2019 IEEE/CVF Conference on computer vision and pattern recognition, pp 9258–9267. https://doi.org/10.1109/CVPR.2019.00948
Sun Q, Liu Y, Chua TS, Schiele B (2019) Meta-transfer learning for few-shot learning. In: 2019 IEEE/CVF conference on computer vision and pattern recognition, pp 403–412. https://doi.org/10.1109/CVPR.2019.00049
Lee K, Maji S, Ravichandran A, Soatto S (2019) Meta-learning with differentiable convex optimization. In: 2019 IEEE/CVF Conference on computer vision and pattern recognition, pp 10649–10657. https://doi.org/10.1109/CVPR.2019.01091
Zhang RX, Che T, Grahahramani Z, Bengio Y, Song Y (2018) Metagan: an adversarial approach to few-shot learning. In: Neural information processing systems, pp 2371–2380. https://proceedings.neurips.cc/paper/2018/hash/4e4e53aa080247bc31d0eb4e7aeb07a0-Abstract.html
Li Z, Zhou F, Chen F, Li H (2017) Meta-SGD: learning to learn quickly for few shot learning
Chen Y, Wang X, Liu Z, Xu H, Darrell T (2020) A new meta-baseline for few-shot learning
Ye H, Hu H, Zhan D, Sha F (2021) Learning adaptive classifiers synthesis for generalized few-shot learning. International Journal of Computer Vision 129:1930–1953. https://doi.org/10.1007/s11263-020-01381-4
Zhong JA, Xc A, Yy B, Zz C (2021) Reweighting and information-guidance networks for few-shot learning. Neurocomputing 423:13–23. https://doi.org/10.1016/j.neucom.2020.07.128
Huang H, Wu Z, Li W, Huo J, Gao Y (2021) Local descriptor-based multi-prototype network for few-shot learning. Pattern Recognition 116(4):107935. https://doi.org/10.1016/j.patcog.2021.107935
Xu H, Wang J, Li H, Ouyang D, Shao J (2021) Unsupervised meta-learning for few-shot learning. Pattern Recognition 116(6):107951. https://doi.org/10.1016/j.patcog.2021.107951
Qin Y, Zhang W, Zhao C, Wang Z, Zhu X, Qi G et al (2021) Prior-knowledge and attention-based meta-learning for few-shot learning. Knowledge-Based Systems 213:106609. https://doi.org/10.1016/j.knosys.2020.106609
Lai N, Kan M, Han C, Song X, Shan S (2021) Learning to learn adaptive classifier-predictor for few-shot learning. IEEE Trans Neural Netw Learn Syst (99):1–13. https://doi.org/10.1109/TNNLS.2020.3011526
Zhang C, Cai Y, Lin G, Shen C (2020) Deepemd: differentiable earth mover’s distance for few-shot learning. In: 2020 IEEE/CVF Conference on computer vision and pattern recognition, pp 12200–12210. https://doi.org/10.1109/CVPR42600.2020.01222
Tian Y, Wang Y, Krishnan D, Tenenbaum J, Isola P (2020) Rethinking few-shot image classification: a Good Embedding Is All You Need? European Conference on Computer Vision 12359:266–282. https://doi.org/10.1007/978-3-030-58568-6_16
Afrasiyabi A, Lalonde JF, Gagné C (2020) Associative alignment for few-shot image classification. European Conference on Computer Vision 12350:18–35. https://doi.org/10.1007/978-3-030-58558-7_2
Liu Y, Schiele B, Sun Q (2020) An Ensemble of Epoch-Wise Empirical Bayes for Few-Shot Learning. European Conference on Computer Vision 12361:404–421. https://doi.org/10.1007/978-3-030-58517-4_24
Acknowledgements
We would like to thank our classmates and friends for their encouragement and support. We would also like to thank the computer resources and other support provided by the Machine Learning Laboratory of Soochow University.
Funding
This work is supported by the National Key R&D Program of China (2018YFA0701700; 2018YFA0701701), and the National Natural Science Foundation of China under Grant No.61672364, No.62176172 and No.61902269.
Author information
Authors and Affiliations
Contributions
All authors have contributed to the conception and design of the study. Conceptualization: Mengjuan Jiang, Fanzhang Li; Experimentation: Mengjuan Jiang; Writing-original draft preparation: Mengjuan Jiang; Writing-review and editing: Mengjuan Jiang, Fanzhang Li; Funding acquisition: Fanzhang Li; Supervision: Fanzhang Li. All authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Jiang, M., Li, F. Lie group continual meta learning algorithm. Appl Intell 52, 10965–10978 (2022). https://doi.org/10.1007/s10489-021-03036-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-03036-4