Skip to main content

Advertisement

Log in

Lie group continual meta learning algorithm

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Humans can use acquired experience to learn new skills quickly and without forgetting the knowledge they already have. However, the neural network cannot do continual learning like humans, because it is easy to fall into the stability-plasticity dilemma and lead to catastrophic forgetting. Since meta-learning with the already acquired knowledge as a priori can directly optimize the final goal, this paper proposes LGCMLA (Lie Group Continual Meta Learning Algorithm) based on meta-learning, this algorithm is an improvement of CMLA (Continual Meta Learning Algorithm) proposed by Jiang et al. On the one hand, LGCMLA enhances the continuity between tasks by changing the inner-loop update rule (from using random initialization parameters for each task to using the updated parameters of the previous task for the subsequent task). On the other hand, it uses orthogonal groups to limit the parameter space and adopts the natural Riemannian gradient descent to accelerate the convergence speed. It not only corrects the shortcomings of poor convergence and stability of CMLA, but also further improves the generalization performance of the model and solves the stability-plasticity dilemma more effectively. Experiments on miniImageNet, tieredImageNet and Fewshot-CIFAR100 (Canadian Institute For Advanced Research) datasets prove the effectiveness of LGCMLA. Especially compared to MAML (Model-Agnostic Meta-Learning) with standard four-layer convolution, the accuracy of 1 shot and 5 shot is improved by 16.4% and 17.99% respectively under the setting of 5-way on miniImageNet.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data Availability Statement

The datasets used during this study are available upon reasonable request to the authors. The code is publicly available at https://github.com/

References

  1. Parisi GI, Kemker R, Part JL, Kanan C, Wermter S (2018) Continual lifelong learning with neural networks: a review. Neural Networks 113:54–71. https://doi.org/10.1016/j.neunet.2019.01.012

    Article  Google Scholar 

  2. Li C, Li Y, Zhao Y et al (2021) SLER: Self-generated long-term experience replay for continual reinforcement learning. Applied Intelligence 51:185–201. https://doi.org/10.1007/s10489-020-01786-1

    Article  Google Scholar 

  3. Jiang MJ, Li FZ, Liu L (2021) Continual meta learning algorithm. Appl Intell. https://doi.org/10.1007/s10489-021-02543-8

  4. Chaudhry A, Ranzato M, Rohrbach M, Elhoseiny M (2019) Efficient lifelong learning with A-GEM. In: 7th International conference on learning representations. https://openreview.net/forum?id=Hkf2_sC5FX

  5. Chaudhry A, Rohrbach M, Elhoseiny M, Ajanthan T, Dokania PK, Torr PHS et al (2019) Continual learning with tiny episodic memories

  6. Lopez-Paz D, Ranzato M (2017) Gradient episodic memory for continual learning. Neural Inf Process Syst:6467-6476. https://proceedings.neurips.cc/paper/2017/hash/f87522788a2be2d171666752f97ddebb-Abstract.html

  7. Cortes C, Gonzalvo X, Kuznetsov V, Mohri M, Yang S (2016) AdaNet: adaptive structural learning of artificial neural networks. In: Proceedings of the 34th international conference on machine learning, vol 70, pp 874–883. http://proceedings.mlr.press/v70/cortes17a.html

  8. Rebuffi S, Kolesnikov A, Sperl G, Lampert CH (2017) iCaRL: Incremental classifier and representation learning. In: 2017 IEEE conference on computer vision and pattern recognition, pp 5533–5542. https://doi.org/10.1109/CVPR.2017.587

  9. Rusu AA, Rabinowitz NC, Desjardins G, Soyer H, Kirkpatrick J, Kavukcuoglu K, Pascanu R, Hadsell R (2016) Progressive neural networks

  10. Fini E, Lathuiliere S, Sangineto E, Nabi M, Ricci E (2020) Online continual learning under extreme memory constraints. European Conference on Computer Vision 12373:720–735. https://doi.org/10.1007/978-3-030-58604-1_43

    Article  Google Scholar 

  11. Kirkpatrick J, Pascanu R, Rabinowitz N, Veness J, Desjardins G, Rusu AA, Milan K, Quan J, Ramalho T, Grabska-Barwinska A et al (2016) Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences 114(13):3521–3526

    Article  MathSciNet  Google Scholar 

  12. Li Z, Hoiem D (2016) Learning without forgetting. European Conference on Computer Vision 9908:614–629. https://doi.org/10.1007/978-3-319-46493-0_37

    Article  Google Scholar 

  13. Wang Y, Yao Q, Kwok JT, Ni LM (2020) Generalizing from a few examples: a survey on few-shot learning. ACM Computing Surveys 53(3):1–34. https://doi.org/10.1145/3386252

    Article  Google Scholar 

  14. Ravi S, Larochelle H (2017) Optimization as a model for few-shot learning. In: Proceedings of the 5th international conference on learning representations. https://openreview.net/forum?id=rJY0-Kcll

  15. Mishra N, Rohaninejad M, Chen X, Abbeel P (2017) A simple neural attentive meta-learner. In: 6th International conference on learning representations. https://openreview.net/forum?id=B1DmUzWAW

  16. Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: 34th International conference on machine learning, vol 70, pp 1126–1135. http://proceedings.mlr.press/v70/finn17a.html

  17. Rusu AA, Rao D, Sygnowski J, Vinyals O, Pascanu R, Osindero S, Hadsell R (2019) Meta-Learning with latent embedding optimization. In: 7th International conference on learning representations. https://openreview.net/forum?id=BJgklhAcK7

  18. Sung F, Yang Y, Zhang L, Xiang T, Torr P, Hospedales TM (2018) Learning to compare: relation network for few-shot learning. https://doi.org/10.1109/CVPR.2018.00131

  19. Vinyals O, Blundell C, Lillicrap T, Kavukcuoglu K, Wierstra D (2016) Matching networks for one shot learning. Neural Inf Process Syst:3630–3638. https://proceedings.neurips.cc/paper/2016/hash/90e1357833654983612fb05e3ec9148c-Abstract.html

  20. Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning. Neural Inf Process Syst:4077–4087. https://proceedings.neurips.cc/paper/2017/hash/cb8da6767461f2812ae4290eac7cbc42-Abstract.html

  21. Krishna NR, Balaprakash P (2020) Meta continual learning via dynamic programming

  22. Yang H, He H, Zhang W et al (2021) Lie group manifold analysis: an unsupervised domain adaptation approach for image classification. Appl Intell. https://doi.org/10.1007/s10489-021-02564-3

  23. Absil PA, Mahony R, Sepulchre R (2008) Optimization algorithms on matrix manifolds. Princeton University Press. http://press.princeton.edu/titles/8586.html

  24. Nishimori Y (2021) A neural stiefel learning based on geodesics revisited

  25. Harandi M, Fernando B (2016) Generalized backpropagation, \(\acute{E}\)tude de cas: orthogonality

  26. Amari SI (1998) Natural gradient works efficiently in learning. Neural Computation 10(2):251–276. https://doi.org/10.1162/089976698300017746

    Article  Google Scholar 

  27. Choi S, Cichocki A, Amari SI (2000) Flexible independent component analysis. Journal of Vlsi Signal Processing 26(1–2):25–38. https://doi.org/10.1023/A:1008135131269

    Article  MATH  Google Scholar 

  28. M Ren, E Triantafillou, S Ravi, J Snell, and K Swersky (2018) Meta-learning for semi-supervised few-shot classification. In: 6th International conference on learning representations. https://openreview.net/forum?id=HJcSzz-CZ

  29. Krizhevsky A, Hinton G (2012) Learning multiple layers of features from tiny images. Handb Syst Autoimmune Dis 1(4):54–57

  30. Oreshkin BN, Lacoste A, Rodriguez P (2018) Tadam: task dependent adaptive metric for improved few-shot learning. Neural Inf Process Syst:719–729. https://proceedings.neurips.cc/paper/2018/hash/66808e327dc79d135ba18e051673d906-Abstract.html

  31. Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: 19th International conference on computational statistics, pp 177–186. https://doi.org/10.1007/978-3-7908-2604-3_16

  32. Kingma D, Ba J (2015) Adam: a method for stochastic optimization. In: 3rd International conference on learning representations

  33. Lifchitz Y, Avrithis Y, Picard S, Bursuc A (2019) Dense classification and implanting for few-shot learning. In: 2019 IEEE/CVF Conference on computer vision and pattern recognition, pp 9258–9267. https://doi.org/10.1109/CVPR.2019.00948

  34. Sun Q, Liu Y, Chua TS, Schiele B (2019) Meta-transfer learning for few-shot learning. In: 2019 IEEE/CVF conference on computer vision and pattern recognition, pp 403–412. https://doi.org/10.1109/CVPR.2019.00049

  35. Lee K, Maji S, Ravichandran A, Soatto S (2019) Meta-learning with differentiable convex optimization. In: 2019 IEEE/CVF Conference on computer vision and pattern recognition, pp 10649–10657. https://doi.org/10.1109/CVPR.2019.01091

  36. Zhang RX, Che T, Grahahramani Z, Bengio Y, Song Y (2018) Metagan: an adversarial approach to few-shot learning. In: Neural information processing systems, pp 2371–2380. https://proceedings.neurips.cc/paper/2018/hash/4e4e53aa080247bc31d0eb4e7aeb07a0-Abstract.html

  37. Li Z, Zhou F, Chen F, Li H (2017) Meta-SGD: learning to learn quickly for few shot learning

  38. Chen Y, Wang X, Liu Z, Xu H, Darrell T (2020) A new meta-baseline for few-shot learning

  39. Ye H, Hu H, Zhan D, Sha F (2021) Learning adaptive classifiers synthesis for generalized few-shot learning. International Journal of Computer Vision 129:1930–1953. https://doi.org/10.1007/s11263-020-01381-4

    Article  MATH  Google Scholar 

  40. Zhong JA, Xc A, Yy B, Zz C (2021) Reweighting and information-guidance networks for few-shot learning. Neurocomputing 423:13–23. https://doi.org/10.1016/j.neucom.2020.07.128

    Article  Google Scholar 

  41. Huang H, Wu Z, Li W, Huo J, Gao Y (2021) Local descriptor-based multi-prototype network for few-shot learning. Pattern Recognition 116(4):107935. https://doi.org/10.1016/j.patcog.2021.107935

    Article  Google Scholar 

  42. Xu H, Wang J, Li H, Ouyang D, Shao J (2021) Unsupervised meta-learning for few-shot learning. Pattern Recognition 116(6):107951. https://doi.org/10.1016/j.patcog.2021.107951

    Article  Google Scholar 

  43. Qin Y, Zhang W, Zhao C, Wang Z, Zhu X, Qi G et al (2021) Prior-knowledge and attention-based meta-learning for few-shot learning. Knowledge-Based Systems 213:106609. https://doi.org/10.1016/j.knosys.2020.106609

    Article  Google Scholar 

  44. Lai N, Kan M, Han C, Song X, Shan S (2021) Learning to learn adaptive classifier-predictor for few-shot learning. IEEE Trans Neural Netw Learn Syst (99):1–13. https://doi.org/10.1109/TNNLS.2020.3011526

  45. Zhang C, Cai Y, Lin G, Shen C (2020) Deepemd: differentiable earth mover’s distance for few-shot learning. In: 2020 IEEE/CVF Conference on computer vision and pattern recognition, pp 12200–12210. https://doi.org/10.1109/CVPR42600.2020.01222

  46. Tian Y, Wang Y, Krishnan D, Tenenbaum J, Isola P (2020) Rethinking few-shot image classification: a Good Embedding Is All You Need? European Conference on Computer Vision 12359:266–282. https://doi.org/10.1007/978-3-030-58568-6_16

    Article  Google Scholar 

  47. Afrasiyabi A, Lalonde JF, Gagné C (2020) Associative alignment for few-shot image classification. European Conference on Computer Vision 12350:18–35. https://doi.org/10.1007/978-3-030-58558-7_2

    Article  Google Scholar 

  48. Liu Y, Schiele B, Sun Q (2020) An Ensemble of Epoch-Wise Empirical Bayes for Few-Shot Learning. European Conference on Computer Vision 12361:404–421. https://doi.org/10.1007/978-3-030-58517-4_24

    Article  Google Scholar 

Download references

Acknowledgements

We would like to thank our classmates and friends for their encouragement and support. We would also like to thank the computer resources and other support provided by the Machine Learning Laboratory of Soochow University.

Funding

This work is supported by the National Key R&D Program of China (2018YFA0701700; 2018YFA0701701), and the National Natural Science Foundation of China under Grant No.61672364, No.62176172 and No.61902269.

Author information

Authors and Affiliations

Authors

Contributions

All authors have contributed to the conception and design of the study. Conceptualization: Mengjuan Jiang, Fanzhang Li; Experimentation: Mengjuan Jiang; Writing-original draft preparation: Mengjuan Jiang; Writing-review and editing: Mengjuan Jiang, Fanzhang Li; Funding acquisition: Fanzhang Li; Supervision: Fanzhang Li. All authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Fanzhang Li.

Ethics declarations

Conflicts of interest

The authors declare no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jiang, M., Li, F. Lie group continual meta learning algorithm. Appl Intell 52, 10965–10978 (2022). https://doi.org/10.1007/s10489-021-03036-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-03036-4

Keywords

Navigation