Skip to main content

REMIND Your Neural Network to Prevent Catastrophic Forgetting

Part of the Lecture Notes in Computer Science book series (LNIP,volume 12353)

Abstract

People learn throughout life. However, incrementally updating conventional neural networks leads to catastrophic forgetting. A common remedy is replay, which is inspired by how the brain consolidates memory. Replay involves fine-tuning a network on a mixture of new and old instances. While there is neuroscientific evidence that the brain replays compressed memories, existing methods for convolutional networks replay raw images. Here, we propose REMIND, a brain-inspired approach that enables efficient replay with compressed representations. REMIND is trained in an online manner, meaning it learns one example at a time, which is closer to how humans learn. Under the same constraints, REMIND outperforms other methods for incremental class learning on the ImageNet ILSVRC-2012 dataset. We probe REMIND’s robustness to data ordering schemes known to induce catastrophic forgetting. We demonstrate REMIND’s generality by pioneering online learning for Visual Question Answering (VQA) (https://github.com/tyler-hayes/REMIND).

Keywords

  • Online learning
  • Brain-inspired
  • Deep learning

T. L. Hayes, K. Kafle, R. Shrestha—Equal Contribution.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Abraham, W.C., Robins, A.: Memory retention-the synaptic stability versus plasticity dilemma. Trends Neurosci. 28, 73–78 (2005)

    Google Scholar 

  2. Acharya, M., Jariwala, K., Kanan, C.: VQD: visual query detection in natural scenes. In: NAACL (2019)

    Google Scholar 

  3. Aljundi, R., Babiloni, F., Elhoseiny, M., Rohrbach, M., Tuytelaars, T.: Memory aware synapses: learning what (not) to forget. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 144–161. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_9

    CrossRef  Google Scholar 

  4. Aljundi, R., et al.: Online continual learning with maximal interfered retrieval. In: NeurIPS, pp. 11849–11860 (2019)

    Google Scholar 

  5. Aljundi, R., Lin, M., Goujaud, B., Bengio, Y.: Gradient based sample selection for online continual learning. In: NeurIPS, pp. 11816–11825 (2019)

    Google Scholar 

  6. Anderson, P., et al.: Bottom-up and top-down attention for image captioning and visual question answering. In: CVPR (2018)

    Google Scholar 

  7. Andreas, J., Rohrbach, M., Darrell, T., Klein, D.: Neural module networks. In: CVPR, pp. 39–48 (2016)

    Google Scholar 

  8. Antol, S., et al.: VQA: visual question answering. In: ICCV (2015)

    Google Scholar 

  9. Barnes, D.C., Wilson, D.A.: Slow-wave sleep-imposed replay modulates both strength and precision of memory. J. Neurosci. 34(15), 5134–5142 (2014)

    Google Scholar 

  10. Ben-Younes, H., Cadene, R., Cord, M., Thome, N.: Mutan: multimodal tucker fusion for visual question answering. In: ICCV (2017)

    Google Scholar 

  11. Bengio, Y., Louradour, J., Collobert, R., Weston, J.: Curriculum learning. In: ICML, pp. 41–48 (2009)

    Google Scholar 

  12. Bernardi, R., et al.: Automatic description generation from images: a survey of models, datasets, and evaluation measures. J. Artif. Intell. Res. 55, 409–442 (2016)

    CrossRef  Google Scholar 

  13. Castro, F.M., Marín-Jiménez, M.J., Guil, N., Schmid, C., Alahari, K.: End-to-End incremental learning. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11216, pp. 241–257. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01258-8_15

    CrossRef  Google Scholar 

  14. Chaudhry, A., Dokania, P.K., Ajanthan, T., Torr, P.H.S.: Riemannian walk for incremental learning: understanding forgetting and intransigence. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 556–572. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_33

    CrossRef  Google Scholar 

  15. Chaudhry, A., Ranzato, M., Rohrbach, M., Elhoseiny, M.: Efficient lifelong learning with A-GEM. In: ICLR (2019)

    Google Scholar 

  16. Dhar, P., Singh, R.V., Peng, K.C., Wu, Z., Chellappa, R.: Learning without memorizing. In: CVPR, pp. 5138–5146 (2019)

    Google Scholar 

  17. Farquhar, S., Gal, Y.: Towards robust evaluations of continual learning. arXiv:1805.09733 (2018)

  18. Fernando, C., et al.: Pathnet: evolution channels gradient descent in super neural networks. arXiv:1701.08734 (2017)

  19. Fukui, A., Park, D.H., Yang, D., Rohrbach, A., Darrell, T., Rohrbach, M.: Multimodal compact bilinear pooling for visual question answering and visual grounding. In: EMNLP (2016)

    Google Scholar 

  20. Gama, J.: Knowledge Discovery from Data Streams. Chapman and Hall/CRC, Boca Raton (2010)

    MATH  Google Scholar 

  21. Gama, J., Sebastião, R., Rodrigues, P.P.: On evaluating stream learning algorithms. Mach. Learn. 90(3), 317–346 (2013). https://doi.org/10.1007/s10994-012-5320-9

    CrossRef  MathSciNet  MATH  Google Scholar 

  22. Hayes, T.L., Cahill, N.D., Kanan, C.: Memory efficient experience replay for streaming learning. In: ICRA (2019)

    Google Scholar 

  23. Hayes, T.L., Kanan, C.: Lifelong machine learning with deep streaming linear discriminant analysis. In: CVPRW (2020)

    Google Scholar 

  24. Hayes, T.L., Kemker, R., Cahill, N.D., Kanan, C.: New metrics and experimental paradigms for continual learning. In: CVPRW, pp. 2031–2034 (2018)

    Google Scholar 

  25. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

    Google Scholar 

  26. Hou, S., Pan, X., Loy, C.C., Wang, Z., Lin, D.: Lifelong learning via progressive distillation and retrospection. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 452–467. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_27

    CrossRef  Google Scholar 

  27. Hou, S., Pan, X., Wang, Z., Change Loy, C., Lin, D.: Learning a unified classifier incrementally via rebalancing. In: CVPR (2019)

    Google Scholar 

  28. Hudson, D.A., Manning, C.D.: Compositional attention networks for machine reasoning. In: ICLR (2018)

    Google Scholar 

  29. Insausti, R., et al.: The nonhuman primate hippocampus: neuroanatomy and patterns of cortical connectivity. In: Hannula, D.E., Duff, M.C. (eds.) The Hippocampus from Cells to Systems, pp. 3–36. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-50406-3_1

    CrossRef  Google Scholar 

  30. Jegou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. TPAMI 33(1), 117–128 (2010)

    Google Scholar 

  31. Ji, D., Wilson, M.A.: Coordinated memory replay in the visual cortex and hippocampus during sleep. Nat. Neurosci. 10(1), 100–107 (2007)

    Google Scholar 

  32. Johnson, J., Douze, M., Jégou, H.: Billion-scale similarity search with GPUs. IEEE Trans. Big Data (2019)

    Google Scholar 

  33. Johnson, J., Hariharan, B., van der Maaten, L., Fei-Fei, L., Zitnick, C.L., Girshick, R.: Clevr: a diagnostic dataset for compositional language and elementary visual reasoning. In: CVPR (2017)

    Google Scholar 

  34. Kafle, K., Kanan, C.: Answer-type prediction for visual question answering. In: CVPR, pp. 4976–4984 (2016)

    Google Scholar 

  35. Kafle, K., Kanan, C.: An analysis of visual question answering algorithms. In: ICCV, pp. 1983–1991 (2017)

    Google Scholar 

  36. Kafle, K., Kanan, C.: Visual question answering: datasets, algorithms, and future challenges. Comput. Vis. Image Underst. 163, 3–20 (2017)

    CrossRef  Google Scholar 

  37. Kafle, K., Price, B., Cohen, S., Kanan, C.: DVQA: understanding data visualizations via question answering. In: CVPR, pp. 5648–5656 (2018)

    Google Scholar 

  38. Kafle, K., Shrestha, R., Cohen, S., Price, B., Kanan, C.: Answering questions about data visualizations using efficient bimodal fusion. In: WACV, pp. 1498–1507 (2020)

    Google Scholar 

  39. Kafle, K., Shrestha, R., Kanan, C.: Challenges and prospects in vision and language research. Front. Artif. Intell. 2, 28 (2019)

    CrossRef  Google Scholar 

  40. Kahou, S.E., Michalski, V., Atkinson, A., Kádár, Á., Trischler, A., Bengio, Y.: Figureqa: An annotated figure dataset for visual reasoning. arXiv preprint arXiv:1710.07300 (2017)

  41. Karlsson, M.P., Frank, L.M.: Awake replay of remote experiences in the hippocampus. Nat. Neurosci. 12(7), 913 (2009)

    Google Scholar 

  42. Kazemi, V., Elqursh, A.: Show, ask, attend, and answer: a strong baseline for visual question answering. arXiv:1704.03162 (2017)

  43. Kazemzadeh, S., Ordonez, V., Matten, M., Berg, T.: Referitgame: Referring to objects in photographs of natural scenes. In: EMNLP, pp. 787–798 (2014)

    Google Scholar 

  44. Kemker, R., Kanan, C.: FearNet: brain-inspired model for incremental learning. In: ICLR (2018)

    Google Scholar 

  45. Kemker, R., McClure, M., Abitino, A., Hayes, T.L., Kanan, C.: Measuring catastrophic forgetting in neural networks. In: AAAI (2018)

    Google Scholar 

  46. Kim, J.H., Jun, J., Zhang, B.T.: Bilinear attention networks. In: NeurIPS, pp. 1564–1574 (2018)

    Google Scholar 

  47. Kirkpatrick, J., et al.: Overcoming catastrophic forgetting in neural networks. In: PNAS (2017)

    Google Scholar 

  48. Konkel, A., Warren, D.E., Duff, M.C., Tranel, D., Cohen, N.J.: Hippocampal amnesia impairs all manner of relational memory. Front. Hum. Neurosci. 2, 15 (2008)

    Google Scholar 

  49. Le, T., Stahl, F., Gaber, M.M., Gomes, J.B., Di Fatta, G.: On expressiveness and uncertainty awareness in rule-based classification for data streams. Neurocomputing 265, 127–141 (2017)

    Google Scholar 

  50. Lee, K., Lee, K., Shin, J., Lee, H.: Overcoming catastrophic forgetting with unlabeled data in the wild. In: ICCV, pp. 312–321 (2019)

    Google Scholar 

  51. Lewis, P.A., Durrant, S.J.: Overlapping memory replay during sleep builds cognitive schemata. Trends Cogn. Sci. 15(8), 343–351 (2011)

    Google Scholar 

  52. Lomonaco, V., Maltoni, D.: Core50: a new dataset and benchmark for continuous object recognition. In: CoRL, pp. 17–26 (2017)

    Google Scholar 

  53. Lopez-Paz, D., Ranzato, M.: Gradient episodic memory for continual learning. In: NeurIPS, pp. 6467–6476 (2017)

    Google Scholar 

  54. Malinowski, M., Fritz, M.: A multi-world approach to question answering about real-world scenes based on uncertain input. In: NeurIPS (2014)

    Google Scholar 

  55. Marois, V., Jayram, T., Albouy, V., Kornuta, T., Bouhadjar, Y., Ozcan, A.S.: On transfer learning using a mac model variant. arXiv:1811.06529 (2018)

  56. McClelland, J.L., Goddard, N.H.: Considerations arising from a complementary learning systems perspective on hippocampus and neocortex. Hippocampus 6(6), 654–665 (1996)

    Google Scholar 

  57. McCloskey, M., Cohen, N.J.: Catastrophic interference in connectionist networks: the sequential learning problem. Psychol. Learn. Motiv. 24, 109–165 (1989)

    CrossRef  Google Scholar 

  58. Nguyen, C.V., Li, Y., Bui, T.D., Turner, R.E.: Variational continual learning. In: ICLR (2018)

    Google Scholar 

  59. Ostapenko, O., Puscas, M., Klein, T., Jähnichen, P., Nabi, M.: Learning to remember: a synaptic plasticity driven framework for continual learning. In: CVPR (2019)

    Google Scholar 

  60. O’Neill, J., Pleydell-Bouverie, B., Dupret, D., Csicsvari, J.: Play it again: reactivation of waking experience and memory. Trends Neurosci. 33(5), 220–229 (2010)

    Google Scholar 

  61. Parisi, G.I., Kemker, R., Part, J.L., Kanan, C., Wermter, S.: Continual lifelong learning with neural networks: a review. Neural Netw. 113, 54–71 (2019)

    Google Scholar 

  62. Parisi, G.I., Tani, J., Weber, C., Wermter, S.: Lifelong learning of spatiotemporal representations with dual-memory recurrent self-organization. Front. Neurorobot. 12, 78 (2018)

    Google Scholar 

  63. Plummer, B.A., Wang, L., Cervantes, C.M., Caicedo, J.C., Hockenmaier, J., Lazebnik, S.: Flickr30k entities: collecting region-to-phrase correspondences for richer image-to-sentence models. In: ICCV, pp. 2641–2649 (2015)

    Google Scholar 

  64. Rebuffi, S.A., Kolesnikov, A., Sperl, G., Lampert, C.H.: icarl: incremental classifier and representation learning. In: CVPR (2017)

    Google Scholar 

  65. Riemer, M., et al.: Learning to learn without forgetting by maximizing transfer and minimizing interference. In: ICLR (2019)

    Google Scholar 

  66. Ritter, H., Botev, A., Barber, D.: Online structured Laplace approximations for overcoming catastrophic forgetting. In: NeurIPS, pp. 3738–3748 (2018)

    Google Scholar 

  67. Rohrbach, A., Rohrbach, M., Hu, R., Darrell, T., Schiele, B.: Grounding of textual phrases in images by reconstruction. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 817–834. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_49

    CrossRef  Google Scholar 

  68. Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. IJCV 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y

    CrossRef  MathSciNet  Google Scholar 

  69. Rusu, A.A., et al.: Progressive neural networks. arXiv:1606.04671 (2016)

  70. Serra, J., Suris, D., Miron, M., Karatzoglou, A.: Overcoming catastrophic forgetting with hard attention to the task. In: ICML, pp. 4555–4564 (2018)

    Google Scholar 

  71. Shrestha, R., Kafle, K., Kanan, C.: Answer them all! toward universal visual question answering models. In: CVPR (2019)

    Google Scholar 

  72. Stickgold, R., Hobson, J.A., Fosse, R., Fosse, M.: Sleep, learning, and dreams: off-line memory reprocessing. Science 294(5544), 1052–1057 (2001)

    Google Scholar 

  73. Subramanian, S., Trischler, A., Bengio, Y., Pal, C.J.: Learning general purpose distributed sentence representations via large scale multi-task learning. In: ICLR (2018)

    Google Scholar 

  74. Takahashi, S.: Episodic-like memory trace in awake replay of hippocampal place cell activity sequences. Elife 4, e08105 (2015)

    Google Scholar 

  75. Teyler, T.J., Rudy, J.W.: The hippocampal indexing theory and episodic memory: updating the index. Hippocampus 17(12), 1158–1169 (2007)

    Google Scholar 

  76. Verma, V., et al.: Manifold mixup: better representations by interpolating hidden states. In: ICML (2019)

    Google Scholar 

  77. Wu, Y., et al.: Large scale incremental learning. In: CVPR, pp. 374–382 (2019)

    Google Scholar 

  78. Yang, Z., He, X., Gao, J., Deng, L., Smola, A.J.: Stacked attention networks for image question answering. In: CVPR (2016)

    Google Scholar 

  79. Yoon, J., Yang, E., Lee, J., Hwang, S.J.: Lifelong learning with dynamically expandable networks. In: ICLR (2018)

    Google Scholar 

  80. Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? In: NeurIPS, pp. 3320–3328 (2014)

    Google Scholar 

  81. Zenke, F., Poole, B., Ganguli, S.: Continual learning through synaptic intelligence. In: ICML, pp. 3987–3995 (2017)

    Google Scholar 

Download references

Acknowledgements

This work was supported in part by the DARPA/MTO Lifelong Learning Machines program [W911NF-18-2-0263], AFOSR grant [FA9550-18-1-0121], NSF award #1909696, and a gift from Adobe Research. We thank NVIDIA for the GPU donation. The views and conclusions contained herein are those of the authors and should not be interpreted as representing the official policies or endorsements of any sponsor. We thank Michael Mozer, Ryne Roady, and Zhongchao Qian for feedback on early drafts of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tyler L. Hayes .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 2305 KB)

Rights and permissions

Reprints and Permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hayes, T.L., Kafle, K., Shrestha, R., Acharya, M., Kanan, C. (2020). REMIND Your Neural Network to Prevent Catastrophic Forgetting. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12353. Springer, Cham. https://doi.org/10.1007/978-3-030-58598-3_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58598-3_28

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58597-6

  • Online ISBN: 978-3-030-58598-3

  • eBook Packages: Computer ScienceComputer Science (R0)