Skip to main content

Equivariance and Invariance Inductive Bias for Learning from Insufficient Data

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13671))

Included in the following conference series:

Abstract

We are interested in learning robust models from insufficient data, without the need for any externally pre-trained checkpoints. First, compared to sufficient data, we show why insufficient data renders the model more easily biased to the limited training environments that are usually different from testing. For example, if all the training swan samples are “white”, the model may wrongly use the “white” environment to represent the intrinsic class swan. Then, we justify that equivariance inductive bias can retain the class feature while invariance inductive bias can remove the environmental feature, leaving the class feature that generalizes to any environmental changes in testing. To impose them on learning, for equivariance, we demonstrate that any off-the-shelf contrastive-based self-supervised feature learning method can be deployed; for invariance, we propose a class-wise invariant risk minimization (IRM) that efficiently tackles the challenge of missing environmental annotation in conventional IRM. State-of-the-art experimental results on real-world benchmarks (VIPriors, ImageNet100 and NICO) validate the great potential of equivariance and invariance in data-efficient learning. The code is available at https://github.com/Wangt-CN/EqInv.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    See https://vipriors.github.io/ for details.

  2. 2.

    The distance between the “white dog” sample vector \((\textbf{z}_{\text {white}}, \textbf{z}_{\text {dog}})\) and the swan model vector \((\textbf{z}_{\text {white}}, \textbf{0})\) is: \(\Vert (\textbf{z}_{\text {white}}, \textbf{z}_{\text {dog}})-(\textbf{z}_{\text {white}}, {\textbf {0}})\Vert = \Vert ({\textbf {0}}, \textbf{z}_{\text {dog}})\Vert = \Vert \textbf{z}_{\text {dog}}\Vert \); similarly, we have the distance between “white dog” and dog model as \(\Vert \textbf{z}_{\text {white}}\Vert \).

  3. 3.

    It is modified from MNIST dataset [50] by injecting color bias on each digit (class). The non-bias ratio is \(0.5\%\), e.g., \(99.5\%\) samples of 0 are in red and only \(0.5\%\) in uniform colors.

  4. 4.

    Please note that in implementation, we adopt an advanced version [47] of IRM. Please check appendix for details.

References

  1. Ahuja, K., Shanmugam, K., Varshney, K., Dhurandhar, A.: Invariant risk minimization games. In: ICML, pp. 145–155. PMLR (2020)

    Google Scholar 

  2. Ahuja, K., Wang, J., Dhurandhar, A., Shanmugam, K., Varshney, K.R.: Empirical or invariant risk minimization? a sample complexity perspective. arXiv preprint (2020)

    Google Scholar 

  3. Allen-Zhu, Z., Li, Y.: What can ResNet learn efficiently, going beyond kernels? In: NeurIPS, vol. 32 (2019)

    Google Scholar 

  4. Arjovsky, M., Bottou, L., Gulrajani, I., Lopez-Paz, D.: Invariant risk minimization. arXiv preprint arXiv:1907.02893 (2019)

  5. Austin, P.C.: An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multiv. Behav. Res. 46(3), 399–424 (2011)

    Article  Google Scholar 

  6. Bahng, H., Chun, S., Yun, S., Choo, J., Oh, S.J.: Learning de-biased representations with biased representations. In: ICML, pp. 528–539. PMLR (2020)

    Google Scholar 

  7. Bardes, A., Ponce, J., LeCun, Y.: VICReg: variance-invariance-covariance regularization for self-supervised learning. arXiv preprint arXiv:2105.04906 (2021)

  8. Barz, B., Brigato, L., Iocchi, L., Denzler, J.: A strong baseline for the VIPriors data-efficient image classification challenge. arXiv preprint arXiv:2109.13561 (2021)

  9. Battaglia, P.W., et al.: Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261 (2018)

  10. Bietti, A., Mairal, J.: On the inductive bias of neural tangent kernels. In: NeurIPS, vol. 32 (2019)

    Google Scholar 

  11. Bouchacourt, D., Ibrahim, M., Morcos, A.: Grounding inductive biases in natural images: invariance stems from variations in data. In: NeurIPS, vol. 34 (2021)

    Google Scholar 

  12. Bruintjes, R.J., Lengyel, A., Rios, M.B., Kayhan, O.S., van Gemert, J.: VIPriors 1: visual inductive priors for data-efficient deep learning challenges. arXiv preprint arXiv:2103.03768 (2021)

  13. Castro, F.M., Marín-Jiménez, M.J., Guil, N., Schmid, C., Alahari, K.: End-to-end incremental learning. In: ECCV, pp. 233–248 (2018)

    Google Scholar 

  14. Chen, S., Dobriban, E., Lee, J.: A group-theoretic framework for data augmentation. NeurIPS 33, 21321–21333 (2020)

    MATH  Google Scholar 

  15. Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: ICML, pp. 1597–1607. PMLR (2020)

    Google Scholar 

  16. Chen, X., Fan, H., Girshick, R., He, K.: Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297 (2020)

  17. Chen, X., He, K.: Exploring simple siamese representation learning. arXiv preprint arXiv:2011.10566 (2020)

  18. Chrupała, G.: Symbolic inductive bias for visually grounded learning of spoken language. arXiv preprint arXiv:1812.09244 (2018)

  19. Cohen, T., Weiler, M., Kicanaoglu, B., Welling, M.: Gauge equivariant convolutional networks and the icosahedral CNN. In: ICML, pp. 1321–1330. PMLR (2019)

    Google Scholar 

  20. Cohen, T., Welling, M.: Group equivariant convolutional networks. In: ICML, pp. 2990–2999. PMLR (2016)

    Google Scholar 

  21. Creager, E., Jacobsen, J.H., Zemel, R.: Environment inference for invariant learning. In: ICML, pp. 2189–2200. PMLR (2021)

    Google Scholar 

  22. Cubuk, E.D., Zoph, B., Shlens, J., Le, Q.V.: RandAugment: practical automated data augmentation with a reduced search space. In: CVPR Workshops, pp. 702–703 (2020)

    Google Scholar 

  23. Daneshmand, H., Joudaki, A., Bach, F.: Batch normalization orthogonalizes representations in deep random networks. In: NeurIPS, vol. 34 (2021)

    Google Scholar 

  24. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR, pp. 248–255. IEEE (2009)

    Google Scholar 

  25. Geirhos, R.: Shortcut learning in deep neural networks. Nat. Mach. Intell. 2(11), 665–673 (2020)

    Article  Google Scholar 

  26. Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F.A., Brendel, W.: ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. ICLR (2018)

    Google Scholar 

  27. Gondal, M.W., et al.: On the transfer of inductive bias from simulation to the real world: a new disentanglement dataset. In: NeurIPS, vol. 32 (2019)

    Google Scholar 

  28. Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014)

  29. He, K., Chen, X., Xie, S., Li, Y., Dollár, P., Girshick, R.: Masked autoencoders are scalable vision learners. arXiv preprint arXiv:2111.06377 (2021)

  30. He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. arXiv preprint arXiv:1911.05722 (2019)

  31. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)

    Google Scholar 

  32. He, Y., Shen, Z., Cui, P.: Towards non-IID image classification: a dataset and baselines. Pattern Recogn. 110, 107383 (2021)

    Article  Google Scholar 

  33. Helmbold, D.P., Long, P.M.: On the inductive bias of dropout. J. Mach. Learn. Res. 16(1), 3403–3454 (2015)

    MathSciNet  MATH  Google Scholar 

  34. Hendrycks, D., Dietterich, T.: Benchmarking neural network robustness to common corruptions and perturbations. ICLR (2019)

    Google Scholar 

  35. Hendrycks, D., Liu, X., Wallace, E., Dziedzic, A., Krishnan, R., Song, D.: Pretrained transformers improve out-of-distribution robustness. arXiv preprint arXiv:2004.06100 (2020)

  36. Hendrycks, D., Mazeika, M., Kadavath, S., Song, D.: Using self-supervised learning can improve model robustness and uncertainty. In: NeurIPS, vol. 32 (2019)

    Google Scholar 

  37. Hendrycks, D., Zhao, K., Basart, S., Steinhardt, J., Song, D.: Natural adversarial examples. In: CVPR, pp. 15262–15271 (2021)

    Google Scholar 

  38. Heo, B., Kim, J., Yun, S., Park, H., Kwak, N., Choi, J.Y.: A comprehensive overhaul of feature distillation. In: ICCV, pp. 1921–1930 (2019)

    Google Scholar 

  39. Hernán, M.A., Robins, J.M.: Causal inference (2010)

    Google Scholar 

  40. Hinton, G., Vinyals, O., Dean, J., et al.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 2(7) (2015)

  41. Imbens, G.W., Rubin, D.B.: Causal Inference in Statistics, Social, and Biomedical Sciences. Cambridge University Press, Cambridge (2015)

    Book  MATH  Google Scholar 

  42. Jo, Y., Chun, S.Y., Choi, J.: Rethinking deep image prior for denoising. In: ICCV, pp. 5087–5096 (2021)

    Google Scholar 

  43. Jung, Y., Tian, J., Bareinboim, E.: Learning causal effects via weighted empirical risk minimization. NeurIPS 33, 12697–12709 (2020)

    Google Scholar 

  44. Kayhan, O.S., Gemert, J.C.v.: On translation invariance in CNNs: convolutional layers can exploit absolute spatial location. In: CVPR, pp. 14274–14285 (2020)

    Google Scholar 

  45. Khosla, P., et al.: Supervised contrastive learning. NeurIPS 33, 18661–18673 (2020)

    Google Scholar 

  46. Kim, D., Yoo, Y., Park, S., Kim, J., Lee, J.: SelfReg: self-supervised contrastive regularization for domain generalization. In: ICCV, pp. 9619–9628 (2021)

    Google Scholar 

  47. Krueger, D., et al.: Out-of-distribution generalization via risk extrapolation (rex). arXiv preprint (2020)

    Google Scholar 

  48. Lahiri, A., Kwatra, V., Frueh, C., Lewis, J., Bregler, C.: LipSync3D: data-efficient learning of personalized 3D talking faces from video using pose and lighting normalization. In: CVPR, pp. 2755–2764 (2021)

    Google Scholar 

  49. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  50. LeCun, Y., Cortes, C., Burges, C.: MNIST handwritten digit database (2010)

    Google Scholar 

  51. Lee, J., Kim, E., Lee, J., Lee, J., Choo, J.: Learning debiased representation via disentangled feature augmentation. In: NeurIPS, vol. 34 (2021)

    Google Scholar 

  52. Lenssen, J.E., Fey, M., Libuschewski, P.: Group equivariant capsule networks. In: NeurIPS, vol. 31 (2018)

    Google Scholar 

  53. Little, R.J., Rubin, D.B.: Statistical Analysis with Missing Data, vol. 793. John Wiley & Sons, Hoboken (2019)

    MATH  Google Scholar 

  54. Liu, Q., Mohamadabadi, B.B., El-Khamy, M., Lee, J.: Diversification is all you need: towards data efficient image understanding (2020)

    Google Scholar 

  55. Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017)

  56. Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(11), 2579–2605 (2008)

    MATH  Google Scholar 

  57. Mitchell, T.M.: The need for biases in learning generalizations. Department of Computer Science, Laboratory for Computer Science Research \(\ldots \) (1980)

    Google Scholar 

  58. Müller, R., Kornblith, S., Hinton, G.E.: When does label smoothing help? In: NeurIPS, vol. 32 (2019)

    Google Scholar 

  59. Nam, H., Lee, H., Park, J., Yoon, W., Yoo, D.: Reducing domain gap by reducing style bias. In: CVPR, pp. 8690–8699 (2021)

    Google Scholar 

  60. Nam, J., Cha, H., Ahn, S., Lee, J., Shin, J.: Learning from failure: de-biasing classifier from biased classifier. NeurIPS 33, 20673–20684 (2020)

    Google Scholar 

  61. Van den Oord, A., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding. arXiv e-prints pp. arXiv-1807 (2018)

    Google Scholar 

  62. Pan, S.J., Tsang, I.W., Kwok, J.T., Yang, Q.: Domain adaptation via transfer component analysis. IEEE Trans. Neural Netw. 22(2), 199–210 (2010)

    Article  Google Scholar 

  63. Pearl, J.: Causality. Cambridge University Press, Cambridge (2009)

    Book  MATH  Google Scholar 

  64. Pezeshki, M., Kaba, O., Bengio, Y., Courville, A.C., Precup, D., Lajoie, G.: Gradient starvation: a learning proclivity in neural networks. NeurIPS 34, 1256–1272 (2021)

    Google Scholar 

  65. Pfister, N., Bühlmann, P., Peters, J.: Invariant causal prediction for sequential data. J. Am. Stat. Assoc. 114(527), 1264–1276 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  66. Radford, A., et al.: Learning transferable visual models from natural language supervision. Image 2, T2 (2021)

    Google Scholar 

  67. Recht, B., Roelofs, R., Schmidt, L., Shankar, V.: Do imagenet classifiers generalize to imagenet? In: ICML, pp. 5389–5400. PMLR (2019)

    Google Scholar 

  68. Saito, K., Kim, D., Sclaroff, S., Saenko, K.: Universal domain adaptation through self supervision. NeurIPS 33, 16282–16292 (2020)

    Google Scholar 

  69. Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision, pp. 618–626 (2017)

    Google Scholar 

  70. Shen, Z., Liu, J., He, Y., Zhang, X., Xu, R., Yu, H., Cui, P.: Towards out-of-distribution generalization: a survey. arXiv preprint arXiv:2108.13624 (2021)

  71. Snell, J., Swersky, K., Zemel, R.: Prototypical networks for few-shot learning. In: NeurIPS, vol. 30 (2017)

    Google Scholar 

  72. Sun, P., Jin, X., Su, W., He, Y., Xue, H., Lu, Q.: A visual inductive priors framework for data-efficient image classification. In: Bartoli, A., Fusiello, A. (eds.) ECCV 2020. LNCS, vol. 12536, pp. 511–520. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-66096-3_35

    Chapter  Google Scholar 

  73. Sun, Q., Liu, Y., Chua, T.S., Schiele, B.: Meta-transfer learning for few-shot learning. In: CVPR, pp. 403–412 (2019)

    Google Scholar 

  74. Sung, F., Yang, Y., Zhang, L., Xiang, T., Torr, P.H., Hospedales, T.M.: Learning to compare: Relation network for few-shot learning. In: CVPR, pp. 1199–1208 (2018)

    Google Scholar 

  75. Tian, Y., Krishnan, D., Isola, P.: Contrastive multiview coding. arXiv preprint arXiv:1906.05849 (2019)

  76. Vapnik, V.: Principles of risk minimization for learning theory. In: NeurIPS (1992)

    Google Scholar 

  77. Wang, T., Yue, Z., Huang, J., Sun, Q., Zhang, H.: Self-supervised learning disentangled group representation as feature. In: Conference and Workshop on Neural Information Processing Systems (NeurIPS) (2021)

    Google Scholar 

  78. Wang, T., Zhou, C., Sun, Q., Zhang, H.: Causal attention for unbiased visual recognition. In: ICCV, pp. 3091–3100 (2021)

    Google Scholar 

  79. Wen, Z., Li, Y.: Toward understanding the feature learning process of self-supervised contrastive learning. In: ICML, pp. 11112–11122. PMLR (2021)

    Google Scholar 

  80. Xu, Y., Zhang, Q., Zhang, J., Tao, D.: Vitae: vision transformer advanced by exploring intrinsic inductive bias. In: NeurIPS, vol. 34 (2021)

    Google Scholar 

  81. You, K., Long, M., Cao, Z., Wang, J., Jordan, M.I.: Universal domain adaptation. In: CVPR, pp. 2720–2729 (2019)

    Google Scholar 

  82. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53

    Chapter  Google Scholar 

  83. Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: Mixup: beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017)

  84. Zhang, X., Zhou, L., Xu, R., Cui, P., Shen, Z., Liu, H.: Towards unsupervised domain generalization. In: CVPR, pp. 4910–4920 (2022)

    Google Scholar 

  85. Zhao, B., Wen, X.: Distilling visual priors from self-supervised learning. In: Bartoli, A., Fusiello, A. (eds.) ECCV 2020. LNCS, vol. 12536, pp. 422–429. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-66096-3_29

    Chapter  Google Scholar 

  86. Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: CVPR, pp. 2921–2929 (2016)

    Google Scholar 

  87. Zhu, F., Cheng, Z., Zhang, X.y., Liu, C.l.: Class-incremental learning via dual augmentation. In: NeurIPS, vol. 34 (2021)

    Google Scholar 

Download references

Acknowledgement

The authors would like to thank all reviewers for their constructive suggestions. This research is partly supported by the Alibaba-NTU Joint Research Institute, AISG, A*STAR under its AME YIRG Grant (Project No.A20E6c0101).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tan Wad .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 6054 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wad, T., Sun, Q., Pranata, S., Jayashree, K., Zhang, H. (2022). Equivariance and Invariance Inductive Bias for Learning from Insufficient Data. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13671. Springer, Cham. https://doi.org/10.1007/978-3-031-20083-0_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-20083-0_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-20082-3

  • Online ISBN: 978-3-031-20083-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics