Advertisement

Interpolating Convolutional Neural Networks Using Batch Normalization

  • Gratianus Wesley Putra DataEmail author
  • Kirjon Ngu
  • David William Murray
  • Victor Adrian Prisacariu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11217)

Abstract

Perceiving a visual concept as a mixture of learned ones is natural for humans, aiding them to grasp new concepts and strengthening old ones. For all their power and recent success, deep convolutional networks do not have this ability. Inspired by recent work on universal representations for neural networks, we propose a simple emulation of this mechanism by purposing batch normalization layers to discriminate visual classes, and formulating a way to combine them to solve new tasks. We show that this can be applied for 2-way few-shot learning where we obtain between 4% and 17% better accuracy compared to straightforward full fine-tuning, and demonstrate that it can also be extended to the orthogonal application of style transfer.

Keywords

Neural network interpolation Batch normalization Few-shot learning Style transfer 

References

  1. 1.
    Aharon, M., Elad, M., Bruckstein, A.: K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process. 54(11), 4311–4322 (2006).  https://doi.org/10.1109/TSP.2006.881199CrossRefzbMATHGoogle Scholar
  2. 2.
    Bertinetto, L., Henriques, J.F., Valmadre, J., Torr, P., Vedaldi, A.: Learning feed-forward one-shot learners. In: Lee, D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 29, pp. 523–531. Curran Associates, Inc. (2016). http://papers.nips.cc/paper/6068-learning-feed-forward-one-shot-learners.pdf
  3. 3.
    Bilen, H., Vedaldi, A.: Universal representations: the missing link between faces, text, planktons, and cat breeds. CoRR abs/1701.0 (2017). http://arxiv.org/abs/1701.07275
  4. 4.
    Chen, W., Wilson, J., Tyree, S., Weinberger, K., Chen, Y.: Compressing Neural Networks with the Hashing Trick. In: Bach, F., Blei, D. (eds.) Proceedings of the 32nd International Conference on Machine Learning, Proceedings of Machine Learning Research, PMLR, Lille, France, vol. 37, pp. 2285–2294 (2015). http://proceedings.mlr.press/v37/chenc15.html
  5. 5.
    Chrabaszcz, P., Loshchilov, I., Hutter, F.: A downsampled variant of ImageNet as an alternative to the CIFAR datasets. CoRR abs/1707.0 (2017). http://arxiv.org/abs/1707.08819
  6. 6.
    Dumoulin, V., Shlens, J., Kudlur, M.: A learned representation for artistic style. CoRR abs/1610.0 (2016). http://arxiv.org/abs/1610.07629
  7. 7.
    Engan, K., Aase, S.O., Husoy, J.H.: Method of optimal directions for frame design. In: Proceedings of the 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP99 (Cat. No. 99CH36258). vol. 5, pp. 2443–2446 (1999).  https://doi.org/10.1109/ICASSP.1999.760624
  8. 8.
    Finn, C., Abbeel, P., Levine, S.: model-agnostic meta-learning for fast adaptation of deep networks. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning, Proceedings of Machine Learning Research, PMLR, vol. 70, pp. 1126–1135. International Convention Centre, Sydney (2017). http://proceedings.mlr.press/v70/finn17a.html
  9. 9.
    Gao, Y., She, Q., Ma, J., Zhao, M., Liu, W., Yuille, A.L.: NDDR-CNN: layer-wise feature fusing in multi-task CNN by neural discriminative dimensionality reduction. CoRR abs/1801.0 (2018). http://arxiv.org/abs/1801.08297
  10. 10.
    Hariharan, B., Girshick, R.: Low-shot visual recognition by shrinking and hallucinating features. In: The IEEE International Conference on Computer Vision (ICCV), pp. 3018–3027, October 2017Google Scholar
  11. 11.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778, June 2016Google Scholar
  12. 12.
    Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. In: NIPS Deep Learning and Representation Learning Workshop (2015)Google Scholar
  13. 13.
    Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Bach, F., Blei, D. (eds.) Proceedings of the 32nd International Conference on Machine Learning, Proceedings of Machine Learning Research, PMLR, Lille, France, vol. 37, pp. 448–456 (2015). http://proceedings.mlr.press/v37/ioffe15.html
  14. 14.
    Koch, G., Zemel, R., Salakhutdinov, R.: Siamese neural networks for one-shot image recognition. In: ICML Deep Learning Workshop (2015)Google Scholar
  15. 15.
    Krizhevsky, A.: Learning multiple layers of features from tiny images. Technical report (2009)Google Scholar
  16. 16.
    Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105. Curran Associates, Inc. (2012). http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
  17. 17.
    Lawrence, N.: Probabilistic non-linear principal component analysis with Gaussian process latent variable models. J. Mach. Learn. Res. 6, 1783–1816 (2005). http://dl.acm.org/citation.cfm?id=1046920.1194904MathSciNetzbMATHGoogle Scholar
  18. 18.
    Luo, Z., Zou, Y., Hoffman, J., Fei-Fei, L.: Label efficient learning of transferable representations acrosss domains and tasks. In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 165–177. Curran Associates, Inc. (2017). http://papers.nips.cc/paper/6621-label-efficient-learning-of-transferable-representations-acrosss-domains-and-tasks.pdf
  19. 19.
    Ravi, S., Larochelle, H.: Optimization as a model for few-shot learning. In: International Conference on Learning Representations (2017)Google Scholar
  20. 20.
    Rebuffi, S.A., Bilen, H., Vedaldi, A.: Learning multiple visual domains with residual adapters. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 506–516. Curran Associates, Inc. (2017). http://papers.nips.cc/paper/6654-learning-multiple-visual-domains-with-residual-adapters.pdf
  21. 21.
    Rubinstein, R., Bruckstein, A.M., Elad, M.: Dictionaries for sparse representation modeling. Proc. IEEE 98(6), 1045–1057 (2010).  https://doi.org/10.1109/JPROC.2010.2040551CrossRefGoogle Scholar
  22. 22.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (2015)Google Scholar
  23. 23.
    Snell, J., Swersky, K., Zemel, R.S.: Prototypical networks for few-shot learning. CoRR abs/1703.0 (2017). http://arxiv.org/abs/1703.05175
  24. 24.
    Ulyanov, D., Vedaldi, A., Lempitsky, V.S.: Instance normalization: the missing ingredient for fast stylization. CoRR abs/1607.0 (2016). http://arxiv.org/abs/1607.08022
  25. 25.
    Vinyals, O., Blundell, C., Lillicrap, T., Kavukcuoglu, K., Wierstra, D.: Matching networks for one shot learning. In: Lee, D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 29, pp. 3630–3638. Curran Associates, Inc. (2016). http://papers.nips.cc/paper/6385-matching-networks-for-one-shot-learning.pdf

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Active Vision Laboratory, Department of Engineering ScienceUniversity of OxfordOxfordUK

Personalised recommendations