Advertisement

HybridNet: Classification and Reconstruction Cooperation for Semi-supervised Learning

  • Thomas RobertEmail author
  • Nicolas Thome
  • Matthieu Cord
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11211)

Abstract

In this paper, we introduce a new model for leveraging unlabeled data to improve generalization performances of image classifiers: a two-branch encoder-decoder architecture called HybridNet. The first branch receives supervision signal and is dedicated to the extraction of invariant class-related representations. The second branch is fully unsupervised and dedicated to model information discarded by the first branch to reconstruct input data. To further support the expected behavior of our model, we propose an original training objective. It favors stability in the discriminative branch and complementarity between the learned representations in the two branches. HybridNet is able to outperform state-of-the-art results on CIFAR-10, SVHN and STL-10 in various semi-supervised settings. In addition, visualizations and ablation studies validate our contributions and the behavior of the model on both CIFAR-10 and STL-10 datasets.

Keywords

Deep learning Semi-supervised learning Regularization Reconstruction Invariance and stability Encoder-decoder 

Notes

Acknowledgements

This work was funded by grant DeepVision (ANR-15-CE23-0029-02, STPGP-479356-15), a joint French/Canadian call by ANR & NSERC.

Supplementary material

474212_1_En_10_MOESM1_ESM.pdf (3.1 mb)
Supplementary material 1 (pdf 3156 KB)

References

  1. 1.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (NIPS) (2012)Google Scholar
  2. 2.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  3. 3.
    Durand, T., Mordan, T., Thome, N., Cord, M.: WILDCAT: weakly supervised learning of deep ConvNets for image classification, pointwise localization and segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  4. 4.
    Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: Advances in Neural Information Processing Systems (NIPS) (2016)Google Scholar
  5. 5.
    Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  6. 6.
    Mordan, T., Thome, N., Cord, M., Henaff, G.: Deformable part-based fully convolutional network for object detection. In: British Machine Vision Conference (BMVC) (2017)Google Scholar
  7. 7.
    Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) (2018)Google Scholar
  8. 8.
    Engilberge, M., Chevallier, L., Pérez, P., Cord, M.: Finding beans in burgers: deep semantic-visual embedding with localization. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)Google Scholar
  9. 9.
    Carvalho, M., Cadène, R., Picard, D., Soulier, L., Thome, N., Cord, M.: Cross-modal retrieval in the cooking context: learning semantic text-image embeddings. Special Interest Group on Information Retrieval (SIGIR) (2018)Google Scholar
  10. 10.
    Ben-Younes, H., Cadène, R., Thome, N., Cord, M.: MUTAN: multimodal tucker fusion for visual question answering. In: IEEE International Conference on Computer Vision (ICCV) (2017)Google Scholar
  11. 11.
    Krogh, A., Hertz, J.A.: A simple weight decay can improve generalization. In: Advances in Neural Information Processing Systems (NIPS) (1992)Google Scholar
  12. 12.
    Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd International Conference on Machine Learning, vol. 37, pp. 448–456 (2015)Google Scholar
  13. 13.
    Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. (JMLR), 15, 1929–1958 (2014)Google Scholar
  14. 14.
    Blot, M., Robert, T., Thome, N., Cord, M.: SHADE: information-based regularization for deep learning. In: IEEE International Conference on Image Processing (ICIP) (2018)Google Scholar
  15. 15.
    Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: Advances in Neural Information Processing Systems (NIPS) (2007)Google Scholar
  16. 16.
    Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Zhao, J., Mathieu, M., Goroshin, R., LeCun, Y.: Stacked what-where auto-encoders. In: International Conference on Learning Representations Workshop (ICLR-W) (2016)Google Scholar
  18. 18.
    Zhang, Y., Lee, K., Lee, H.: Augmenting supervised neural networks with unsupervised objectives for large-scale image classification. In: International Conference on Machine Learning (ICML) (2016)Google Scholar
  19. 19.
    Rasmus, A., Berglund, M., Honkala, M., Valpola, H., Raiko, T.: Semi-supervised learning with ladder networks. In: Advances in Neural Information Processing Systems (NIPS) (2015)Google Scholar
  20. 20.
    Mallat, S.: Group invariant scattering. In: Communications on Pure and Applied Mathematics (CPAM) (2012)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Bruna, J., Mallat, S.: Invariant scattering convolution networks. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI), 35(8), 1872–1886 (2013)CrossRefGoogle Scholar
  22. 22.
    Bietti, A., Mairal, J.: Group invariance, stability to deformations, and complexity of deep convolutional representations. In: Advances in Neural Information Processing Systems (NIPS) (2017)Google Scholar
  23. 23.
    Sajjadi, M., Javanmardi, M., Tasdizen, T.: Regularization with stochastic transformations and perturbations for deep semi-supervised learning. In: Advances in Neural Information Processing Systems (NIPS) (2016)Google Scholar
  24. 24.
    Laine, S., Aila, T.: Temporal ensembling for semi-supervised learning. In: International Conference on Learning Representations (ICLR) (2017)Google Scholar
  25. 25.
    Tarvainen, A., Valpola, H.: Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results. In: Advances in Neural Information Processing Systems (NIPS) (2017)Google Scholar
  26. 26.
    Zhu, X.: Semi-supervised learning literature survey. Technical report 1530, Computer Sciences, University of Wisconsin-Madison (2005)Google Scholar
  27. 27.
    Ranzato, M., Szummer, M.: Semi-supervised learning of compact document representations with deep networks. In: International Conference on Machine Learning (ICML) (2008)Google Scholar
  28. 28.
    Ranzato, M., Huang, F.J., Boureau, Y.L., LeCun, Y.: Unsupervised learning of invariant feature hierarchies with applications to object recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2007Google Scholar
  29. 29.
    Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: International Conference on Machine Learning (ICML) (2008)Google Scholar
  30. 30.
    Ranzato, M., Poultney, C., Chopra, S., Lecun, Y.: Efficient learning of sparse representations with an energy-based model. In: Advances in Neural Information Processing Systems (NIPS) (2007)Google Scholar
  31. 31.
    Larochelle, H., Bengio, Y.: Classification using discriminative restricted Boltzmann machines. In: International Conference on Machine Learning (ICML) (2008)Google Scholar
  32. 32.
    Kingma, D.P., Mohamed, S., Rezende, D.J., Welling, M.: Semi-supervised learning with deep generative models. In: Advances in Neural Information Processing Systems (NIPS) (2014)Google Scholar
  33. 33.
    Erhan, D., Bengio, Y., Courville, A., Manzagol, P.A., Vincent, P., Bengio, S.: Why does unsupervised pre-training help deep learning? J. Mach. Learning Research (JMLR), 11, 625–660 (2010)Google Scholar
  34. 34.
    Goh, H., Thome, N., Cord, M., Lim, J.H.: Top-down regularization of deep belief networks. In: Advances in Neural Information Processing Systems (NIPS) (2013)Google Scholar
  35. 35.
    Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems (NIPS) (2014)Google Scholar
  36. 36.
    Makhzani, A., Shlens, J., Jaitly, N., Goodfellow, I.: Adversarial autoencoders. In: International Conference on Learning Representations (ICLR) (2016)Google Scholar
  37. 37.
    Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer, New York (2009).  https://doi.org/10.1007/978-0-387-84858-7CrossRefzbMATHGoogle Scholar
  38. 38.
    Thériault, C., Thome, N., Cord, M.: Dynamic scene classification: learning motion descriptors with slow features analysis. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2013)Google Scholar
  39. 39.
    Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: International Conference on Learning Representations (ICLR) (2015)Google Scholar
  40. 40.
    Miyato, T., Maeda, S.i., Koyama, M., Nakae, K., Ishii, S.: Distributional smoothing with virtual adversarial training. In: International Conference on Learning Representations (ICLR) (2016)Google Scholar
  41. 41.
    Wojna, Z., et al.: The devil is in the decoder. In: British Machine Vision Conference (BMVC) (2017)Google Scholar
  42. 42.
    Dumoulin, V., Visin, F.: A guide to convolution arithmetic for deep learning. Technical report (2016)Google Scholar
  43. 43.
    Gomez, A.N., Ren, M., Urtasun, R., Grosse, R.B.: The reversible residual network: backpropagation without storing activations. In: Advances in Neural Information Processing Systems (NIPS) (2017)Google Scholar
  44. 44.
    Jacobsen, J.H., Smeulders, A., Oyallon, E.: i-RevNet: deep invertible networks. In: International Conference on Learning Representations (ICLR) (2018)Google Scholar
  45. 45.
    Sweldens, W.: The lifting scheme: a new philosophy in biorthogonal wavelet constructions. In: Wavelet Applications in Signal and Image Processing III (1995)Google Scholar
  46. 46.
    Mallat, S.G., Peyré, G.: A Wavelet Tour Of Signal Processing: The Sparse Way. Academic Press (2009)Google Scholar
  47. 47.
    Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10590-1_53CrossRefGoogle Scholar
  48. 48.
    Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning. In: NIPS Workshop on Deep Learning and Unsupervised Feature Learning (2011)Google Scholar
  49. 49.
    Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Technical report (2009)Google Scholar
  50. 50.
    Coates, A., Ng, A., Lee, H.: An analysis of single-layer networks in unsupervised feature learning. In: International Conference on Artificial Intelligence and Statistics (AISTATS) (2011)Google Scholar
  51. 51.
    Gastaldi, X.: Shake-shake regularization of 3-branch residual networks. In: International Conference on Learning Representations Workshop (ICLR-W) (2017)Google Scholar
  52. 52.
    Springenberg, J.T.: Unsupervised and semi-supervised learning with categorical generative adversarial networks. In: International Conference on Learning Representations (ICLR) (2016)Google Scholar
  53. 53.
    Salimans, T., et al.: Improved techniques for training GANs. In Lee, D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems (NIPS). Curran Associates, Inc. (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Sorbonne Université, CNRS, LIP6ParisFrance
  2. 2.CEDRIC - Conservatoire National des Arts et MétiersParisFrance

Personalised recommendations