Advertisement

Attention-Based Generative Adversarial Network for Semi-supervised Image Classification

  • Xuezhi XiangEmail author
  • Zeting Yu
  • Ning Lv
  • Xiangdong Kong
  • Abdulmotaleb El Saddik
Article
  • 30 Downloads

Abstract

Semi-supervised image classification is one of the areas of interest within the computer vision, which can build better classifiers using a few labeled images and plenty of unlabeled images. Recently, semi-supervised image classification methods based on the generative adversarial network (GAN) get promising results. In this paper, we introduce a self-attention mechanism to propose an attention-based GAN for semi-supervised image classification, which can capture global dependencies and adaptively extract important information. Furthermore, we apply spectral normalization, which can stabilize the training of attention-based GAN. We also adopt manifold regularization as an additional regularization term so that we can make the most of the unlabeled images. We test the proposed method on SVHN and CIFAR-10 datasets. The experimental results show that the proposed method is comparable with the state-of-the-art GAN-based semi-supervised image classification methods.

Keywords

Image classification Semi-supervised Convolutional neural network Generative adversarial network Self-attention 

Notes

References

  1. 1.
    Arjovsky M, Chintala S, Bottou L (2017) Wasserstein gan. arXiv preprint arXiv:1701.07875
  2. 2.
    Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473
  3. 3.
    Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7(Nov):2399–2434MathSciNetzbMATHGoogle Scholar
  4. 4.
    Berthelot D, Schumm T, Metz L (2017) Began: boundary equilibrium generative adversarial networks. arXiv preprint arXiv:1703.10717
  5. 5.
    Caron M, Bojanowski P, Joulin A, Douze M (2018) Deep clustering for unsupervised learning of visual features. In: Proceedings of the european conference on computer vision (ECCV), pp 132–149CrossRefGoogle Scholar
  6. 6.
    Chen X, Duan Y, Houthooft R, Schulman J, Sutskever I, Abbeel P (2016) Infogan: interpretable representation learning by information maximizing generative adversarial nets. In: Advances in neural information processing systems, pp 2172–2180Google Scholar
  7. 7.
    Dai Z, Yang Z, Yang F, Cohen WW, Salakhutdinov RR (2017) Good semi-supervised learning that requires a bad gan. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems 30, Curran Associates, Inc., pp 6510–6520, http://papers.nips.cc/paper/7229-good-semi-supervised-learning-that-requires-a-bad-gan.pdf
  8. 8.
    Dong J, Gao K, Chen X, Cao J (2019) Refocused attention: long short-term rewards guided video captioning. Neural Process Lett.  https://doi.org/10.1007/s11063-019-10030-y CrossRefGoogle Scholar
  9. 9.
    Fan R, Zhou P, Chen W, Jia J, Liu G (2018) An online attention-based model for speech recognition. arXiv preprint arXiv:1811.05247
  10. 10.
    Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ (eds) Advances in neural information processing systems 27, Curran Associates, Inc., pp 2672–2680, http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf
  11. 11.
    Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville AC (2017) Improved training of wasserstein gans. In: Advances in neural information processing systems, pp 5767–5777Google Scholar
  12. 12.
    Haeusser P, Plapp J, Golkov V, Aljalbout E, Cremers D (2018) Associative deep clustering: training a classification network with no labels. In: German conference on pattern recognition, Springer, pp 18–32Google Scholar
  13. 13.
    Han Z, Tao X, Li H, Zhang S, Metaxas D (2017) Stackgan: text to photo-realistic image synthesis with stacked generative adversarial networks. In: 2017 IEEE international conference on computer vision (ICCV)Google Scholar
  14. 14.
    Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems 30, curran associates, Inc., pp 6626–6637, http://papers.nips.cc/paper/7240-gans-trained-by-a-two-time-scale-update-rule-converge-to-a-local-nash-equilibrium.pdf
  15. 15.
    Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141Google Scholar
  16. 16.
    Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907
  17. 17.
    Kosiorek A, Bewley A, Posner I (2017) Hierarchical attentive recurrent tracking. In: Advances in neural information processing systems, pp 3053–3061Google Scholar
  18. 18.
    Laine S, Aila T (2016) Temporal ensembling for semi-supervised learning. arXiv preprint arXiv:1610.02242
  19. 19.
    Lecouat B, Foo CS, Zenati H, Chandrasekhar V (2018) Manifold regularization with gans for semi-supervised learning. arXiv preprint arXiv:1807.04307
  20. 20.
    Ledig C, Theis L, Huszar F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z, Shi W (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: The IEEE conference on computer vision and pattern recognition (CVPR)Google Scholar
  21. 21.
    Lee DH (2013) Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In: Workshop on challenges in representation learning, ICML, vol 3, p 2Google Scholar
  22. 22.
    LI C, Xu T, Zhu J, Zhang B (2017) Triple generative adversarial nets. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems 30, Curran Associates, Inc., pp 4088–4098, http://papers.nips.cc/paper/6997-triple-generative-adversarial-nets.pdf
  23. 23.
    Li Y, Liu S, Yang J, Yang MH (2017) Generative face completion. In: The IEEE conference on computer vision and pattern recognition (CVPR)Google Scholar
  24. 24.
    Liu W, Ma X, Zhou Y, Tao D, Cheng J (2018) \( p \)-laplacian regularization for scene recognition. IEEE Trans Cybern 49(8):2927–2940CrossRefGoogle Scholar
  25. 25.
    Luo Y, Zhu J, Li M, Ren Y, Zhang B (2018) Smooth neighbors on teacher graphs for semi-supervised learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8896–8905Google Scholar
  26. 26.
    Luong MT, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025
  27. 27.
    Ma X, Liu W, Li S, Tao D, Zhou Y (2018) Hypergraph \( p \)-laplacian regularization for remotely sensed image recognition. IEEE Trans Geosci Remote Sens 57(3):1585–1595CrossRefGoogle Scholar
  28. 28.
    Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784
  29. 29.
    Miyato T, Kataoka T, Koyama M, Yoshida Y (2018a) Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957
  30. 30.
    Miyato T, Maeda Si, Ishii S, Koyama M (2018b) Virtual adversarial training: a regularization method for supervised and semi-supervised learning. In: IEEE transactions on pattern analysis and machine intelligenceGoogle Scholar
  31. 31.
    Odena A (2016) Semi-supervised learning with generative adversarial networks. arXiv preprint arXiv:1606.01583
  32. 32.
    Oneto L (2019) Model selection and error estimation in a nutshell. Springer, BerlinGoogle Scholar
  33. 33.
    Qi GJ, Zhang L, Hu H, Edraki M, Wang J, Hua XS (2018) Global versus localized generative adversarial nets. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1517–1525Google Scholar
  34. 34.
    Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434
  35. 35.
    Rao Y, Lu J, Zhou J (2017) Attention-aware deep reinforcement learning for video face recognition. In: Proceedings of the IEEE international conference on computer vision, pp 3931–3940Google Scholar
  36. 36.
    Rasmus A, Berglund M, Honkala M, Valpola H, Raiko T (2015) Semi-supervised learning with ladder networks. In: Advances in neural information processing systems, pp 3546–3554Google Scholar
  37. 37.
    Rush AM, Chopra S, Weston J (2015) A neural attention model for abstractive sentence summarization. arXiv preprint arXiv:1509.00685
  38. 38.
    Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X, Chen X (2016) Improved techniques for training gans. In: Lee DD, Sugiyama M, Luxburg UV, Guyon I, Garnett R (eds) Advances in neural information processing systems 29, Curran Associates, Inc., pp 2234–2242, http://papers.nips.cc/paper/6125-improved-techniques-for-training-gans.pdf
  39. 39.
    Springenberg JT (2015) Unsupervised and semi-supervised learning with categorical generative adversarial networks. arXiv preprint arXiv:1511.06390
  40. 40.
    Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems, pp 3104–3112Google Scholar
  41. 41.
    Tarvainen A, Valpola H (2017) Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results. In: Advances in neural information processing systems, pp 1195–1204Google Scholar
  42. 42.
    Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008Google Scholar
  43. 43.
    Wang F, Jiang M, Qian C, Yang S, Li C, Zhang H, Wang X, Tang X (2017) Residual attention network for image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3156–3164Google Scholar
  44. 44.
    Wang T, Hu H, He C (2019) Image caption with endogenous-exogenous attention. Neural Process Lett.  https://doi.org/10.1007/s11063-019-09979-7 CrossRefGoogle Scholar
  45. 45.
    Wei X, Gong B, Liu Z, Lu W, Wang L (2018) Improving the improved training of wasserstein gans: a consistency term and its dual effect. arXiv preprint arXiv:1803.01541
  46. 46.
    Woo S, Park J, Lee JY, So Kweon I (2018) CBAM: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19CrossRefGoogle Scholar
  47. 47.
    Yu J, Rui Y, Tang YY, Tao D (2014) High-order distance-based multiview stochastic learning in image classification. IEEE Trans Cybern 44(12):2431–2442CrossRefGoogle Scholar
  48. 48.
    Yu J, Tan M, Zhang H, Tao D, Rui Y (2019) Hierarchical deep click feature prediction for fine-grained image recognition. In: IEEE transactions on pattern analysis and machine intelligenceGoogle Scholar
  49. 49.
    Zhai M, Xiang X, Zhang R, Lv N, El Saddik A (2019) Ad-net: attention guided network for optical flow estimation using dilated convolution. In: ICASSP 2019-2019 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 2207–2211Google Scholar
  50. 50.
    Zhang H, Goodfellow I, Metaxas D, Odena A (2018a) Self-attention generative adversarial networks. arXiv preprint arXiv:1805.08318
  51. 51.
    Zhang Z, Zhao M, Chow TW (2013) Graph based constrained semi-supervised learning framework via label propagation over adaptive neighborhood. IEEE Trans Knowl Data Eng 27(9):2362–2376CrossRefGoogle Scholar
  52. 52.
    Zhang Z, Li F, Jia L, Qin J, Zhang L, Yan S (2017) Robust adaptive embedded label propagation with weight learning for inductive classification. IEEE Trans Neural Netw Learn Syst 29(8):3388–3403MathSciNetCrossRefGoogle Scholar
  53. 53.
    Zhang Z, Jia L, Zhao M, Liu G, Wang M, Yan S (2018b) Kernel-induced label propagation by mapping for semi-supervised classification. IEEE Trans Big Data 5(2):148–165CrossRefGoogle Scholar
  54. 54.
    Zhang Z, Zhang Y, Liu G, Tang J, Yan S, Wang M (2019) Joint label prediction based semi-supervised adaptive concept factorization for robust data representation. In: IEEE Transactions on Knowledge and Data EngineeringGoogle Scholar
  55. 55.
    Zhu JY, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: The IEEE international conference on computer vision (ICCV)Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  • Xuezhi Xiang
    • 1
    Email author
  • Zeting Yu
    • 1
  • Ning Lv
    • 1
  • Xiangdong Kong
    • 1
  • Abdulmotaleb El Saddik
    • 2
  1. 1.School of Information and Communication EngineeringHarbin Engineering UniversityHarbinChina
  2. 2.School of Electrical Engineering and Computer ScienceUniversity of OttawaOttawaCanada

Personalised recommendations