Skip to main content
Log in

Multi-match: mutual information maximization and CutEdge for semi-supervised learning

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Deep supervised learning has achieved great successes in tackling complex computer vision tasks. However, it typically requires a large amount of data with labels and is expensive in practical applications. Semi-supervised learning, which leverages the hidden structures learned from unlabeled data, has attracted much attention. In this work, a semi-supervised classification model named Multi-Match is proposed, which includes two augmentation branches and encourages the output of the complex augmentation branch to be close to the predictions of the simple augmentation branch. A mutual information (MI) loss is introduced to maximize MI not only between the input and output representation, but also between the class assignments inside the simple augmentation branch. A novel information dropping method named CutEdge is proposed by removing multiple regions near the input edges to further improve the robustness. The experimental results on CIFAR-10, CIFAR-100 and SVHN with different label sizes demonstrate that the proposed model outperforms the compared semi-supervised learning methods. The gains come from the MI loss, the combination of affine transformation and CutEdge, and the use of multiple branches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Athiwaratkun B, Finzi M, Izmailov P, Wilson AG (2018a) Improving consistency-based semi-supervised learning with weight averaging. arXiv:180605594

  2. Athiwaratkun B, Finzi M, Izmailov P, Wilson AG (2018b) There are many consistent explanations of unlabeled data: Why you should average. In: Proceedings of the international conference on learning representations (ICLR)

  3. Bachman P, Hjelm RD, Buchwalter W (2019) Learning representations by maximizing mutual information across views. In: Advances in neural information processing systems (NIPS), pp 15535–15545

  4. Berthelot D, Carlini N, Goodfellow I, Papernot N, Oliver A, Raffel CA (2019) Mixmatch: A holistic approach to semi-supervised learning. In: Advances in neural information processing systems (NIPS), pp 5049–5059

  5. Canny J (1986) A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence (6):679–698

  6. Chapelle O, Scholkopf B, Zien A (2009) Semi-supervised learning (Chapelle, O et al 2006)[book reviews]. IEEE Transactions on Neural Networks 20 (3):542–542

    Article  Google Scholar 

  7. Cubuk ED, Zoph B, Mane D, Vasudevan V, Le QV (2018) Autoaugment: Learning augmentation policies from data. arXiv:180509501

  8. Cubuk ED, Zoph B, Shlens J, Le QV (2020) Randaugment: Practical automated data augmentation with a reduced search space. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 702–703

  9. DeVries T, Taylor GW (2017) Improved regularization of convolutional neural networks with cutout. arXiv:170804552

  10. Grandvalet Y, Bengio Y (2005) Semi-supervised learning by entropy minimization. In: Advances in neural information processing systems (NIPS), pp 529–536

  11. Hartley RI (1999) Theory and practice of projective rectification. Int J Comput Vis 35(2):115–127

    Article  Google Scholar 

  12. Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR (2012) Improving neural networks by preventing co-adaptation of feature detectors. arXiv:12070580

  13. Hjelm RD, Fedorov A, Lavoie-Marchildon S, Grewal K, Bachman P, Trischler A, Bengio Y (2018) Learning deep representations by mutual information estimation and maximization. In: Proceedings of the international conference on learning representations (ICLR)

  14. Hu W, Miyato T, Tokui S, Matsumoto E, Sugiyama M (2017) Learning discrete representations via information maximizing self-augmented training. In: Proceedings of the international conference on machine learning (ICML), pp 1558–1567

  15. Ji X, Henriques JF, Vedaldi A (2019) Invariant information clustering for unsupervised image classification and segmentation. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 9865–9874

  16. Kingma DP, Mohamed S, Rezende DJ, Welling M (2014) Semi-supervised learning with deep generative models. In: Advances in neural information processing systems (NIPS), pp 3581–3589

  17. Krizhevsky A, Hinton G, et al. (2009) Learning multiple layers of features from tiny images. Technical Report

  18. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems (NIPS), pp 1097–1105

  19. Laine S, Aila T (2016) Temporal ensembling for semi-supervised learning. arXiv:161002242

  20. Lee DH (2013) Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In: Workshop on challenges in representation learning (ICML), pp 1–6

  21. Leng L, Zhang J, Khan MK, Chen X, Alghathbar K (2010) Dynamic weighted discrimination power analysis: a novel approach for face and palmprint recognition in dct domain. Int J Phys Sci 5(17):2543–2554

    Google Scholar 

  22. Leng L, Li M, Kim C, Bi X (2017) Dual-source discrimination power analysis for multi-instance contactless palmprint recognition. Multimed Tools Appl 76(1):333–354

    Article  Google Scholar 

  23. Lim S, Kim I, Kim T, Kim C, Kim S (2019) Fast autoaugment. In: Advances in neural information processing systems (NIPS), pp 6665–6675

  24. Liu P, Yu H, Cang S (2019) Adaptive neural network tracking control for underactuated systems with matched and mismatched disturbances. Nonlinear Dynam 98(2):1447–1464

    Article  Google Scholar 

  25. Maaløe L, Sønderby CK, Sønderby SK, Winther O (2016) Auxiliary deep generative models. In: Proceedings of the international conference on machine learning (ICML), pp 1445–1453

  26. Mahbod A, Chowdhury M, Smedby Ö, Wang C (2018) Automatic brain segmentation using artificial neural networks with shape context. Pattern Recogn Lett 101:74–79

    Article  Google Scholar 

  27. Miyato T, Si Maeda, Koyama M, Ishii S (2018) Virtual adversarial training: a regularization method for supervised and semi-supervised learning. IEEE Trans Pattern Anal Mach Intell 41(8):1979–1993

    Article  Google Scholar 

  28. Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng AY (2011) Reading digits in natural images with unsupervised feature learning. In: NIPS workshop on deep learning and unsupervised feature learning

  29. Oord Avd, Li Y, Vinyals O (2018) Representation learning with contrastive predictive coding. arXiv:180703748

  30. Poole B, Ozair S, Van Den Oord A, Alemi A, Tucker G (2019) On variational bounds of mutual information. In: Proceedings of the international conference on machine learning (ICML), pp 5171–5180

  31. Qi GJ, Zhang L, Hu H, Edraki M, Wang J, Hua XS (2018) Global versus localized generative adversarial nets. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 1517–1525

  32. Qi GJ, Zhang L, Chen CW, Tian Q (2019) Avt: Unsupervised learning of transformation equivariant representations by autoencoding variational transformations. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 8130–8139

  33. Rasmus A, Berglund M, Honkala M, Valpola H, Raiko T (2015) Semi-supervised learning with ladder networks. In: Advances in neural information processing systems (NIPS), pp 3546–3554

  34. Sajjadi M, Javanmardi M, Tasdizen T (2016) Regularization with stochastic transformations and perturbations for deep semi-supervised learning. In: Advances in neural information processing systems (NIPS), pp 1163–1171

  35. Siddharth N, Paige B, Van de Meent JW, Desmaison A, Goodman N, Kohli P, Wood F, Torr P (2017) Learning disentangled representations with semi-supervised deep generative models. In: Advances in neural information processing systems (NIPS), pp 5925–5935

  36. Sønderby CK, Raiko T, Maaløe L, Sønderby SK, Winther O (2016) Ladder variational autoencoders. In: Advances in neural information processing systems (NIPS), pp 3738–3746

  37. Song J, Ermon S (2019) Understanding the limitations of variational mutual information estimators. In: Proceedings of the international conference on learning representations (ICLR)

  38. Sun L, Zhao C, Yan Z, Liu P, Duckett T, Stolkin R (2018) A novel weakly-supervised approach for rgb-d-based nuclear waste object detection. IEEE Sensors J 19(9):3487–3500

    Article  Google Scholar 

  39. Zc Tang, Li C, Jf W u, Pc Liu, Sw Cheng (2019) Classification of eeg-based single-trial motor imagery tasks using a b-csp method for bci. Front Inform Technol Electron Eng 20(8):1087–1098

    Article  Google Scholar 

  40. Tarvainen A, Valpola H (2017) Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In: Advances in neural information processing systems (NIPS), pp 1195–1204

  41. Tschannen M, Djolonga J, Rubenstein PK, Gelly S, Lucic M (2019) On mutual information maximization for representation learning. In: Proceedings of the international conference on learning representations (ICLR)

  42. Zagoruyko S, Komodakis N (2016) Wide residual networks. arXiv:160507146

  43. Zhang H, Cisse M, Dauphin YN, Lopez-Paz D (2018) Mixup: Beyond empirical risk minimization. In: Proceedings of the international conference on learning representations (ICLR)

  44. Zhang L, Qi GJ, Wang L, Luo J (2019) Aet vs. aed: Unsupervised representation learning by auto-encoding transformations rather than data. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 2547–2555

  45. Zhong Z, Zheng L, Kang G, Li S, Yang Y (2017) Random erasing data augmentation. arXiv:170804896

  46. Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2016) Learning deep features for discriminative localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 2921–2929

  47. Xianye B, Yi R, Junping Z, Su-Jing W, Kidiyo K, Weixiao M, Yong-Jin L Video-based facial micro-expression analysis: a survey of datasets, features and algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence, https://doi.org/10.1109/TPAMI.2021.3067464

Download references

Acknowledgements

This research was funded in part by the National Key Research and Development Program of China (2021YFB2800300), the Shandong Provincial Key Research and Development Program (Major Scientific and Technological Innovation Project) under Grant 2019JZZY010119, the National Natural Science Foundation of China under Grant No. 62001267, and the Future Plan for Young Scholars of Shandong University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongchao Zhou.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Abbreviations and descriptions

Appendix A: Abbreviations and descriptions

Abbreviations

Descriptions

Multi-Match

Proposed semi-supervised classification model based on multi-branch

MI

Mutual information

CutEdge

Proposed information dropping method based on edge removal

GAN

Generative adversarial network

CutOut

Information dropping method based on random deletion

π Model

Method of maximizing prediction consistency

VAT

Virtual adversarial training, method of applying the most sensitive perturbation to model input

MixMatch

Method of combining entropy minimization and consistency regularization

MixUp

Fusion method based on samples and labels

KL

Kullback leiber divergence, measurement of the difference between two probability distributions

DIM

Deep infomax, method of maximizing global and local mutual information of images

AMDIM

Augmented multi-scale deep infomax, method of maximizing input and output mutual information of images

CPC

Contrastive predictive coding, method of maximizing the lower bound of mutual information

IIC

Invariant information clustering, method of maximizing mutual information between image outputs

NMS

Non-maximum suppression

EMA

Exponential moving average, the exponential moving average of previous model parameters

CAM

Class activation mapping, method of generating class activation thermal mapping for input

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, Y., Chen, L., Zhao, D. et al. Multi-match: mutual information maximization and CutEdge for semi-supervised learning. Multimed Tools Appl 82, 479–496 (2023). https://doi.org/10.1007/s11042-022-13126-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-13126-1

Keywords

Navigation