Skip to main content
Log in

Maximizing bi-mutual information of features for self-supervised deep clustering

  • Original Article
  • Published:
Advances in Computational Intelligence Aims and scope Submit manuscript

Abstract

Self-supervised learning based on mutual information makes good use of classification models and label information produced by clustering tasks to train networks parameters, and then updates the downstream clustering assignment with respect to maximizing mutual information between label information. This kind of methods have attracted more and more attention and obtained better progress, but there is still a larger improvement space compared with the methods of supervised learning, especially on the challenge image datasets. To this end, a self-supervised deep clustering method by maximizing mutual information is proposed (bi-MIM-SSC), where deep convolutional network is employed as a feature encoder. The first term is to maximize mutual information between output-feature pairs for importing more semantic meaning to the output features. The second term is to maximize mutual information between an input image and its feature generated by the encoder for keeping the useful information of an original image in latent space as possible. Furthermore, pre-training is carried out to further enhance the representation ability of the encoder, and the auxiliary over-clustering is added in clustering network. The performance of the proposed method bi-MIM-SSC is compared with other clustering methods on the CIFAR10, CIFAR100 and STL10 datasets. Experimental results demonstrate that the proposed bi-MIM-SSC method has better feature representation ability and provide better clustering results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Adam C, Ng AY, Honglak L (2011) An analysis of single-layer networks in unsupervised feature learning. J Mach Learn Res 15:215–223

    Google Scholar 

  • Alec R, Luke M, Soumith C (2016) Unsupervised representation learning with deep convolutional generative adversarial networks. In: Proceedings of the 4th international conference on learning representations, pp 97–108

  • Asja F, Christian I (2012) An introduction to restricted Boltzmann machines. In: Proceedings of progress in pattern recognition, image analysis, computer vision, and applications (CIARP 2012), vol 7441, pp 14–36

  • Chang J, Wang L, Meng G et al. (2017) Deep adaptive image clustering. In: Proceedings of the IEEE international conference on computer vision, pp 5879–5887.

  • Devon HR, Alex F, Samuel L-M et al. (2018) Learning deep representations by mutual information estimation and maximization. arXiv preprint https://arXiv:180806670

  • Fengfu Li, Hong Q, Bo Z (2018) Discriminatively boosted image clustering with fully convolutional auto-encoders. Pattern Recogn 83:161–173

    Article  Google Scholar 

  • Gan S, Yang C, Qianqian W et al (2020) Lifelong spectral clustering. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 5867–5874

  • Ian G, Jean P-A, Mehdi M et al (2014) Generative adversarial nets. In: Proceedings of advances in neural information processing systems, pp 2672–2680

  • Ishmael BM, Aristide B, Sai R et al (2018) Mine: mutual information neural estimation. arXiv preprint https://arXiv:180104062

  • Xu J, Henriques João F, Andrea V (2019) Invariant information clustering for unsupervised image classification and segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp 9865–9874

  • Jianwei Y, Devi P, Dhruv B (2016) Joint unsupervised learning of deep representations and image clusters. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 147–5156

  • Jonathan M, Ueli M, Dan C et al (2011) Stacked convolutional auto-encoders for hierarchical feature extraction. International conference on Artificial Neural Networks, Springer, Berlin, pp 52–59

    Google Scholar 

  • Juanying X, Qi H, Jiawen C (2019) Image clustering algorithms by deep convolutional autoencoders. Comput Sci Technol 13(04):586–595

    Google Scholar 

  • Junyuan X, Ross G, Ali F (2016) Unsupervised deep embedding for clustering analysis. In: International conference on machine learning, pp 478–487

  • Kaiming H, Xiangyu Z, Shaoqing R et al. Deep Residual Learning for Image Recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  • Kaiming H, Haoqi F, Yuxin W et al. (2020) Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 9729–9738

  • Kamran GD, Amirhossein H, Cheng D et al. (2017) Deep clustering via joint convolutional autoencoder embedding and relative entropy minimization. In: Proceedings of the IEEE International Conference on Computer Vision, pp 5736–5745

  • Kingma Diederik P, Max W (2013) Auto-encoding variational bayes. arXiv preprint https://arXiv:13126114

  • Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. Handbook Syst Auto Immune Dis 1(4)

  • Kuhn HW (2010) The Hungarian method for the assignment problem. Nav Res Logist 52(1–2):7–21

    MATH  Google Scholar 

  • Lars S, Monty S, Simon-Martin S et al. (2020) A survey on semi- self-and unsupervised learning for image classification. arXiv preprint https://arXiv:20028721

  • Mathilde C, Piotr B, Armand J et al. (2018) Deep clustering for unsupervised learning of visual features. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 132–149

  • Michael G, Aapo H (2010) Noise-contrastive estimation: a new estimation principle for unnormalized statistical models. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp 297–304

  • Philip H, Johannes P, Vladimir G et al (2018) Associative deep clustering: training a classification network with no labels. In: Proceedings of German conference on pattern recognition, pp 18–32

  • Rumelhart David E, Hinton Geoffrey E, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536

    Article  Google Scholar 

  • Shorte C, Khoshgoftaarn T (2019) A survey on image data augmentation for deep learning. J Big Data 6(60):1–48

    Google Scholar 

  • Sinaga Kristina P, Miin-Shen Y (2020) Unsupervised K-means clustering algorithm. IEEE Access 8:80716–80727

    Article  Google Scholar 

  • Ting C, Simon K, Mohammad N et al. (2020) A simple framework for contrastive learning of visual representations. arXiv preprint https://arXiv:200205709

  • van den Aaon O, Yazhe L, Oriol V (2018) Representation learning with contrastive predictive coding. arXiv preprint https://arXiv:180703748

  • Zhirong W, Yuanjun X, Yu Stella X et al. (2018) Unsupervised feature learning via non-parametric instance discrimination. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3733–3742

Download references

Acknowledgements

This work is supported by the Hebei Province Introduction of Studying Abroad Talent Funded Project (No.C20200302); Opening Fund of Hebei Key Laboratory of Machine Learning and Computational Intelligence (2019-2021-A; ZZ201909-202109-1); Key R&D Project of Hebei Science and Technology Plan (No.19210310D); and the Natural Science Foundation of Hebei Province (F2021201020).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Junfen Chen.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, J., Chen, J., Meng, X. et al. Maximizing bi-mutual information of features for self-supervised deep clustering. Adv. in Comp. Int. 2, 3 (2022). https://doi.org/10.1007/s43674-021-00012-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s43674-021-00012-w

Keywords

Navigation