Skip to main content
Log in

Deep generative clustering methods based on disentangled representations and augmented data

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

This paper presents a novel clustering approach that utilizes variational autoencoders (VAEs) with disentangled representations, enhancing the efficiency and effectiveness of clustering. Traditional VAE-based clustering models often conflate generative and clustering information, leading to suboptimal clustering performance. To overcome this, our model distinctly separates latent representations into two modules: one for clustering and another for generation. This separation significantly improves clustering performance. Additionally, we employ augmented data to maximize mutual information between cluster assignment variables and the optimized latent variables. This strategy not only enhances clustering effectiveness but also allows the construction of latent variables that synergistically combine clustering information from original data with generative information from augmented data. Through extensive experiments, our model demonstrates superior clustering performance without the need for pre-training, outperforming existing deep generative clustering models. Moreover, it achieves state-of-the-art clustering accuracy on certain datasets, surpassing models that require pre-training.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data Availability

The MNIST data sets is available at: [http://yann.lecun.com/exdb/mnist/]. The USPS data set is available at: [https://www.kaggle.com/datasets/bistaumanga/usps-dataset]. The GTSRB data set is available at: [https://benchmark.ini.rub.de/gtsrb_news.html] The YTF data set is available at: [https://www.cs.tau.ac.il/~wolf/ytfaces/] The F-MNIST data set is available at: [https://www.kaggle.com/datasets/zalando-research/fashionmnist].

References

  1. Cai J, Wang S, Guo W (2021) Unsupervised embedded feature learning for deep clustering with stacked sparse auto-encoder. Expert Syst Appl 186(115):729

    Google Scholar 

  2. Cao L, Asadi S, Zhu W, Schmidli C, Sjöberg M (2020) Simple, scalable, and stable variational deep clustering. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp 108–124

  3. Chen RT, Li X, Grosse RB, Duvenaud DK (2018) Isolating sources of disentanglement in variational autoencoders. vol 31

  4. Chen X, Duan Y, Houthooft R, Schulman J, Sutskever I, Abbeel P (2016) Infogan: Interpretable representation learning by information maximizing generative adversarial nets. In: Advances in neural information processing systems, vol 29

  5. Dai Q, Zhao C, Zhao S (2022) Variational bayesian student’st mixture model with closed-form missing value imputation for robust process monitoring of low-quality data. IEEE Trans Cybern pp 1–14

  6. Diallo B, Hu J, Li T, Khan GA, Liang X, Zhao Y (2021) Deep embedding clustering based on contractive autoencoder. Neurocomputing 433:96–107

    Article  Google Scholar 

  7. Dilokthanakul N, Mediano PA, Garnelo M, Lee MC, Salimbeni H, Arulkumaran K, Shanahan M (2016) Deep unsupervised clustering with gaussian mixture variational autoencoders. arXiv preprint arXiv:1611.02648

  8. Dupont E (2018) Learning disentangled joint continuous and discrete representations. vol 31

  9. Fan W, Bouguila N (2014) Variational learning for dirichlet process mixtures of dirichlet distributions and applications. Multimed Tools Appl 70(3):1685–1702

    Article  Google Scholar 

  10. Fan W, Hou W (2022) Unsupervised modeling and feature selection of sequential spherical data through nonparametric hidden markov models. Int J Mach Learn Cybern 13(10):3019–3029

    Article  Google Scholar 

  11. Fan W, Sallay H, Bouguila N, Bourouis S (2016) Variational learning of hierarchical infinite generalized dirichlet mixture models and applications. Soft Comput 20(3):979–990

    Article  Google Scholar 

  12. Fan W, Bouguila N, Bourouis S, Laalaoui Y (2018) Entropy-based variational bayes learning framework for data clustering. IET Image Proc 12(10):1762–1772

    Article  Google Scholar 

  13. Fan W, Bouguila N, Du JX, Liu X (2019) Axially symmetric data clustering through dirichlet process mixture models of watson distributions. IEEE Trans Neural Netw Learn Syst 30(6):1683–1694

    Article  MathSciNet  Google Scholar 

  14. Fan W, Yang L, Bouguila N (2022) Unsupervised grouped axial data modeling via hierarchical bayesian nonparametric models with watson distributions. IEEE Trans Pattern Anal Mach Intell 44(12):9654–9668

    Article  Google Scholar 

  15. Fan W, Zeng L, Wang T (2023) Uncertainty quantification in molecular property prediction through spherical mixture density networks. Eng Appl Artif Intell 123(106):180

    Google Scholar 

  16. Fei Z, Gong H, Guo J, Wang J, Jin W, Xiang X, Ding X, Zhang N (2023) Image clustering: Utilizing teacher-student model and autoencoder. IEEE Access 11:104,846-104,857

    Article  Google Scholar 

  17. Feng K, Qin H, Wu S, Pan W, Liu G (2020) A sleep apnea detection method based on unsupervised feature learning and single-lead electrocardiogram. IEEE Trans Instrum Meas 70:1–12

    Google Scholar 

  18. Gao X, Huang W, Liu Y, Zhang Y, Zhang J, Li C, Bore JC, Wang Z, Si Y, Tian Y et al (2023) A novel robust student’s t-based granger causality for eeg based brain network analysis. Biomed Signal Process Control 80(104):321

    Google Scholar 

  19. Ge P, Ren CX, Dai DQ, Feng J, Yan S (2019) Dual adversarial autoencoders for clustering. IEEE Trans Neural Netw Learn Syst 31(4):1417–1424

    Article  MathSciNet  Google Scholar 

  20. Ghasedi Dizaji K, Herandi A, Deng C, Cai W, Huang H (2017) Deep clustering via joint convolutional autoencoder embedding and relative entropy minimization. In: Proceedings of the IEEE international conference on computer vision, pp 5736–5745

  21. Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville AC, Bengio Y (2014) Generative adversarial nets. In: Advances in Neural Information Processing Systems, vol 27

  22. Guo X, Gao L, Liu X, Yin J (2017a) Improved deep embedded clustering with local structure preservation. In: Ijcai, pp 1753–1759

  23. Guo X, Liu X, Zhu E, Yin J (2017b) Deep clustering with convolutional autoencoders. In: International conference on neural information processing, pp 373–382

  24. Haeusser P, Plapp J, Golkov V, Aljalbout E, Cremers D (2019) Associative deep clustering: Training a classification network with no labels. In: Pattern Recognition: 40th German Conference, GCPR 2018, Stuttgart, Germany, October 9-12, 2018, Proceedings 40, pp 18–32

  25. Higgins I, Matthey L, Pal A, Burgess C, Glorot X, Botvinick M, Mohamed S, Lerchner A (2017) beta-vae: Learning basic visual concepts with a constrained variational framework. In: International conference on learning representations

  26. Houben S, Stallkamp J, Salmen J, Schlipsing M, Igel C (2013) Detection of traffic signs in real-world images: The german traffic sign detection benchmark. In: The 2013 international joint conference on neural networks (IJCNN), pp 1–8

  27. Hu Q, Zhang G, Qin Z, Cai Y, Yu G, Li GY (2023) Robust semantic communications with masked vq-vae enabled codebook. IEEE Transactions on Wireless Communications p 1

  28. Ji X, Henriques JF, Vedaldi A (2019) Invariant information clustering for unsupervised image classification and segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 9865–9874

  29. Jiang Z, Zheng Y, Tan H, Tang B, Zhou H (2016) Variational deep embedding: An unsupervised and generative approach to clustering. arXiv preprint arXiv:1611.05148

  30. Kim H, Mnih A (2018) Disentangling by factorising. In: International Conference on Machine Learning, pp 2649–2658

  31. Kingma DP, Welling M (2013) Auto-encoding variational bayes. In: International Conference on Learning Representations

  32. Külah E, Çetinkaya YM, Özer AG, Alemdar H (2023) Covid-19 forecasting using shifted gaussian mixture model with similarity-based estimation. Expert Syst Appl 214(119):034

    Google Scholar 

  33. Le Guennec A, Malinowski S, Tavenard R (2016) Data augmentation for time series classification using convolutional neural networks. In: ECML/PKDD workshop on advanced analytics and learning on temporal data, pp 3558–3565

  34. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  35. Li B, Wu F, Weinberger KQ, Belongie S (2019) Positional normalization. vol 32

  36. Li B, Wu F, Lim SN, Belongie S, Weinberger KQ (2021a) On feature normalization and data augmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 12,383–12,392

  37. Li X, Kou K, Zhao B (2021b) Weather gan: Multi-domain weather translation using generative adversarial networks. arXiv preprint arXiv:2103.05422

  38. Liu T, Yuan Q, Ding X, Wang Y, Zhang D (2023) Multi-objective optimization for greenhouse light environment using gaussian mixture model and an improved nsga-ii algorithm. Comput Electron Agric 205(107):612

    Google Scholar 

  39. Liu X, Hu Z, Ling H, Cheung YM (2021) Mtfh: A matrix tri-factorization hashing framework for efficient cross-modal retrieval. IEEE Trans Pattern Anal Mach Intell 43(3):964–981

    Article  Google Scholar 

  40. Lloyd S (1982) Least squares quantization in pcm. IEEE Trans Inf Theory 28(2):129–137

    Article  MathSciNet  Google Scholar 

  41. Van der Maaten L, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9(11):2579–2605

    Google Scholar 

  42. Makhzani A, Shlens J, Jaitly N, Goodfellow I, Frey B (2015) Adversarial autoencoders. arXiv preprint arXiv:1511.05644

  43. Marsaglia G, Tsang WW (2000) A simple method for generating gamma variables. ACM Trans Math Softw (TOMS) 26(3):363–372

    Article  MathSciNet  Google Scholar 

  44. McLachlan GJ, Lee SX, Rathnayake SI (2019) Finite mixture models. Ann Rev Stat Appl 6:355–378

    Article  MathSciNet  Google Scholar 

  45. Meitz M, Preve D, Saikkonen P (2023) A mixture autoregressive model based on student’s t-distribution. Commun Stat Theory Methods 52(2):499–515

    Article  MathSciNet  Google Scholar 

  46. Miklautz L, Bauer LG, Mautz D, Tschiatschek S, Böhm C, Plant C (2021) Details (don’t) matter: Isolating cluster information in deep embedded spaces. In: IJCAI, pp 2826–2832

  47. Mukherjee S, Asnani H, Lin E, Kannan S (2019) Clustergan: Latent space clustering in generative adversarial networks. Proc AAAI Conf Artif Intell 33:4610–4617

    Google Scholar 

  48. Naesseth C, Ruiz F, Linderman S, Blei D (2017) Reparameterization gradients through acceptance-rejection sampling algorithms. In: Artificial Intelligence and Statistics, pp 489–498

  49. Niknam G, Molaei S, Zare H, Clifton D, Pan S (2023) Graph representation learning based on deep generative gaussian mixture models. Neurocomputing 523:157–169

    Article  Google Scholar 

  50. Satheesh C, Kamal S, Mujeeb A, Supriya M (2021) Passive sonar target classification using deep generative \(\beta\)-vae. IEEE Signal Process Lett 28:808–812

    Article  Google Scholar 

  51. Sevgen E, Moller J, Lange A, Parker J, Quigley S, Mayer J, Srivastava P, Gayatri S, Hosfield D, Korshunova M, et al (2023) Prot-vae: Protein transformer variational autoencoder for functional protein design. bioRxiv pp 2023–01

  52. Wolf L, Hassner T, Maoz I (2011) Face recognition in unconstrained videos with matched background similarity. CVPR 2011:529–534

    Google Scholar 

  53. Xiao H, Rasul K, Vollgraf R (2017) Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747

  54. Xie J, Girshick R, Farhadi A (2016) Unsupervised deep embedding for clustering analysis. In: International conference on machine learning, pp 478–487

  55. Xu W, Liu X, Gong Y (2003) Document clustering based on non-negative matrix factorization. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pp 267–273

  56. Yang B, Fu X, Sidiropoulos ND, Hong M (2017) Towards k-means-friendly spaces: Simultaneous deep learning and clustering. In: International conference on machine learning, pp 3861–3870

  57. Yang L, Fan W, Bouguila N (2022) Clustering analysis via deep generative models with mixture models. IEEE Trans Neural Netw Learn Syst 33(1):340–350

    Article  MathSciNet  Google Scholar 

  58. Yang L, Fan W, Bouguila N (2022) Robust unsupervised image categorization based on variational autoencoder with disentangled latent representations. Knowl-Based Syst 246(108):671

    Google Scholar 

  59. Yang L, Fan W, Bouguila N (2023) Deep clustering analysis via dual variational autoencoder with spherical latent embeddings. IEEE Trans Neural Netw Learn Syst 34(9):6303–6312

    Article  Google Scholar 

  60. Yang X, Deng C, Zheng F, Yan J, Liu W (2019) Deep spectral clustering using dual autoencoder network. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4066–4075

  61. Yang X, Yan J, Cheng Y, Zhang Y (2023) Learning deep generative clustering via mutual information maximization. IEEE Trans Neural Netw Learn Syst 34(9):6263–6275

    Article  MathSciNet  Google Scholar 

  62. Zhang Y, Fan W, Bouguila N (2019) Unsupervised image categorization based on variational autoencoder and student’st mixture model. In: 2019 IEEE Symposium Series on Computational Intelligence (SSCI), pp 2403–2409

  63. Zhu X, Zhu Y, Zheng W (2020) Spectral rotation for deep one-step clustering. Pattern Recogn 105(107):175

    Google Scholar 

  64. Zhu X, Xu C, Tao D (2021) Commutative lie group vae for disentanglement learning. In: International Conference on Machine Learning, pp 12,924–12,934

Download references

Acknowledgements

The completion of this work was supported by the National Natural Science Foundation of China (62276106), the Guangdong Provincial Key Laboratory IRADS (2022B1212010006, R0400001-22) and the UIC Start-up Research Fund (UICR0700056-23).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wentao Fan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, K., Fan, W. & Liu, X. Deep generative clustering methods based on disentangled representations and augmented data. Int. J. Mach. Learn. & Cyber. (2024). https://doi.org/10.1007/s13042-024-02173-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13042-024-02173-9

Keywords

Navigation