Parallel multi-view concept clustering in distributed computing

Abstract

Multi-view clustering (MvC) is an emerging task in data mining. It aims at partitioning the data sampled from multiple views. Although a great deal of research has been done, this task remains to be very challenging. We found an important problem in performing the MvC task. MvC needs large amounts of computation. To address this problem, we propose a parallel MvC method in a distributed computing environment. The proposed method builds upon concept factorization with local manifold learning, denoted by parallel multi-view concept clustering (PMCC). Concept factorization learns a compressed representation for the data. Local manifold learning preserves the locally intrinsic geometrical structure in the data. The weight of each view is learned automatically and a cooperative normalized approach is proposed to better guide the learning of a consensus representation for all views. For the proposed PMCC architecture, the calculation of each part is independent. It is clear that our PMCC can be performed in a distributed computing environment. Experimental results using real-world datasets demonstrate the effectiveness of the proposed method.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2

Notes

  1. 1.

    Based on our prior work [32], the value of \(\hat{t}\) is less than 50.

  2. 2.

    http://lig-membres.imag.fr/grimal/data.html.

  3. 3.

    http://mlg.ucd.ie/datasets/3sources.html.

  4. 4.

    http://elki.dbs.ifi.lmu.de/wiki/DataSets/MultiView.

  5. 5.

    http://mlg.ucd.ie/datasets/segment.html.

  6. 6.

    https://cs.nyu.edu//~roweis/data.html.

  7. 7.

    http://www.svcl.ucsd.edu/projects/crossmodal/.

References

  1. 1.

    Appice A, Malerba D (2016) A co-training strategy for multiple view clustering in process mining. IEEE Trans Serv Comput 9(6):832–845

    Article  Google Scholar 

  2. 2.

    Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434

    MathSciNet  MATH  Google Scholar 

  3. 3.

    Cai D, He X, Han J (2011a) Locally consistent concept factorization for document clustering. IEEE Trans Knowl Data Eng 23(6):902–913

    Article  Google Scholar 

  4. 4.

    Cai D, He X, Han J, Huang TS (2011b) Graph regularized nonnegative matrix factorization for data representation. IEEE Trans Pattern Anal Mach Intell 33(8):1548–1560

    Article  Google Scholar 

  5. 5.

    Cai X, Nie F, Huang H (2013) Multi-view K-means clustering on big data. In: Proceedings of the international joint conferences on artificial intelligence, pp 2598–2604

  6. 6.

    Chao G, Sun S, Bi J (2017) A survey on multi-view clustering. arXiv:171206246

  7. 7.

    Chen J, Li K, Tang Z, Bilal K, Yu S, Weng C, Li K (2017) A parallel random forest algorithm for big data in a spark cloud computing environment. IEEE Trans Parallel Distrib Syst 28(4):919–933

    Article  Google Scholar 

  8. 8.

    Chen J, Li K, Bilal K, Zhou X, Li K, Yu P (2018a) A bi-layered parallel training architecture for large-scale convolutional neural networks. IEEE Trans Parallel Distrib Syst. https://doi.org/10.1109/TPDS.2018.2877359

    Article  Google Scholar 

  9. 9.

    Chen J, Li K, Rong H, Bilal K, Li K, Philip SY (2018b) A periodicity-based parallel time series prediction algorithm in cloud computing environments. Inf Sci. https://doi.org/10.1016/j.ins.2018.06.045

    Article  Google Scholar 

  10. 10.

    Ding C, Li T, Jordan MI (2010) Convex and semi-nonnegative matrix factorizations. IEEE Trans Pattern Anal Mach Intell 32(1):45–55

    Article  Google Scholar 

  11. 11.

    Gao H, Nie F, Li X, Huang H (2015) Multi-view subspace clustering. In: Proceedings of the IEEE international conference on computer vision, pp 4238–4246

  12. 12.

    Hou C, Nie F, Tao H, Yi D (2017) Multi-view unsupervised feature selection with adaptive similarity and view weight. IEEE Trans Knowl Data Eng 29(9):1998–2011

    Article  Google Scholar 

  13. 13.

    Huang S, Kang Z, Xu Z (2018a) Self-weighted multi-view clustering with soft capped norm. Knowl Based Syst 158:1–8

    Article  Google Scholar 

  14. 14.

    Huang S, Ren Y, Xu Z (2018b) Robust multi-view data clustering with multi-view capped-norm k-means. Neurocomputing 311:197–208

    Article  Google Scholar 

  15. 15.

    Huang S, Kang Z, Tsang IW, Xu Z (2019) Auto-weighted multi-view clustering via kernelized graph learning. Pattern Recognit 88:174–184

    Article  Google Scholar 

  16. 16.

    Hussain SF, Bashir S (2016) Co-clustering of multi-view datasets. Knowl Inf Syst 47(3):1–26

    Article  Google Scholar 

  17. 17.

    Kumar A, Rai P, III HD (2011) Co-regularized multi-view spectral clustering. In: Proceedings of the advances in neural information processing systems, pp 1413–1421

  18. 18.

    Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788

    MATH  Article  Google Scholar 

  19. 19.

    Lee DD, Seung HS (2001) Algorithms for non-negative matrix factorization. In: Proceedings of the advances in neural information processing systemsSyst, pp 556–562

  20. 20.

    Li K, Tang X, Veeravalli B, Li K (2015a) Scheduling precedence constrained stochastic tasks on heterogeneous cluster systems. IEEE Trans Comput 64(1):191–204

    MathSciNet  MATH  Article  Google Scholar 

  21. 21.

    Li K, Yang W, Li K (2015b) Performance analysis and optimization for spmv on gpu using probabilistic modeling. IEEE Trans Parallel Distrib Syst 26(1):196–205

    MathSciNet  Google Scholar 

  22. 22.

    Liu J, Wang C, Gao J, Han J (2013a) Multi-view clustering via joint nonnegative matrix factorization. In: Proceedings of the SIAM international conferences on data mining, pp 252–260

  23. 23.

    Liu X, Wang L, Yin J, Zhu E, Zhang J (2013b) An efficient approach to integrating radius information into multiple kernel learning. IEEE Trans Cybern 43(2):557–569

    Article  Google Scholar 

  24. 24.

    Liu X, Zhu X, Li M, Wang L, Tang C, Yin J, Shen D, Wang H, Gao W (2018) Late fusion incomplete multi-view clustering. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2018.2879108

    Article  Google Scholar 

  25. 25.

    Lu C, Yan S, Lin Z (2016) Convex sparse spectral clustering: single-view to multi-view. IEEE Trans Image Process 25(6):2833–2843

    MathSciNet  MATH  Article  Google Scholar 

  26. 26.

    Nie F, Li J, Li X (2017) Self-weighted multiview clustering with multiple graphs. In: Proceedings of the international joint conferences on artificial intelligence, pp 2564–2570

  27. 27.

    Nie F, Cai G, Li J, Li X (2018) Auto-weighted multi-view learning for image clustering and semi-supervised classification. IEEE Trans Image Process 27(3):1501–1511

    MathSciNet  MATH  Article  Google Scholar 

  28. 28.

    Sun J, Lu J, Xu T, Bi J (2015) Multi-view sparse co-clustering via proximal alternating linearized minimization. In: Proceedings of the international conference on machine learning, pp 757–766

  29. 29.

    Tao H, Hou C, Liu X, Liu T, Yi D, Zhu J (2018) Reliable multi-view clustering. In: Proceedings of the AAAI conference on artificial intelligence, pp 4123–4130

  30. 30.

    Tong M, Chen Y, Zhao M, Bu H, Xi S (2018) A deep discriminative and robust nonnegative matrix factorization network method with soft label constraint. Neural Comput Appl. https://doi.org/10.1007/s00521-018-3554-6

    Article  Google Scholar 

  31. 31.

    Tzortzis G, Likas A (2012) Kernel-based weighted multi-view clustering. In: Proceedings of the international conferences on data mining, pp 675–684

  32. 32.

    Wang H, Yang Y, Li T (2016) Multi-view clustering via concept factorization with local manifold regularization. In: Proceedings of the international conferences on data mining, pp 1245–1250

  33. 33.

    Wang H, Yang Y, Liu B (2019a) GMC: graph-based multi-view clustering. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2019.2903810

    Article  Google Scholar 

  34. 34.

    Wang H, Yang Y, Liu B, Fujita H (2019b) A study of graph-based system for multi-view clustering. Knowl Based Syst 163:1009–1019

    Article  Google Scholar 

  35. 35.

    Wang Y, Lin X, Wu L, Zhang W, Zhang Q, Huang X (2015a) Robust subspace clustering for multi-view data by exploiting correlation consensus. IEEE Trans Image Process 24(11):3939–3949

    MathSciNet  MATH  Article  Google Scholar 

  36. 36.

    Wang Y, Liu X, Dou Y, Li R (2017) Multiple kernel clustering framework with improved kernels. In: Proceedings of the international joint conferences on artificial intelligence, pp 2999–3005

  37. 37.

    Wang Z, Kong X, Fu H, Li M, Zhang Y (2015b) Feature extraction via multi-view non-negative matrix factorization with local graph regularization. In: IEEE international conference image processing, pp 3500–3504

  38. 38.

    Xia R, Pan Y, Du L, Yin J (2014) Robust multi-view spectral clustering via low-rank and sparse decomposition. In: Proceedings of the AAAI conference on artificial intelligence, pp 2149–2155

  39. 39.

    Xu C, Tao D, Xu C (2015) Multi-view learning with incomplete views. IEEE Trans Image Process 24(12):5812–5825

    MathSciNet  MATH  Article  Google Scholar 

  40. 40.

    Xu W, Gong Y (2004) Document clustering by concept factorization. In: ACM SIGIR conference on research and development in information retrieval, pp 202–209

  41. 41.

    Yang S, Hou C, Zhang C, Wu Y (2013) Robust non-negative matrix factorization via joint sparse and graph regularization for transfer learning. Neural Comput Appl 23(2):541–559

    Article  Google Scholar 

  42. 42.

    Yang Y, Wang H (2018) Multi-view clustering: a survey. Big Data Min Anal 1(2):83–107

    Article  Google Scholar 

  43. 43.

    Yang Y, Teng F, Li T, Wang H, Wang H, Zhang Q (2018) Parallel semi-supervised multi-ant colonies clustering ensemble based on mapreduce methodology. IEEE Trans Cloud Comput 6(3):857–867

    Article  Google Scholar 

  44. 44.

    Zhan K, Chang X, Guan J, Chen L, Ma Z, Yang Y (2018) Adaptive structure discovery for multimedia analysis using multiple features. IEEE Trans Cybern 49(5):1826–1834

    Article  Google Scholar 

  45. 45.

    Zong L, Zhang X, Zhao L, Yu H, Zhao Q (2017) Multi-view clustering via multi-manifold regularized non-negative matrix factorization. Neural Netw 88:74–89

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grant No. 61572407, and the Seeding Project of Scientific and Technological Innovation in Sichuan Province of China under Grant No. 2018102.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Yan Yang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wang, H., Yang, Y., Zhang, X. et al. Parallel multi-view concept clustering in distributed computing. Neural Comput & Applic 32, 5621–5631 (2020). https://doi.org/10.1007/s00521-019-04243-4

Download citation

Keywords

  • Multi-view clustering
  • Concept factorization
  • Manifold learning
  • Distributed computing