Advertisement

Adaptive local learning regularized nonnegative matrix factorization for data clustering

  • Yongpan Sheng
  • Meng Wang
  • Tianxing Wu
  • Han Xu
Article
  • 53 Downloads

Abstract

Data clustering aims to group the input data instances into certain clusters according to the high similarity to each other, and it could be regarded as a fundamental and essential immediate or intermediate task that appears in areas of machine learning, pattern recognition, and information retrieval. Clustering algorithms based on graph regularized extensions have accumulated much interest for a couple of decades, and the performance of this category of approaches is largely determined by the data similarity matrix, which is usually calculated by the predefined model with carefully tuned parameters combination. However, they may lack a more flexible ability and not be optimal in practice. In this paper, we consider both discriminative information as well as the data manifold in a matrix factorization point of view, and propose an adaptive local learning regularized nonnegative matrix factorization (ALLRNMF) approach for data clustering, which assumes that similar instance pairs with a smaller distance should have a larger probability to be assigned to the probabilistic neighbors. ALLRNMF simultaneously learns the data similarity matrix under the assumption and performs the nonnegative matrix factorization. The constraint of the similarity matrix encodes both the discriminative information as well as the learned adaptive local structure and benefits the data clustering on manifold. In order to solve the optimization problem of our approach, an effective alternative optimization algorithm is proposed such that our objective function could be decomposed into several subproblems that each has an optimal solution, and its convergence is theoretically guaranteed. Experiments on real-world benchmark datasets demonstrate the superior performance of our approach against the existing clustering approaches.

Keywords

Adaptive local structure learning Manifold regularization Nonnegative matrix factorization Data clustering 

References

  1. 1.
    Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434MathSciNetzbMATHGoogle Scholar
  2. 2.
    Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  3. 3.
    Cai D, He X, Han J, Huang TS (2011) Graph regularized nonnegative matrix factorization for data representation. IEEE Trans Pattern Anal Mach Intell 33(8):1548–1560CrossRefGoogle Scholar
  4. 4.
    Cai D, He X, Wu X, Han J (2008) Non-negative matrix factorization on manifold. In: Proceedings of the 8th international conference on data mining. IEEE, Piscataway, pp 63–72Google Scholar
  5. 5.
    Cai X, Nie F, Huang H (2013) Multi-view k-means clustering on big data. In: Proceedings of the 25th international joint conference on artificial intelligence. AAAI, Cambridge, pp 2598–2604Google Scholar
  6. 6.
    Chung FR (1997) Spectral graph theory, vol 92 American Mathematical SocGoogle Scholar
  7. 7.
    Ding C, Li T, Jordan MI (2009) Convex and semi-nonnegative matrix factorizations. IEEE Trans Pattern Anal Mach Intell 32(1):45–55CrossRefGoogle Scholar
  8. 8.
    Ding C, Li T, Peng W, Park H (2006) Orthogonal nonnegative matrix t-factorizations for clustering. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 126–135Google Scholar
  9. 9.
    Elhamifar E, Vidal R (2015) Sparse subspace clustering: algorithm, theory, and applications. IEEE Trans Pattern Anal Mach Intell 35(11):2765–2781CrossRefGoogle Scholar
  10. 10.
    Gokcay E, Principe JC (2002) Information theoretic clustering. IEEE Trans Pattern Anal Mach Intell 24 (2):158–171CrossRefGoogle Scholar
  11. 11.
    Gu Q, Ding C, Han J (2011) On trivial solution and scale transfer problems in graph regularized nmf. In: Proceedings of the 23rd international joint conference on artificial intelligence, vol 22. AAAI, Cambridge, pp 1288–1295Google Scholar
  12. 12.
    Gu Q, Zhou J (2009) Local learning regularized nonnegative matrix factorization. In: Proceedings of the 21st international joint conference on artificial intelligence. AAAI, Cambridge, pp 1046–1051Google Scholar
  13. 13.
    Guo X (2015) Robust subspace segmentation by simultaneously learning data representations and their affinity matrix Proceeding of the 24nd international joint conference on artificial intelligence. AAAI, Cambridge, pp 3547–3553Google Scholar
  14. 14.
    Hagen L, Kahng AB (2006) New spectral methods for ratio cut partitioning and clustering. IEEE Trans Comput Aided Des Integr Circuits Syst 11(9):1074–1085CrossRefGoogle Scholar
  15. 15.
    Han EH, Boley D, Gini M, Gross R, Hastings K, Karypis G, Kumar V, Mobasher B, Moore J (1998) Webace:a web agent for document categorization and exploration. In: Proceedings of the 2nd international conference on autonomous agents, pp 408– 415Google Scholar
  16. 16.
    Huang J, Nie F, Huang H, Ding C (2014) Robust manifold nonnegative matrix factorization. ACM Trans Knowl Discov Data 8(3):11CrossRefGoogle Scholar
  17. 17.
    Huang S, Wang H, Li T, Li T, Xu Z (2018) Robust graph regularized nonnegative matrix factorization for clustering. Data Min Knowl Disc 32(2):483–503MathSciNetCrossRefGoogle Scholar
  18. 18.
    Huang S, Xu Z, Lv J (2018) Adaptive local structure learning for document co-clustering. Knowl-Based Syst 148:74–84CrossRefGoogle Scholar
  19. 19.
    Jain AK (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323CrossRefGoogle Scholar
  20. 20.
    Lee DD, Seung HS (2001) Algorithms for non-negative matrix factorization. In: Proceedings of the 14th advances in neural information processing systems. MIT Press, Cambridge, pp 556–562Google Scholar
  21. 21.
    Liu G, Lin Z, Yan S, Sun J, Yu Y, Ma Y (2013) Robust recovery of subspace structures by low-rank representation. IEEE Trans Pattern Anal Mach Intell 35(1):171–184CrossRefGoogle Scholar
  22. 22.
    Liu Y, Jiao L, Shang F (2013) A fast tri-factorization method for low-rank matrix recovery and completion. Pattern Recogn 46(1):163–173CrossRefGoogle Scholar
  23. 23.
    Luxburg UV (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416MathSciNetCrossRefGoogle Scholar
  24. 24.
    MacQueen J, et al. (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the 50th Berkeley symposium on mathematical statistics and probability, vol 1, pp 281-297, Oakland, USAGoogle Scholar
  25. 25.
    Nie F, Wang X, Huang H (2014) Clustering and projected clustering with adaptive neighbors. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 977–986Google Scholar
  26. 26.
    Nie F, Wang X, Jordan MI, Huang H (2016) The constrained laplacian rank algorithm for graph-based clustering. In: Proceedings of the 30th AAAI conference on artificial intelligence. AAAI, Cambridge, pp 1969–1976Google Scholar
  27. 27.
    Peng C, Kang Z, Hu Y, Cheng J, Cheng Q (2017) Robust graph regularized nonnegative matrix factorization for clustering. ACM Trans Knowl Discov Data (TKDD) 11(3):33Google Scholar
  28. 28.
    Rai N, Negi S, Chaudhury S, Deshmukh O (2016) Partial multi-view clustering using graph regularized nmf. In: Proceeding of 23rd international conference on pattern recognition (ICPR). IEEE, Piscataway, pp 2192–2197Google Scholar
  29. 29.
    Seung HS, Lee DD (2000) The manifold ways of perception. Science 290(5500):2268–2269CrossRefGoogle Scholar
  30. 30.
    Shang F, Jiao L, Wang F (2012) Graph dual regularization non-negative matrix factorization for co-clustering. Pattern Recogn 45(6):2237–2250CrossRefGoogle Scholar
  31. 31.
    Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22 (8):888–905CrossRefGoogle Scholar
  32. 32.
    Shlens J (2014) A tutorial on principal component analysis. arXiv:1404.1100
  33. 33.
    Tenenbaum JB, De Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323CrossRefGoogle Scholar
  34. 34.
    Wang H, Nie F, Huang H, Makedon F (2011) Fast nonnegative matrix tri-factorization for large-scale data co-clustering. In: Proceedings of the 22nd international joint conference on artificial intelligence. AAAI, Cambridge, pp 1553–1558Google Scholar
  35. 35.
    Wang S, Tang J, Liu H (2015) Embedded unsupervised feature selection. In: Proceedings of the 29th AAAI conference on artificial intelligence. AAAI, Cambridge, pp 470–476Google Scholar
  36. 36.
    Xu W, Liu X, Gong Y (2003) Document clustering based on non-negative matrix factorization. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval. ACM, New York, pp 267–273Google Scholar
  37. 37.
    Xu Z, King I, Lyu MRT, Jin R (2010) Discriminative semi-supervised feature selection via manifold regularization. IEEE Trans Neural Netw 21(7):1033–1047CrossRefGoogle Scholar
  38. 38.
    Yoo J, Choi S (2010) Orthogonal nonnegative matrix tri-factorization for co-clustering: multiplicative updates on stiefel manifolds. Inf Process Manag 46(5):559–570CrossRefGoogle Scholar
  39. 39.
    Zhang L, Zhang Q, Du B, You J, Tao D (2017) Adaptive manifold regularized matrix factorization for data clustering. AAAI, CambridgeCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of Computer Science and EngineeringUniversity of Electronic Science and Technology of ChinaChengduChina
  2. 2.School of Computer Science and EngineeringSoutheast UniversityNanjingChina
  3. 3.School of Computer Science and EngineeringNanyang Technological UniversitySingaporeSingapore

Personalised recommendations