Advertisement

Knowledge and Information Systems

, Volume 36, Issue 3, pp 629–651 | Cite as

A nonnegative matrix factorization framework for semi-supervised document clustering with dual constraints

  • Huifang MaEmail author
  • Weizhong Zhao
  • Zhongzhi Shi
Regular Paper

Abstract

In this paper, we propose a new semi-supervised co-clustering algorithm Orthogonal Semi-Supervised Nonnegative Matrix Factorization (OSS-NMF) for document clustering. In this new approach, the clustering process is carried out by incorporating both prior domain knowledge of data points (documents) in the form of pair-wise constraints and category knowledge of features (words) into the NMF co-clustering framework. Under this framework, the clustering problem is formulated as the problem of finding the local minimizer of objective function, taking into account the dual prior knowledge. The update rules are derived, and an iterative algorithm is designed for the co-clustering process. Theoretically, we prove the correctness and convergence of our algorithm and demonstrate its mathematical rigorous. Our experimental evaluations show that the proposed document clustering model presents remarkable performance improvements with those constraints.

Keywords

Nonnegative matrix factorization Semi-supervised clustering  Dual constraints Pair-wise constraints Word-level constraints 

Notes

Acknowledgments

This work is supported by the National Natural Science Foundation of China (No. 61163039, 61105052), National Basic Research Priorities Programme (No. 2007CB311004), Funding of enhancement of young teachers’ research of Northwest Normal University (No. NWNU-LKQN-10-1), Doctoral Start-up Funding of Xiangtan University (No. 10QDZ42).

References

  1. 1.
    Banerjee A, Dhillon L et al (2004) A generalized maximum entropy approach to bregman co-clustering and matrix approximation. In: Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining, pp 509–514Google Scholar
  2. 2.
    Basu S, Banerjee A et al (2002) Semi-supervised clustering by seeding. In: Proceedings of the 19th ICML international conference on, machine learning, pp 27–34Google Scholar
  3. 3.
    Basu S, Bilenko M et al (2004) A probabilistic framework for semi-supervised clustering. In: Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining, pp 59–68Google Scholar
  4. 4.
    Beil F, Ester M et al (2002) Frequent term-based text clustering. In: Proceedings of the 8th ACM SIGKDD international conference on knowledge discovery and data mining, pp 436–442Google Scholar
  5. 5.
    Berry MW, Browne M et al (2007) Algorithms and applications for approximate nonnegative matrix factorization. Comput Stat Data Anal 52:155–173MathSciNetzbMATHCrossRefGoogle Scholar
  6. 6.
    Bission G, Hussain F (2008) Chi-sim: a new similarity measure for the co-clustering task. In: Proceedings of the 7th international conference on machine learning and applications, pp 211–217Google Scholar
  7. 7.
    Chang H, Yeung DY (2006) Locally linear metric adaptation for semi-supervised clustering and image retrieval. Pattern Recognit 39(7):1253–1264zbMATHCrossRefGoogle Scholar
  8. 8.
    Chen Y, Rege M et al (2008) Non-negative matrix factorization for semi-supervised data clustering. Knowl Inf Syst 17(3):355–379CrossRefGoogle Scholar
  9. 9.
    Chen Y, Wang L J et al (2009) Semi-supervised document clustering with simultaneous text representation and categorization. Mach Learn Knowl Discov Databases 5781:211–226Google Scholar
  10. 10.
    Chen Y, Wang L et al (2010) Non-negative matrix factorization for semi-supervised heterogeneous data co-clustering. IEEE Trans Knowl Data Eng 22(10):1459–1474CrossRefGoogle Scholar
  11. 11.
    Cover TM, Thomas JA (1991) Elements of information theory. Wiley-Interscience, NewYorkzbMATHCrossRefGoogle Scholar
  12. 12.
    Davidson I, Ravi T (2005) Clustering with constraints: feasibility issues and the FK-means algorithm. In: Proceedings of the 5th SIAM international conference on data mining, pp 138–149Google Scholar
  13. 13.
    Dhillon IS, Modha DS (2001) Concept decompositions for large sparse text data using clustering. Mach Learn 42(1–2):143–175zbMATHCrossRefGoogle Scholar
  14. 14.
    Dhillon IS, Mallela S et al (2003) Information-theoretic co-clustering. In: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining, pp 89–98Google Scholar
  15. 15.
    Ding CH, Li T et al (2008) Convex and semi-nonnegative matrix factorizations. IEEE Trans Pattern Anal Mach Intell 99(1):195–197Google Scholar
  16. 16.
    Ding CH, Li T et al (2006) Orthogonal nonnegative matrix tri-factorizations for clustering. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, pp 126–135Google Scholar
  17. 17.
    Gu Q, Zhou J (2009) Co-clustering on manifolds. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, pp 359–367Google Scholar
  18. 18.
    Ho ND (2008) Nonnegative matrix factorization-algorithms and applications. PhD thesis, Université catholique de Louvain, BelgiumGoogle Scholar
  19. 19.
    Hu G, Zhou S et al (2008) Toward effective document clustering: a constrained K-means based approach. Inf Process Manag 44(4):1397–1409Google Scholar
  20. 20.
    Kalogeratos A, Likas A (2012) Text document clustering using global term context vectors. Knowl Inf Syst 31(3):455–474Google Scholar
  21. 21.
    Kamvar SD, Klein D, Manning CD (2003) Spectral learning. In: Proceedings of the 18th international joint conference on artificial intelligence, pp 561–566Google Scholar
  22. 22.
    Klein D, Kamvar S, Manning C (2002) From instance-level constraints to space-level constraints: making the most of prior knowledge in data clustering. In: Proceedings of the 19th international conference on machine learning, pp 307–314Google Scholar
  23. 23.
    Kriegel HP, Kröger P, Zimek A (2009) Clustering high dimensional data: a survey on subspace clustering, pattern-based clustering, and correlation clustering. ACM Trans Knowl Discov Data 3(1):1–58CrossRefGoogle Scholar
  24. 24.
    Lee D, Seung H (2001) Algorithms for non-negative matrix factorization. In: Proceedings of annual conference on neural information processing systems, pp 556–562Google Scholar
  25. 25.
    Lee H, Yoo J et al (2010) Semi-supervised nonnegative matrix factorization. IEEE Signal Process Lett 46(2):269–294Google Scholar
  26. 26.
    Levin M (1998) Mathematical classification and clustering. J Glob Optimiz 12(1):105–108CrossRefGoogle Scholar
  27. 27.
    Li T, Ding C et al (2008) Knowledge transformation from word space to document space. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval, pp 187–194Google Scholar
  28. 28.
    Li, T, Zhang Y et al (2009) A non-negative matrix tri-factorization approach to sentiment classification with lexical prior knowledge. In: Proceedings of the 47th annual meeting of the ACL and the 4th IJCNLP of the AFNLP, pp 244–252Google Scholar
  29. 29.
    Lu Z, Leen TK (2007) Penalized probabilistic clustering. Neural Comput 19(6):1528–1567MathSciNetzbMATHCrossRefGoogle Scholar
  30. 30.
    Mechelen IV, Bock HH, Boeck DP (2004) Two-mode clustering methods: a structured overview. Stat Methods Med Res 13(5):363–394MathSciNetzbMATHCrossRefGoogle Scholar
  31. 31.
    Ni X, Quan X et al (2011) Short text clustering by finding core terms. Knowl Inf Syst 27(3):345–365CrossRefGoogle Scholar
  32. 32.
    Paatero P, Tapper U (1994) Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5:111–126CrossRefGoogle Scholar
  33. 33.
    Rege M, Dong M (2006) Co-clustering documents and words using bipartite isoperimetric graph partitioning. In: Proceedings of the 6th international conference on data mining, pp 532–541Google Scholar
  34. 34.
    Salton G, Wong A et al (1975) A vector space model for automatic indexing. Commun ACM 18(11): 613–620zbMATHCrossRefGoogle Scholar
  35. 35.
    Shan H., Banerjee A (2008) Bayesian co-clustering. In: Proceedings of the 8th international conference on data mining, pp 530–539Google Scholar
  36. 36.
    Song YQ, Pan S et al (2010) Constrained co-clustering for textual documents. In: Proceedings of the 24th AAAI conference on artificial intelligence, pp 581–586Google Scholar
  37. 37.
    Thurau C, Kersting K et al (2011) Convex non-negative matrix factorization for massive datasets. Knowl Inf Syst 29(2):457–478CrossRefGoogle Scholar
  38. 38.
    Verbeek JJ, Nunnink JRJ et al (2006) Accelerated EM-based clustering of large data sets. Data Min Knowl Discov 13(3):291–307MathSciNetCrossRefGoogle Scholar
  39. 39.
    Wagstaff K, Cardie C et al (2001) Constrained K-means clustering with background knowledge. In: Proceedings of the 18th international conference on machine learning, pp 577–584Google Scholar
  40. 40.
    Wang F, Li T et al (2008) Semi-supervised clustering via matrix factorization. In: Proceedings of the 8th SIAM international conference on data mining, pp 1–12Google Scholar
  41. 41.
    Wang P, Domeniconi C et al (2009) Latent dirichlet bayesian co-clustering. Mach Learn Knowl Discov Databases 5782:522–537Google Scholar
  42. 42.
    Xing EP, Ng AY et al (2002) Distance metric learning, with application to clustering with side-information. Adv Neural Inf Process Syst 15:502–512Google Scholar
  43. 43.
    Xu W, Liu X, Gong Y (2003) Document clustering based on non-negative matrix factorization. In: Proceedings of the 26th ACM SIGIR conference on research and development in information retrieval, pp 267–273Google Scholar
  44. 44.
    Yan Y, Chen L X et al (2011) Semi-supervised fuzzy co-clustering algorithm for document categorization. Knowl Inf Syst (published online)Google Scholar
  45. 45.
    Yin X, Chen S et al (2010) Semi-supervised clustering with metric learning: an adaptive kernel method. Pattern Recognit 43(4):1320–1333MathSciNetzbMATHCrossRefGoogle Scholar
  46. 46.
    Zhang ZY, Li T et al (2012) Non-negative tri-factor tensor decomposition with applications. Knowl Inf Syst (published online)Google Scholar
  47. 47.
    Zhao WZ, He Q, Ma HF et al (2011) Effective semi-supervised document clustering via active learning with instance-level constraints. Knowl Inf Syst 30(3):569–587CrossRefGoogle Scholar
  48. 48.
    Zhu Y, Yu J et al (2012) A novel semi-supervised learning framework with simultaneous text representing. Knowl Inf Syst (published online)Google Scholar

Copyright information

© Springer-Verlag London 2012

Authors and Affiliations

  1. 1.College of Computer Science and Engineering Northwest Normal UniversityLanzhouChina
  2. 2.College of Information EngineeringXiangtan UniversityXiangtanChina
  3. 3.The Key Laboratory of Intelligent Information Processing, Institute of Computing TechnologyChinese Academy of SciencesBeijingChina

Personalised recommendations