Subtractive Initialization of Nonnegative Matrix Factorizations for Document Clustering
Nonnegative matrix factorizations (NMF) have recently assumed an important role in several fields, such as pattern recognition, automated image exploitation, data clustering and so on. They represent a peculiar tool adopted to obtain a reduced representation of multivariate data by using additive components only, in order to learn parts-based representations of data. All algorithms for computing the NMF are iterative, therefore particular emphasis must be placed on a proper initialization of NMF because of its local convergence. The problem of selecting appropriate starting initialization matrices becomes more complex when data possess special meaning, and this is the case of document clustering. In this paper, we present a new initialization method which is based on the fuzzy subtractive scheme and used to generate initial matrices for NMF algorithms. A preliminary comparison of the proposed initialization with other commonly adopted initializations is presented by considering the application of NMF algorithms in the context of document clustering.
KeywordsInitialization Strategy Document Cluster Alternate Little Square Subtractive Cluster Initial Matrice
Unable to display preview. Download preview PDF.
- 3.Chiu, S.L.: Fuzzy Model Estimation based on Cluster Estimation. J. Intelligent and Fuzzy Systems 2, 267–278 (1994)Google Scholar
- 4.Choi, S.: Algorithms for orthogonal nonnegative matrix factorization. Proc. Intern. Joint Conf. Neural Networks (2008)Google Scholar
- 6.Del Buono, N., Lucarelli, M.: Comparative studies on initializations for nonnegative matrix factorization algorithms, Tech. Rep. 17/10, Univ. Bari, Italy (2010)Google Scholar
- 9.Lee, D.D., Seung, S.H.: Algorithms for non-negative matrix factorization. In: Proc. Adv. Neural Information Proc. Syst. Conf., vol. 13, pp. 556–562 (2000)Google Scholar
- 11.Xu, W., Liu, X., Gong, Y.: Document clustering based on nonnegative matrix factorization. In: Proc. SIGIR, pp. 267–273 (2003)Google Scholar