Abstract
Recently many topic models such as Latent Dirichlet Allocation (LDA) and Non-negative Matrix Factorization (NMF) have made important progress towards generating high-level knowledge from a large corpus. However, these algorithms based on random initialization generate different results on the same corpus using the same parameters, denoted as instability problem. For solving this problem, ensembles of NMF are known to be much more stable and accurate than individual NMFs. However, training multiple NMFs for ensembling is computationally expensive. In this paper, we propose a novel scheme to obtain the seemingly contradictory goal of ensembling multiple NMFs without any additional training cost. We train a single NMF algorithm with the cyclical learning rate schedule, which can converge to several local minima along its optimization path. We save the results to the ensemble when the model converges, and then restart the optimization with a large learning rate that can help escape the current local minimum. Based on experiments performed on text corpora using a number of measures to assess, our method can reduce instability at no additional training cost, while simultaneously yields more accurate topic models than traditional single methods and ensemble methods.
Similar content being viewed by others
References
Arora S, Ge R, Moitra A (2012) Learning topic models–going beyond svd. In: FOCS, pp 1–10
Arthur D, Vassilvitskii S (2007) k-means++: the advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms. Society for Industrial and Applied Mathematics, pp 1027–1035
Bdiri T, Bouguila N, Ziou D (2016) Variational bayesian inference for infinite generalized inverted dirichlet mixtures with feature selection and its application to clustering. Appl Intell 44(3):507–525
Belford M, Mac Namee B, Greene D (2018) Stability of topic modeling via matrix factorization. Expert Syst Appl 91:159–169
Ben-Hur A, Elisseeff A, Guyon I (2002) A stability based method for discovering structure in clustered data. In: Proceedings of the 7th Pacific symposium on biocomputing. vol 7, pp 6–17
Blei DM (2012) Probabilistic topic models. Commun ACM 55(4):77–84
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022
Bouma G (2009) Normalized (pointwise) mutual information in collocation extraction. In: Proceedings of the Biennial GSCL Conference, vol 156
Boutsidis C, Gallopoulos E (2008) Svd based initialization: a head start for nonnegative matrix factorization. Pattern Recogn 41(4):1350–1362
Brunet JP, Tamayo P, Golub TR, Mesirov JP (2004) Metagenes and molecular pattern discovery using matrix factorization. Proc Natl Acad Sci 101(12):4164–4169
Chen Z, Liu B (2014) Mining topics in documents: standing on the shoulders of big data. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 1116–1125
Cheng X, Yan X, Lan Y, Guo J (2014) Btm: topic modeling over short texts. IEEE Trans Knowl Data Eng 26(12):2928–2941
Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Amer Soc Inform Sci 41(6):391
Gao H, Nie F, Heng H (2017) Local centroids structured non-negative matrix factorization. In: AAAI, pp 1905–1911
Garc SA, Ndez A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inf Sci 180(10):2044–2064
Greene D, Cagney G, Krogan N, Cunningham P (2008) Ensemble non-negative matrix factorization methods for clustering protein–protein interactions. Bioinformatics 24(15):1722–1728
Greene D, Cunningham P (2005) Producing accurate interpretable clusters from high-dimensional data. In: PKDD. Springer, pp 486–494
Greene D, O’Callaghan D, Cunningham P (2014) How many topics? stability analysis for topic models. In: Joint european conference on machine learning and knowledge discovery in databases. Springer, pp 498–513
Griffiths T, Steyvers M (2004) Finding scientific topics. Proc Natl Acad Sci 101(suppl 1):5228–5235
Hadjitodorov ST, Kuncheva LI, Todorova LP (2006) Moderate diversity for better cluster ensembles. Inf Fusion 7(3):264–275
Hang G, Li Y, Pleiss G (2017) Snapshot ensembles: train 1, get m for free. In: ICLR
Hofmann T (1999) Probabilistic latent semantic indexing. In: Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval. ACM, pp 50–57
Hofree M, Shen JP, Carter H, Gross A, Ideker T (2013) Network-based stratification of tumor mutations. Natur Methods 10(11):1108–1115
Kim H, Park H (2008) Nonnegative matrix factorization based on alternating nonnegativity constrained least squares and active set method. SIAM J Matrix Anal Appl 30(2):713–730
Kuang D, Choo J, Park H (2015) Nonnegative matrix factorization for interactive topic modeling and document clustering. In: Partitional clustering algorithms. Springer, pp 215–243
Kuhn HW (1955) The hungarian method for the assignment problem. Naval Res Logist (NRL) 2(1-2):83–97
Kuncheva LI, Vetrov DP (2006) Evaluation of stability of k-means cluster ensembles with respect to random initialization. IEEE Trans Pattern Anal Mach Intell 28(11):1798–1808
Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401 (6755):788–791
Lin CJ (2007) Projected gradient methods for nonnegative matrix factorization. Neural Comput 19(10):2756–2779
Loshchilov I, Hutter F (2016) Sgdr: stochastic gradient descent with restarts. arXiv:1608.03983
Minaei-Bidgoli B, Topchy A, Punch WF (2004) Ensembles of partitions via data resampling. In: 2004. Proceedings. ITCC 2004. International conference on Information technology: coding and computing, vol 2. IEEE, pp 188–192
Newman D, Bonilla EV, Buntine W (2011) Improving topic coherence with regularized topic models. In: Advances in neural information processing systems, pp 496–504
O’Callaghan D, Greene D, Carthy J, Cunningham P (2015) An analysis of the coherence of descriptors in topic modeling. Expert Syst Appl 42(13):5645–5657
Qiang J, Li Y, Yuan Y, Wu X (2018) Short text clustering based on pitman-yor process mixture model. Applied Intelligence, https://doi.org/10.1007/s10489-017-1055-4
Sandhaus E (2008) The new york times annotated corpus. Linguistic Data Consortium. Philadelphia 6 (12):e26,752
Smith LN (2015) No more pesky learning rate guessing games. Arxiv June
Steyvers M, Smyth P, Rosen-Zvi M, Griffiths T (2004) Probabilistic author-topic models for information discovery. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pp 306–315
Strehl A, Ghosh J (2002) Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3(Dec):583–617
Suh S, Choo J, Lee J, Reddy CK (2016) L-ensnmf: boosted local topic discovery via ensemble of nonnegative matrix factorization
Wang Z, Gu s, Xu X (2018) Gslda: Lda-based group spamming detection in product reviews. Applied Intelligence, https://doi.org/10.1007/s10489-018-1142-1
Wild S, Curry J, Dougherty A (2004) Improving non-negative matrix factorizations through structured initialization. Pattern Recogn 37(11):2217–2232
Xie P, Yang D, Xing EP (2015) Incorporating word correlation knowledge into topic modeling. In: Conference of the north american chapter of the association for computational linguistics
Zhou X, Ouyang J, Li X (2018) Two time-efficient gibbs sampling inference algorithms for biterm topic model. Appl Intell 48(3):730–754
Acknowledgements
This research is partially supported by the the National Natural Science Foundation of China under grants (61703362, 61702441, 61402203), Natural Science Foundation of Jiangsu Province of China under grants (BK20170513, BK20161338), the Natural Science Foundation of the Higher Education Institutions of Jiangsu Province of China under grant 17KJB520045, and the Science and Technology Planning Project of Yangzhou of China under grant YZ2016238.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Qiang, J., Li, Y., Yuan, Y. et al. Snapshot ensembles of non-negative matrix factorization for stability of topic modeling. Appl Intell 48, 3963–3975 (2018). https://doi.org/10.1007/s10489-018-1192-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-018-1192-4