Knowledge and Information Systems

, Volume 56, Issue 3, pp 503–531 | Cite as

Localized user-driven topic discovery via boosted ensemble of nonnegative matrix factorization

  • Sangho Suh
  • Sungbok Shin
  • Joonseok Lee
  • Chandan K. Reddy
  • Jaegul Choo
Regular Paper


Nonnegative matrix factorization (NMF) has been widely used in topic modeling of large-scale document corpora, where a set of underlying topics are extracted by a low-rank factor matrix from NMF. However, the resulting topics often convey only general, thus redundant information about the documents rather than information that might be minor, but potentially meaningful to users. To address this problem, we present a novel ensemble method based on nonnegative matrix factorization that discovers meaningful local topics. Our method leverages the idea of an ensemble model, which has shown advantages in supervised learning, into an unsupervised topic modeling context. That is, our model successively performs NMF given a residual matrix obtained from previous stages and generates a sequence of topic sets. The algorithm we employ to update is novel in two aspects. The first lies in utilizing the residual matrix inspired by a state-of-the-art gradient boosting model, and the second stems from applying a sophisticated local weighting scheme on the given matrix to enhance the locality of topics, which in turn delivers high-quality, focused topics of interest to users. We subsequently extend this ensemble model by adding keyword- and document-based user interaction to introduce user-driven topic discovery.


Topic modeling Ensemble learning Matrix factorization Gradient boosting Local weighting 



This work was supported in part by the National Science Foundation Grants IIS-1707498, IIS-1619028, and IIS-1646881 and by Basic Science Research Program through the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP) (No. NRF-2016R1C1B2015924). Any opinions, findings, and conclusions or recommendations expressed here are those of the authors and do not necessarily reflect the views of funding agencies.


  1. 1.
    Aletras N, Stevenson M (2013) Evaluating topic coherence using distributional semantics. In: Proceedings of the international conference on computational semantics, pp 13–22Google Scholar
  2. 2.
    Andrzejewski D, Zhu X, Craven M (2009) Incorporating domain knowledge into topic modeling via dirichlet forest priors. In: Proceedings of the international conference on machine learning (ICML), pp 25–32Google Scholar
  3. 3.
    Bakharia A, Bruza P, Watters J, Narayan B, Sitbon L (2016) Interactive topic modeling for aiding qualitative content analysis. In: Proceedings of the ACM SIGIR on conference on human information interaction and retrieval (CHIIR), pp 213–222Google Scholar
  4. 4.
    Bernstein MS, Suh B, Hong L, Chen J, Kairam S, Chi EH (2010) Eddi: interactive topic-based browsing of social status streams. In: Proceedings of the annual ACM symposium on user interface software and technology (UIST), pp 303–312Google Scholar
  5. 5.
    Biggs M, Ghodsi A, Vavasis S (2008) Nonnegative matrix factorization via rank-one downdate. In: Proceedings of the international conference on machine learning (ICML), pp 64–71Google Scholar
  6. 6.
    Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res (JMLR) 3:993–1022zbMATHGoogle Scholar
  7. 7.
    Brandes U, Corman SR (2003) Visual unrolling of network evolution and the analysis of dynamic discourse. Inf Vis 2(1):40–50CrossRefGoogle Scholar
  8. 8.
    Cho Y-S, Ver Steeg G, Ferrara E, Galstyan A (2016) Latent space model for multi-modal social data. In: Proceedings of the international conference on world wide web (WWW), pp 447–458Google Scholar
  9. 9.
    Choo J, Lee C, Reddy CK, Park H (2013) UTOPIAN: user-driven topic modeling based on interactive nonnegative matrix factorization. IEEE Trans Vis Comput Graph (TVCG) 19(12):1992–2001CrossRefGoogle Scholar
  10. 10.
    Choo J, Lee C, Reddy CK, Park H (2015) Weakly supervised nonnegative matrix factorization for user-driven clustering. Data Min Knowl Discov (DMKD) 29(6):1598–1621MathSciNetCrossRefGoogle Scholar
  11. 11.
    Cichocki A, Zdunek R, Amari S-I (2007) Hierarchical als algorithms for nonnegative matrix and 3d tensor factorization. In: Independent component analysis and signal separation, pp 169–176Google Scholar
  12. 12.
    DeCoste D (2006) Collaborative prediction using ensembles of maximum margin matrix factorizations. In: Proceedings of the international conference on machine learning (ICML), pp 249–256Google Scholar
  13. 13.
    Ding C, Li T, Peng W, Park H (2006) Orthogonal nonnegative matrix tri-factorizations for clustering. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining (KDD)Google Scholar
  14. 14.
    Freund Y, Schapire R, Abe N (1999) A short introduction to boosting. J Jpn Soc Artif Intell 14(771–780):1612Google Scholar
  15. 15.
    Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29(5):1189–1232MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Gillis N, Glineur F (2010) Using underapproximations for sparse nonnegative matrix factorization. Pattern Recogn 43(4):1676–1687CrossRefzbMATHGoogle Scholar
  17. 17.
    Golub GH, van Loan CF (1996) Matrix computations, 3rd edn. Johns Hopkins University Press, BaltimorezbMATHGoogle Scholar
  18. 18.
    Greene D, Cagney G, Krogan N, Cunningham P (2008) Ensemble non-negative matrix factorization methods for clustering protein-protein interactions. Bioinformatics 24(15):1722–1728CrossRefGoogle Scholar
  19. 19.
    Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction. Springer, BerlinCrossRefzbMATHGoogle Scholar
  20. 20.
    Hofmann T (1999) Probabilistic latent semantic indexing. In: Proceedings of the ACM SIGIR international conference on research and development in information retrieval (SIGIR), pp 50–57Google Scholar
  21. 21.
    Hoque E, Carenini G (2015) Convisit: interactive topic modeling for exploring asynchronous online conversations. In: Proceedings of the international conference on intelligent user interfaces (IUI), pp 169–180Google Scholar
  22. 22.
    Huang F, Zhang S, Zhang J, Yu G (2017) Multimodal learning for topic sentiment analysis in microblogging. Neurocomputing 253:144–153CrossRefGoogle Scholar
  23. 23.
    Jo Y, Oh AH (2011) Aspect and sentiment unification model for online review analysis. In: Proceedings of the ACM international conference on web search and data mining (WSDM), pp 815–824Google Scholar
  24. 24.
    Kim H, Park H (2007) Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis. Bioinformatics 23(12):1495–1502CrossRefGoogle Scholar
  25. 25.
    Kim H, Park H (2008) Nonnegative matrix factorization based on alternating nonnegativity constrained least squares and active set method. SIAM J Matrix Anal Appl 30(2):713–730MathSciNetCrossRefzbMATHGoogle Scholar
  26. 26.
    Kim J, Park H (2008) Sparse nonnegative matrix factorization for clustering. Georgia Institute of Technology, GeorgiaGoogle Scholar
  27. 27.
    Kim J, Park H (2011) Fast nonnegative matrix factorization: an active-set-like method and comparisons. SIAM J Sci Comput 33(6):3261–3281MathSciNetCrossRefzbMATHGoogle Scholar
  28. 28.
    Kim J, He Y, Park H (2014) Algorithms for nonnegative matrix and tensor factorizations: a unified view based on block coordinate descent framework. J Glob Optim 58(2):285–319MathSciNetCrossRefzbMATHGoogle Scholar
  29. 29.
    Kim H, Choo J, Kim J, Reddy CK, Park H (2015) Simultaneous discovery of common and discriminative topics via joint nonnegative matrix factorization. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 567–576Google Scholar
  30. 30.
    Kim M, Kang K, Park D, Choo J, Elmqvist N (2017) Topiclens: efficient multi-level visual topic exploration of large-scale document collections. IEEE Trans Vis Comput Graph (TVCG) 23(1):151–160CrossRefGoogle Scholar
  31. 31.
    Kuang D, Park H (2013) Fast rank-2 nonnegative matrix factorization for hierarchical document clustering. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 739–747Google Scholar
  32. 32.
    Kuhn HW (1955) The hungarian method for the assignment problem. Naval Res Logist Q 2(1–2):83–97MathSciNetCrossRefzbMATHGoogle Scholar
  33. 33.
    Kumar S, Mohri M, Talwalkar A (2009) Ensemble nystrom method. In: Advances in neural information processing systems (NIPS), pp 1060–1068Google Scholar
  34. 34.
    Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791CrossRefzbMATHGoogle Scholar
  35. 35.
    Lee H, Kihm J, Choo J, Stasko J, Park H (2012) iVisClustering: an interactive visual document clustering via topic modeling. Comput Graph Forum 31(3 pt 3):1155–1164CrossRefGoogle Scholar
  36. 36.
    Lee J, Sun M, Kim S, Lebanon G (2012) Automatic feature induction for stagewise collaborative filtering. In: Advances in neural information processing systems (NIPS)Google Scholar
  37. 37.
    Lee J, Kim S, Lebanon G, Singer Y, Bengio S (2016) Llorma: local low-rank matrix approximation. J Mach Learn Res (JMLR) 17(15):1–24MathSciNetzbMATHGoogle Scholar
  38. 38.
    Li T, Zhang Y, Sindhwani V (2009) A non-negative matrix tri-factorization approach to sentiment classification with lexical prior knowledge. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP, pp 244–252Google Scholar
  39. 39.
    Lin C-J (2007) Projected gradient methods for nonnegative matrix factorization. Neural Comput 19(10):2756–2779MathSciNetCrossRefzbMATHGoogle Scholar
  40. 40.
    Mackey LW, Talwalkar AS, Jordan MI (2011) Divide-and-conquer matrix factorization. In: Advances in neural information processing systems (NIPS), pp 1134–1142Google Scholar
  41. 41.
    Meyer M, Munzner T, DePace A, Pfister H (2010) Multeesum: a tool for comparative spatial and temporal gene expression data. IEEE Trans Vis Comput Graph (TVCG) 16(6):908–917CrossRefGoogle Scholar
  42. 42.
    Mukherjea S, Hirata K, Hara Y (1996) Visualizing the results of multimedia web search engines. In: Proceedings of the IEEE symposium on information visualization (InfoVis), pp 64–65, 122Google Scholar
  43. 43.
    Newman D, Lau JH, Grieser K, Baldwin T (2010) Automatic evaluation of topic coherence. In: Proceedings of the annual conference of the North American chapter of the association for computational linguistics (NAACL-HLT), pp 100–108Google Scholar
  44. 44.
    Paatero P, Tapper U (1994) Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5:111–126CrossRefGoogle Scholar
  45. 45.
    Qian S, Zhang T, Xu C, Shao J (2016) Multi-modal event topic model for social event analysis. IEEE Trans Multimed 18:233–246CrossRefGoogle Scholar
  46. 46.
    Sill J, Takacs G, Mackey L, Lin D (2009) Feature-weighted linear stacking. Arxiv preprint arXiv:0911.0460
  47. 47.
    Su X, Khoshgoftaar TM (2009) A survey of collaborative filtering techniques. Adv Artif Intell 2009:4:2Google Scholar
  48. 48.
    Suh S, Choo J, Lee J, Reddy CK (2016) L-ensnmf: boosted local topic discovery via ensemble of nonnegative matrix factorization. In: Proceedings of the IEEE international conference on data mining (ICDM), pp 479–488Google Scholar
  49. 49.
    Titov I, McDonald R (2008) Modeling online reviews with multi-grain topic models. In Proceedings of the international conference on world wide web (WWW), pp 111–120Google Scholar
  50. 50.
    Wang S, Chen Z, Liu B (2016) Mining aspect-specific opinion using a holistic lifelong topic model. In: Proceedings of the international conference on world wide web (WWW), pp 167–176Google Scholar
  51. 51.
    Wei F, Liu S, Song Y, Pan S, Zhou MX, Qian W, Shi L, Tan L, Zhang Q (2010) Tiara: a visual exploratory text analytic system. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 153–162Google Scholar
  52. 52.
    Wilkinson JH, Wilkinson JH, Wilkinson JH (1965) The algebraic eigenvalue problem, vol 87. Clarendon Press, OxfordzbMATHGoogle Scholar
  53. 53.
    Wu Q, Tan M, Li X, Min H, Sun N (2015) Nmfe-sscc: non-negative matrix factorization ensemble for semi-supervised collective classification. Knowl Based Syst 89:160–172CrossRefGoogle Scholar
  54. 54.
    Yang P, Su X, Ou-Yang L, Chua H-N, Li X-L, Ning K (2014) Microbial community pattern detection in human body habitats via ensemble clustering framework. BMC Syst Biol 8(Suppl 4):S7CrossRefGoogle Scholar
  55. 55.
    Zheng Y, Zhang YJ, Larochelle H (2016) A deep and autoregressive approach for topic modeling of multimodal data. IEEE Trans Pattern Anal Mach Intell (TPAMI) 38:1056–1069CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2018

Authors and Affiliations

  • Sangho Suh
    • 1
  • Sungbok Shin
    • 2
  • Joonseok Lee
    • 3
  • Chandan K. Reddy
    • 4
  • Jaegul Choo
    • 2
  1. 1.David R. Cheriton School of Computer ScienceUniversity of WaterlooWaterlooCanada
  2. 2.Department of Computer Science and EngineeringKorea UniversitySeoulSouth Korea
  3. 3.Google ResearchMountain ViewUSA
  4. 4.Department of Computer ScienceVirginia TechArlingtonUSA

Personalised recommendations