Coupled Hierarchical Dirichlet Process Mixtures for Simultaneous Clustering and Topic Modeling

  • Masamichi ShimosakaEmail author
  • Takeshi Tsukiji
  • Shoji Tominaga
  • Kota Tsubouchi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9852)


We propose a nonparametric Bayesian mixture model that simultaneously optimizes the topic extraction and group clustering while allowing all topics to be shared by all clusters for grouped data. In addition, in order to enhance the computational efficiency on par with today’s large-scale data, we formulate our model so that it can use a closed-form variational Bayesian method to approximately calculate the posterior distribution. Experimental results with corpus data show that our model has a better performance than existing models, achieving a 22 % improvement against state-of-the-art model. Moreover, an experiment with location data from mobile phones shows that our model performs well in the field of big data analysis.


Non-parametric Bayes Clustering Hierarchical model Topic modeling 



We thank Tengfei Ma, Issei Sato, and Hiroshi Nakagawa for providing the hNHDP implementation. This work was partly supported by CREST, JST.


  1. 1.
    Advances in Neural Information Processing Systems. Accessed 15 Jan 2013
  2. 2.
    Nist Topic Detection and Tracking Corpus. Accessed 15 Jan 2013
  3. 3.
    Popular Text Data Sets in Matlab Format. Accessed 15 Jan 2013
  4. 4.
    Roweis, S.: Data. Accessed 15 Jan 2013
  5. 5.
    Blei, D.M., Griffiths, T.L., Jordan, M.I.: The nested Chinese restaurant process and Bayesian nonparametric inference of topic hierarchies. J. ACM 57(2), 7:1–7:30 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)zbMATHGoogle Scholar
  7. 7.
    Dhillon, I.S., Modha, D.S.: Concept decompositions for large sparse text data using clustering. Mach. Learn. 42(1–2), 143–175 (2001)CrossRefzbMATHGoogle Scholar
  8. 8.
    Ferguson, T.S.: A Bayesian analysis of some nonparametric problems. Ann. Stat. 1, 209–230 (1973)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Ghahramani, Z., Griffiths, T.L.: Infinite latent feature models and the Indian buffet process. In: Proceedings of NIPS, pp. 475–482 (2005)Google Scholar
  10. 10.
    Hayes, P.J., Weinstein, S.P.: CONSTRUE/TIS: a system for content-based indexing of a database of news stories. In: Proceedings of IAAI, pp. 49–64 (1991)Google Scholar
  11. 11.
    Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2(1), 193–218 (1985)CrossRefzbMATHGoogle Scholar
  12. 12.
    Jones, K.S.: IDF term weighting and IR research lessons. J. Documentation 60(5), 521–523 (2004)CrossRefGoogle Scholar
  13. 13.
    Lin, D., Fisher, J.: Coupled Dirichlet processes: beyond HDP. In: Proceedings of NIPS Workshop (2012)Google Scholar
  14. 14.
    Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Ma, T., Sato, I., Nakagawa, H.: The hybrid nested/hierarchical Dirichlet process and its application to topic modeling with word differentiation. In: Proceedings of AAAI, pp. 2835–2841 (2015)Google Scholar
  16. 16.
    Muller, P., Quintana, F., Rosner, G.: A method for combining inference across related nonparametric Bayesian models. J. R. Stat. Soc. Ser. B (Stat. Method.) 66(3), 735–749 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Nishi, K., Tsubouchi, K., Shimosaka, M.: Extracting land-use patterns using location data from smartphones. In: Proceedings of Urb-IoT, pp. 38–43 (2014)Google Scholar
  18. 18.
    Paisley, J., Wang, C., Blei, D.M., Jordan, M.I.: Nested hierarchical Dirichlet processes (2012). arXiv preprint: arXiv:1210.6738
  19. 19.
    Pitman, J.: Combinatorial stochastic processes. Technical report, Technical report 621, Dept. Statistics, UC Berkeley, 2002. Lecture notes for St. Flour course (2002)Google Scholar
  20. 20.
    Ramage, D., Hall, D., Nallapati, R., Manning, C.D.: Labeled lda: a supervised topic model for credit attribution in multi-labeled corpora. In: Proceedings of EMNLP, pp. 248–256 (2009)Google Scholar
  21. 21.
    Rodriguez, A., Dunson, D.B., Gelfand, A.E.: The nested Dirichlet process. J. Am. Stat. Assoc. 103(483), 1131–1154 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Hierarchical Dirichlet processes. J. Am. Stat. Assoc. 101(476), 1566–1581 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Wang, C., Paisley, J.W., Blei, D.M.: Online variational inference for the hierarchical Dirichlet process. In: Proceedings of AISTATS, pp. 752–760 (2011)Google Scholar
  24. 24.
    Wang, X., Ma, X., Grimson, E.: Unsupervised activity perception by hierarchical Bayesian model. In: Proceedings of CVPR, pp. 1–8 (2007)Google Scholar
  25. 25.
    Yuan, J., Zheng, Y., Xie, X.: Discovering regions of different functions in a city using human mobility and pois. In: Proceedings of KDD, pp. 186–194 (2012)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Masamichi Shimosaka
    • 1
    Email author
  • Takeshi Tsukiji
    • 2
  • Shoji Tominaga
    • 2
  • Kota Tsubouchi
    • 3
  1. 1.Tokyo Institute of TechnologyTokyoJapan
  2. 2.The University of TokyoTokyoJapan
  3. 3.Yahoo Japan CorporationTokyoJapan

Personalised recommendations