Advertisement

Sampling Table Configurations for the Hierarchical Poisson-Dirichlet Process

  • Changyou Chen
  • Lan Du
  • Wray Buntine
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6911)

Abstract

Hierarchical modeling and reasoning are fundamental in machine intelligence, and for this the two-parameter Poisson-Dirichlet Process (PDP) plays an important role. The most popular MCMC sampling algorithm for the hierarchical PDP and hierarchical Dirichlet Process is to conduct an incremental sampling based on the Chinese restaurant metaphor, which originates from the Chinese restaurant process (CRP). In this paper, with the same metaphor, we propose a new table representation for the hierarchical PDPs by introducing an auxiliary latent variable, called table indicator, to record which customer takes responsibility for starting a new table. In this way, the new representation allows full exchangeability that is an essential condition for a correct Gibbs sampling algorithm. Based on this representation, we develop a block Gibbs sampling algorithm, which can jointly sample the data item and its table contribution. We test this out on the hierarchical Dirichlet process variant of latent Dirichlet allocation (HDP-LDA) developed by Teh, Jordan, Beal and Blei. Experiment results show that the proposed algorithm outperforms their “posterior sampling by direct assignment” algorithm in both out-of-sample perplexity and convergence speed. The representation can be used with many other hierarchical PDP models.

Keywords

Hierarchical Poisson-Dirichlet Processes Dirichlet Processes HDP-LDA block Gibbs sampler 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Teh, Y.W.: A hierarchical Bayesian language model based on Pitman-Yor processes. In: ACL 2006, pp. 985–992 (2006)Google Scholar
  2. 2.
    Goldwater, S., Griffiths, T., Johnson, M.: Interpolating between types and tokens by estimating power-law generators. In: NIPS 2006, pp. 459–466 (2006)Google Scholar
  3. 3.
    Mochihashi, D., Sumita, E.: The infinite Markov model. In: NIPS 2008, pp. 1017–1024 (2008)Google Scholar
  4. 4.
    Johnson, M., Griffiths, T., Goldwater, S.: Adaptor grammars: A framework for specifying compositional nonparametric Bayesian models. In: NIPS 2007, pp. 641–648 (2007)Google Scholar
  5. 5.
    Wallach, H., Sutton, C., McCallum, A.: Bayesian modeling of dependency trees using hierarchical Pitman-Yor priors. In: Proceedings of the Workshop on Prior Knowledge for Text and Language (in Conjunction with ICML/UAI/COLT), pp. 15–20 (2008)Google Scholar
  6. 6.
    Wood, F., Archambeau, C., Gasthaus, J., James, L., Teh, Y.: A stochastic memoizer for sequence data. In: ICML 2009, pp. 119–116 (2009)Google Scholar
  7. 7.
    Rasmussen, C.: The infinite Gaussian mixture model. In: NIPS 2000, pp. 554–560 (2000)Google Scholar
  8. 8.
    Pruteanu-Malinici, I., Ren, L., Paisley, J., Wang, E., Carin, L.: Hierarchical Bayesian modeling of topics in time-stamped documents. TPAMI 32, 996–1011 (2010)CrossRefGoogle Scholar
  9. 9.
    Xu, Z., Tresp, V., Yu, K., Kriegel, H.P.: Infinite hidden relational models. In: UAI 2006, pp. 544–551 (2006)Google Scholar
  10. 10.
    Teh, Y.W., Jordan, M.I.: Hierarchical Bayesian nonparametric models with applications. In: Bayesian Nonparametrics: Principles and Practice (2010)Google Scholar
  11. 11.
    Ishwaran, H., James, L.: Gibbs sampling methods for stick-breaking priors. Journal of ASA 96, 161–173 (2001)MathSciNetzbMATHGoogle Scholar
  12. 12.
    Buntine, W., Jakulin, A.: Discrete components analysis. In: Subspace, Latent Structure and Feature Selection Techniques (2006)Google Scholar
  13. 13.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)zbMATHGoogle Scholar
  14. 14.
    Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Hierarchical Dirichlet processes. Journal of the ASA 101, 1566–1581 (2006)MathSciNetzbMATHGoogle Scholar
  15. 15.
    Du, L., Buntine, W., Jin, H.: A segmented topic model based on the two-parameter Poisson-Dirichlet process. Mach. Learn. 81, 5–19 (2010)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Du, L., Buntine, W., Jin, H.: Sequential latent Dirichlet allocation: Discover underlying topic structures within a document. In: ICDM 2010, pp. 148–157 (2010)Google Scholar
  17. 17.
    Pitman, J., Yor, M.: The two-parameter Poisson-Diriclet distribution derived from a stable subordinator. Annals Prob. 25, 855–900 (1997)CrossRefzbMATHGoogle Scholar
  18. 18.
    Buntine, W., Hutter, M.: A Bayesian review of the Poisson-Dirichlet process. Technical Report arXiv:1007.0296, NICTA and ANU, Australia (2010)Google Scholar
  19. 19.
    Teh, Y.: A Bayesian interpretation of interpolated Kneser-Ney. Technical Report TRA2/06, School of Computing, National University of Singapore (2006)Google Scholar
  20. 20.
    Buntine, W., Du, L., Nurmi, P.: Bayesian networks on Dirichlet distributed vectors. In: PGM 2010, pp. 33–40 (2010)Google Scholar
  21. 21.
    Blei, D.M., Griffiths, T.L., Jordan, M.I.: The nested Chinese restaurant process and Bayesian nonparametric inference of topic hierarchies. J. ACM 57, 1–30 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Teh, Y.: Nonparametric Bayesian mixture models - release 2.1. Technical Report  University College London (2004), http://www.gatsby.ucl.ac.uk/~ywteh/research/software.html
  23. 23.
    Wang, C., Blei, D.: Variational inference for the nested Chinese restaurant process. In: NIPS 2009, pp. 1990–1998 (2009)Google Scholar
  24. 24.
    Wang, C., Paisley, J., Blei, D.: Online variational inference for the hierarchical Dirichlet process. In: AISTATS 2011 (2011)Google Scholar
  25. 25.
    Teh, Y., Kurihara, K., Welling, M.: Collapsed variational inference for HDP. In: NIPS 2007 (2007)Google Scholar
  26. 26.
    Blunsom, P., Cohn, T., Goldwater, S., Johnson, M.: A note on the implementation of hierarchical Dirichlet processes. In: ACL 2009, pp. 337–340 (2009)Google Scholar
  27. 27.
    Buntine, W.: Estimating likelihoods for topic models. In: Zhou, Z.-H., Washio, T. (eds.) ACML 2009. LNCS, vol. 5828, pp. 51–64. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  28. 28.
    Wallach, H., Murray, I., Salakhutdinov, R., Mimno, D.: Evaluation methods for topic models. In: ICML 2009, pp. 672–679 (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Changyou Chen
    • 1
    • 2
  • Lan Du
    • 1
    • 2
  • Wray Buntine
    • 1
    • 2
  1. 1.Research School of Computer ScienceThe Australian National UniversityCanberraAustralia
  2. 2.National ICTCanberraAustralia

Personalised recommendations