Advertisement

Sequential Entity Group Topic Model for Getting Topic Flows of Entity Groups within One Document

  • Young-Seob Jeong
  • Ho-Jin Choi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7301)

Abstract

Topic mining is regarded as a powerful method to analyze documents, and topic models are used to annotate relationships or to get a topic flow. The research aim in this paper is to get topic flows of entities and entity groups within one document. We propose two topic models: Entity Group Topic Model (EGTM) and Sequential Entity Group Topic Model (S-EGTM). These models provide two contributions. First, topic distributions of entities and entity groups can be analyzed. Second, the topic flow of each entity or each entity group can be captured, through segments in one document. We develop collapsed gibbs sampling methods for performing approximate inference of the models. By experiments, we demonstrate the models by showing the analysis of topics, prediction performance, and the topic flows over segments in one document.

Keywords

Sequential topic model Poisson-Dirichlet process entity group 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Hofmann, T.: Probabilistic Latent Semantic Indexing. In: SIGIR, pp. 50–57 (1999)Google Scholar
  2. 2.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet Allocation. In: NIPS, pp. 601–608 (2001)Google Scholar
  3. 3.
    Newman, D., Chemudugunta, C., Smyth, P.: Statistical entity-topic models. In: KDD, pp. 680–686 (2006)Google Scholar
  4. 4.
    Chang, J., Boyd-Graber, J.L., Blei, D.M.: Connections between the lines: augmenting social networks with text. In: KDD, pp. 169–178 (2009)Google Scholar
  5. 5.
    Blei, D.M., Lafferty, J.D.: Dynamic topic models. In: ICML, pp. 113–120 (2006)Google Scholar
  6. 6.
    Du, L., Buntine, W.L., Jin, H.: A segmented topic model based on the two-parameter Poisson-Dirichlet process. Machine Learning, 5–19 (2010)Google Scholar
  7. 7.
    Du, L., Buntine, W.L., Jin, H.: Sequential Latent Dirichlet Allocation: Discover Underlying Topic Structures within a Document. In: ICDM, pp. 148–157 (2010)Google Scholar
  8. 8.
    Griffiths, T.L., Steyvers, M.: Finding Scientific Topics. National Academy of Sciences, 5228–5235 (2004)Google Scholar
  9. 9.
    Rosen-Zvi, M., Griffiths, T.L., Steyvers, M., Smyth, P.: The Author-Topic Model for Authors and Documents. In: UAI, pp. 487–494 (2004)Google Scholar
  10. 10.
    Mei, Q., Liu, C., Su, H., Zhai, C.: A probabilistic approach to spatiotemporal theme pattern mining on weblogs. In: WWW, pp. 533–542 (2006)Google Scholar
  11. 11.
    Titov, I., McDonald, R.T.: Modeling Online Reviews with Multi-grain Topic Models. CoRR (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Young-Seob Jeong
    • 1
  • Ho-Jin Choi
    • 1
  1. 1.Department of Computer ScienceKAISTYuseong-guKorea (South)

Personalised recommendations