Abstract
In this paper, we propose the Topical Communities and Personal Interest (TCPI) model for simultaneously modeling topics, topical communities, and users’ topical interests in microblogging data. TCPI considers different topical communities while differentiating users’ personal topical interests from those of topical communities, and learning the dependence of each user on the affiliated communities to generate content. This makes TCPI different from existing models that either do not consider the existence of multiple topical communities, or do not differentiate between personal and community’s topical interests. Our experiments on two Twitter datasets show that TCPI can effectively mine the representative topics for each topical community. We also demonstrate that TCPI significantly outperforms other state-of-the-art topic models in the modeling tweet generation task.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Airoldi, E.M., Blei, D.M., Fienberg, S.E., Xing, E.P.: Mixed membership stochastic blockmodels. J. Mach. Learn. Res. 9 (2008)
Balasubramanyan, R., Cohen, W.W.: Regularization of latent variable models to obtain sparsity. In: SDM 2013 (2013)
Balasubramanyan, R., Dalvi, B., Cohen, W.W.: From topic models to semi-supervised learning: Biasing mixed-membership models to exploit topic-indicative features in entity clustering. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013, Part II. LNCS, vol. 8189, pp. 628–642. Springer, Heidelberg (2013)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. (2003)
Ding, Y.: Community detection: Topological vs. topical. Journal of Informetrics 5(4), 498–514 (2011)
Grabowicz, P.A., Aiello, L.M., Eguiluz, V.M., Jaimes, A.: Distinguishing topical and social groups based on common identity and bond theory. In: WSDM (2013)
Ho, Q., Xing, E., et al.: More effective distributed ml via a stale synchronous parallel parameter server. In: NIPS (2013)
Hoang, T.A., Cohen, W.W., Lim, E.P.: On modeling community behaviors and sentiments in microblogging. In: SDM 2014 (2014)
Hong, L., Davison, B.D.: Empirical study of topic modeling in twitter. In: SOMA 2010 (2010)
Hong, L., Dom, B., Gurumurthy, S., Tsioutsiouliklis, K.: A time-dependent topic model for multiple text streams. In: KDD 2011 (2011)
Java, A., Song, X., Finin, T., Tseng, B.: Why we twitter: An analysis of a microblogging community. In: Zhang, H., Spiliopoulou, M., Mobasher, B., Giles, C.L., McCallum, A., Nasraoui, O., Srivastava, J., Yen, J. (eds.) WebKDD 2007. LNCS, vol. 5439, pp. 118–138. Springer, Heidelberg (2009)
Jurgen, A.: Twitter top 100 for software developers (2009), http://www.noop.nl/2009/02/twitter-top-100-for-software-developers.html
Kooti, F., Yang, H., Cha, M., Gummadi, P.K., Mason, W.A.: The emergence of conventions in online social networks. In: ICWSM 2012 (2012)
Liu, J.S.: The collapsed gibbs sampler in bayesian computations with applications to a gene regulation problem. J. Amer. Stat. Assoc. (1994)
Mehrotra, R., Sanner, S., Buntine, W., Xie, L.: Improving lda topic models for microblogs via tweet pooling and automatic labeling. In: SIGIR (2013)
Michelson, M., Macskassy, S.A.: Discovering users’ topics of interest on twitter: A first look. In: AND 2010 (2010)
Newman, D., Asuncion, A., Smyth, P., Welling, M.: Distributed algorithms for topic models. The Journal of Machine Learning Research 10, 1801–1828 (2009)
Newman, M.E.J.: Modularity and community structure in networks. PNAS (2006)
Qiu, M., Jiang, J., Zhu, F.: It is not just what we say, but how we say them: Lda-based behavior-topic model. In: SDM 2013 (2013)
Ramage, D., Hall, D., Nallapati, R., Manning, C.D.: Labeled lda: A supervised topic model for credit attribution in multi-labeled corpora. In: ECML (2009)
Rosen-Zvi, M., Griffiths, T., Steyvers, M., Smyth, P.: The author-topic model for authors and documents. In: UAI (2004)
Sachan, M., Dubey, A., Srivastava, S., Xing, E.P., Hovy, E.: Spatial compactness meets topical consistency: Jointly modeling links and content for community detection. In: WSDM (2014)
Xie, P., Xing, E.P.: Integrating document clustering and topic modeling. In: UAI (2013)
Yang, J., Leskovec, J.: Community-affiliation graph model for overlapping network community detection. In: ICDM (2012)
Yang, J., Leskovec, J.: Defining and evaluating network communities based on ground-truth. In: ICDM (2012)
Yang, J., McAuley, J., Leskovec, J.: Community detection in networks with node attributes. In: ICDM (2013)
Yang, J., McAuley, J., Leskovec, J.: Detecting cohesive and 2-mode communities indirected and undirected networks. In: WSDM (2014)
Yin, Z., Cao, L., Gu, Q., Han, J.: Latent community topic analysis: Integration of community discovery with topic modeling. ACM TIST (2012)
Zhao, D., Rosson, M.B.: How and why people twitter: The role that micro-blogging plays in informal communication at work. In: GROUP 2009 (2009)
Zhao, W.X., Jiang, J., Weng, J., He, J., Lim, E.-P., Yan, H., Li, X.: Comparing twitter and traditional media using topic models. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J.F., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 338–349. Springer, Heidelberg (2011)
Zhou, D., Manavoglu, E., Li, J., Giles, C.L., Zha, H.: Probabilistic models for discovering e-communities. In: WWW 2006 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Hoang, TA., Lim, EP. (2014). On Joint Modeling of Topical Communities and Personal Interest in Microblogs. In: Aiello, L.M., McFarland, D. (eds) Social Informatics. SocInfo 2014. Lecture Notes in Computer Science, vol 8851. Springer, Cham. https://doi.org/10.1007/978-3-319-13734-6_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-13734-6_1
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13733-9
Online ISBN: 978-3-319-13734-6
eBook Packages: Computer ScienceComputer Science (R0)