Skip to main content
Log in

Modeling Community Structure and Topics in Dynamic Text Networks

  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

The last decade has seen great progress in both dynamic network modeling and topic modeling. This paper draws upon both areas to create a bespoke Bayesian model applied to a dataset consisting of the top 467 US political blogs in 2012, their posts over the year, and their links to one another. Our model allows dynamic topic discovery to inform the latent network model and the network structure to facilitate topic identification. Our results find complex community structure within this set of blogs, where community membership depends strongly upon the set of topics in which the blogger is interested. We examine the time varying nature of the Sensational Crime topic, as well as the network properties of the Election News topic, as notable and easily interpretable empirical examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. Gingrich, Santorum, and Cain all refer to candidates in the 2012 Republican presidential primary.

  2. George Zimmerman shot and killed Trayvon Martin in March of 2012.

  3. Newt Gingrich gradually faded to political irrelevance after a failed presidential primary run.

  4. Trayvon Martin was a young African American man shot by George Zimmerman, in what he claimed to be an act of self defense, while Martin was walking in Zimmerman’s neighborhood. The Aurora theater massacre was a mass shooting at a movie theater in Aurora, Colorado. The Sikh Temple shooting was a mass shooting at a Sikh temple in Wisconsin. The Sandy Hook massacre was a mass shooting at an elementary school in Connecticut.

References

  • Airoldi, E.M., Blei, D.M., Fienberg, S.E., Xing, E.P. (2008). Mixed Membership Stochastic Blockmodels. Journal of Machine Learning Research, 9(2008), 1981–2014.

    MATH  Google Scholar 

  • Arun, R., Suresh, V., Madhavan, C.V., Murthy, M.N. (2010). On finding the natural number of topics with latent dirichlet allocation: Some observations. In Advances in knowledge discovery and data mining (pp. 391–402). Springer.

    Chapter  Google Scholar 

  • Blei, D., Ng, A., Jordan, M. (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research, 3, 993–1022.

    MATH  Google Scholar 

  • Blei, D.M., & Lafferty, J.D. (2006). Dynamic topic models. In Proceedings of the 23rd international conference on machine learning (pp. 113–120). ACM.

  • Brown, P.F., Desouza, P.V., Mercer, R.L., Pietra, V.J.D., Lai, J.C. (1992). Class-based n-gram models of natural language. Computational linguistics, 18 (4), 467–479.

    Google Scholar 

  • Chang, J., & Blei, D.M. (2009). Relational topic models for document networks. In International conference on artificial intelligence and statistics (pp. 81–88).

  • Faust, K., & Wasserman, S. (1992). Blockmodels: interpretation and evaluation. Social networks, 14(1), 5–61.

    Article  Google Scholar 

  • Frank, O., & Strauss, D. (1986). Markov graphs. Journal of the American Statistical Association, 81(395), 832–842.

    Article  MathSciNet  Google Scholar 

  • Gilks, W.R., Best, N., Tan, K. (1995). Adaptive rejection metropolis sampling within gibbs sampling. Applied Statistics, 44, 455–472.

    Article  Google Scholar 

  • Ho, Q., Eisenstein, J., Xing, E.P. (2012). Document hierarchies from text and links. In Proceedings of the 21st international conference on World Wide Web (pp. 739–748). ACM.

  • Hoff, P.D., Raftery, A.E., Handcock, M.S. (2002). Latent space approaches to social network analysis. Journal of the American Statistical Association, 97(460), 1090–1098.

    Article  MathSciNet  Google Scholar 

  • Hoffman, M., Bach, F.R., Blei, D.M. (2010). Online learning for latent dirichlet allocation. In Advances in neural information processing systems (pp. 856–864).

  • Holland, P.W., & Leinhardt, S. (1981). An exponential family of probability distributions for directed graphs. Journal of the american Statistical association, 76 (373), 33–50.

    Article  MathSciNet  Google Scholar 

  • Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193–218.

    Article  Google Scholar 

  • Hunter, D.R., Goodreau, S.M., Handcock, M.S. (2008). Goodness of Fit of Social Network Models. Journal of the American Statistical Association, 103(481), 248–258.

    Article  MathSciNet  Google Scholar 

  • Krivitsky, P.N., & Handcock, M.S. (2014). A separable model for dynamic networks. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76(1), 29–46.

    Article  MathSciNet  Google Scholar 

  • Latouche, P., Birmelé, E., Ambroise, C. (2011). Overlapping stochastic block models with application to the French political blogosphere. Annals of Applied Statistics, 5(1), 309–336.

    Article  MathSciNet  Google Scholar 

  • Lawrence, E., Sides, J., Farrell, H. (2010). Self-segregation or deliberation? blog readership, participation, and polarization in american politics. Perspectives on Politics, 8(01), 141.

    Article  Google Scholar 

  • McNamee, P., & Mayfield, J. (2003). Jhu/apl experiments in tokenization and non-word translation. In Comparative evaluation of multilingual information access systems (pp. 85–97). Springer.

  • Moody, J. (2004). The structure of a social science collaboration network: disciplinary cohesion from 1963 to 1999. American Sociological Review, 69(2), 213–238.

    Article  Google Scholar 

  • Newman, M.E.J., & Girvan, M. (2004). Finding and evaluating community structure in networks. Physical Review E - Statistical, Nonlinear, and Soft Matter Physics, 69 (22), 026113.

    Article  Google Scholar 

  • Pons, P., & Latapy, M. (2006). Computing communities in large networks using random walks. J. Graph Algorithms Appl., 10(2), 191.

    Article  MathSciNet  Google Scholar 

  • Ramos, J. (2003). Using tf-idf to determine word relevance in document queries. In Proceedings of the first instructional conference on machine learning.

  • Robins, G., Elliott, P., Pattison, P. (2001). Network models for social selection processes. Social Networks, 23(1), 1–30.

    Article  Google Scholar 

  • Snijders, T.A., & Nowicki, K. (1997). Estimation and prediction for stochastic blockmodels for graphs with latent block structure. Journal of Classification, 14(1), 75–100.

    Article  MathSciNet  Google Scholar 

  • Steinley, D. (2004). Properties of the hubert-arable adjusted rand index. Psychological Methods, 9(3), 386.

    Article  Google Scholar 

  • Technorati. (2002). https://web.archive.org/web/20140420052710/http://technorati.com/.

  • Wang, E., Silva, J., Willett, R., Carin, L. (2011). Dynamic relational topic model for social network analysis with noisy links. In 2011 IEEE, statistical signal processing workshop (SSP) (pp. 497–500). IEEE.

  • Wasserman, S., & Pattison, P. (1996). Logit models and logistic regressions for social networks: I. An introduction to Markov graphs and p. Psychometrika, 61(3), 401–425.

    Article  MathSciNet  Google Scholar 

  • Yin, J., & Wang, J. (2014). A dirichlet multinomial mixture model-based approach for short text clustering. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 233–242). ACM.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Teague R. Henry.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Henry, T.R., Banks, D., Owens-Oas, D. et al. Modeling Community Structure and Topics in Dynamic Text Networks. J Classif 36, 322–349 (2019). https://doi.org/10.1007/s00357-018-9289-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00357-018-9289-3

Keywords

Navigation