Modeling Community Structure and Topics in Dynamic Text Networks

Henry, Teague R.; Banks, David; Owens-Oas, Derek; Chai, Christine

doi:10.1007/s00357-018-9289-3

Modeling Community Structure and Topics in Dynamic Text Networks

Published: 22 January 2019

Volume 36, pages 322–349, (2019)
Cite this article

Journal of Classification Aims and scope Submit manuscript

Teague R. Henry¹,
David Banks²,
Derek Owens-Oas² &
…
Christine Chai²

396 Accesses
4 Citations
Explore all metrics

Abstract

The last decade has seen great progress in both dynamic network modeling and topic modeling. This paper draws upon both areas to create a bespoke Bayesian model applied to a dataset consisting of the top 467 US political blogs in 2012, their posts over the year, and their links to one another. Our model allows dynamic topic discovery to inform the latent network model and the network structure to facilitate topic identification. Our results find complex community structure within this set of blogs, where community membership depends strongly upon the set of topics in which the blogger is interested. We examine the time varying nature of the Sensational Crime topic, as well as the network properties of the Election News topic, as notable and easily interpretable empirical examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Estimating psychological networks and their accuracy: A tutorial paper

Article Open access 24 March 2017

Analyzing social media data: A mixed-methods framework combining computational and qualitative text analysis

Article 02 April 2019

The coordination network toolkit: a framework for detecting and analysing coordinated behaviour on social media

Article Open access 11 May 2024

Notes

Gingrich, Santorum, and Cain all refer to candidates in the 2012 Republican presidential primary.
George Zimmerman shot and killed Trayvon Martin in March of 2012.
Newt Gingrich gradually faded to political irrelevance after a failed presidential primary run.
Trayvon Martin was a young African American man shot by George Zimmerman, in what he claimed to be an act of self defense, while Martin was walking in Zimmerman’s neighborhood. The Aurora theater massacre was a mass shooting at a movie theater in Aurora, Colorado. The Sikh Temple shooting was a mass shooting at a Sikh temple in Wisconsin. The Sandy Hook massacre was a mass shooting at an elementary school in Connecticut.

References

Airoldi, E.M., Blei, D.M., Fienberg, S.E., Xing, E.P. (2008). Mixed Membership Stochastic Blockmodels. Journal of Machine Learning Research, 9(2008), 1981–2014.
MATH Google Scholar
Arun, R., Suresh, V., Madhavan, C.V., Murthy, M.N. (2010). On finding the natural number of topics with latent dirichlet allocation: Some observations. In Advances in knowledge discovery and data mining (pp. 391–402). Springer.
Chapter Google Scholar
Blei, D., Ng, A., Jordan, M. (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research, 3, 993–1022.
MATH Google Scholar
Blei, D.M., & Lafferty, J.D. (2006). Dynamic topic models. In Proceedings of the 23rd international conference on machine learning (pp. 113–120). ACM.
Brown, P.F., Desouza, P.V., Mercer, R.L., Pietra, V.J.D., Lai, J.C. (1992). Class-based n-gram models of natural language. Computational linguistics, 18 (4), 467–479.
Google Scholar
Chang, J., & Blei, D.M. (2009). Relational topic models for document networks. In International conference on artificial intelligence and statistics (pp. 81–88).
Faust, K., & Wasserman, S. (1992). Blockmodels: interpretation and evaluation. Social networks, 14(1), 5–61.
Article Google Scholar
Frank, O., & Strauss, D. (1986). Markov graphs. Journal of the American Statistical Association, 81(395), 832–842.
Article MathSciNet Google Scholar
Gilks, W.R., Best, N., Tan, K. (1995). Adaptive rejection metropolis sampling within gibbs sampling. Applied Statistics, 44, 455–472.
Article Google Scholar
Ho, Q., Eisenstein, J., Xing, E.P. (2012). Document hierarchies from text and links. In Proceedings of the 21st international conference on World Wide Web (pp. 739–748). ACM.
Hoff, P.D., Raftery, A.E., Handcock, M.S. (2002). Latent space approaches to social network analysis. Journal of the American Statistical Association, 97(460), 1090–1098.
Article MathSciNet Google Scholar
Hoffman, M., Bach, F.R., Blei, D.M. (2010). Online learning for latent dirichlet allocation. In Advances in neural information processing systems (pp. 856–864).
Holland, P.W., & Leinhardt, S. (1981). An exponential family of probability distributions for directed graphs. Journal of the american Statistical association, 76 (373), 33–50.
Article MathSciNet Google Scholar
Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193–218.
Article Google Scholar
Hunter, D.R., Goodreau, S.M., Handcock, M.S. (2008). Goodness of Fit of Social Network Models. Journal of the American Statistical Association, 103(481), 248–258.
Article MathSciNet Google Scholar
Krivitsky, P.N., & Handcock, M.S. (2014). A separable model for dynamic networks. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76(1), 29–46.
Article MathSciNet Google Scholar
Latouche, P., Birmelé, E., Ambroise, C. (2011). Overlapping stochastic block models with application to the French political blogosphere. Annals of Applied Statistics, 5(1), 309–336.
Article MathSciNet Google Scholar
Lawrence, E., Sides, J., Farrell, H. (2010). Self-segregation or deliberation? blog readership, participation, and polarization in american politics. Perspectives on Politics, 8(01), 141.
Article Google Scholar
McNamee, P., & Mayfield, J. (2003). Jhu/apl experiments in tokenization and non-word translation. In Comparative evaluation of multilingual information access systems (pp. 85–97). Springer.
Moody, J. (2004). The structure of a social science collaboration network: disciplinary cohesion from 1963 to 1999. American Sociological Review, 69(2), 213–238.
Article Google Scholar
Newman, M.E.J., & Girvan, M. (2004). Finding and evaluating community structure in networks. Physical Review E - Statistical, Nonlinear, and Soft Matter Physics, 69 (22), 026113.
Article Google Scholar
Pons, P., & Latapy, M. (2006). Computing communities in large networks using random walks. J. Graph Algorithms Appl., 10(2), 191.
Article MathSciNet Google Scholar
Ramos, J. (2003). Using tf-idf to determine word relevance in document queries. In Proceedings of the first instructional conference on machine learning.
Robins, G., Elliott, P., Pattison, P. (2001). Network models for social selection processes. Social Networks, 23(1), 1–30.
Article Google Scholar
Snijders, T.A., & Nowicki, K. (1997). Estimation and prediction for stochastic blockmodels for graphs with latent block structure. Journal of Classification, 14(1), 75–100.
Article MathSciNet Google Scholar
Steinley, D. (2004). Properties of the hubert-arable adjusted rand index. Psychological Methods, 9(3), 386.
Article Google Scholar
Technorati. (2002). https://web.archive.org/web/20140420052710/http://technorati.com/.
Wang, E., Silva, J., Willett, R., Carin, L. (2011). Dynamic relational topic model for social network analysis with noisy links. In 2011 IEEE, statistical signal processing workshop (SSP) (pp. 497–500). IEEE.
Wasserman, S., & Pattison, P. (1996). Logit models and logistic regressions for social networks: I. An introduction to Markov graphs and p^∗. Psychometrika, 61(3), 401–425.
Article MathSciNet Google Scholar
Yin, J., & Wang, J. (2014). A dirichlet multinomial mixture model-based approach for short text clustering. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 233–242). ACM.

Download references

Author information

Authors and Affiliations

University of North Carolina, Chapel Hill, NC, 27599, USA
Teague R. Henry
Duke University, Durham, NC, USA
David Banks, Derek Owens-Oas & Christine Chai

Authors

Teague R. Henry
View author publications
You can also search for this author in PubMed Google Scholar
David Banks
View author publications
You can also search for this author in PubMed Google Scholar
Derek Owens-Oas
View author publications
You can also search for this author in PubMed Google Scholar
Christine Chai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Teague R. Henry.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Henry, T.R., Banks, D., Owens-Oas, D. et al. Modeling Community Structure and Topics in Dynamic Text Networks. J Classif 36, 322–349 (2019). https://doi.org/10.1007/s00357-018-9289-3

Download citation

Published: 22 January 2019
Issue Date: July 2019
DOI: https://doi.org/10.1007/s00357-018-9289-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Modeling Community Structure and Topics in Dynamic Text Networks

Abstract

Access this article

Similar content being viewed by others

Estimating psychological networks and their accuracy: A tutorial paper

Analyzing social media data: A mixed-methods framework combining computational and qualitative text analysis

The coordination network toolkit: a framework for detecting and analysing coordinated behaviour on social media

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Modeling Community Structure and Topics in Dynamic Text Networks

Abstract

Access this article

Similar content being viewed by others

Estimating psychological networks and their accuracy: A tutorial paper

Analyzing social media data: A mixed-methods framework combining computational and qualitative text analysis

The coordination network toolkit: a framework for detecting and analysing coordinated behaviour on social media

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation