Uncovering Geo-Social Semantics from the Twitter Mention Network: An Integrated Approach Using Spatial Network Smoothing and Topic Modeling
Advances in human dynamics research and availability of geo-referenced communication data provide an unprecedented opportunity for studying the semantics of communication and understanding the interplay between online social networks and geography. Among the most extensively studied topics in geographically-embedded communication networks, are the effect of geographic proximity on interpersonal communication; the influence of information diffusion and social networks on real-world geographic events such as group activities and demonstrations; and revealing structural and geographic characteristics of a communication network. However, little is known on how the content of interpersonal communication vary across geographic space. By integrating methods of spatial network smoothing and probabilistic topic modeling, this paper introduces an approach to extracting and visualizing geo-social semantics, i.e., how the semantics of information vary based on the geographic locations and communication ties among the users. Different from the previous work that examine the geographic variation in the content produced by individuals, the proposed approach focuses on an analysis of reciprocal conversations among individuals in a geographically-embedded communication network. To demonstrate the approach, geo-located mention tweets in the U.S. from Aug. 1, 2015 to Aug. 1, 2016 were analyzed. Topics extracted from the analysis reflect geo-social dynamics of the society, way of speaking in the context of friendship, linguistic variation and the use of social media acronyms. Although the tweets were collected during primary and presidential elections, political topics discovered from the reciprocal mentions focused more on civil rights rather than the candidates and primaries. While the topic of primary candidates and elections was prominent at locations of primary elections and core supporters of candidates; civil rights was a prominent topic across the whole country.
KeywordsGeo-social semantics Topic modeling Geographically-embedded social networks Reciprocal mention tweets
- Adamic, L. A., Lento, T. M., Adar, E., & Ng, P. C. (2014). Information evolution in social networks. arXiv preprint arXiv:1402.6792.
- Backstrom, L., Sun, E., & Marlow, C. (2010, April 26–30). Find me if you can: Improving geographical prediction with social and spatial proximity. Paper Presented at the Proceedings of the 19th International Conference on World Wide Web, Raleigh, NC, USA.Google Scholar
- Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. The Journal of Machine Learning Research, 3, 993–1022.Google Scholar
- Chae, J., Thom, D., Bosch, H., Jang, Y., Maciejewski, R., Ebert, D. S., et al. (2012). Spatiotemporal social media analytics for abnormal event detection and examination using seasonal-trend decomposition. Paper Presented at the 2012 IEEE Conference on Visual Analytics Science and Technology (VAST).Google Scholar
- Cogan, P., Andrews, M., Bradonjic, M., Kennedy, W. S., Sala, A., & Tucci, G. (2012). Reconstruction and analysis of Twitter conversation graphs. Paper Presented at the Proceedings of the First ACM International Workshop on Hot Topics on Interdisciplinary Social Networks Research.Google Scholar
- Compton, R., Jurgens, D., & Allen, D. (2014). Geotagging one hundred million Twitter accounts with total variation minimization. Paper Presented at the 2014 IEEE International Conference on Big Data (Big Data).Google Scholar
- Conover, M., Ratkiewicz, J., Francisco, M., Gonçalves, B., Menczer, F., & Flammini, A. (2011). Political Polarization on Twitter. Paper presented at the ICWSM.Google Scholar
- Doughty, M., Rowland, D., & Lawson, S. (2012). Who is on your sofa? TV audience communities and second screening social networks. Paper Presented at the Proceedings of the 10th European Conference on Interactive TV and Video.Google Scholar
- Eisenstein, J., Ahmed, A., & Xing, E. P. (2011). Sparse additive generative models of text. Paper Presented at the Proceedings of the 28th International Conference on Machine Learning (ICML-11).Google Scholar
- Grant, C. E., George, C. P., Jenneisch, C., & Wilson, J. N. (2011). Online topic modeling for real-time Twitter search. Paper Presented at the TREC.Google Scholar
- Han, S. Y., Tsou, M.-H., & Clarke, K. C. (2017). Revisiting the death of geography in the era of Big Data: The friction of distance in cyberspace and real space. International Journal of Digital Earth, 1–19.Google Scholar
- Hong, L., & Davison, B. D. (2010). Empirical study of topic modeling in Twitter. Paper Presented at the Proceedings of the First Workshop on Social Media Analytics.Google Scholar
- Hu, B., & Ester, M. (2013). Spatial topic modeling in online social media for location recommendation. Paper Presented at the Proceedings of the 7th ACM Conference on Recommender Systems.Google Scholar
- Hu, Y. J., Gao, S., Janowicz, K., Yu, B. L., Li, W. W., & Prasad, S. (2015). Extracting and understanding urban areas of interest using geotagged photos. Computers, Environment and Urban Systems, 54, 240–254. https://doi.org/10.1016/j.compenvurbsys.2015.09.001.CrossRefGoogle Scholar
- Jurgens, D. (2013). That’s what friends are for: Inferring location in online social media platforms based on social relationships. ICWSM, 13, 273–282.Google Scholar
- Kato, S., Koide, A., Fushimi, T., Saito, K., & Motoda, H. (2012). Network analysis of three Twitter functions: Favorite, follow and mention. Paper Presented at the Pacific Rim Knowledge Acquisition Workshop.Google Scholar
- Kondor, D., Csabai, I., Dobos, L., Szule, J., Barankai, N., Hanyecz, T., … Vattey, G. (2013). Using Robust PCA to estimate regional characteristics of language use from geo-tagged Twitter messages. Paper Presented at the 2013 IEEE 4th International Conference on Cognitive Infocommunications (CogInfoCom).Google Scholar
- Koylu, C., & Guo, D. (2013). Smoothing locational measures in spatial interaction networks. Computers, Environment and Urban Systems, 41, 12–25. https://doi.org/10.1016/j.compenvurbsys.2013.03.001.CrossRefGoogle Scholar
- Kylasa, S. B., Kollias, G., & Grama, A. (2015). Social ties and checkin sites: Connections and latent structures in location based social networks. Paper Presented at the Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015.Google Scholar
- Lin, Y.-R., Margolin, D., & Lazer, D. (2016). Uncovering social semantics from textual traces: A theory-driven approach and evidence from public statements of U.S. Members of Congress. Journal of the Association for Information Science and Technology, 67, 2072–2089. https://doi.org/10.1002/asi.23540.
- Malik, S., Smith, A., Hawes, T., Papadatos, P., Li, J., Dunne, C., & Shneiderman, B. (2013). TopicFlow: Visualizing topic alignment of Twitter data over time. Paper Presented at the Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, Niagara, Ontario, Canada.Google Scholar
- McCallum, A. K. (2002). MALLET: A machine learning for language toolkit. http://mallet.cs.umass.edu.
- McCallum, A., Wang, X., & Corrada-Emmanuel, A. (2007). Topic and role discovery in social networks with experiments on enron and academic email. Journal of Artificial Intelligence Research, 30, 249–272.Google Scholar
- Pozdnoukhov, A., & Kaiser, C. (2011). Space-time dynamics of topics in streaming text. Paper Presented at the Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Location-Based Social Networks.Google Scholar
- Sakaki, T., Okazaki, M., & Matsuo, Y. (2010). Earthquake shakes Twitter users: Real-time event detection by social sensors. Paper Presented at the Proceedings of the 19th International Conference on World Wide Web.Google Scholar
- Salton, G., & McGill, M. J. (1983). Introduction to Modern Information Retrieval.Google Scholar
- Underwood, T. (2012). Topic modeling made just simple enough. Retrieved from http://tedunderwood.com/2012/04/07/topic-modeling-made-just-simple-enough/.
- Vasi, I. B., & Suh, C. S. (2013). Protest in the Internet Age: Public attention, social media, and the spread of “Occupy” protests in the United States.Google Scholar
- Weng, J., & Lee, B.-S. (2011). Event detection in Twitter. ICWSM, 11, 401–408.Google Scholar
- Yamaguchi, Y., Amagasa, T., & Kitagawa, H. (2013). Landmark-based user location inference in social media. Paper Presented at the Proceedings of the first ACM Conference on Online Social Networks.Google Scholar
- Yan, X., Guo, J., Lan, Y., & Cheng, X. (2013). A biterm topic model for short texts. Paper Presented at the Proceedings of the 22nd International Conference on World Wide Web.Google Scholar
- Zhang, D., Zhai, C., & Han, J. (2009). Topic cube: Topic modeling for OLAP on multidimensional text databases. Paper Presented at the SDM.Google Scholar