Abstract
Twitter has attracted millions of users that generate a humongous flow of information at constant pace. The research community has thus started proposing tools to extract meaningful information from tweets. In this paper, we take a different angle from the mainstream of previous work: we explicitly target the analysis of the timeline of tweets from “single users”. We define a framework—named TUCAN—to compare information offered by the target users over time, and to pinpoint recurrent topics or topics of interest. First, tweets belonging to the same time window are aggregated into “bird songs”. Several filtering procedures can be selected to remove stop-words and reduce noise. Then, each pair of bird songs is compared using a similarity score to automatically highlight the most common terms, thus highlighting recurrent or persistent topics. TUCAN can be naturally applied to compare bird song pairs generated from timelines of different users. By showing actual results for both public profiles and anonymous users, we show how TUCAN is useful to highlight meaningful information from a target user’s Twitter timeline.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Note that by definition, \(\textit{VS}(u,i) \otimes \textit{VS}(u,i)=1\).
- 2.
References
Java A, Song X, Finin T, Tseng B (2007) Why we Twitter: understanding microblogging usage and communities. In: Workshop on web mining and social network, analysis, pp 56–65
Kwak H, Lee C, Park H, Moon S (2010) What is Twitter, a social network or a news media? In: WWW, pp 591–600
Alvanaki F, Michel S, Ramamritham K, Weikum G (2012) See what’s enBlogue—real-time emergent topic identification in social media. In: EDBT. ACM, Berlin
Hong L, Davison BD (2010) Empirical study of topic modeling in Twitter. In: Workshop on social media analytics. ACM, New York, pp 80–88
Mathioudakis M, Koudas N (2010) Twittermonitor: trend detection over the Twitter stream. In: SIGMOD’10. ACM, New York, pp 1155–1158
Ramage D, Dumais ST, Liebling DJ (2010) Characterizing microblogs with topic models. In: Cohen WW, Gosling S (eds) ICWSM. The AAAI Press
Salton G, Mcgill MJ (1986) Introduction to modern information retrieval. McGraw-Hill Inc, New York
Grimaudo L, Song H, Baldi M, Mellia M, Munafò M (2013) TUCAN Twitter user centric analyzer. In: Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining
Ramage D, Hall D, Nallapati R, Manning CD (2009) Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora. In: Proceedings of the 2009 conference on empirical methods in natural language processing: volume 1—volume 1. Stroudsburg, pp 248–256
Chang J, Boyd-Graber J, Blei DM (2009) Connections between the lines: augmenting social networks with text. In: ACM SIGKDD. ACM, New York, pp 169–178
Zhao WX, Jiang J, Weng J, He J, Lim EP, Yan H, Li X (2011) Comparing twitter and traditional media using topic models. In: ECIR’11. Berlin, pp 338–349
Phan X-H, Nguyen L-M, Horiguchi S (2008) Learning to classify short and sparse text and web with hidden topics from large-scale data collections. In: WWW. New York, pp 91–100
Liu Y, Niculescu-Mizil A, Gryc W (2009) Topic-link LDA: joint models of topic and author community. In: Annual international conference on machine learning. ACM, New York, pp 665–672
Blei DM, Ng A, Jordan M (2003) Latent dirichlet allocation. JMLR 3:993–1022
Rosen-Zvi M, Griffiths T, Steyvers M, Smyth P (2004) The author-topic model for authors and documents. In: UAI. Arlington, pp 487–494
Rosen-Zvi M, Chemudugunta C, Griffiths T, Smyth P, Steyvers M (2010) Learning author-topic models from text corpora, vol 28(1). ACM, New York, pp 4:1–4:38
Das Sarma A, Jain A, Yu C (2011) Dynamic relationship and event discovery. In: WSDM. New York, pp 207–216
Malik S, Smith A, Hawes T, Papadatos P, Dunne C, Shneiderman B (2013) TopicFlow: visualizing topic alignment of twitter data over time. In: Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining
Porter M (1980) An algorithm for suffix stripping. Program 14(3):130–137
Fellbaum C (1998) WordNet: An Electronic Lexical Database. MIT Press, Cambridge, p 422
Honeycutt C, Herring SC (2009) Beyond microblogging: conversation and collaboration via Twitter. In: HICSS’09, pp 1–10
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Grimaudo, L., Song, H.H., Baldi, M., Mellia, M., Munafò, M. (2014). TUCAN: Twitter User Centric ANalyzer. In: Kawash, J. (eds) Online Social Media Analysis and Visualization. Lecture Notes in Social Networks. Springer, Cham. https://doi.org/10.1007/978-3-319-13590-8_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-13590-8_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13589-2
Online ISBN: 978-3-319-13590-8
eBook Packages: Computer ScienceComputer Science (R0)