Abstract
Topic detection and tracking aims at extracting topics from a stream of textual information sources, or documents, and to quantify their “trend” in real time. These techniques apply on pieces of texts, i.e. posts, produced within social media platforms. Topic detection can produce two types of complementary outputs: cluster output or term output are selected and then clustered. In the first method, referred to as document-pivot , a topic is represented by a cluster of documents, whereas in the latter, commonly referred to as feature-pivot , a cluster of terms is produced instead. In the following, we review several popular approaches that fall in either of the two categories. Six state-of-the-art methods: Latent Dirichlet Allocation (LDA) , Document-Pivot Topic Detection (Doc-p) , Graph-Based Feature-Pivot Topic Detection (GFeat-p) , Frequent Pattern Mining (FPM) , Soft Frequent Pattern Mining (SFPM) , BNgram are described in detail, as they serve as the performance benchmarks to the proposed system.
Keywords
- Topic Detection
- Latent Dirichlet Allocation (LDA)
- Frequent Pattern Mining (FPM)
- Textual Information Sources
- Pivotal Paper
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Allan J (2002) Topic detection and tracking: event-based information organization. Kluwer Academic Publishers, Norwell
Becker H, Naaman M, Gravano L (2011) Beyond trending topics: real-world event identification on twitter. In: 5th international AAAI conference on web and social media
Blei DM, Lafferty JD (2006) Dynamic topic models. In: 23rd ACM international conference on machine learning, New York, pp 113–120
Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022
Cataldi M, Caro LD, Schifanella C (2010) Emerging topic detection on twitter based on temporal and social terms evaluation. In: 10th international workshop on multimedia data mining, New York, pp 1–10
Diplaris S, Petkos G, Papadopoulos S, Kompatsiaris Y, Sarris N, Martin C, Goker A, Corney D, Geurts J, Liu Y, Point JC (2012) SocialSensor: surfacing real-time trends and insights from multiple social networks. In: NEM summit, pp 47–52
Finkel JR, Grenager T, Manning C (2005) Incorporating non-local information into information extraction systems by Gibbs sampling. In: Annual meeting on association for computational linguistics, vol 43, pp 363–370
Fung GPC, Yu JX, Yu PS, Lu H (2005) Parameter free bursty events detection in text streams. In: 31st International conference on very large data bases. VLDB Endowment, pp 181–192
Goethals B (2005) Frequent set mining. Springer, Heidelberg, pp 377–397
Gyorodi C, Gyorodi R (2004) A comparative study of association rules mining algorithms. John Wiley & Sons
He Q, Chang K, Lim PE (2007) Analyzing feature trajectories for event detection. In: 30th annual international ACM conference on research and development in information retrieval, New York, pp 207–214
Lehmann J, Goncalves B, Ramasco JJ, Cattuto C (2012) Dynamical classes of collective attention in twitter. In: 21st ACM international conference on world wide web (WWW), New York, pp 251–260
Leskovec J, Backstrom L, Kleinberg J (2009) Meme-tracking and the dynamics of the news cycle. In: 15th ACM international conference on knowledge discovery and data mining (KDD), New York, pp 497–506
Li H, Wang Y, Zhang D, Zhang M, Chang EY (2008) Pfp: parallel fp-growth for query recommendation. In: ACM conference on recommender systems, New York, pp 107–114
Mathioudakis M, Koudas N (2010) Twittermonitor: trend detection over the twitter stream. In International conference on management of data (SIGMOD), New York, pp 1155–1158
Murtagh F (1983) A survey of recent advances in hierarchical clustering algorithms. Comput J 26(4):354–359
O’Connor B, Krieger M, Ahn D (2010) Tweetmotif: exploratory search and topic summarization for twitter. In: Cohen WW, Gosling S, Cohen WW, Gosling S (eds) 4th international AAAI conference on web and social media. The AAAI Press, Menlo Park
Papadopoulos S, Kompatsiaris Y, Vakali A (2010) A graph-based clustering scheme for identifying related tags in folksonomies. In: 12th international conference on data warehousing and knowledge discovery, pp 65–76
Petrovic S, Osborne M, Lavrenko V (2010) Streaming first story detection with application to twitter. In: Annual conference of the North American chapter of the association for computational linguistics, pp 181–189
Phuvipadawat S, Murata T (2010) Breaking news detection and tracking in twitter. In: IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology, pp 120–123
Porter MF (1997) An algorithm for suffix stripping. In: Readings in information retrieval. Morgan Kaufmann Publishers Inc., San Francisco, pp 313–316
Sakaki T, Okazaki M, Matsuo Y (2010) Earthquake shakes twitter users: real-time event detection by social sensors. In: Proceedings of the 19th ACM international conference on world wide web (WWW’10), New York
Salton G, McGill MJ (1986) Introduction to modern information retrieval. McGraw-Hill, New York
Sankaranarayanan J, Samet H, Teitler BE, Lieberman MD, Sperling J (2009) Twitterstand: news in tweets. In: 17th ACM international conference on advances in geographic information systems, New York, pp 42–51
Sayyadi H, Hurst M, Maykov A (2009) Event detection and tracking in social streams. In: Adar E, Hurst M, Finin T, Glance NS, Nicolov N, Tseng BL (eds) 3rd international AAAI conference on web and social media. The AAAI Press, Menlo Park
Shamma DA, Kennedy L, Churchill EF (2011) Peaks and persistence: modeling the shape of microblog conversations. In: ACM conference on computer supported cooperative work, New York, pp 355–358
Teh YW, Jordan MI, Beal MJ, Blei DM (2006) Hierarchical Dirichlet processes. J Am Stat Assoc 101(476):1566–1581
Teh YW, Newman D, Welling M (2007) A collapsed variational Bayesian inference algorithm for latent Dirichlet allocation. Adv Neural Inf Process Syst 19 1353–1360
Weng J, Lee B-S (2011) Event detection in twitter. In: 5th international conference on weblogs and social media
Xu X, Yuruk N, Feng Z, Schweiger TAJ (2007) Scan: a structural clustering algorithm for networks. In: 13th ACM international conference on knowledge discovery and data mining (KDD), New York, pp 824–833
Yang J, Leskovec J (2011) Patterns of temporal variation in online media. In: 4th ACM international conference on web search and data mining, New York, pp 177–186
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this chapter
Cite this chapter
Milioris, D. (2018). Background and Related Work. In: Topic Detection and Classification in Social Networks. Springer, Cham. https://doi.org/10.1007/978-3-319-66414-9_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-66414-9_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66413-2
Online ISBN: 978-3-319-66414-9
eBook Packages: EngineeringEngineering (R0)