Encyclopedia of Social Network Analysis and Mining

2018 Edition
| Editors: Reda Alhajj, Jon Rokne

Real-Time Detection of Topics in Twitter Streams

  • Rania Ibrahim
  • Ahmed Elbagoury
  • Khaled Ammar
  • Mohamed S. Kamel
  • Fakhri Karray
Reference work entry
DOI: https://doi.org/10.1007/978-1-4939-7131-2_110157






Hadoop distributed file system


Latent dirichlet allocation


Latent semantic analysis


Non-negative matrix factorization


Singular value decomposition


Topic detection from Twitter is defined as the task of discovering the underlying key topics that occur in a set of tweets. Additionally, scalable topic detection techniques are topic detection techniques that scale well with extracting topics from huge number of tweets.


Recently Twitter has become one of the most popular social networks, where users can express themselves by tweeting their thoughts in a post of 140 characters at most. The increasing number of users – that reached more than 288 million users in 2014 – who are producing more than 500 million tweets daily ( http://www.statisticbrain.com/twitter-statistics/), motivates a lot of celebrities and organizations to post their updates on Twitter....
This is a preview of subscription content, log in to check access.



This publication was made possible by a grant from the Qatar National Research Fund through National Priority Research Program (NPRP) No. 06-1220-1-233. Its contents are solely the responsibility of the authors.


  1. Agarwal MK, Ramamritham K, Bhide M (2012) Real time discovery of dense clusters in highly dynamic graphs: identifying real world events in highly dynamic environments. Proc VLDB Endowment 5(10):980–991, TurkeyCrossRefGoogle Scholar
  2. Aiello LM, Petkos G, Martin C, Corney D, Papadopoulos S, Skraba R, Goker A, Kompatsiaris I, Jaimes A (2013) Sensing trending topics in twitter. IEEE Trans Multimedia 15(6):1268–1282CrossRefGoogle Scholar
  3. Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022zbMATHGoogle Scholar
  4. Budak C, Georgiou T, Agrawal D, El Abbadi A (2013) Geoscope: online detection of geo-correlated information trends in social networks. Proc VLDB Endowment 7(4):229, ItalyCrossRefGoogle Scholar
  5. Elbagoury A, Ibrahim R, Farahat A, Kamel M, Karray F (2015) Exemplar-based topic detection in twitter streams. In: Ninth international AAAI conference on weblogs and social media, Oxford, UKGoogle Scholar
  6. Elbagoury A, Ibrahim R, Kamel MS, Karray F (2016) Ebek: exemplar-based kernel preserving embedding. In: Proceedings of the 25th international conference on artificial intelligence. AAAI Press, New York, USAGoogle Scholar
  7. Elsayed T, Lin J, Oard DW (2008) Pairwise document similarity in large collections with mapreduce. In: Proceedings of the 46th annual meeting of the Association for Computational Linguistics on human language technologies: short papers. Association for Computational Linguistics, pp 265–268, Ohio, USAGoogle Scholar
  8. Gonzalez JE, Low Y, Gu H, Bickson D, Guestrin C (2012) Powergraph: distributed graph-parallel computation on natural graphs. OSDI 12:2, Hollywood, CA, USAGoogle Scholar
  9. Gonzalez JE, Xin RS, Dave A, Crankshaw D, Franklin MJ, Stoica I (2014) Graphx: graph processing in a distributed dataflow framework. In: Proceedings of the 11th USENIX symposium on operating systems design and implementation (OSDI), broomfield, CO, USAGoogle Scholar
  10. Gupta P, Goel A, Lin J, Sharma A, Wang D, Zadeh R (2013) Wtf: the who to follow service at twitter. In: Proceedings of the 22nd international conference on world wide web, pp 505–514. International World Wide Web Conferences Steering Committee, Rio de Janeiro, BrazilGoogle Scholar
  11. Ibrahim R, Elbagoury A, Kamel MS, Karray F (2016) Lvc: local variance-based clustering. In: Neural networks (IJCNN), 2016 international joint conference on. IEEE, Vancouver, CanadaGoogle Scholar
  12. Landauer TK, Foltz PW, Laham D (1998) An introduction to latent semantic analysis. Discourse Processes 25(2–3):259–284CrossRefGoogle Scholar
  13. Lee P, Lakshmanan LV, Milios EE (2014) Incremental cluster evolution tracking from highly dynamic network data. In: Data engineering (ICDE), 2014 I.E. 30th international conference on. IEEE, pp 3–14, Chicago, IL, USAGoogle Scholar
  14. Li R, Lei KH, Khadiwala R, Chang K-C (2012) Tedas: a twitter-based event detection and analysis system. In: Data engineering (icde), 2012 ieee 28th international conference on. IEEE, pp 1273–1276, Arlington, VA, USAGoogle Scholar
  15. Li R, Wang S, Chang KC-C (2013) Towards social data platform: automatic topic-focused monitor for twitter stream. Proc VLDB Endowment 6(14):1966–1977, ItalyCrossRefGoogle Scholar
  16. Low Y, Gonzalez JE, Kyrola A, Bickson D, Guestrin CE, Hellerstein J (2014) Graphlab: a new framework for parallel machine learning. In: arXiv preprint arXiv:1408.2041Google Scholar
  17. McMinn AJ, Moshfeghi Y, Jose JM (2013) Building a large-scale corpus for evaluating event detection on twitter. In: Proceedings of the 22nd ACM international conference on conference on information & knowledge management. ACM, pp 409–418, San Francisco, CA, USAGoogle Scholar
  18. Mehrotra R, Sanner S, Buntine W, Xie L (2013) Improving lda topic models for microblogs via tweet pooling and automatic labeling. In: Proceedings of the 36th international ACM SIGIR conference on research and development in information retrieval. ACM, pp 889–892, Dublin, IrelandGoogle Scholar
  19. Mishne G, Dalton J, Li Z, Sharma A, Lin J (2013) Fast data in the era of big data: Twitter’s real-time related query suggestion architecture. In: Proceedings of the 2013 ACM SIGMOD international conference on management of data. ACM, pp 1147–1158, New York, USAGoogle Scholar
  20. Newman D, Bonilla EV, Buntine W (2011) Improving topic coherence with regularized topic models. In: Advances in neural information processing systems, pp 496–504, Granada Spain.Google Scholar
  21. Pantel P, Crestan E, Borkovsky A, Popescu A-M, Vyas V (2009) Web-scale distributional similarity and entity set expansion. In: Proceedings of the 2009 conference on empirical methods in natural language processing: volume 2-volume 2. Association for Computational Linguistics, pp 938–947, SingaporeGoogle Scholar
  22. Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Inf Process Manag 24(5): 513–523CrossRefGoogle Scholar
  23. Tian Y, Balmin A, Corsten SA, Tatikonda S, McPherson J (2013) From “think like a vertex” to “think like a graph”. Proc VLDB Endowment 7(3):193–204, ItalyCrossRefGoogle Scholar
  24. Toshniwal A, Taneja S, Shukla A, Ramasamy K, Patel JM, Kulkarni S, Jackson J, Gade K, Fu M, Donham J et al (2014) Storm@ twitter. In: Proceedings of the 2014 ACM SIGMOD international conference on management of data. ACM, pp 147–156, Utah, USAGoogle Scholar
  25. Xie W, Zhu F, Jiang J, Lim E-P, Wang K (2013) Topicsketch: real-time bursty topic detection from twitter. In: Data mining (ICDM), 2013 I.E. 13th international conference on. IEEE, pp 837–846, Dallas, Texas, USAGoogle Scholar
  26. Yan X, Guo J, Liu S, Cheng X-q, Wang Y (2012) Clustering short text using ncut-weighted non-negative matrix factorization. In: Proceedings of the 21st ACM international conference on information and knowledge management. ACM, pp 2259–2262, Maui, HI, USAGoogle Scholar
  27. Yan D, Cheng J, Lu Y, Ng W (2014) Blogel: a block-centric framework for distributed computation on real-world graphs. Proc VLDB Endowment 7(14):1981, Hangzhou, ChinaGoogle Scholar
  28. Yuan M, Wu K-L, Jacques-Silva G, Lu Y (2013) Efficient processing of streaming graphs for evolution-aware clustering. In: Proceedings of the 22nd ACM international conference on conference on information & knowledge management. ACM, pp 319–328, San Francisco, CA, USAGoogle Scholar
  29. Zhang Z, Shu H, Chong Z, Lu H, Yang Y (2013) C-cube: elastic continuous clustering in the cloud. In: Data engineering (ICDE), 2013 I.E. 29th international conference on. IEEE, pp 577–588, Brisbane, AustraliaGoogle Scholar

Copyright information

© Springer Science+Business Media LLC, part of Springer Nature 2018

Authors and Affiliations

  • Rania Ibrahim
    • 1
  • Ahmed Elbagoury
    • 1
  • Khaled Ammar
    • 1
  • Mohamed S. Kamel
    • 1
  • Fakhri Karray
    • 1
  1. 1.University of WaterlooWaterlooCanada

Section editors and affiliations

  • Fakhreddine Karray
    • 1
  1. 1.Department of Electrical and Computer Engineering, Centre for Pattern Analysis and Machine Intelligence (CPAMI)University of WaterlooWaterlooCanada