Advertisement

Data Mining and Knowledge Discovery

, Volume 29, Issue 5, pp 1374–1405 | Cite as

Multiscale event detection in social media

  • Xiaowen Dong
  • Dimitrios Mavroeidis
  • Francesco Calabrese
  • Pascal Frossard
Article

Abstract

Event detection has been one of the most important research topics in social media analysis. Most of the traditional approaches detect events based on fixed temporal and spatial resolutions, while in reality events of different scales usually occur simultaneously, namely, they span different intervals in time and space. In this paper, we propose a novel approach towards multiscale event detection using social media data, which takes into account different temporal and spatial scales of events in the data. Specifically, we explore the properties of the wavelet transform, which is a well-developed multiscale transform in signal processing, to enable automatic handling of the interaction between temporal and spatial scales. We then propose a novel algorithm to compute a data similarity graph at appropriate scales and detect events of different scales simultaneously by a single graph-based clustering process. Furthermore, we present spatiotemporal statistical analysis of the noisy information present in the data stream, which allows us to define a novel term-filtering procedure for the proposed event detection algorithm and helps us study its behavior using simulated noisy data. Experimental results on both synthetically generated data and real world data collected from Twitter demonstrate the meaningfulness and effectiveness of the proposed approach. Our framework further extends to numerous application domains that involve multiscale and multiresolution data analysis.

Keywords

Multiscale event detection Spatiotemporal analysis  Wavelet decomposition Modularity-based clustering 

Notes

Acknowledgments

X. Dong is supported by a Swiss National Science Foundation Mobility Fellowship. This work was done while X. Dong and D. Mavroeidis were at IBM Research - Ireland.

References

  1. Aggarwal CC, Subbian K (2012) Event detection in social streams. In: SIAM international conference on data mining (SDM), Anaheim, CAGoogle Scholar
  2. Atefeh F, Khreich W (2013) A survey of techniques for event detection in Twitter. Comput IntellGoogle Scholar
  3. Becker H, Naaman M, Gravano L (2009) Event identification in social media. In: ACM SIGMOD workshop on the web and databases (WebDB), Providence, RIGoogle Scholar
  4. Becker H, Naaman M, Gravano L (2010) Learning similarity metrics for event identification in social media. In: The third ACM international conference on web search and data mining (WSDM), New York City, NYGoogle Scholar
  5. Becker H, Naaman M, Gravano L (2011) Beyond trending topics: real-world event identification on Twitter. In: The fifth international AAAI conference on weblogs and social media (ICWSM), BarcelonaGoogle Scholar
  6. Berlingerio M, Calabrese F, Lorenzo GD, Dong X, Gkoufas Y, Mavroeidis D (2013) SaferCity: a system for detecting and analyzing incidents from social media. In: IEEE international conference on data mining (ICDM), Dallas, TXGoogle Scholar
  7. Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech 10:P10008 (12pp)Google Scholar
  8. Chen L, Roy A (2009) Event detection from flickr data through wavelet-based spatial analysis. In: The 18th ACM conference on information and knowledge management (CIKM), Hong KongGoogle Scholar
  9. Cooper M, Foote J, Girgensohn A, Wilcox L (2005) Temporal event clustering for digital photo collections. ACM Trans Multimed Comput Commun Appl (TOMCCAP) 1(3):269–288CrossRefGoogle Scholar
  10. Cordeiro M (2012) Twitter event detection: combining wavelet analysis and topic inference summarization. In: Doctoral symposium on informatics engineering, PortoGoogle Scholar
  11. Cressie N, Wikle CK (2011) Statistics for spatio-temporal data (Wiley series in probability and statistics). Wiley, New YorkGoogle Scholar
  12. Daubechies I (1992) Ten lectures on wavelets. In: SIAMGoogle Scholar
  13. Lappas T, Vieira MR, Gunopulos D, Tsotras VJ (2012) On the spatiotemporal burstiness of terms. In: The 38th international conference on very large databases, IstanbulGoogle Scholar
  14. Lee CH, Yang HC, Chien TF, Wen WS (2011) A novel approach for event detection by mining spatio-temporal information on microblogs. In: International conference on advances in social networks analysis and mining (ASONAM), KaohsiungGoogle Scholar
  15. Li C, Sun A, Datta A (2012a) Twevent: segment-based event detection from Tweets. In: The 21st ACM international conference on information and knowledge management (CIKM), Maui, HIGoogle Scholar
  16. Li R, Lei KH, Khadiwala R, Chang KCC (2012b) TEDAS: a Twitter-based event detection and analysis system. In: The 28th IEEE international conference on data engineering (ICDE), Washington, DCGoogle Scholar
  17. Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press, CambridgeCrossRefzbMATHGoogle Scholar
  18. Marcus A, Bernstein MS, Badar O, Karger DR, Madden S, Miller RC (2011) Twitinfo: aggregating and visualizing microblogs for event exploration. In: ACM CHI conference on human factors in computing systems, VancouverGoogle Scholar
  19. Newman MEJ (2006) Modularity and community structure in networks. Proc Natl Acad Sci USA 103(23):8577–8582CrossRefGoogle Scholar
  20. Ozdikis O, Senkul P, Oguztuzun H (2012) Semantic expansion of hashtags for enhanced event detection in Twitter. In: The first international workshop on online social systems (WOSS), IstanbulGoogle Scholar
  21. Papadopoulos S, Zigkolis C, Kompatsiaris Y, Vakali A (2011) Cluster-based landmark and event detection for tagged photo collections. IEEE MultiMed 18(1):52–63CrossRefGoogle Scholar
  22. Parikh R, Karlapalem K (2013) ET: events from Tweets. In: The 22nd international conference on world wide web (WWW), Rio de JaneiroGoogle Scholar
  23. Petrovic S, Osborne M, Lavrenko V (2010) Streaming first story detection with application to Twitter. In: The 11th annual conference of the North American chapter of the association for computational linguistics, Los Angeles, CAGoogle Scholar
  24. Rattenbury T, Good N, Naaman M (2007) Towards automatic extraction of event and place semantics from Flickr tags. In: ACM SIGIR conference on research and development on information retrieval, AmsterdamGoogle Scholar
  25. Reuter T, Papadopoulos S, Petkos G, Mezaris V, Kompatsiaris Y, Cimiano P, de Vries C, Geva S (2013) Social event detection at mediaeval 2013: challenges, datasets, and evaluation. In: Mediaeval benchmarking initiative for multimedia evaluation (MediaEval) 2013 workshop, BarcelonaGoogle Scholar
  26. Ronhovde P, Chakrabarty S, Hu D, Sahu M, Sahu KK, Kelton KF, Mauro NA, Nussinov Z (2011) Detecting hidden spatial and spatio-temporal structures in glasses and complex physical systems by multiresolution network clustering. Eur Phys J E 34:105CrossRefGoogle Scholar
  27. Ronhovde P, Chakrabarty S, Hu D, Sahu M, Sahu KK, Kelton KF, Mauro NA, Nussinov Z (2012) Detection of hidden structures for arbitrary scales in complex physical systems. Sci Rep 2:329CrossRefGoogle Scholar
  28. Sakaki T, Okazaki M, Matsuo Y (2010) Earthquake shakes Twitter users: real-time event detection by social sensors. In: The 19th international conference on world wide web (WWW), Raleigh, NCGoogle Scholar
  29. Sankaranarayanan J, Samet H, Teitler BE, Lieberman MD, Sperling J (2009) TwitterStand: news in Tweets. In: The 17th ACM SIGSPATIAL international conference on advances in geographic information systems, Seattle, WAGoogle Scholar
  30. Sayyadi H, Hurst M, Maykov A (2009) Event detection and tracking in social streams. In: The third international AAAI conference on weblogs and social media (ICWSM), San Jose, CAGoogle Scholar
  31. Sheikholeslami G, Chatterjee S, Zhang A (2000) WaveCluster: a multi-resolution clustering approach for very large spatial databases. Int J Very Large Data Bases 8(3–4):289–304CrossRefGoogle Scholar
  32. Sugitani T, Shirakawa M, Hara T, Nishio S (2013) Detecting local events by analyzing spatiotemporal locality of Tweets. In: The 27th international conference on advanced information networking and applications workshops (WAINA), BarcelonaGoogle Scholar
  33. Thom D, Bosch H, Koch S, Woerner M, Ertl T (2012) Spatiotemporal anomaly detection through visual analysis of geolocated Twitter messages. In: 2012 IEEE Pacific visualization symposium (PacificVis), SongdoGoogle Scholar
  34. Tremblay N, Borgnat P (2012) Multiscale community mining in networks using spectral graph wavelets. arXiv:1212.0689
  35. von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416MathSciNetCrossRefGoogle Scholar
  36. Walther M, Kaisser M (2013) Geo-spatial event detection in the Twitter stream. In: The 35th European conference on information retrieval (ECIR), MoscowGoogle Scholar
  37. Weng J, Lee BS (2011) Event detection in Twitter. In: The fifth international AAAI conference on weblogs and social media (ICWSM), BarcelonaGoogle Scholar
  38. Witkin A (1983) Scale space filtering. In: International joint conference on artificial intelligence (IJCAI), KarlsruheGoogle Scholar
  39. Zaharieva M, Zeppelzauer M, Breiteneder C (2013) Automated social event detection in large photo collections. In: ACM international conference on multimedia retrieval, Dallas, TXGoogle Scholar
  40. Zeimpekis D, Gallopoulos E (2006) TMG: a MATLAB toolbox for generating term-document matrices from text collections. In: Kogan J, Nicholas C, and Teboulle M (eds) Grouping multidimensional data: recent advances in clustering. pp 187–210Google Scholar

Copyright information

© The Author(s) 2015

Authors and Affiliations

  • Xiaowen Dong
    • 1
  • Dimitrios Mavroeidis
    • 2
  • Francesco Calabrese
    • 3
  • Pascal Frossard
    • 4
  1. 1.MIT Media LabCambridgeUSA
  2. 2.Philips ResearchEindhovenNetherlands
  3. 3.IBM ResearchDublinIreland
  4. 4.École Polytechnique Fédérale de Lausanne (EPFL)LausanneSwitzerland

Personalised recommendations