Advertisement

Spatio-Temporal Event Detection from Multiple Data Sources

  • Aman AhujaEmail author
  • Ashish Baghudana
  • Wei Lu
  • Edward A. Fox
  • Chandan K. Reddy
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11439)

Abstract

The proliferation of Internet-enabled smartphones has ushered in an era where events are reported on social media websites such as Twitter and Facebook. However, the short text nature of social media posts, combined with a large volume of noise present in such datasets makes event detection challenging. This problem can be alleviated by using other sources of information, such as news articles, that employ a precise and factual vocabulary, and are more descriptive in nature. In this paper, we propose Spatio-Temporal Event Detection (STED), a probabilistic model to discover events, their associated topics, time of occurrence, and the geospatial distribution from multiple data sources, such as news and Twitter. The joint modeling of news and Twitter enables our model to distinguish events from other noisy topics present in Twitter data. Furthermore, the presence of geocoordinates and timestamps in tweets helps find the spatio-temporal distribution of the events. We evaluate our model on a large corpus of Twitter and news data, and our experimental results show that STED can effectively discover events, and outperforms state-of-the-art techniques.

Keywords

Topic modeling Probabilistic models Event detection 

Notes

Acknowledgments

This work was supported in part by the US National Science Foundation grants IIS-1619028, IIS-1707498, and IIS-1838730.

References

  1. 1.
    Ahuja, A., Wei, W., Carley, K.M.: Microblog sentiment topic model. In: 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW), pp. 1031–1038. IEEE (2016)Google Scholar
  2. 2.
    Ahuja, A., Wei, W., Lu, W., Carley, K.M., Reddy, C.K.: A probabilistic geographical aspect-opinion model for geo-tagged microblogs. In: 2017 IEEE International Conference on Data Mining (ICDM), pp. 721–726. IEEE (2017)Google Scholar
  3. 3.
    Andrieu, C., De Freitas, N., Doucet, A., Jordan, M.I.: An introduction to MCMC for machine learning. Mach. Learn. 50(1–2), 5–43 (2003)CrossRefGoogle Scholar
  4. 4.
    Aramaki, E., Maskawa, S., Morita, M.: Twitter catches the flu: detecting influenza epidemics using Twitter. In: Proceedings of the Conference on Empirical methods in Natural Language Processing, pp. 1568–1576. Association for Computational Linguistics (2011)Google Scholar
  5. 5.
    Benson, E., Haghighi, A., Barzilay, R.: Event discovery in social media feeds. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, pp. 389–398. Association for Computational Linguistics (2011)Google Scholar
  6. 6.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003)zbMATHGoogle Scholar
  7. 7.
    Eisenstein, J., O’Connor, B., Smith, N.A., Xing, E.P.: A latent variable model for geographic lexical variation. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp. 1277–1287. Association for Computational Linguistics (2010)Google Scholar
  8. 8.
    Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proc. Natl. Acad. Sci. 101(suppl 1), 5228–5235 (2004)CrossRefGoogle Scholar
  9. 9.
    Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of the 22nd annual international ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 50–57. ACM (1999)Google Scholar
  10. 10.
    Hong, L., Ahmed, A., Gurumurthy, S., Smola, A.J., Tsioutsiouliklis, K.: Discovering geographical topics in the Twitter stream. In: Proceedings of the 21st International Conference on World Wide Web, pp. 769–778. ACM (2012)Google Scholar
  11. 11.
    Kumaran, G., Allan, J.: Text classification and named entities for new event detection. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 297–304. ACM (2004)Google Scholar
  12. 12.
    Mathioudakis, M., Koudas, N.: TwitterMonitor: trend detection over the Twitter stream. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pp. 1155–1158. ACM (2010)Google Scholar
  13. 13.
    Matuszka, T., Vinceller, Z., Laki, S.: On a keyword-lifecycle model for real-time event detection in social network data. In: 2013 IEEE 4th International Conference on Cognitive Infocommunications (CogInfoCom), pp. 453–458. IEEE (2013)Google Scholar
  14. 14.
    Newman, D., Chemudugunta, C., Smyth, P., Steyvers, M.: Analyzing entities and topics in news articles using statistical topic models. In: Mehrotra, S., Zeng, D.D., Chen, H., Thuraisingham, B., Wang, F.-Y. (eds.) ISI 2006. LNCS, vol. 3975, pp. 93–104. Springer, Heidelberg (2006).  https://doi.org/10.1007/11760146_9CrossRefGoogle Scholar
  15. 15.
    Newman, D., Lau, J.H., Grieser, K., Baldwin, T.: Automatic evaluation of topic coherence. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 100–108. Association for Computational Linguistics (2010)Google Scholar
  16. 16.
    Sakaki, T., Okazaki, M., Matsuo, Y.: Earthquake shakes Twitter users: real-time event detection by social sensors. In: Proceedings of the 19th International Conference on World Wide Web, pp. 851–860. ACM (2010)Google Scholar
  17. 17.
    Sizov, S.: GeoFolk: latent spatial semantics in web 2.0 social media. In: Proceedings of the Third ACM International Conference on Web Search and Data Mining, pp. 281–290. ACM (2010)Google Scholar
  18. 18.
    Wei, W., Joseph, K., Lo, W., Carley, K.M.: A Bayesian graphical model to discover latent events from Twitter. In: ICWSM, pp. 503–512 (2015)Google Scholar
  19. 19.
    Zhao, W.X., et al.: Comparing Twitter and traditional media using topic models. In: Clough, P., et al. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 338–349. Springer, Heidelberg (2011).  https://doi.org/10.1007/978-3-642-20161-5_34CrossRefGoogle Scholar
  20. 20.
    Zubiaga, A., Spina, D., Amigó, E., Gonzalo, J.: Towards real-time summarization of scheduled events from Twitter streams. In: Proceedings of the 23rd ACM Conference on Hypertext and Social Media, pp. 319–320. ACM (2012)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Aman Ahuja
    • 1
    Email author
  • Ashish Baghudana
    • 2
  • Wei Lu
    • 3
  • Edward A. Fox
    • 2
  • Chandan K. Reddy
    • 1
  1. 1.Virginia TechArlingtonUSA
  2. 2.Virginia TechBlacksburgUSA
  3. 3.Singapore University of Technology and DesignSingaporeSingapore

Personalised recommendations