Advertisement

Stability Evaluation of Event Detection Techniques for Twitter

  • Andreas Weiler
  • Joeran BeelEmail author
  • Bela Gipp
  • Michael Grossniklaus
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9897)

Abstract

Twitter continues to gain popularity as a source of up-to-date news and information. As a result, numerous event detection techniques have been proposed to cope with the steadily increasing rate and volume of social media data streams. Although most of these works conduct some evaluation of the proposed technique, comparing their effectiveness is a challenging task. In this paper, we examine the challenges to reproducing evaluation results for event detection techniques. We apply several event detection techniques and vary four parameters, namely time window (15 vs. 30 vs. 60 mins), stopwords (include vs. exclude), retweets (include vs. exclude), and the number of terms that define an event (1...5 terms). Our experiments use real-world Twitter streaming data and show that varying these parameters alone significantly influences the outcomes of the event detection techniques, sometimes in unforeseen ways. We conclude that even minor variations in event detection techniques may lead to major difficulties in reproducing experiments.

Notes

Acknowledgement

The research presented in this paper is funded in part by the Deutsche Forschungsgemeinschaft (DFG), Grant No. GR 4497/4: “Adaptive and Scalable Event Detection Techniques for Twitter Data Streams” and by a fellowship within the FITweltweit programme of the German Academic Exchange Service (DAAD). We would also like to thank the students Christina Papavasileiou, Harry Schilling, and Wai-Lok Cheung for their contributions to the implementations of WATIS, EDCoW, and enBlogue.

References

  1. 1.
    Aiello, L.M., Petkos, G., Martin, C., Corney, D., Papadopoulos, S., Skraba, R., Göker, A., Kompatsiaris, I.: Sensing trending topics in Twitter. IEEE Trans. Multimedia 15(6), 1268–1282 (2013)CrossRefGoogle Scholar
  2. 2.
    Alvanaki, F., Michel, S., Ramamritham, K., Weikum, G.: See what’s enBlogue: real-time emergent topic identification in social media. In: Proceedings of International Conference on Extending Database Technology (EDBT), pp. 336–347 (2012)Google Scholar
  3. 3.
    Becker, H., Naaman, M., Gravano, L.: Beyond trending topics: real-world event identification on Twitter. In: Proceedings of International Conference on Weblogs and Social Media (ICWSM), pp. 438–441 (2011)Google Scholar
  4. 4.
    Beel, J., Breitinger, C., Langer, S., Lommatzsch, A., Gipp, B.: Towards reproducibility in recommender-systems research. User Model. User-Adap. Inter. 26(1), 69–101 (2016)CrossRefGoogle Scholar
  5. 5.
    Beel, J., Langer, S.: A Comparison of offline evaluations, online evaluations, and user studies in the context of research-paper recommender systems. In: Kapidakis, S., Mazurek, C., Werla, M. (eds.) TPDL 2015. LNCS, vol. 9316, pp. 153–168. Springer, Heidelberg (2015). doi: 10.1007/978-3-319-24592-8_12 CrossRefGoogle Scholar
  6. 6.
    Beel, J., Langer, S., Genzmehr, M., Nürnberger, A.: Introducing Docear’s research paper recommender system. In: Proceedings of Joint Conference on Digital Libraries (JCDL), pp. 459–460 (2013)Google Scholar
  7. 7.
    Bethard, S., Jurafsky, D.: Who should i cite: learning literature search models from citation behavior. In: Proceedings of International Conference on Information and Knowledge Management (CIKM), pp. 609–618 (2010)Google Scholar
  8. 8.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)zbMATHGoogle Scholar
  9. 9.
    Cordeiro, M.: Twitter event detection: combining wavelet analysis and topic inference summarization. In: Proceedings of Doctoral Symposium on Informatics Engineering (DSIE) (2012)Google Scholar
  10. 10.
    Farzindar, A., Khreich, W.: A survey of techniques for event detection in Twitter. Comput. Intell. 31(1), 132–164 (2015)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Guille, A., Favre, C.: Mention-anomaly-based event detection and tracking in Twitter. In: Proceedings of International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 375–382 (2014)Google Scholar
  12. 12.
    He, Q., Pei, J., Kifer, D., Mitra, P., Giles, L.: Context-aware citation recommendation. In: Proceedings of International Conference on World Wide Web (WWW), pp. 421–430 (2010)Google Scholar
  13. 13.
    Li, C., Sun, A., Datta, A.: Twevent: segment-based event detection from tweets. In: Proceedings of International Conference on Information and Knowledge Management (CIKM), pp. 155–164 (2012)Google Scholar
  14. 14.
    Lu, Y., He, J., Shan, D., Yan, H.: Recommending citations with translation model. In: Proceedings of International Conference on Information and Knowledge Management (CIKM), pp. 2017–2020 (2011)Google Scholar
  15. 15.
    Madani, A., Boussaid, O., Zegour, D.E.: What’s happening: a survey of tweets event detection. In: Proceedings of International Conference on Communications, Computation, Networks and Technologies (INNOV), pp. 16–22 (2014)Google Scholar
  16. 16.
    Mathioudakis, M., Koudas, N.: TwitterMonitor: trend detection over the Twitter stream. In: Proceedings of International Conference on Management of Data (SIGMOD), pp. 1155–1158 (2010)Google Scholar
  17. 17.
    McCallum, A.K.: MALLET: A Machine Learning for Language Toolkit (2002). http://mallet.cs.umass.edu
  18. 18.
    McCreadie, R., Soboroff, I., Lin, J., Macdonald, C., Ounis, I., McCullough, D.: On building a reusable Twitter corpus. In: Proceedings of International Conference on Research and Development in Information Retrieval (SIGIR), pp. 1113–1114 (2012)Google Scholar
  19. 19.
    McMinn, A.J., Moshfeghi, Y., Jose, J.M.: Building a large-scale corpus for evaluating event detection on Twitter. In: Proceedings of International Conference on Information and Knowledge Management (CIKM), pp. 409–418 (2013)Google Scholar
  20. 20.
    Nurwidyantoro, A., Winarko, E.: Event detection in social media: a survey. In: Proceedings of International Conference on ICT for Smart Society (ICISS), pp. 1–5 (2013)Google Scholar
  21. 21.
    Parikh, R., Karlapalem, K.: ET: Events from Tweets. In: Proceedings of International Conference Companion on World Wide Web (WWW), pp. 613–620 (2013)Google Scholar
  22. 22.
    Petrović, S., Osborne, M., Lavrenko, V.: Streaming first story detection with application to Twitter. In: Proceedings of Conference on the North American Chapter of the Association for Computational Linguistics (HLT), pp. 181–189 (2010)Google Scholar
  23. 23.
    Petrović, S., Osborne, M., Lavrenko, V.: Using paraphrases for improving first story detection in news and Twitter. In: Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT), pp. 338–346 (2012)Google Scholar
  24. 24.
    Weiler, A.: Design and evaluation of event detection techniques for social media data streams. Ph.D. thesis, University of Konstanz, Konstanz (2016)Google Scholar
  25. 25.
    Weiler, A., Grossniklaus, M., Scholl, M.H.: Event identification and tracking in social media streaming data. In: Proceedings of EDBT Workshop on Multimodal Social Data Management (MSDM), pp. 282–287 (2014)Google Scholar
  26. 26.
    Weiler, A., Grossniklaus, M., Scholl, M.H.: Evaluation measures for event detection techniques on Twitter data streams. In: Maneth, S. (ed.) BICOD 2015. LNCS, vol. 9147, pp. 108–119. Springer, Heidelberg (2015). doi: 10.1007/978-3-319-20424-6_11 CrossRefGoogle Scholar
  27. 27.
    Weiler, A., Grossniklaus, M., Scholl, M.H.: Run-time and task-based performance of event detection techniques for Twitter. In: Zdravkovic, J., Kirikova, M., Johannesson, P. (eds.) CAiSE 2015. LNCS, vol. 9097, pp. 35–49. Springer, Heidelberg (2015). doi: 10.1007/978-3-319-19069-3_3 CrossRefGoogle Scholar
  28. 28.
    Weiler, A., Scholl, M.H., Wanner, F., Rohrdantz, C.: Event identification for local areas using social media streaming data. In: Proceedings of SIGMOD Workshop on Databases and Social Networks (DBSocial), pp. 1–6 (2013)Google Scholar
  29. 29.
    Weng, J., Lee, B.S.: Event detection in Twitter. In: Proceedings of International Conference on Weblogs and Social Media (ICWSM), pp. 401–408 (2011)Google Scholar
  30. 30.
    Zarrinkalam, F., Kahani, M.: SemCiR - a citation recommendation system based on a novel semantic distance measure. Program: Electron. Libr. Inf. Syst. 47(1), 92–112 (2013)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Andreas Weiler
    • 1
  • Joeran Beel
    • 2
    Email author
  • Bela Gipp
    • 1
  • Michael Grossniklaus
    • 1
  1. 1.Department of Computer and Information ScienceUniversity of KonstanzKonstanzGermany
  2. 2.Digital Contents and Media Sciences Research DivisionNational Institute of Informatics (NII)TokyoJapan

Personalised recommendations