Word Embedding Based Event Detection on Social Media

  • Ali Mert Ertugrul
  • Burak Velioglu
  • Pinar KaragozEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10334)


Event detection from social media messages is conventionally based on clustering the message contents. The most basic approach is representing messages in terms of term vectors that are constructed through traditional natural language processing (NLP) methods and then assigning weights to terms generally based on frequency. In this study, we use neural feature extraction approach and explore the performance of event detection under the use of word embeddings. Using a corpus of a set of tweets, message terms are embedded to continuous space. Message contents that are represented as vectors of word embedding are grouped by using hierarchical clustering. The technique is applied on a set of Twitter messages posted in Turkish. Experimental results show that automatically extracted features detect the contextual similarities between tweets better than traditional feature extraction with term frequency - inverse document frequency (TF-IDF) based term vectors.


Event detection Neural feature extraction Word embedding Neural probabilistic language models 


  1. 1.
    Goodchild, M.F.: Citizens as sensors: the world of volunteered geography. GeoJournal 69(4), 211–221 (2007)CrossRefGoogle Scholar
  2. 2.
    Abdelhaq, H., Sengstock, C., Gertz, M.: EvenTweet: online localized event detection from twitter. Proc. VLDB Endowment 6(12), 1326–1329 (2013)CrossRefGoogle Scholar
  3. 3.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)Google Scholar
  4. 4.
    Atefeh, F., Khreich, W.: A survey of techniques for event detection in Twitter. Comput. Intell. 31(1), 132–164 (2015)CrossRefMathSciNetGoogle Scholar
  5. 5.
    Popescu, A-M., Pennacchiotti, M., Paranjpe, D.: Extracting events and event descriptions from Twitter. In: Proceedings of the 20th International Conference Companion on World Wide Web, pp. 105–106 (2011)Google Scholar
  6. 6.
    Sakaki, T., Okazaki, M., Matsuo, Y.: Earthquake shakes Twitter users: real-time event detection by social sensors. In: Proceedings of the 19th International Conference on World Wide Web, pp. 851–860 (2010)Google Scholar
  7. 7.
    Kallus, N.: Predicting crowd behavior with big public data. In: Proceedings of the 23rd International Conference on World Wide Web, pp. 625–630 (2014)Google Scholar
  8. 8.
    Reschke, K., Jankowiak, M., Surdeanu, M., Manning, C.D., Jurafsky, D.: Event extraction using distant supervision. In: LREC, pp. 4527–4531 (2014)Google Scholar
  9. 9.
    Phuvipadawat, S., Murata, T.: Breaking news detection and tracking in Twitter. In: 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), vol. 3, pp. 120–123 (2010)Google Scholar
  10. 10.
    Petrović, S., Osborne, M., Lavrenko, V.: Streaming first story detection with application to Twitter. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 181–189 (2010)Google Scholar
  11. 11.
    Ozdikis, O., Senkul, P., Oguztuzun, H.: Semantic expansion of Tweet contents for enhanced event detection in Twitter. In: 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 20–24 (2012)Google Scholar
  12. 12.
    Ozdikis, O., Senkul, P., Oguztuzun, H.: Context based semantic relations in Tweets. In: Can, F., Özyer, T., Polat, F. (eds.) State of the Art Applications of Social Network Analysis. Lecture Notes in Social Networks, pp. 35–52. Springer, Switzerland (2014)CrossRefGoogle Scholar
  13. 13.
    Parikh, R., Karlapalem, K.: ET: events from Tweets. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 613–620 (2013)Google Scholar
  14. 14.
    Long, R., Wang, H., Chen, Y., Jin, O., Yu, Y.: Towards effective event detection, tracking and summarization on microblog data. In: Wang, H., Li, S., Oyama, S., Hu, X., Qian, T. (eds.) WAIM 2011. LNCS, vol. 6897, pp. 652–663. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-23535-1_55 CrossRefGoogle Scholar
  15. 15.
    Sayyadi, H., Hurst, M., Maykov, A.: Event detection and tracking in social streams. In: ICWSM (2009)Google Scholar
  16. 16.
    Weng, J., Lee, B-S.: Event detection in Twitter. In: ICWSM, vol. 11, pp. 401–408 (2011)Google Scholar
  17. 17.
    Becker, H., Naaman, M., Gravano, L.: Beyond trending topics: real-world event identification on Twitter. In: ICWSM, vol. 11, pp. 438–441 (2011)Google Scholar
  18. 18.
    Hua, T., Chen, F., Zhao, L., Lu, C-T., Ramakrishnan, N.: STED: semi-supervised targeted-interest event detection in Twitter. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1466–1469 (2013)Google Scholar
  19. 19.
    Tan, L., Zhang, H., Clarke, C.L.A., Smucker, M.D.: Lexical comparison between Wikipedia and Twitter corpora by using word embeddings. In: Short Papers, vol. 2, p. 657 (2015)Google Scholar
  20. 20.
    Lin, W-S., Dai, H-J., Jonnagaddala, J., Chang, N-W., Jue, T.R., Iqbal, U., Shao, J.Y-H., Chiang, I-J., Li, Y-C.: Utilizing different word representation methods for twitter data in adverse drug reactions extraction. In: 2015 Conference on Technologies and Applications of Artificial Intelligence (TAAI), pp. 260–265 (2015)Google Scholar
  21. 21.
    Tang, D., Wei, F., Yang, N., Zhou, M., Liu, T., Qin, B.: Learning sentiment-specific word embedding for Twitter sentiment classification. In: ACL, no. 1, pp. 1555–1565 (2014)Google Scholar
  22. 22.
    Fang, A., Macdonald, C., Ounis, I., Habel, P.: Using word embedding to evaluate the coherence of topics from Twitter data. In: Proceedings of SIGIR (2016)Google Scholar
  23. 23.
    Nguyen, T.H., Grishman, R.: Event detection and domain adaptation with convolutional neural networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, vol. 2, pp. 365–371 (2015)Google Scholar
  24. 24.
    Nguyen, D.T., Joty, S., Imran, M., Sajjad, H., Mitra, P.: Applications of online deep learning for crisis response using social media information. arXiv preprint arXiv:1610.01030 (2016)
  25. 25.
    Demir, H., Özgür, A.: Improving named entity recognition for morphologically rich languages using word embeddings. In: 2014 13th International Conference on Machine Learning and Applications (ICMLA), pp. 117–122 (2014)Google Scholar
  26. 26.
    Onal, K.D., Karagoz, P.: Named entity recognition from scratch on social media. In: ECML-PKDD, MUSE Workshop (2015)Google Scholar
  27. 27.
    Luhn, H.P.: A statistical approach to mechanized encoding and searching of literary information. IBM J. Res. Dev. 1(4), 309–317 (1957)CrossRefMathSciNetGoogle Scholar
  28. 28.
    Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)Google Scholar
  29. 29.
    Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: ICML, vol. 14, pp. 1188–1196 (2014)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Ali Mert Ertugrul
    • 1
  • Burak Velioglu
    • 2
  • Pinar Karagoz
    • 2
    Email author
  1. 1.Informatics InstituteMETUAnkaraTurkey
  2. 2.Computer Engineering DepartmentMETUAnkaraTurkey

Personalised recommendations