Raimond: Quantitative Data Extraction from Twitter to Describe Events

  • Thibault SellamEmail author
  • Omar Alonso
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9114)


Social media play a decisive role in communicating and spreading information during global events. In particular, real-time microblogging platforms such as Twitter have become prevalent. Researchers have used microblogging for a number of tasks, including past events analysis, predictions, and information retrieval. Nevertheless, little attention has been given to quantitative data extraction. In this paper, we address two questions: can we develop a mechanism to extract quantitative data from a collection of tweets, and can we use the salient findings to describe an event? To answer the first question, we introduce Raimond, a virtual text curator, specialized in quantitative data extraction from Twitter. To address the second question, we use our system on three events and evaluate its output using a crowdsourcing strategy. We demonstrate the effectiveness of our approach with a number of real world examples.


Microblogs Information extraction Events analysis 


  1. 1.
    Ahmed, A., Ho, Q., Eisenstein, J., Xing, E., Smola, A.J., Teo, C.H.: Unified analysis of streaming news. In: Proc. WWW, pp. 267–276 (2011)Google Scholar
  2. 2.
    Allan, J., Gupta, R., Khandelwal, V.: Temporal summaries of new topics. In: Proc. SIGIR, pp. 10–18. ACM (2001)Google Scholar
  3. 3.
    Alonso, O., Marshall, C.C., Najork, M.: Are some tweets more interesting than others? #hardquestion. In: HCIR, p. 2. ACM (2013)Google Scholar
  4. 4.
    Alonso, O., Shiells, K.: Timelines as summaries of popular scheduled events. In: Proc. WWW, pp. 1037–1044 (2013)Google Scholar
  5. 5.
    Benson, E., Haghighi, A., Barzilay, R.: Event discovery in social media feeds. In: Proc. ACL, pp. 389–398. Association for Computational Linguistics (2011)Google Scholar
  6. 6.
    Chakrabarti, D., Punera, K.: Event summarization using tweets. In: Proc. ICWSM, pp. 66–73. AAAI Press (2011)Google Scholar
  7. 7.
    Diakopoulos, N.: Diamonds in the rough: Social media visual analytics for journalistic inquiry. In: Proc. VAST, pp. 115–122. IEEE (2010)Google Scholar
  8. 8.
    Hotho, A., Staab, S., Stumme, G.: Ontologies improve text document clustering. In: Proc. ICDM, pp. 541–544. IEEE (2003)Google Scholar
  9. 9.
    Imran, M., Castillo, C., Diaz, F., Vieweg, S.: Processing social media messages in mass emergency: A survey. In: CoRR. arXiv preprint: 1407.7071 (2014)Google Scholar
  10. 10.
    Imran, M., Elbassuoni, S., Castillo, C.: Practical extraction of disaster-relevant information from social media. In: Proc. WWW, pp. 1021–1024 (2013)Google Scholar
  11. 11.
    Luhn, H.: The automatic creation of literature abstracts. IBM Journal of Research and Development, 159–165 (1958)Google Scholar
  12. 12.
    Marcus, A., Bernstein, M., Badar, O.: Twitinfo: aggregating and visualizing microblogs for event exploration. In: Proc. CHI, pp. 227–236. ACM (2011)Google Scholar
  13. 13.
    Miller, G.A.: Wordnet: a lexical database for english. In: CACM, vol. 38, pp. 39–41. ACM (1995)Google Scholar
  14. 14.
    Petrović, S., Osborne, M., Lavrenko, V.: Streaming first story detection with application to twitter. In: NAACL, pp. 181–189. Association for Computational Linguistics (2010)Google Scholar
  15. 15.
    Phan, X.H., Nguyen, L.M., Horiguchi, S.: Learning to classify short and sparse text & web with hidden topics from large-scale data collections. In: Proc. WWW, pp. 91–100 (2008)Google Scholar
  16. 16.
    Popescu, A.M., Pennacchiotti, M.: Detecting controversial events from twitter. In: Proc. CIKM, p. 1873. ACM (2010)Google Scholar
  17. 17.
    Popescu, A.M., Pennacchiotti, M., Paranjpe, D.: Extracting events and event descriptions from Twitter. In: Proc. WWW, p. 105 (2011)Google Scholar
  18. 18.
    Ritter, A., Etzioni, O., Clark, S.: Open domain event extraction from twitter. In: KDD, p. 1104. ACM (2012)Google Scholar
  19. 19.
    Sakaki, T., Okazaki, M., Matsuo, Y.: Earthquake shakes Twitter users: real-time event detection by social sensors. In: Proc. WWW, pp. 851–860 (2010)Google Scholar
  20. 20.
    Sayyadi, H., Hurst, M., Maykov, A.: Event detection and tracking in social streams. In: Proc. ICWSM, pp. 311–314. AAAI Press (2009)Google Scholar
  21. 21.
    Sokal, R.R.: A statistical method for evaluating systematic relationships. U. Kansas Scientific Bulletin 38, 1409–1438 (1958)Google Scholar
  22. 22.
    Suen, C., Huang, S., Eksombatchai, C., Sosic, R., Leskovec, J.: Nifty: a system for large scale information flow tracking and clustering. In: Proc. WWW, pp. 1237–1248 (2013)Google Scholar
  23. 23.
    Tufte, E.: The visual display of quantitative information. Graphics Press Cheshire, CT (1983)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.CWIAmsterdamThe Netherlands
  2. 2.Microsoft CorporationMountain ViewUSA

Personalised recommendations