Raimond: Quantitative Data Extraction from Twitter to Describe Events

  • Thibault Sellam
  • Omar Alonso
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9114)


Social media play a decisive role in communicating and spreading information during global events. In particular, real-time microblogging platforms such as Twitter have become prevalent. Researchers have used microblogging for a number of tasks, including past events analysis, predictions, and information retrieval. Nevertheless, little attention has been given to quantitative data extraction. In this paper, we address two questions: can we develop a mechanism to extract quantitative data from a collection of tweets, and can we use the salient findings to describe an event? To answer the first question, we introduce Raimond, a virtual text curator, specialized in quantitative data extraction from Twitter. To address the second question, we use our system on three events and evaluate its output using a crowdsourcing strategy. We demonstrate the effectiveness of our approach with a number of real world examples.


Microblogs Information extraction Events analysis 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ahmed, A., Ho, Q., Eisenstein, J., Xing, E., Smola, A.J., Teo, C.H.: Unified analysis of streaming news. In: Proc. WWW, pp. 267–276 (2011)Google Scholar
  2. 2.
    Allan, J., Gupta, R., Khandelwal, V.: Temporal summaries of new topics. In: Proc. SIGIR, pp. 10–18. ACM (2001)Google Scholar
  3. 3.
    Alonso, O., Marshall, C.C., Najork, M.: Are some tweets more interesting than others? #hardquestion. In: HCIR, p. 2. ACM (2013)Google Scholar
  4. 4.
    Alonso, O., Shiells, K.: Timelines as summaries of popular scheduled events. In: Proc. WWW, pp. 1037–1044 (2013)Google Scholar
  5. 5.
    Benson, E., Haghighi, A., Barzilay, R.: Event discovery in social media feeds. In: Proc. ACL, pp. 389–398. Association for Computational Linguistics (2011)Google Scholar
  6. 6.
    Chakrabarti, D., Punera, K.: Event summarization using tweets. In: Proc. ICWSM, pp. 66–73. AAAI Press (2011)Google Scholar
  7. 7.
    Diakopoulos, N.: Diamonds in the rough: Social media visual analytics for journalistic inquiry. In: Proc. VAST, pp. 115–122. IEEE (2010)Google Scholar
  8. 8.
    Hotho, A., Staab, S., Stumme, G.: Ontologies improve text document clustering. In: Proc. ICDM, pp. 541–544. IEEE (2003)Google Scholar
  9. 9.
    Imran, M., Castillo, C., Diaz, F., Vieweg, S.: Processing social media messages in mass emergency: A survey. In: CoRR. arXiv preprint: 1407.7071 (2014)Google Scholar
  10. 10.
    Imran, M., Elbassuoni, S., Castillo, C.: Practical extraction of disaster-relevant information from social media. In: Proc. WWW, pp. 1021–1024 (2013)Google Scholar
  11. 11.
    Luhn, H.: The automatic creation of literature abstracts. IBM Journal of Research and Development, 159–165 (1958)Google Scholar
  12. 12.
    Marcus, A., Bernstein, M., Badar, O.: Twitinfo: aggregating and visualizing microblogs for event exploration. In: Proc. CHI, pp. 227–236. ACM (2011)Google Scholar
  13. 13.
    Miller, G.A.: Wordnet: a lexical database for english. In: CACM, vol. 38, pp. 39–41. ACM (1995)Google Scholar
  14. 14.
    Petrović, S., Osborne, M., Lavrenko, V.: Streaming first story detection with application to twitter. In: NAACL, pp. 181–189. Association for Computational Linguistics (2010)Google Scholar
  15. 15.
    Phan, X.H., Nguyen, L.M., Horiguchi, S.: Learning to classify short and sparse text & web with hidden topics from large-scale data collections. In: Proc. WWW, pp. 91–100 (2008)Google Scholar
  16. 16.
    Popescu, A.M., Pennacchiotti, M.: Detecting controversial events from twitter. In: Proc. CIKM, p. 1873. ACM (2010)Google Scholar
  17. 17.
    Popescu, A.M., Pennacchiotti, M., Paranjpe, D.: Extracting events and event descriptions from Twitter. In: Proc. WWW, p. 105 (2011)Google Scholar
  18. 18.
    Ritter, A., Etzioni, O., Clark, S.: Open domain event extraction from twitter. In: KDD, p. 1104. ACM (2012)Google Scholar
  19. 19.
    Sakaki, T., Okazaki, M., Matsuo, Y.: Earthquake shakes Twitter users: real-time event detection by social sensors. In: Proc. WWW, pp. 851–860 (2010)Google Scholar
  20. 20.
    Sayyadi, H., Hurst, M., Maykov, A.: Event detection and tracking in social streams. In: Proc. ICWSM, pp. 311–314. AAAI Press (2009)Google Scholar
  21. 21.
    Sokal, R.R.: A statistical method for evaluating systematic relationships. U. Kansas Scientific Bulletin 38, 1409–1438 (1958)Google Scholar
  22. 22.
    Suen, C., Huang, S., Eksombatchai, C., Sosic, R., Leskovec, J.: Nifty: a system for large scale information flow tracking and clustering. In: Proc. WWW, pp. 1237–1248 (2013)Google Scholar
  23. 23.
    Tufte, E.: The visual display of quantitative information. Graphics Press Cheshire, CT (1983)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.CWIAmsterdamThe Netherlands
  2. 2.Microsoft CorporationMountain ViewUSA

Personalised recommendations