A Data Collection for Evaluating the Retrieval of Related Tweets to News Articles

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10772)


Nowadays, social media users react in real-time to local and global events. Therefore, social media can be used to measure the impact of particular topics or events and to analyze public opinion. To this end, identifying and ranking social media posts, such as tweets, associated with a news article is an important information retrieval task. In this paper, we devise a new data collection to evaluate approaches for the task of related-tweet retrieval for news articles. Using two sets of (a) mainstream news articles and (b) tweets from curated newsworthy sources from the same period, we use a TREC-like pooling approach to associate news articles with relevant tweets. We also provide a benchmark for the related-tweet retrieval task by evaluating a number of retrieval approaches on this new data collection.


News Articles Related Tweets Pooling Approach Relevance Judgments Word2vec Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Albakour, M., Macdonald, C., Ounis, I., et al.: Identifying local events by using microblogs as social sensors. In: Proceedings of OAIR, pp. 173–180 (2013)Google Scholar
  2. 2.
    Bollen, J., Mao, H., Zeng, X.: Twitter mood predicts the stock market. J. Comput. Sci. 2(1), 1–8 (2011)CrossRefGoogle Scholar
  3. 3.
    Brigadir, I., Greene, D., Cunningham, P.: Detecting attention dominating moments across media types. In: Proceedings of NewsIR 2016 Workshop at ECIR (2016)Google Scholar
  4. 4.
    Buckley, C., Dimmick, D., Soboroff, I., Voorhees, E.: Bias and the limits of pooling for large collections. Inf. Retrieval 10(6), 491–508 (2007)CrossRefGoogle Scholar
  5. 5.
    Corney, D., Albakour, D., Martinez-Alvarez, M., Moussa, S.: What do a million news articles look like? In: Proceedings of NewsIR 2016 Workshop at ECIR, pp. 42–47 (2016)Google Scholar
  6. 6.
    Kothari, A., Magdy, W., Darwish, K., Mourad, A., Taei, A.: Detecting comments on news articles in microblogs. In: Proceedings of ICWSM (2013)Google Scholar
  7. 7.
    Martinez-Alvarez, M., Kruschwitz, U., Kazai, G., Hopfgartner, F., Corney, D., Campos, R., Albakour, D.: First international workshop on recent trends in news information retrieval (NewsIR 2016). In: Proceedings of ECIR, pp. 878–882 (2016)Google Scholar
  8. 8.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of NIPS, pp. 3111–3119 (2013)Google Scholar
  9. 9.
    Mikolov, T., Yih, W.T., Zweig, G.: Linguistic regularities in continuous space word representations. In: Proceedings of NAACL, pp. 746–751 (2013)Google Scholar
  10. 10.
    O’Connor, B., Balasubramanyan, R., Routledge, B.R., Smith, N.A.: From tweets to polls: linking text sentiment to public opinion time series. In: Proceedings of ICWSM (2010)Google Scholar
  11. 11.
    Shi, B., Ifrim, G., Hurley, N.: Be in the know: connecting news articles to relevant twitter conversations. arXiv:1405.3117 (2014)
  12. 12.
    Voorhees, E.M., Harman, D.K., et al.: TREC: Experiment and Evaluation in Information Retrieval, vol. 1. MIT Press, Cambridge (2005)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.School of Computer Science and Electronic EngineeringUniversity of EssexColchesterUK
  2. 2.Signal Media Ltd.LondonUK

Personalised recommendations