Diversifying Microblog Posts

  • Marios Koniaris
  • Giorgos Giannopoulos
  • Timos Sellis
  • Yiannis Vasileiou
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8787)

Abstract

Microblogs have become an important source of information, a medium for following and spreading trends, news and ideas all over the world. As a result, microblog search has emerged as a new option for covering user information needs, especially with respect to timely events, news or trends. However users are frequently overloaded by the high rate of produced microblogging posts, which often carry no new information with respect to other similar posts. In this paper we propose a method that helps users effectively harvest information from a microblogging stream, by filtering out redundant data and maximizing diversity among the displayed information. We introduce microblog posts-specific diversification criteria and apply them on heuristic diversification algorithms. We implement the above methods into a prototype system that works with data from Twitter. The experimental evaluation, demonstrates the effectiveness of applying our problem specific diversification criteria, as opposed to applying plain content diversity on microblog posts.

Keywords

Diversification information retrieval microblogging services twitter data mining 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., Gollapudi, S., Halverson, A., Ieong, S.: Diversifying search results. In: Proceedings of WSDM 2009, pp. 5–14 (2009)Google Scholar
  2. 2.
    Carbonell, J., Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of ACM SIGIR 1998, pp. 335–336 (1998)Google Scholar
  3. 3.
    Cheng, S., Arvanitis, A., Chrobak, M., Hristidis, V.: Multi-Query Diversification in Microblogging Posts. In: Proceedings of EDBT 2014, pp. 133–144 (2014)Google Scholar
  4. 4.
    Clarke, C.L.A., Kolla, M., Cormack, G.V., Vechtomova, O., Ashkan, A., Büttcher, S., MacKinnon, I.: Novelty and diversity in information retrieval evaluation. In: Proceedings of ACM SIGIR 2008, pp. 659–666 (2008)Google Scholar
  5. 5.
    Davenport, J.R.A., DeLine, R.: The Readability of Tweets and their Geographic Correlation with Education (preprint, 2014), http://arxiv.org/abs/1401.6058
  6. 6.
    Drosou, M., Pitoura, E.: Search result diversification. ACM SIGMOD Record 39, 41 (2010)CrossRefGoogle Scholar
  7. 7.
    Erkut, E.: The discrete p-dispersion problem. European Journal of Operational Research 46(1), 48–60 (1990)MATHMathSciNetCrossRefGoogle Scholar
  8. 8.
    Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by Gibbs sampling. In: Proceedings of ACL 2005, pp. 363–370 (2005)Google Scholar
  9. 9.
    Giannopoulos, G., Koniaris, M., Weber, I., Jaimes, A., Sellis, T.: Algorithms and Criteria for Diversification of News Article Comments. Journal of Intelligent Information Systems (2014)Google Scholar
  10. 10.
    Giannopoulos, G., Weber, I., Jaimes, A., Sellis, T.: Diversifying User Comments on News Articles. In: Wang, X.S., Cruz, I., Delis, A., Huang, G. (eds.) WISE 2012. LNCS, vol. 7651, pp. 100–113. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  11. 11.
    Gollapudi, S., Sharma, A.: An axiomatic approach for result diversification. In: Proceedings of WWW 2009, pp. 381–390 (2009)Google Scholar
  12. 12.
    Jabeur, L.B., Tamine, L., Boughanem, M.: Uprising microblogs. In: Proceedings of SAC 2012, pp. 943–948 (2012)Google Scholar
  13. 13.
    Ounis, I., Santos, L.R.T., Peng, J., Macdonald, C.: Explicit search result diversification through sub-queries. In: Gurrin, C., He, Y., Kazai, G., Kruschwitz, U., Little, S., Roelleke, T., Rüger, S., van Rijsbergen, K. (eds.) ECIR 2010. LNCS, vol. 5993, pp. 87–99. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  14. 14.
    Razis, G., Anagnostopoulos, I.: InfluenceTracker: Rating the impact of a Twitter account (preprint, 2014), http://arxiv.org/abs/1404.5239
  15. 15.
    Ren, Z., Liang, S., Meij, E., de Rijke, M.: Personalized time-aware tweets summarization. In: Proceedings of SIGIR 2013, pp. 513–522 (2013)Google Scholar
  16. 16.
    Perez, J.A.R., Moshfeghi, Y., Jose, J.M.: On using inter-document relations in microblog retrieval. In: WWW 2013 Companion Proceedings, pp. 75–76 (2013)Google Scholar
  17. 17.
    Teevan, J., Ramage, D., Morris, M.R.: #TwitterSearch: a comparison of microblog search and web search. In: Proceedings of WSDM 2011, pp. 35–44 (2011)Google Scholar
  18. 18.
    Thelwall, M., Buckley, K., Paltoglou, G., Cai, D., Kappas, A.: Sentiment strength detection in short informal text. Journal of the American Society for Information Science and Technology 61(12), 2544–2558 (2010)CrossRefGoogle Scholar
  19. 19.
    Zhang, X., He, B., Luo, T., Li, B.: Query-biased learning to rank for real-time twitter search. In: Proceedings of CIKM 2012, pp. 1915–1919 (2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Marios Koniaris
    • 1
  • Giorgos Giannopoulos
    • 2
  • Timos Sellis
    • 3
  • Yiannis Vasileiou
    • 1
  1. 1.School of ECENTU AthensGreece
  2. 2.IMIS Institute“Athena” Research CenterGreece
  3. 3.RMIT UniversityMelbourneAustralia

Personalised recommendations