Result Diversification for Tweet Search

  • Makbule Gulcin Ozsoy
  • Kezban Dilek Onal
  • Ismail Sengor Altingovde
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8787)


Being one of the most popular microblogging platforms, Twitter handles more than two billion queries per day. Given the users’ desire for fresh and novel content but their reluctance to submit long and descriptive queries, there is an inevitable need for generating diversified search results to cover different aspects of a query topic. In this paper, we address diversification of results in tweet search by adopting several methods from the text summarization and web search domains. We provide an exhaustive evaluation of all the methods using a standard dataset specifically tailored for this purpose. Our findings reveal that implicit diversification methods are more promising in the current setup, whereas explicit methods need to be augmented with a better representation of query sub-topics.


Microblogging Tweet search novelty diversity 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agrawal, R., Gollapudi, S., Halverson, A., Ieong, S.: Diversifying search results. In: Proc. of WSDM 2009, pp. 5–14 (2009)Google Scholar
  2. 2.
    Busch, M., Gade, K., Larson, B., Lok, P., Luckenbill, S., Lin, J.: Earlybird: Real-time search at twitter. In: Proc. of ICDE 2012, pp. 1360–1369 (2012)Google Scholar
  3. 3.
    Carbonell, J., Goldstein, J.: The use of mmr, diversity-based reranking for reordering documents and producing summaries. In: Proc.of SIGIR 1998, pp. 335–336 (1998)Google Scholar
  4. 4.
    Carterette, B.: An analysis of NP-completeness in novelty and diversity ranking. In: Azzopardi, L., Kazai, G., Robertson, S., Rüger, S., Shokouhi, M., Song, D., Yilmaz, E. (eds.) ICTIR 2009. LNCS, vol. 5766, pp. 200–211. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  5. 5.
    Carterette, B., Chandar, P.: Probabilistic models of ranking novel documents for faceted topic retrieval. In: Proc. of CIKM 2009, pp. 1287–1296 (2009)Google Scholar
  6. 6.
    Clarke, C.L.A., Kolla, M., Cormack, G.V., Vechtomova, O., Ashkan, A., Büttcher, S., MacKinnon, I.: Novelty and diversity in information retrieval evaluation. In: Proc. of SIGIR 2008, pp. 659–666 (2008)Google Scholar
  7. 7.
    Dang, V., Croft, W.B.: Diversity by proportionality: an election-based approach to search result diversification. In: Proc. of SIGIR 2012, pp. 65–74 (2012)Google Scholar
  8. 8.
    Erkan, G., Radev, D.R.: Lexrank: graph-based lexical centrality as salience in text summarization. J. Artif. Int. Res. 22(1), 457–479 (2004)Google Scholar
  9. 9.
    Feng, W., Wang, J.: Retweet or not?: personalized tweet re-ranking. In: Proc. of WSDM 2013, pp. 577–586 (2013)Google Scholar
  10. 10.
    Gollapudi, S., Sharma, A.: An axiomatic approach for result diversification. In: Proc. of WWW 2009, pp. 381–390 (2009)Google Scholar
  11. 11.
    He, J., Meij, E., de Rijke, M.: Result diversification based on query-specific cluster ranking. JASIST 62(3), 550–571 (2011)Google Scholar
  12. 12.
    Jabeur, L.B., Tamine, L., Boughanem, M.: Uprising microblogs: a bayesian network retrieval model for tweet search. In: Proceedings of the 27th Annual ACM Symposium on Applied Computing, SAC 2012, pp. 943–948. ACM (2012)Google Scholar
  13. 13.
    McCreadie, R., Macdonald, C.: Relevance in microblogs: Enhancing tweet retrieval using hyperlinked documents. In: Proc. of the 10th Conference on Open Research Areas in Information Retrieval, OAIR 2013, pp. 189–196 (2013)Google Scholar
  14. 14.
    Otterbacher, J., Erkan, G., Radev, D.R.: Biased lexrank: Passage retrieval using random walks with question-based priors. Inf. Process. Manage. 45(1), 42–54 (2009)CrossRefGoogle Scholar
  15. 15.
    Ozdemiray, A.M., Altingovde, I.S.: Explicit search result diversification using score and rank aggregation methods. In: JASIST (in press)Google Scholar
  16. 16.
    Radlinski, F., Dumais, S.T.: Improving personalized web search using result diversification. In: Proc. of SIGIR 2006, pp. 691–692 (2006)Google Scholar
  17. 17.
    Rafiei, D., Bharat, K., Shukla, A.: Diversifying web search results. In: Proc. of WWW 2010, pp. 781–790 (2010)Google Scholar
  18. 18.
    Rodriguez Perez, J.A., Moshfeghi, Y., Jose, J.M.: On using inter-document relations in microblog retrieval. In: Proc. of WWW 2013, pp. 75–76 (2013)Google Scholar
  19. 19.
    Santos, R.L., Macdonald, C., Ounis, I.: Exploiting query reformulations for web search result diversification. In: Proc. of WWW 2010, pp. 881–890 (2010)Google Scholar
  20. 20.
    Sharifi, B., Inouye, D., Kalita, J.K.: Summarization of twitter microblogs. Comput. J. 57(3), 378–402 (2014)CrossRefGoogle Scholar
  21. 21.
    Tao, K., Abel, F., Hauff, C., Houben, G.-J., Gadiraju, U.: Groundhog day: Near-duplicate detection on twitter. In: Proc. of WWW 2013, pp. 1273–1284 (2013)Google Scholar
  22. 22.
    Tao, K., Hauff, C., Houben, G.-J.: Building a microblog corpus for search result diversification. In: Banchs, R.E., Silvestri, F., Liu, T.-Y., Zhang, M., Gao, S., Lang, J. (eds.) AIRS 2013. LNCS, vol. 8281, pp. 251–262. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  23. 23.
    Teevan, J., Ramage, D., Morris, M.R.: #twittersearch: a comparison of microblog search and web search. In: Proc. of WSDM 2011, pp. 35–44 (2011)Google Scholar
  24. 24.
    Vieira, M.R., Razente, H.L., Barioni, M.C.N., Hadjieleftheriou, M., Srivastava, D., C. T. Jr., Tsotras, V.J.: On query result diversification. In: Proc. of ICDE 2011, pp. 1163–1174 (2011)Google Scholar
  25. 25.
    Vosecky, J., Leung, K.W.-T., Ng, W.: Collaborative personalized twitter search with topic-language models. In: Proc. of SIGIR 2014, pp. 53–62 (2014)Google Scholar
  26. 26.
    Wang, J., Zhu, J.: Portfolio theory of information retrieval. In: Proc. of SIGIR 2009, pp. 115–122 (2009)Google Scholar
  27. 27.
    Zhai, C., Cohen, W.W., Lafferty, J.D.: Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In: Proc. of SIGIR 2003, pp. 10–17 (2003)Google Scholar
  28. 28.
    Zhai, C., Lafferty, J.D.: A risk minimization framework for information retrieval. Inf. Process. Manage. 42(1), 31–55 (2006)zbMATHCrossRefGoogle Scholar
  29. 29.
    Zhang, X., He, B., Luo, T., Li, B.: Query-biased learning to rank for real-time twitter search. In: Proc. of CIKM 2012, pp. 1915–1919 (2012)Google Scholar
  30. 30.
    Zuccon, G., Azzopardi, L., Zhang, D., Wang, J.: Top-k retrieval using facility location analysis. In: Baeza-Yates, R., de Vries, A.P., Zaragoza, H., Cambazoglu, B.B., Murdock, V., Lempel, R., Silvestri, F. (eds.) ECIR 2012. LNCS, vol. 7224, pp. 305–316. Springer, Heidelberg (2012)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Makbule Gulcin Ozsoy
    • 1
  • Kezban Dilek Onal
    • 1
  • Ismail Sengor Altingovde
    • 1
  1. 1.Middle East Technical UniversityAnkaraTurkey

Personalised recommendations