The Impact of Semantic Document Expansion on Cluster-Based Fusion for Microblog Search

  • Shangsong Liang
  • Zhaochun Ren
  • Maarten de Rijke
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8416)


Searching microblog posts, with their limited length and creative language usage, is challenging. We frame the microblog search problem as a data fusion problem. We examine the effectiveness of a recent cluster-based fusion method on the task of retrieving microblog posts. We find that in the optimal setting the contribution of the clustering information is very limited, which we hypothesize to be due to the limited length of microblog posts. To increase the contribution of the clustering information in cluster-based fusion, we integrate semantic document expansion as a preprocessing step. We enrich the content of microblog posts appearing in the lists to be fused by Wikipedia articles, based on which clusters are created. We verify the effectiveness of our combined document expansion plus fusion method by making comparisons with microblog search algorithms and other fusion methods.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Efron, M., Organisciak, P., Fenlon, K.: Improving retrieval of short texts through document expansion. In: SIGIR 2012, pp. 911–920. ACM (2012)Google Scholar
  2. 2.
    Farah, M., Vanderpooten, D.: An outranking approach for rank aggregation in information retrieval. In: SIGIR 2007, pp. 591–598. ACM (2007)Google Scholar
  3. 3.
    Kozorovitzky, A.K., Kurland, O.: Cluster-based fusion of retrieved lists. In: SIGIR, pp. 893–902 (2011)Google Scholar
  4. 4.
    Liang, S., de Rijke, M., Tsagkias, M.: Late data fusion for microblog search. In: Serdyukov, P., Braslavski, P., Kuznetsov, S.O., Kamps, J., Rüger, S., Agichtein, E., Segalovich, I., Yilmaz, E. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 743–746. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  5. 5.
    Lin, J., Macdonald, C., Ounis, I., Soboroff, I.: Overview of the TREC 2011 Microblog track. In: TREC 2011. NIST (2011)Google Scholar
  6. 6.
    Meij, E., Weerkamp, W., de Rijke, M.: Adding semantics to microblog posts. In: WSDM 2012, pp. 563–572. ACM (2012)Google Scholar
  7. 7.
    Odijk, D., Meij, E., de Rijke, M.: Feeding the second screen: Semantic linking based on subtitles. In: OAIR 2013, pp. 9–16 (2013)Google Scholar
  8. 8.
    Shaw, J.A., Fox, E.A.: Combination of multiple searches. In: TREC 1992, pp. 243–252. NIST (1993)Google Scholar
  9. 9.
    Sheldon, D., Shokouhi, M., Szummer, M., Craswell, N.: LambdaMerge: merging the results of query reformulations. In: WSDM 2011, pp. 795–804 (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Shangsong Liang
    • 1
  • Zhaochun Ren
    • 1
  • Maarten de Rijke
    • 1
  1. 1.ISLAUniversity of AmsterdamThe Netherlands

Personalised recommendations