Twinder: Enhancing Twitter Search

  • Ke Tao
  • Fabian Abel
  • Claudia Hauff
  • Geert-Jan Houben
  • Ujwal Gadiraju
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8173)

Abstract

How can the search process on Twitter be improved to better meet the various information needs of its users? As an answer to this question, we have developed the Twinder framework, a scalable search system for Twitter streams. Twinder contains algorithms to determine the relevance of tweets in relation to search requests, as well as components to detect (near-)duplicate content, to diversify search results, and to personalize the search result ranking. In this paper, we report on our current progress, including the system architecture and the different modules for solving specific problems. Finally, we empirically determine the effectiveness of Twinder’s components with experiments on representative datasets.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Kwak, H., Lee, C., Park, H., Moon, S.: What is twitter, a social network or a news media? In: Proceedings of the 19th International Conference on World Wide Web, WWW 2010, pp. 591–600. ACM, New York (2010)Google Scholar
  2. 2.
    Teevan, J., Ramage, D., Morris, M.R.: #TwitterSearch: a comparison of microblog search and web search. In: Proceedings of the International Conference on Web Search and Web Data Mining (WSDM 2011), pp. 35–44. ACM, New York (2011)Google Scholar
  3. 3.
    Golovchinsky, G., Efron, M.: Making sense of twitter search. In: Proc. CHI 2010 Workshop on Microblogging: What and How Can We Learn From It? (2010)Google Scholar
  4. 4.
    Tao, K., Abel, F., Hauff, C., Houben, G.-J.: Twinder: A search engine for twitter streams. In: Brambilla, M., Tokuda, T., Tolksdorf, R. (eds.) ICWE 2012. LNCS, vol. 7387, pp. 153–168. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  5. 5.
    Tao, K., Abel, F., Hauff, C., Houben, G.J., Gadiraju, U.: Groundhog day: Near-duplicate detection on twitter. In: Proceedings of the 22nd International Conference on World Wide Web, WWW 2013, pp. 1273–1284. International World Wide Web Conferences Steering Committee (2013)Google Scholar
  6. 6.
    Abel, F., Gao, Q., Houben, G.-J., Tao, K.: Analyzing User Modeling on Twitter for Personalized News Recommendations. In: Konstan, J.A., Conejo, R., Marzo, J.L., Oliver, N. (eds.) UMAP 2011. LNCS, vol. 6787, pp. 1–12. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  7. 7.
    Duan, Y., Jiang, L., Qin, T., Zhou, M., Shum, H.Y.: An empirical study on learning to rank of tweets. In: Proceedings of the 23rd International Conference on Computational Linguistics, COLING 2010, pp. 295–303. Association for Computational Linguistics, Stroudsburg (2010)Google Scholar
  8. 8.
    McCreadie, R., Soboroff, I., Lin, J., Macdonald, C., Ounis, I., McCullough, D.: On building a reusable twitter corpus. In: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2012, pp. 1113–1114. ACM, New York (2012)Google Scholar
  9. 9.
    Tao, K., Abel, F., Hauff, C., Houben, G.J., Gadiraju, U.: Supporting website: datasets and additional findings (2012), http://wis.ewi.tudelft.nl/duptweet/
  10. 10.
    Lin, D.: An information-theoretic definition of similarity. In: Proceedings of the 15th International Conference on Machine Learning, vol. 1, San Francisco, pp. 296–304 (1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Ke Tao
    • 1
  • Fabian Abel
    • 1
    • 2
  • Claudia Hauff
    • 1
  • Geert-Jan Houben
    • 1
  • Ujwal Gadiraju
    • 1
  1. 1.Web Information SystemsTU DelftDelftThe Netherlands
  2. 2.XING AGHamburgGermany

Personalised recommendations