Finding Relevant Tweets

  • Deepak P.
  • Sutanu Chakraborti
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7418)

Abstract

When a user of a microblogging site authors a microblog post or browses through a microblog post, it provides cues as to what topic she is interested in at that point in time. Example-based search that retrieves similar tweets given one exemplary tweet, such as the one just authored, can help provide the user with relevant content. We investigate various components of microblog posts, such as the associated timestamp, author’s social network, and the content of the post, and develop approaches that harness such factors in finding relevant tweets given a query tweet. An empirical analysis of such techniques on real world twitter-data is then presented to quantify the utility of the various factors in assessing tweet relevance. We observe that content-wise similar tweets that also contain extra information not already present in the query, are perceived as useful. We then develop a composite technique that combines the various approaches by scoring tweets using a dynamic query-specific linear combination of separate techniques. An empirical evaluation establishes the effectiveness of the composite technique, and that it outperforms each of its constituents.

Keywords

Edit Distance Mean Average Precision Graph Distance Twitter User Relevance Judgement 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Sakaki, T., Okazaki, M., Matsuo, Y.: Earthquake shakes twitter users: real-time event detection by social sensors. In: WWW, pp. 851–860 (2010)Google Scholar
  2. 2.
    Wu, S., Hofman, J.M., Mason, W.A., Watts, D.J.: Who says what to whom on twitter. In: WWW, pp. 705–714. ACM, New York (2011)Google Scholar
  3. 3.
    Deshpande, P.M., Deepak, P., Kummamuru, K.: Efficient online top-k retrieval with arbitrary similarity measures. In: EDBT, pp. 356–367 (2008)Google Scholar
  4. 4.
    Krinke, J.: Identifying similar code with program dependence graphs. In: WCRE, p. 301. IEEE Computer Society, Washington, DC (2001)Google Scholar
  5. 5.
    Subramaniam, L.V., Roy, S., Faruquie, T.A., Negi, S.: A survey of types of text noise and techniques to handle noisy text. In: AND, pp. 115–122 (2009)Google Scholar
  6. 6.
    Allison, B., Guthrie, D., Guthrie, L.: Another Look at the Data Sparsity Problem. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2006. LNCS (LNAI), vol. 4188, pp. 327–334. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  7. 7.
    Steinbach, M., Karypis, G., Kumar, V.: A comparison of document clustering techniques. In: KDD Workshop on Text Mining (2000)Google Scholar
  8. 8.
    Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Technical Report 8 (1966)Google Scholar
  9. 9.
    Wang, W., Xiao, C., Lin, X., Zhang, C.: Efficient approximate entity extraction with edit distance constraints. In: SIGMOD, pp. 759–770 (2009)Google Scholar
  10. 10.
    Sanderson, M., Croft, W.B.: Deriving concept hierarchies from text. In: SIGIR, pp. 206–213 (1999)Google Scholar
  11. 11.
    Xue, X., Jeon, J., Croft, W.B.: Retrieval models for question and answer archives. In: SIGIR, pp. 475–482 (2008)Google Scholar
  12. 12.
    Pedersen, T., Patwardhan, S., Michelizzi, J.: Wordnet: Similarity - measuring the relatedness of concepts. In: AAAI, pp. 1024–1025 (2004)Google Scholar
  13. 13.
    Banerjee, S., Pedersen, T.: An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet. In: Gelbukh, A. (ed.) CICLing 2002. LNCS, vol. 2276, pp. 136–145. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  14. 14.
    Robertson, S., Zaragoza, H.: On rank-based effectiveness measures and optimization. Inf. Retr. 10, 321–339 (2007)CrossRefGoogle Scholar
  15. 15.
    Smucker, M.D., Allan, J., Carterette, B.: A comparison of statistical significance tests for information retrieval evaluation. In: Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, CIKM, pp. 623–632 (2007)Google Scholar
  16. 16.
    Uysal, I., Croft, W.B.: User oriented tweet ranking: a filtering approach to microblogs. In: CIKM, pp. 2261–2264 (2011)Google Scholar
  17. 17.
    Duan, Y., Jiang, L., Qin, T., Zhou, M., Shum, H.Y.: An empirical study on learning to rank of tweets. In: COLING, pp. 295–303 (2010)Google Scholar
  18. 18.
    De Choudhury, M., Counts, S., Czerwinski, M.: Identifying relevant social media content: leveraging information diversity and user cognition. In: HT (2011)Google Scholar
  19. 19.
    Sarma, A.D., Sarma, A.D., Gollapudi, S., Panigrahy, R.: Ranking mechanisms in twitter-like forums. In: WSDM, pp. 21–30 (2010)Google Scholar
  20. 20.
    Chen, J., Nairn, R., Nelson, L., Bernstein, M.S., Chi, E.H.: Short and tweet: experiments on recommending content from information streams. In: CHI (2010)Google Scholar
  21. 21.
    Phelan, O., McCarthy, K., Smyth, B.: Using twitter to recommend real-time topical news. In: RecSys, pp. 385–388. ACM, New York (2009)CrossRefGoogle Scholar
  22. 22.
    Pennacchiotti, M., Gurumurthy, S.: Investigating topic models for social media user recommendation. In: WWW (Companion Volume), pp. 101–102 (2011)Google Scholar
  23. 23.
    Diaz, F., Metzler, D., Amer-Yahia, S.: Relevance and ranking in online dating systems. In: SIGIR, pp. 66–73. ACM, New York (2010)Google Scholar
  24. 24.
    Hannon, J., Bennett, M., Smyth, B.: Recommending twitter users to follow using content and collaborative filtering approaches. In: RecSys, pp. 199–206 (2010)Google Scholar
  25. 25.
    Guy, I., Jacovi, M., Perer, A., Ronen, I., Uziel, E.: Same places, same things, same people?: mining user similarity on social media. In: CSCW, pp. 41–50 (2010)Google Scholar
  26. 26.
    Lee, M.-J., Chung, C.-W.: A User Similarity Calculation Based on the Location for Social Network Services. In: Yu, J.X., Kim, M.H., Unland, R. (eds.) DASFAA 2011, Part I. LNCS, vol. 6587, pp. 38–52. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  27. 27.
    Ding, Y., Li, X., Orlowska, M.E.: Recency-based collaborative filtering. In: Proceedings of the 17th Australasian Database Conference, ADC, vol. 49, pp. 99–107 (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Deepak P.
    • 1
  • Sutanu Chakraborti
    • 2
  1. 1.IBM ResearchBangaloreIndia
  2. 2.Indian Institute of Technology, MadrasIndia

Personalised recommendations