An Analysis of Topical Proximity in the Twitter Social Graph

  • Markus Schaal
  • John O’Donovan
  • Barry Smyth
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7710)


Standard approaches of information retrieval are increasingly complemented by social search even when it comes to rational information needs. Twitter, as a popular source of real-time information, plays an important role in this respect, as both the follower-followee graph and the many relationships among users provide a rich set of information pieces about the social network. However, many hidden factors must be considered if social data are to successfully support the search for high-quality information. Here we focus on one of these factors, namely the relationship between content similarity and social distance in the social network. We compared two methods to compute content similarity among twitter users in a one-per-user document collection, one based on standard term frequency vectors, the other based on topic associations obtained by Latent Dirichlet Allocation (LDA). By comparing these metrics at different hop distances in the social graph we investigated the utility of prominent features such as Retweets and Hashtags as predictors of similarity, and demonstrated the advantages of topical proximity vs. textual similarity for friend recommendations.


micro blogs topical proximity social network distance friend recommendation 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Phelan, O., McCarthy, K., Bennett, M., Smyth, B.: Terms of a Feather: Content-Based News Recommendation and Discovery Using Twitter. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J.F., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 448–459. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  2. 2.
    Smyth, B., Briggs, P., Coyle, M., O’Mahony, M.: Google Shared. A Case-Study in Social Search. In: Houben, G.-J., McCalla, G., Pianesi, F., Zancanaro, M. (eds.) UMAP 2009. LNCS, vol. 5535, pp. 283–294. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  3. 3.
    McNally, K., O’Mahony, M.P., Smyth, B., Coyle, M., Briggs, P.: Social and collaborative web search: an evaluation study. In: Proceedings of the 16th International Conference on Intelligent User Interfaces, IUI 2011, pp. 387–390. ACM, New York (2011)Google Scholar
  4. 4.
    Agrawal, R., Gollapudi, S., Halverson, A., Ieong, S.: Diversifying search results. In: Proceedings of the Second ACM International Conference on Web Search and Data Mining, WSDM 2009, pp. 5–14. ACM, New York (2009)Google Scholar
  5. 5.
    Vargas, S., Castells, P.: Rank and relevance in novelty and diversity metrics for recommender systems. In: Proceedings of the 2011 ACM Conference on Recommender Systems, RecSys 2011, Chicago, IL, USA, October 23-27, pp. 109–116 (2011)Google Scholar
  6. 6.
    Schaal, M., Fidan, G., Müller, R.M., Dagli, O.: Quality Assessment in the Blog Space. The Learning Organization 17(6), 529–536 (2010)CrossRefGoogle Scholar
  7. 7.
    Bourke, S., McCarthy, K., Smyth, B.: Power to the people: exploring neighbourhood formations in social recommender system. In: Proceedings of the Fifth ACM Conference on Recommender Systems, RecSys 2011, pp. 337–340. ACM, New York (2011)Google Scholar
  8. 8.
    Jamali, M., Ester, M.: A matrix factorization technique with trust propagation for recommendation in social networks. In: Proceedings of the Fourth ACM Conference on Recommender Systems, RecSys 2010, p. 135. ACM Press, New York (2010)Google Scholar
  9. 9.
    O’Donovan, J., Smyth, B.: Trust in recommender systems. In: Proceedings of the 10th International Conference on Intelligent User Interfaces, IUI 2005, pp. 167–174. ACM, New York (2005)Google Scholar
  10. 10.
    Esparza, S.G., O’Mahony, M.P., Smyth, B.: Effective Product Recommendation using the Real-Time Web. In: Bramer, M., Petridis, M., Hopgood, A. (eds.) Research and Development in Intelligent Systems XXVII. Springer, London (2011); Proceedings of AI 2010Google Scholar
  11. 11.
    Puniyani, K., Eisenstein, J., Cohen, S., Xing, E.P.: Social links from latent topics in microblogs. In: Proceedings of the NAACL HLT 2010 Workshop on Computational Linguistics in a World of Social Media, WSA 2010, pp. 19–20. Association for Computational Linguistics, Stroudsburg (2010)Google Scholar
  12. 12.
    Kang, B., O’Donovan, J., Höllerer, T.: Modeling topic specific credibility on twitter. In: Proceedings of the 2012 ACM International Conference on Intelligent User Interfaces, IUI 2012, pp. 179–188 (2012)Google Scholar
  13. 13.
    Hannon, J., Bennett, M., Smyth, B.: Recommending twitter users to follow using content and collaborative filtering approaches. In: Proceedings of the Fourth ACM Conference on Recommender Systems, RecSys 2010, pp. 199–206. ACM, New York (2010)Google Scholar
  14. 14.
    Pennacchiotti, M., Gurumurthy, S.: Investigating topic models for social media user recommendation. In: Proceedings of the 20th International Conference on World Wide Web, WWW 2011, Hyderabad, India, pp. 101–102 (2011)Google Scholar
  15. 15.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)zbMATHGoogle Scholar
  16. 16.
    Gretarsson, B., O’Donovan, J., Bostandjiev, S., Höllerer, T., Asuncion, A.U., Newman, D., Smyth, P.: Topicnets: Visual analysis of large text corpora with topic modeling. ACM TIST 3(2), 23 (2012)Google Scholar
  17. 17.
    Woerndl, W., Groh, G.: Utilizing physical and social context to improve recommender systems. In: Web Intelligence/IAT Workshops, pp. 123–128 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Markus Schaal
    • 1
  • John O’Donovan
    • 2
  • Barry Smyth
    • 1
  1. 1.University College DublinBelfield, DublinIreland
  2. 2.University of CaliforniaSanta BarbaraUSA

Personalised recommendations