Towards the Profiling of Twitter Users for Topic-Based Filtering

  • Sandra Garcia EsparzaEmail author
  • Michael P. O’Mahony
  • Barry Smyth
Conference paper


There is no doubting the incredible impact of Twitter on how we communicate, access and share information online. Currently users can follow other users or hashtags in order to benefit from a stream of data from people they trust or on topics that matter to them. However at the moment the following granularity of Twitter means that users cannot limit their information streams to a set of topics by a given user. Thus, even the most carefully curated information streams can quickly become polluted with extraneous content. In this paper we describe our initial steps to improve this situation by proposing a profiling approach that can be used for information filtering purposes as well as recommendation purposes. First, we demonstrate that it is feasible to automatically profile the interests of users by using machine learning techniques to classify the pages that they share via their tweets. We then go on to describe how this profiling mechanism can be used to organise and filter Twitter information streams. In particular we present a system that provides for a more fine-grained way to follow users on specific topics and thereby refine the standard Twitter timeline based on a user’s core topical interests.


Mobile Operator Cosine Similarity Twitter User Information Stream User Topic 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    F. Abel, Q. Gao, G.-J. Houben, and K. Tao. Analyzing user modeling on twitter for personalized news recommendations. In Proceedings of the 19th international conference on User modeling, adaption, and personalization, UMAP’11, Berlin, Heidelberg, 2011.Google Scholar
  2. 2.
    J. Chen, R. Nairn, L. Nelson, M. Bernstein, and E. Chi. Short and tweet: Experiments on recommending content from information streams. In Proceedings of the 28th international conference on Human factors in computing systems, CHI ’10, New York, NY, USA, 2010.Google Scholar
  3. 3.
    S. Garcia Esparza, M. O’Mahony, and B. Smyth. Towards tagging and categorization for micro-blogs. 21st National Conference on Artificial Intelligence and Cognitive Science AICS10, 2010.Google Scholar
  4. 4.
    S. Garcia Esparza, M. O’Mahony, and B. Smyth. Effective product recommendation using the real-time web, Mar. 22 2012. US Patent 20,120,072,427.Google Scholar
  5. 5.
    T. Joachims. Text categorization with support vector machines: learning with many relevant features. In ECML98: European Conference on Machine Learning, London, UK, 1998. Springer-Verlag.Google Scholar
  6. 6.
    S. Kinsella, A. Passant, and J. G. Breslin. Topic classification in social media using metadatafrom hyperlinked objects. In Proceedings of the 33rd European conference on Advances in information retrieval, ECIR’11, pages 201–206, Berlin, Heidelberg, 2011. Springer-Verlag.Google Scholar
  7. 7.
    M. Michelson and S. A. Macskassy. Discovering users’ topics of interest on twitter: a first look. In Proceedings of the fourth workshop on Analytics for noisy unstructured text data, AND ’10, New York, NY, USA, 2010.Google Scholar
  8. 8.
    X.-H. Phan, L.-M. Nguyen, and S. Horiguchi. Learning to classify short and sparse text & web with hidden topics from large-scale data collections. In Proceedings of the 17th international conference on World Wide Web,WWW’08, pages 91–100, New York, NY, USA, 2008. ACM.Google Scholar
  9. 9.
    O. Phelan, K. McCarthy, and B. Smyth. Using twitter to recommend real-time topical news. In Proceedings of the third ACM conference on Recommender systems, RecSys ’09, New York, NY, USA, 2009.Google Scholar
  10. 10.
    F. Sebastiani. Machine learning in automated text categorization. ACM Computing Surveys, 34(1):1–47, 2002.CrossRefGoogle Scholar
  11. 11.
    B. Sriram, D. Fuhry, E. Demir, H. Ferhatosmanoglu, and M. Demirbas. Short text classification in twitter to improve information filtering. In Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’10, New York, NY, USA, 2010.Google Scholar
  12. 12.
    C. J. van Rijsbergen. Information Retrieval. Butterworth-Heinemann, Newton, MA, USA, 1979.Google Scholar
  13. 13.
    J. Weng, E.-P. Lim, J. Jiang, and Q. He. Twitterrank: finding topic-sensitive influential twitterers. In Proceedings of the third ACM international conference on Web search and data mining, WSDM ’10, New York, NY, USA, 2010.Google Scholar
  14. 14.
    Y. Yang and X. Liu. A re-examination of text categorization methods. In SIGIR99: Proceedingsof the 22nd annual international ACM SIGIR conference on Research and developmen in information retrieval, pages 42–49, New York, NY, USA, 1999. ACM.Google Scholar

Copyright information

© Springer-Verlag London 2012

Authors and Affiliations

  • Sandra Garcia Esparza
    • 1
    Email author
  • Michael P. O’Mahony
    • 1
  • Barry Smyth
    • 1
  1. 1.Centre for Sensor Web Technologies, School of Computer Science and InformaticsDublinIreland

Personalised recommendations