FRISK: A Multilingual Approach to Find twitteR InterestS via wiKipedia

  • Coriane Nana Jipmo
  • Gianluca Quercini
  • Nacéra Bennacer
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10604)


Several studies have shown that the users of Twitter reveal their interests (i.e., what they like) while they share their opinions, preferences and personal stories.

In this paper we describe Frisk a multilingual unsupervised approach for the categorization of the interests of Twitter users. Frisk models the tweets of a user and the interests (e.g., politics, sports) as bags of articles and categories of Wikipedia respectively, and ranks the interests by relevance, measured as the graph distance between the articles and the categories. To the best of our knowledge, existing unsupervised approaches do not address multilingualism and describe the users’ interests through bags of words (e.g., phone, apps), without a precise categorization (e.g., technology).

We evaluated Frisk on a dataset including 1,347 users and more than three million tweets written in four different languages (English, French, Italian and Spanish). The results indicate that Frisk shows quantitative promise, also compared to approaches based on text classification (SVM, Naive Bayes and Random Forest) and LDA.


Interests Multilingual text processing Wikipedia Twitter 


  1. 1.
    Asuncion, A., Welling, M., Smyth, P., Teh, Y.W.: On smoothing and inference for topic models. In: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, UAI 2009, pp. 27–34. AUAI Press (2009)Google Scholar
  2. 2.
    Bao, H., Li, Q., Liao, S.S., Song, S., Gao, H.: A new temporal and social PMF-based method to predict users’ interests in micro-blogging. Decis. Support Syst. 55(3), 698–709 (2013)CrossRefGoogle Scholar
  3. 3.
    Bhattacharya, P., Zafar, M.B., Ganguly, N., Ghosh, S., Gummadi, K.P.: Inferring user interests in the twitter social network. In: RecSys, pp. 357–360 (2014)Google Scholar
  4. 4.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)zbMATHGoogle Scholar
  5. 5.
    Ding, Y., Jiang, J.: Extracting interest tags from twitter user biographies. In: Jaafar, A., Mohamad Ali, N., Mohd Noah, S.A., Smeaton, A.F., Bruza, P., Bakar, Z.A., Jamil, N., Sembok, T.M.T. (eds.) AIRS 2014. LNCS, vol. 8870, pp. 268–279. Springer, Cham (2014). doi: 10.1007/978-3-319-12844-3_23 Google Scholar
  6. 6.
    Ferragina, P., Scaiella, U.: TAGME: on-the-fly annotation of short text fragments (by wikipedia entities). In: CIKM, pp. 1625–1628 (2010)Google Scholar
  7. 7.
    He, W., Liu, H., He, J., Tang, S., Du, X.: Extracting interest tags for non-famous users in social network. In: CIKM, pp. 861–870. ACM (2015)Google Scholar
  8. 8.
    Jipmo, C.N., Quercini, G., Bennacer, N.: Catégorisation et Désambiguïsation des Intérêts des Individus dans le Web Social. In: EGC, pp. 523–524 (2016)Google Scholar
  9. 9.
    Li, X., Guo, L., Zhao, Y.E.: Tag-based social interest discovery. In: WWW, pp. 675–684 (2008)Google Scholar
  10. 10.
    Michelson, M., Macskassy, S.A.: Discovering users’ topics of interest on twitter: a first look. In: 4th Workshop on Analytics for Noisy Unstructured Text Data, pp. 73–80. ACM (2010)Google Scholar
  11. 11.
    Moro, A., Raganato, A., Navigli, R.: Entity linking meets word sense disambiguation: a unified approach. TACL 2, 231–244 (2014)Google Scholar
  12. 12.
    Pennacchiotti, M., Silvestri, F., Vahabi, H., Venturini, R.: Making your interests follow you on twitter. In: CIKM, pp. 165–174 (2012)Google Scholar
  13. 13.
    Raghuram, M.A., Akshay, K., Chandrasekaran, K.: Efficient user profiling in twitter social network using traditional classifiers. In: Berretti, S., Thampi, S.M., Dasgupta, S. (eds.) Intelligent Systems Technologies and Applications. AISC, vol. 385, pp. 399–411. Springer, Cham (2016). doi: 10.1007/978-3-319-23258-4_35 CrossRefGoogle Scholar
  14. 14.
    Spasojevic, N., Yan, J., Rao, A., Bhattacharyya, P.: LASTA: large scale topic assignment on multiple social networks. In: KDD, pp. 1809–1818 (2014)Google Scholar
  15. 15.
    Vu, T., Perez, V.: Interest mining from user tweets. In: CIKM, pp. 1869–1872 (2013)Google Scholar
  16. 16.
    Wang, T., Liu, H., He, J., Du, X.: Mining user interests from information sharing behaviors in social media. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013. LNCS, vol. 7819, pp. 85–98. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-37456-2_8 CrossRefGoogle Scholar
  17. 17.
    Wang, X., Liu, H., Fan, W.: Connecting users with similar interests via tag network inference. In: CIKM, pp. 1019–1024. ACM (2011)Google Scholar
  18. 18.
    Wen, Z., Lin, C.Y.: Improving user interest inference from social neighbors. In: CIKM, pp. 1001–1006 (2011)Google Scholar
  19. 19.
    Weng, J., Lim, E.P., Jiang, J., He, Q.: TwitterRank: finding topic-sensitive influential twitterers. In: WSDM, pp. 261–270 (2010)Google Scholar
  20. 20.
    Xu, Z., Lu, R., Xiang, L., Yang, Q.: Discovering user interest on twitter with a modified author-topic model. In: WI-IAT, vol. 1, pp. 422–429 (2011)Google Scholar
  21. 21.
    Zarrinkalam, F., Fani, H., Bagheri, E., Kahani, M., Du, W.: Semantics-enabled user interest detection from twitter. In: WI-IAT, vol. 1, pp. 469–476 (2015)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Coriane Nana Jipmo
    • 1
  • Gianluca Quercini
    • 1
  • Nacéra Bennacer
    • 1
  1. 1.LRI, CentraleSupélec, Université Paris-SaclayGif-sur-YvetteFrance

Personalised recommendations