Skip to main content

FRISK: A Multilingual Approach to Find twitteR InterestS via wiKipedia

  • Conference paper
  • First Online:
Advanced Data Mining and Applications (ADMA 2017)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10604))

Included in the following conference series:

Abstract

Several studies have shown that the users of Twitter reveal their interests (i.e., what they like) while they share their opinions, preferences and personal stories.

In this paper we describe Frisk a multilingual unsupervised approach for the categorization of the interests of Twitter users. Frisk models the tweets of a user and the interests (e.g., politics, sports) as bags of articles and categories of Wikipedia respectively, and ranks the interests by relevance, measured as the graph distance between the articles and the categories. To the best of our knowledge, existing unsupervised approaches do not address multilingualism and describe the users’ interests through bags of words (e.g., phone, apps), without a precise categorization (e.g., technology).

We evaluated Frisk on a dataset including 1,347 users and more than three million tweets written in four different languages (English, French, Italian and Spanish). The results indicate that Frisk shows quantitative promise, also compared to approaches based on text classification (SVM, Naive Bayes and Random Forest) and LDA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    developers.google.com/adwords/api/docs/appendix/productsservices.

  2. 2.

    https://nlp.stanford.edu/software/tmt/tmt-0.4/.

References

  1. Asuncion, A., Welling, M., Smyth, P., Teh, Y.W.: On smoothing and inference for topic models. In: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, UAI 2009, pp. 27–34. AUAI Press (2009)

    Google Scholar 

  2. Bao, H., Li, Q., Liao, S.S., Song, S., Gao, H.: A new temporal and social PMF-based method to predict users’ interests in micro-blogging. Decis. Support Syst. 55(3), 698–709 (2013)

    Article  Google Scholar 

  3. Bhattacharya, P., Zafar, M.B., Ganguly, N., Ghosh, S., Gummadi, K.P.: Inferring user interests in the twitter social network. In: RecSys, pp. 357–360 (2014)

    Google Scholar 

  4. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  5. Ding, Y., Jiang, J.: Extracting interest tags from twitter user biographies. In: Jaafar, A., Mohamad Ali, N., Mohd Noah, S.A., Smeaton, A.F., Bruza, P., Bakar, Z.A., Jamil, N., Sembok, T.M.T. (eds.) AIRS 2014. LNCS, vol. 8870, pp. 268–279. Springer, Cham (2014). doi:10.1007/978-3-319-12844-3_23

    Google Scholar 

  6. Ferragina, P., Scaiella, U.: TAGME: on-the-fly annotation of short text fragments (by wikipedia entities). In: CIKM, pp. 1625–1628 (2010)

    Google Scholar 

  7. He, W., Liu, H., He, J., Tang, S., Du, X.: Extracting interest tags for non-famous users in social network. In: CIKM, pp. 861–870. ACM (2015)

    Google Scholar 

  8. Jipmo, C.N., Quercini, G., Bennacer, N.: Catégorisation et Désambiguïsation des Intérêts des Individus dans le Web Social. In: EGC, pp. 523–524 (2016)

    Google Scholar 

  9. Li, X., Guo, L., Zhao, Y.E.: Tag-based social interest discovery. In: WWW, pp. 675–684 (2008)

    Google Scholar 

  10. Michelson, M., Macskassy, S.A.: Discovering users’ topics of interest on twitter: a first look. In: 4th Workshop on Analytics for Noisy Unstructured Text Data, pp. 73–80. ACM (2010)

    Google Scholar 

  11. Moro, A., Raganato, A., Navigli, R.: Entity linking meets word sense disambiguation: a unified approach. TACL 2, 231–244 (2014)

    Google Scholar 

  12. Pennacchiotti, M., Silvestri, F., Vahabi, H., Venturini, R.: Making your interests follow you on twitter. In: CIKM, pp. 165–174 (2012)

    Google Scholar 

  13. Raghuram, M.A., Akshay, K., Chandrasekaran, K.: Efficient user profiling in twitter social network using traditional classifiers. In: Berretti, S., Thampi, S.M., Dasgupta, S. (eds.) Intelligent Systems Technologies and Applications. AISC, vol. 385, pp. 399–411. Springer, Cham (2016). doi:10.1007/978-3-319-23258-4_35

    Chapter  Google Scholar 

  14. Spasojevic, N., Yan, J., Rao, A., Bhattacharyya, P.: LASTA: large scale topic assignment on multiple social networks. In: KDD, pp. 1809–1818 (2014)

    Google Scholar 

  15. Vu, T., Perez, V.: Interest mining from user tweets. In: CIKM, pp. 1869–1872 (2013)

    Google Scholar 

  16. Wang, T., Liu, H., He, J., Du, X.: Mining user interests from information sharing behaviors in social media. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013. LNCS, vol. 7819, pp. 85–98. Springer, Heidelberg (2013). doi:10.1007/978-3-642-37456-2_8

    Chapter  Google Scholar 

  17. Wang, X., Liu, H., Fan, W.: Connecting users with similar interests via tag network inference. In: CIKM, pp. 1019–1024. ACM (2011)

    Google Scholar 

  18. Wen, Z., Lin, C.Y.: Improving user interest inference from social neighbors. In: CIKM, pp. 1001–1006 (2011)

    Google Scholar 

  19. Weng, J., Lim, E.P., Jiang, J., He, Q.: TwitterRank: finding topic-sensitive influential twitterers. In: WSDM, pp. 261–270 (2010)

    Google Scholar 

  20. Xu, Z., Lu, R., Xiang, L., Yang, Q.: Discovering user interest on twitter with a modified author-topic model. In: WI-IAT, vol. 1, pp. 422–429 (2011)

    Google Scholar 

  21. Zarrinkalam, F., Fani, H., Bagheri, E., Kahani, M., Du, W.: Semantics-enabled user interest detection from twitter. In: WI-IAT, vol. 1, pp. 469–476 (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gianluca Quercini .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Jipmo, C.N., Quercini, G., Bennacer, N. (2017). FRISK: A Multilingual Approach to Find twitteR InterestS via wiKipedia. In: Cong, G., Peng, WC., Zhang, W., Li, C., Sun, A. (eds) Advanced Data Mining and Applications. ADMA 2017. Lecture Notes in Computer Science(), vol 10604. Springer, Cham. https://doi.org/10.1007/978-3-319-69179-4_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-69179-4_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-69178-7

  • Online ISBN: 978-3-319-69179-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics