Abstract
Several studies have shown that the users of Twitter reveal their interests (i.e., what they like) while they share their opinions, preferences and personal stories.
In this paper we describe Frisk a multilingual unsupervised approach for the categorization of the interests of Twitter users. Frisk models the tweets of a user and the interests (e.g., politics, sports) as bags of articles and categories of Wikipedia respectively, and ranks the interests by relevance, measured as the graph distance between the articles and the categories. To the best of our knowledge, existing unsupervised approaches do not address multilingualism and describe the users’ interests through bags of words (e.g., phone, apps), without a precise categorization (e.g., technology).
We evaluated Frisk on a dataset including 1,347 users and more than three million tweets written in four different languages (English, French, Italian and Spanish). The results indicate that Frisk shows quantitative promise, also compared to approaches based on text classification (SVM, Naive Bayes and Random Forest) and LDA.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Asuncion, A., Welling, M., Smyth, P., Teh, Y.W.: On smoothing and inference for topic models. In: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, UAI 2009, pp. 27–34. AUAI Press (2009)
Bao, H., Li, Q., Liao, S.S., Song, S., Gao, H.: A new temporal and social PMF-based method to predict users’ interests in micro-blogging. Decis. Support Syst. 55(3), 698–709 (2013)
Bhattacharya, P., Zafar, M.B., Ganguly, N., Ghosh, S., Gummadi, K.P.: Inferring user interests in the twitter social network. In: RecSys, pp. 357–360 (2014)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Ding, Y., Jiang, J.: Extracting interest tags from twitter user biographies. In: Jaafar, A., Mohamad Ali, N., Mohd Noah, S.A., Smeaton, A.F., Bruza, P., Bakar, Z.A., Jamil, N., Sembok, T.M.T. (eds.) AIRS 2014. LNCS, vol. 8870, pp. 268–279. Springer, Cham (2014). doi:10.1007/978-3-319-12844-3_23
Ferragina, P., Scaiella, U.: TAGME: on-the-fly annotation of short text fragments (by wikipedia entities). In: CIKM, pp. 1625–1628 (2010)
He, W., Liu, H., He, J., Tang, S., Du, X.: Extracting interest tags for non-famous users in social network. In: CIKM, pp. 861–870. ACM (2015)
Jipmo, C.N., Quercini, G., Bennacer, N.: Catégorisation et Désambiguïsation des Intérêts des Individus dans le Web Social. In: EGC, pp. 523–524 (2016)
Li, X., Guo, L., Zhao, Y.E.: Tag-based social interest discovery. In: WWW, pp. 675–684 (2008)
Michelson, M., Macskassy, S.A.: Discovering users’ topics of interest on twitter: a first look. In: 4th Workshop on Analytics for Noisy Unstructured Text Data, pp. 73–80. ACM (2010)
Moro, A., Raganato, A., Navigli, R.: Entity linking meets word sense disambiguation: a unified approach. TACL 2, 231–244 (2014)
Pennacchiotti, M., Silvestri, F., Vahabi, H., Venturini, R.: Making your interests follow you on twitter. In: CIKM, pp. 165–174 (2012)
Raghuram, M.A., Akshay, K., Chandrasekaran, K.: Efficient user profiling in twitter social network using traditional classifiers. In: Berretti, S., Thampi, S.M., Dasgupta, S. (eds.) Intelligent Systems Technologies and Applications. AISC, vol. 385, pp. 399–411. Springer, Cham (2016). doi:10.1007/978-3-319-23258-4_35
Spasojevic, N., Yan, J., Rao, A., Bhattacharyya, P.: LASTA: large scale topic assignment on multiple social networks. In: KDD, pp. 1809–1818 (2014)
Vu, T., Perez, V.: Interest mining from user tweets. In: CIKM, pp. 1869–1872 (2013)
Wang, T., Liu, H., He, J., Du, X.: Mining user interests from information sharing behaviors in social media. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013. LNCS, vol. 7819, pp. 85–98. Springer, Heidelberg (2013). doi:10.1007/978-3-642-37456-2_8
Wang, X., Liu, H., Fan, W.: Connecting users with similar interests via tag network inference. In: CIKM, pp. 1019–1024. ACM (2011)
Wen, Z., Lin, C.Y.: Improving user interest inference from social neighbors. In: CIKM, pp. 1001–1006 (2011)
Weng, J., Lim, E.P., Jiang, J., He, Q.: TwitterRank: finding topic-sensitive influential twitterers. In: WSDM, pp. 261–270 (2010)
Xu, Z., Lu, R., Xiang, L., Yang, Q.: Discovering user interest on twitter with a modified author-topic model. In: WI-IAT, vol. 1, pp. 422–429 (2011)
Zarrinkalam, F., Fani, H., Bagheri, E., Kahani, M., Du, W.: Semantics-enabled user interest detection from twitter. In: WI-IAT, vol. 1, pp. 469–476 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Jipmo, C.N., Quercini, G., Bennacer, N. (2017). FRISK: A Multilingual Approach to Find twitteR InterestS via wiKipedia. In: Cong, G., Peng, WC., Zhang, W., Li, C., Sun, A. (eds) Advanced Data Mining and Applications. ADMA 2017. Lecture Notes in Computer Science(), vol 10604. Springer, Cham. https://doi.org/10.1007/978-3-319-69179-4_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-69179-4_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-69178-7
Online ISBN: 978-3-319-69179-4
eBook Packages: Computer ScienceComputer Science (R0)