Large Scale Retrieval of Social Network Pages by Interests of Their Followers

  • Elena Mikhalkova
  • Yuri Karyakin
  • Igor Glukhikh
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10860)


Social networks provide an opportunity to form communities of people that share their interests on a regular basis (circles of fans of different music, books, kinds of sports, etc.). Every community manifests these interests creating lots of linguistic data to attract new followers to certain pages and support existing clusters of users. In the present article, we suggest a model of retrieving such pages that attract users with similar interests, from a large collection of pages. We test our model on three types of pages manually retrieved from the social network Vkontakte and classified as interesting for a. football fans, b. vegetarians, c. historical reenactors. We use such machine learning classifiers as Naive Bayes, SVM, Logistic Regression, Decision Trees to compare their performance with the performance of our system. It appears that the mentioned classifiers can hardly retrieve (i.e. single out) pages with a particular interest that form a small collection of 30 samples from a collection as large as 4,090 samples. In particular, our system exceeds their best result (F1-score = 0.65) and achieves F1-score of 0.72.


Interest discovery Social group Major interest Social network Supervised machine learning 


  1. 1.
    Agichtein, E., Brill, E., Dumais, S., Ragno, R.: Learning user interaction models for predicting web search result preferences. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 3–10. ACM (2006)Google Scholar
  2. 2.
    Ahmed, A., Low, Y., Aly, M., Josifovski, V., Smola, A.J.: Scalable distributed inference of dynamic user interests for behavioral targeting. In: KDD (2011)Google Scholar
  3. 3.
    Al-Kouz, A., Albayrak, S.: An interests discovery approach in social networks based on semantically enriched graphs. In: 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 1272–1277. IEEE (2012)Google Scholar
  4. 4.
    Bakalov, F., König-Ries, B., Nauerz, A., Welsch, M.: A hybrid approach to identifying user interests in web portals. In: IICS, pp. 123–134 (2009)Google Scholar
  5. 5.
    Bentley, A.F.: The Process of Government. Ripol Klassik, Moskva (1955)Google Scholar
  6. 6.
    Billsus, D., Pazzani, M.J.: A hybrid user model for news story classification. In: Kay, J. (ed.) UM99 User Modeling. CICMS, vol. 407, pp. 99–108. Springer, Vienna (1999). Scholar
  7. 7.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3(January), 993–1022 (2003)zbMATHGoogle Scholar
  8. 8.
    Bonhard, P., Sasse, M.A.: ‘Knowing me, knowing you’ - using profiles and social networking to improve recommender systems. BT Technol. J. 24(3), 84–98 (2006)CrossRefGoogle Scholar
  9. 9.
    Brown, J., Broderick, A.J., Lee, N.: Word of mouth communication within online communities: conceptualizing the online social network. J. Interact. Mark. 21(3), 2–20 (2007)CrossRefGoogle Scholar
  10. 10.
    Dugan, C., Muller, M., Millen, D.R., Geyer, W., Brownholtz, B., Moore, M.: The Dogear game: a social bookmark recommender system. In: Proceedings of the 2007 International ACM Conference on Supporting Group Work, pp. 387–390. ACM (2007)Google Scholar
  11. 11.
    Firan, C.S., Nejdl, W., Paiu, R.: The benefit of using tag-based profiles. In: Web Conference, LA-WEB 2007. Latin American, pp. 32–41. IEEE (2007)Google Scholar
  12. 12.
    Fire, M., Puzis, R.: Organization mining using online social networks. Netw. Spat. Econ. 16(2), 545–578 (2016)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Fischer, G.: User modeling in human-computer interaction. User Model. User-Adap. Inter. 11(1), 65–86 (2001)CrossRefGoogle Scholar
  14. 14.
    Frolov, S.: Sociology: personality and society. The main factors of personality development (1994)Google Scholar
  15. 15.
    Gomaa, W.H., Fahmy, A.A.: A survey of text similarity approaches. Int. J. Comput. Appl. 68(13) (2013)Google Scholar
  16. 16.
    Groh, G., Ehmig, C.: Recommendations in taste related domains: collaborative filtering vs. social filtering. In: Proceedings of the 2007 International ACM Conference on Supporting Group Work, pp. 127–136. ACM (2007)Google Scholar
  17. 17.
    Guy, I., Zwerdling, N., Carmel, D., Ronen, I., Uziel, E., Yogev, S., Ofek-Koifman, S.: Personalized recommendation of social software items based on social relations. In: Proceedings of the Third ACM Conference on Recommender Systems, pp. 53–60. ACM (2009)Google Scholar
  18. 18.
    Li, X., Guo, L., Zhao, Y.E.: Tag-based social interest discovery. In: Proceedings of the 17th International Conference on World Wide Web, pp. 675–684. ACM (2008)Google Scholar
  19. 19.
    Li, Y., Dong, M., Huang, R.: Special interest groups discovery and semantic navigation support within online discussion forums. In: IEEE International Joint Conference on Neural Networks, IJCNN 2008. (IEEE World Congress on Computational Intelligence), pp. 3904–3911. IEEE (2008)Google Scholar
  20. 20.
    McCallum, A., Corrada-Emmanuel, A., Wang, X.: Topic and role discovery in social networks. In: IJCAI, vol. 5, pp. 786–791. Citeseer (2005)Google Scholar
  21. 21.
    Merton, R.K.: Social structure and anomie. Am. Sociol. Rev. 3(5), 672–682 (1938)CrossRefGoogle Scholar
  22. 22.
    Mikhalkova, E., Karyakin, Y., Ganzherli, N.: A comparative analysis of social network pages by interests of their followers. arXiv preprint arXiv:1707.05481v2 (2017)
  23. 23.
    Newman, M.E., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69(2), 026113 (2004)CrossRefGoogle Scholar
  24. 24.
    Pazzani, M.J.: A framework for collaborative, content-based and demographic filtering. Artif. Intell. Rev. 13(5–6), 393–408 (1999)CrossRefGoogle Scholar
  25. 25.
    Pazzani, M.J., Billsus, D.: Content-based recommendation systems. In: Brusilovsky, P., Kobsa, A., Nejdl, W. (eds.) The Adaptive Web. LNCS, vol. 4321, pp. 325–341. Springer, Heidelberg (2007). Scholar
  26. 26.
    Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetzbMATHGoogle Scholar
  27. 27.
    Piao, G., Breslin, J.G.: Interest representation, enrichment, dynamics, and propagation: a study of the synergetic effect of different user modeling dimensions for personalized recommendations on Twitter. In: Blomqvist, E., Ciancarini, P., Poggi, F., Vitali, F. (eds.) EKAW 2016. LNCS (LNAI), vol. 10024, pp. 496–510. Springer, Cham (2016). Scholar
  28. 28.
    Piao, S., Whittle, J.: A feasibility study on extracting Twitter users’ interests using NLP tools for serendipitous connections. In: 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust (PASSAT) and 2011 IEEE Third International Conference on Social Computing (SocialCom), pp. 910–915. IEEE (2011)Google Scholar
  29. 29.
    Ramage, D., Dumais, S.T., Liebling, D.J.: Characterizing microblogs with topic models. ICWSM 10(1), 16 (2010)Google Scholar
  30. 30.
    Reicher, S.: The determination of collective behaviour. Soc. Ident. Intergroup Relat., pp. 41–83 (1982)Google Scholar
  31. 31.
    Scott, J.: Social Network Analysis. SAGE Publications, Thousand Oaks (2017)Google Scholar
  32. 32.
    Sen, S., Vig, J., Riedl, J.: Tagommenders: connecting users to items through tags. In: Proceedings of the 18th International Conference on World Wide Web, pp. 671–680. ACM (2009)Google Scholar
  33. 33.
    Shen, W., Wang, J., Luo, P., Wang, M.: Linking named entities in tweets with knowledge base via user interest modeling. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 68–76. ACM (2013)Google Scholar
  34. 34.
    Shi, L.L., Liu, L., Wu, Y., Jiang, L., Hardy, J.: Event detection and user interest discovering in social media data streams. IEEE Access 5, 20953–20964 (2017)CrossRefGoogle Scholar
  35. 35.
    Stefani, A., Strapparava, C.: Exploiting NLP techniques to build user model for web sites: the use of WordNet in SiteIF project. In: Proceedings of the 2nd Workshop on Adaptive Systems and User Modeling on the WWW (1999)Google Scholar
  36. 36.
    Szomszor, M., Alani, H., Cantador, I., O’Hara, K., Shadbolt, N.: Semantic modelling of user interests based on cross-folksonomy analysis. In: Sheth, A., Staab, S., Dean, M., Paolucci, M., Maynard, D., Finin, T., Thirunarayan, K. (eds.) ISWC 2008. LNCS, vol. 5318, pp. 632–648. Springer, Heidelberg (2008). Scholar
  37. 37.
    Volkova, S., Coppersmith, G., Van Durme, B.: Inferring user political preferences from streaming communications. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Long Papers), vol. 1, pp. 186–196 (2014)Google Scholar
  38. 38.
    Wang, Q., Xu, J., Li, H.: User message model: a new approach to scalable user modeling on microblog. In: Jaafar, A., Mohamad Ali, N., Mohd Noah, S.A., Smeaton, A.F., Bruza, P., Bakar, Z.A., Jamil, N., Sembok, T.M.T. (eds.) AIRS 2014. LNCS, vol. 8870, pp. 209–220. Springer, Cham (2014). Scholar
  39. 39.
    Xu, S., Shi, Q., Qiao, X., Zhu, L., Zhang, H., Jung, H., Lee, S., Choi, S.P.: Adynamic users’ interest discovery model with distributed inference algorithm. Int. J. Distrib. Sens. Netw. 10(4), Article ID 280892 (2014)CrossRefGoogle Scholar
  40. 40.
    Yang, J., Leskovec, J.: Defining and evaluating network communities based on ground-truth. Knowl. Inf. Syst. 42(1), 181–213 (2015)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Tyumen State UniversityTyumenRussia

Personalised recommendations