A Fusion Model of Multi-data Sources for User Profiling in Social Media

  • Liming Zhang
  • Sihui Fu
  • Shengyi JiangEmail author
  • Rui Bao
  • Yunfeng Zeng
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11109)


User profiling in social media plays an important role in different applications. Most of the existing approaches for user profiling are based on user-generated messages, which is not sufficient for inferring user attributes. With the continuous accumulation of data in social media, integrating multi-data sources has become the inexorable trend for precise user profiling. In this paper, we take advantage of text messages, user metadata, followee information and network representations. In order to integrate seamlessly multi-data sources, we propose a novel fusion model that effectively captures the complementarity and diversity of different sources. In addition, we address the problem of friendship-based network from previous studies and introduce celebrity ties which enrich the social network and boost the connectivity of different users. Experimental results show that our method outperforms several state-of-the-art methods on a real-world dataset.


User profiling Social media Multi-data sources Fusion model 



This work was supported by the National Natural Science Foundation of China (No. 61572145) and the Major Projects of Guangdong Education Department for Foundation Research and Applied Research (No. 2017KZDXM031). The authors would like to thank the anonymous reviewers for their valuable comments and suggestions.


  1. 1.
    Lu, Z., Pan, S.J., Li, Y., Jiang, J., Yang, Q.: Collaborative evolution for user profiling in recommender systems. In: IJCAI International Joint Conference on Artificial Intelligence, pp. 3804–3810 (2016)Google Scholar
  2. 2.
    Zhou, M.: Gender difference in web search perceptions and behavior: does it vary by task performance? Comput. Educ. 78(259), 174–184 (2014)CrossRefGoogle Scholar
  3. 3.
    Preoţiuc-Pietro, D., Liu, Y., Hopkins, D., Ungar, L.: Beyond binary labels: political ideology prediction of twitter users. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pp. 729–740 (2017)Google Scholar
  4. 4.
    Zhang, D., Yin, J., Zhu, X., Zhang, C.: User profile preserving social network embedding. In: Twenty-Sixth International Joint Conference on Artificial Intelligence, pp. 3378–3384 (2017)Google Scholar
  5. 5.
    Burger, J.D., Henderson, J., Kim, G., Zarrella, G.: Discriminating gender on Twitter. In: Conference on Empirical Methods in Natural Language Processing, pp. 1301–1309 (2011)Google Scholar
  6. 6.
    Chen, J., Li, S., Dai, B., Zhou, G.: Active learning for age regression in social media. In: China National Conference on Chinese Computational Linguistics, pp. 351–362 (2016)Google Scholar
  7. 7.
    Roller, S., Speriosu, M., Rallapalli, S., Wing, B., Baldridge, J.: Supervised text-based geolocation using language models on an adaptive grid. In: Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 1500–1510 (2012)Google Scholar
  8. 8.
    Preoţiuc-Pietro, D., Lampos, V., Aletras, N.: An analysis of the user occupational class through Twitter content. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pp. 1754–1764 (2015)Google Scholar
  9. 9.
    Kim, H.R., Chan, P.K.: Learning implicit user interest hierarchy for context in personalization. Appl. Intell. 28(2), 153–166 (2008)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Majumder, N., Poria, S., Gelbukh, A., Cambria, E.: Deep learning-based document modeling for personality detection from text. IEEE Intell. Syst. 32(2), 74–79 (2017)CrossRefGoogle Scholar
  11. 11.
    Lampos, V., Aletras, N.: Predicting and characterising user impact on Twitter. In: Conference of the European Chapter of the Association for Computational Linguistics, pp. 405–413 (2014)Google Scholar
  12. 12.
    Schler, J., Koppel, M., Argamon, S., Pennebaker, J.: Effects of age and gender on Blogging. In: Proceedings of AAAI Symposium on Computational Approaches for Analyzing Weblogs, pp. 199–205 (2006)Google Scholar
  13. 13.
    Ciot, M., Sonderegger, M., Ruths, D.: Gender inference of Twitter users in non-english contexts. In: Conference on Empirical Methods in Natural Language Processing, pp. 1136–1145 (2013)Google Scholar
  14. 14.
    Mukherjee, A., Liu, B.: Improving gender classification of blog authors. In: Conference on Empirical Methods in Natural Language Processing, pp. 158–166 (2010)Google Scholar
  15. 15.
    Marquardt, J., et al.: Age and gender identification in social media. In: Proceedings of CLEF 2014 Evaluation Labs, pp. 1129–1136 (2014)Google Scholar
  16. 16.
    Mislove, A., Viswanath, B., Gummadi, K., Druschel, P.: You are who you know: inferring user profiles in online social networks. In: Third ACM International Conference on Web Search and Data Mining, pp. 251–260 (2010)Google Scholar
  17. 17.
    Han, X., Wang, L., Crespi, N., Park, S., Cuevas, Á.: Alike people, alike interests? inferring interest similarity in online social networks. Decision Support Systems 69(C), 92–106 (2015)CrossRefGoogle Scholar
  18. 18.
    Miura, Y., Taniguchi, M., Taniguchi, T., Ohkuma, T.: Unifying text, metadata, and user network representations with a neural network for geolocation prediction. In: Meeting of the Association for Computational Linguistics, pp. 1260–1272 (2017)Google Scholar
  19. 19.
    Wang, J., Li, S., Zhou, G.: Joint learning on relevant user attributes in micro-blog. In: IJCAI International Joint Conference on Artificial Intelligence, pp. 4130–4136 (2017)Google Scholar
  20. 20.
    Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 855–864 (2016)Google Scholar
  21. 21.
    Perozzi, B., Al-Rfou, R., Skiena, S.: DeepWalk: online learning of social representations bryan. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 701–710 (2014)Google Scholar
  22. 22.
    Tang, J., Qu, M.: LINE: large-scale information network embedding categories and subject descriptors. In: International World Wide Web Conferences Steering Committee, pp. 1067–1077 (2015)Google Scholar
  23. 23.
    Yang, C., Liu, Z., Zhao, D., Sun, M., Chang, E.Y.: Network representation learning with rich text information. In: IJCAI International Joint Conference on Artificial Intelligence, pp. 2111–2117 (2015)Google Scholar
  24. 24.
    Zhao, Z., Du, J., Gao, Q., Gui, L., Xu, R.: Inferring user profile using microblog content and friendship network. In: Communications in Computer and Information Science, pp. 29–39 (2017)Google Scholar
  25. 25.
    Han, B., Cook, P., Baldwin, T.: A stacking-based approach to twitter user geolocation prediction. In: Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 7–12 (2013)Google Scholar
  26. 26.
    Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1480–1489 (2016)Google Scholar
  27. 27.
    Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. Computer Science (2014)Google Scholar
  28. 28.
    Mikolov, T., Corrado, G., Chen, K., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of the International Conference on Learning Representations, pp. 1–12 (2013)Google Scholar
  29. 29.
    Kingma, D.P., Ba, J.L.: Adam: A method for stochastic optimization. Computer Science (2014)Google Scholar
  30. 30.
    Van Der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(2605), 2579–2605 (2008)zbMATHGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Liming Zhang
    • 1
  • Sihui Fu
    • 1
  • Shengyi Jiang
    • 1
    • 2
    Email author
  • Rui Bao
    • 1
  • Yunfeng Zeng
    • 1
  1. 1.School of Information Science and TechnologyGuangdong University of Foreign StudiesGuangzhouChina
  2. 2.Engineering Research Center for Cyberspace Content Security of Guangdong ProvinceGuangzhouChina

Personalised recommendations