User profiling by inferring user personality traits, such as age and gender, plays an increasingly important role in many real-world applications. Most existing methods for user profiling either use only one type of data or ignore handling the noisy information of data. Moreover, they usually consider this problem from only one perspective. In this paper, we propose a joint user profiling model with hierarchical attention networks (JUHA) to learn informative user representations for user profiling. Our JUHA method does user profiling based on both inner-user and inter-user features. We explore inner-user features from user behaviors (e.g., purchased items and posted blogs), and inter-user features from a user-user graph (where similar users could be connected to each other). JUHA learns basic sentence and bag representations from multiple separate sources of data (user behaviors) as the first round of data preparation. In this module, convolutional neural networks (CNNs) are introduced to capture word and sentence features of age and gender while the self-attention mechanism is exploited to weaken the noisy data. Following this, we build another bag which contains a user-user graph. Inter-user features are learned from this bag using propagation information between linked users in the graph. To acquire more robust data, inter-user features and other inner-user bag representations are joined into each sentence in the current bag to learn the final bag representation. Subsequently, all of the bag representations are integrated to lean comprehensive user representation by the self-attention mechanism. Our experimental results demonstrate that our approach outperforms several state-of-the-art methods and improves prediction performance.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Price excludes VAT (USA)
Tax calculation will be finalised during checkout.
Culotta A, Ravi N K, Cutler J. Predicting Twitter user demographics using distant supervision from website traffic data. Journal of Artificial Intelligence Research, 2016, 55(1): 389–408
Hu J, Zeng H J, Li H, Niu C, Chen Z. Demographic prediction based on user’s browsing behavior. In: Proceedings of the 16th International Conference on World Wide Web. 2007, 151–160
Ying J J C, Chang Y J, Huang C M, Tseng V S. Demographic prediction based on user’s mobile behaviors. In: Proceedings of the 16th International Conference on World Wide Web. 2012, 1–4
Lu Z, Pan S J, Li Y, Jiang J, Yang Q. Collaborative evolution for user profiling in recommender systems. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence. 2016, 3804–3810
Chen S, Li C, Ji F, Zhou W, Chen H. Review-driven answer generation for product-related questions in e-commerce. In: Proceedings of the 12th ACM International Conference on Web Search and Data Mining. 2019, 411–419
Wu L, Quan C, Li C, Wang Q, Zheng B. A context-aware user-item representation learning for item recommendation. ACM Transactions on Information Systems, 2019, 37(2): 22
Chen W, Gu Y, Ren Z, He X, Xie H, Guo T, Yin D, Zhang Y. Semi-supervised user profiling with heterogeneous graph attention networks. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence. 2019, 2116–2122
Dong Y, Yang Y, Tang J, Yang Y, Chawla N V. Inferring user demographics and social strategies in mobile social networks. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2014, 15–24
Miura Y, Taniguchi M, Taniguchi T, Ohkuma T. Unifying text, metadata, and user network representations with a neural network for geolocation prediction. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017, 1260–1272
Wu C, Wu F, Liu J, He S, Huang Y, Xie X. Neural demographic prediction using search query. In: Proceedings of the 12th ACM International Conference on Web Search and Data Mining. 2019, 654–662
Farnadi G, Tang J, De Cock M, Moens M F. User profiling through deep multimodal fusion. In: Proceedings of the 11th ACM International Conference on Web Search and Data Mining. 2018, 171–179
Gu Y, Ding Z, Wang S, Yin D. Hierarchical user profiling for ecommerce recommender systems. In: Proceedings of the 13th International Conference on Web Search and Data Mining. 2020, 223–231
Heidari M, Jones J H, Uzuner O. Deep contextualized word embedding for text-based online user profiling to detect social bots on twitter. In: Proceedings of 2020 International Conference on Data Mining Workshops. 2020, 480–487
Filippova K. User demographics and language in an implicit social network. In: Proceedings of 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. 2012, 1478–1488
Li W, Dickinson M. Gender prediction for Chinese social media data. In: Proceedings of International Conference on Recent Advances in Natural Language Processing. 2017, 438–445
Peersman C, Daelemans W, Van Vaerenbergh L. Predicting age and gender in online social networks. In: Proceedings of the 3rd International Workshop on Search and Mining User-generated Contents. 2011, 37–44
Al Zamal F, Liu W, Ruths D. Homophily and latent attribute inference: inferring latent attributes of twitter users from neighbors. In: Proceedings of the 6th International Conference on Weblogs and Social Media. 2012, 387–390
Liang S, Zhang X, Ren Z, Kanoulas E. Dynamic embeddings for user profiling in twitter. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2018, 1764–1773
Rosenthal S, McKeown K. Age prediction in blogs: a study of style, content, and online behavior in pre-and post-social media generations. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 2011, 763–772
Kosinski M, Stillwell D, Graepel T. Private traits and attributes are predictable from digital records of human behavior. Proceedings of the National Academy of Sciences of the United States of America, 2013, 110(15): 5802–5805
McPherson M, Smith-Lovin L, Cook J M. Birds of a feather: homophily in social networks. Annual Review of Sociology, 2001, 27: 415–444
Farnadi G, Mahdavifar Z, Keller I, Nelson J, Teredesai A, Moens M F, De Cock M. Scalable adaptive label propagation in Grappa. In: Proceedings of 2015 IEEE International Conference on Big Data. 2015, 1485–1491
Rothe R, Timofte R, Van Gool L. DEX: deep expectation of apparent age from a single image. In: Proceedings of 2015 IEEE International Conference on Computer Vision Workshop. 2015, 252–257
Liu L, Preotiuc-Pietro D, Samani Z R, Moghaddam M E, Ungar L. Analyzing personality through social media profile picture choice. In: Proceedings of the 10th International AAAI Conference on Web and Social Media. 2016, 211–220
Biel J I, Gatica-Perez D. The YouTube lens: crowdsourced personality impressions and audiovisual analysis of vlogs. IEEE Transactions on Multimedia, 2013, 15(1): 41–55
Nguyen D, Smith N A, Rosé C. Author age prediction from text using linear regression. In: Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities. 2011, 115–123
Kosmajac D, Keselj V. Twitter user profiling: bot and gender identification. In: Proceedings of the 11th International Conference of the Cross-Language Evaluation Forum for European Languages. 2020, 141–153
Zhu Y, Hu X, Zhang Y, Li P. Transfer learning with stacked reconstruction independent component analysis. Knowledge-Based Systems, 2018, 152: 100–106
Zhu Y, Wu X, Li P, Zhang Y, Hu X. Transfer learning with deep manifold regularized auto-encoders. Neurocomputing, 2019, 369: 145–154
Wang L, Li Q, Chen X, Li S. Multi-task learning for gender and age prediction on Chinese microblog. In: Proceedings of the 5th CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2016, and 24th International Conference on Computer Processing of Oriental Languages. 2016, 189–200
Zhang D, Li S, Wang H, Zhou G. User classification with multiple textual perspectives. In: Proceedings of the 26th International Conference on Computational Linguistics. 2016, 2112–2121
Lin W, Xu H, Li J, Wu Z, Hu Z, Chang V, Wang J Z. Deep-profiling: a deep neural network model for scholarly Web user profiling. Cluster Computing, 2021, doi: https://doi.org/10.1007/s10586-021-03315-2
Li L, Hu K, Zheng Y, Liu J, Lee K A. COOPNet: multi-modal cooperative gender prediction in social media user profiling. In: Proceedings of 2021 IEEE International Conference on Acoustics, Speech and Signal Processing. 2021, 4310–4314
Farnadi G, Sitaraman G, Sushmita S, Celli F, Kosinski M, Stillwell D, Davalos S, Moens M F, De Cock M. Computational personality recognition in social media. User modeling and User-Adapted Interaction, 2016, 26(2): 109–142
Zhang Y, Yang Q. A survey on multi-task learning. IEEE Transactions on Knowledge and Data Engineering, 2021, doi: https://doi.org/10.1109/TKDE.2021.3070203
Geng Z, Zhang Y, Han Y. Joint entity and relation extraction model based on rich semantics. Neurocomputing, 2021, 429: 132–140
Hong Y, Liu Y, Yang S, Zhang K, Hu J. Joint extraction of entities and relations using graph convolution over pruned dependency trees. Neurocomputing, 2020, 411: 302–312
Kipf T N, Welling M. Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th International Conference on Learning Representations. 2016
Rahimi A, Cohn T, Baldwin T. Semi-supervised user geolocation via graph convolutional networks. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018, 2009–2019
Veličković P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y. Graph attention networks. 2017, arXiv preprint arXiv: 1710.10903
This work was supported in part by the National Key Research and Development Program of China (2016YFB1000901), Innovative Research Team in University of the Ministry of Education (IRT17R32), and the National Natural Science Foundation of China (Grant Nos. 91746209 and 61906060).
Xiaojian Liu is currently a PhD student in the School of Computer Science and Information Engineering, Hefei University of Technology, China. He received the BS and MS degrees from Hefei University of Technology, China. His research interests are data mining and knowledge engineering, which include relation extraction, keyword extraction and user profiling.
Yi Zhu is currently an assistant professor in the School of Information Engineering, Yangzhou University, China. He received the BS degree from Anhui University, China, the MS degree from the University of Science and Technology of China, and the PhD degree from Hefei University of Technology, China. His research interests are data mining, knowledge engineering, and recommendation systems.
Xindong Wu is a professor in the School of Computer Science and Information Engineering at the Hefei University of Technology, China, and the president of Mininglamp Academy of Sciences, Mininglamp, China, and a fellow of IEEE and AAAS. He received his BS and MS degrees in computer science from the Hefei University of Technology, China, and his PhD degree in artificial intelligence from the University of Edinburgh, Britain. His research interests include data mining, big data analytics, and knowledge-based systems.
Electronic supplementary material
About this article
Cite this article
Liu, X., Zhu, Y. & Wu, X. Joint user profiling with hierarchical attention networks. Front. Comput. Sci. 17, 173608 (2023). https://doi.org/10.1007/s11704-022-1437-6