An Integrated Model for User Attribute Discovery: A Case Study on Political Affiliation Identification

  • Swapna Gottipati
  • Minghui Qiu
  • Liu Yang
  • Feida Zhu
  • Jing Jiang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8443)


Discovering user demographic attributes from social media is a problem of considerable interest. The problem setting can be generalized to include three components — users, topics and behaviors. In recent studies on this problem, however, the behavior between users and topics are not effectively incorporated. In our work, we proposed an integrated unsupervised model which takes into consideration all the three components integral to the task. Furthermore, our model incorporates collaborative filtering with probabilistic matrix factorization to solve the data sparsity problem, a computational challenge common to all such tasks. We evaluated our method on a case study of user political affiliation identification, and compared against state-of-the-art baselines. Our model achieved an accuracy of 70.1% for user party detection task.


Unsupervised Integrated Model Social/feedback networks Probabilistic Matrix Factorization Collaborative filtering 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Burke, R.: Hybrid recommender systems: Survey and experiments. User Modeling and User-Adapted Interaction 12(4), 331–370 (2002)CrossRefMATHGoogle Scholar
  2. 2.
    Behrman, J.R., Behrman, J., Perez, N.M.: Out of Sync? Demographic and other social science research on health conditions in developing countries. Demographic Research 24(2), 45–78 (2011)CrossRefGoogle Scholar
  3. 3.
    Efron, M.: Using cocitation information to estimate political orientation in web documents. Knowl. Inf. Syst. 9(4) (2006)Google Scholar
  4. 4.
    Durant, K.T., Smith, M.D.: Mining sentiment classification from political web logs. In: WebKDD 2006 (2006)Google Scholar
  5. 5.
    Pennacchiotti, M., Popescu, A.M.: Democrats, republicans and starbucks afficionados: user classification in twitter. In: KDD 2011, pp. 430–438 (2011)Google Scholar
  6. 6.
    Yan, X., Yan, L.: Gender classification of weblog authors. In: AAAI 2006, pp. 228–230 (2006)Google Scholar
  7. 7.
    Peersman, C., Daelemans, W., Vaerenbergh, L.V.: Predicting age and gender in online social networks. In: SMUC, pp. 37–44 (2011)Google Scholar
  8. 8.
    Rao, D., Yarowsky, D., Shreevats, A., Gupta, M.: Classifying latent user attributes in twitter. In: SMUC 2010, pp. 37–44 (2010)Google Scholar
  9. 9.
    Abu-Jbara, A., Diab, M., Dasigi, P., Radev, D.: Subgroup detection in ideological discussions. In: ACL 2012, pp. 399–409 (2012)Google Scholar
  10. 10.
    Blondel, V.D., Loup Guillaume, J., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. Statistical Mechanics (2008)Google Scholar
  11. 11.
    Traag, V., Bruggeman, J.: Community detection in networks with positive and negative links. Physical Review E 80(3), 036115 (2009)Google Scholar
  12. 12.
    Salakhutdinov, R., Mnih, A.: Probabilistic matrix factorization. In: Advances in Neural Information Processing Systems (NIPS), p. 20 (2008)Google Scholar
  13. 13.
    Ma, H., Yang, H., Lyu, M.R., King, I.: Sorec: Social recommendation using probabilistic matrix factorization. In: Proc. of CIKM (2008)Google Scholar
  14. 14.
    Pennacchiotti, M., Popescu, A.M.: A machine learning approach to twitter user classification. In: ICWSM (2011)Google Scholar
  15. 15.
    Gottipati, S., Qiu, M., Yang, L., Zhu, F., Jiang, J.: Predicting user’s political party using ideological stances. In: Jatowt, A., et al. (eds.) SocInfo 2013. LNCS, vol. 8238, pp. 177–191. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  16. 16.
    Yang, S.H., Long, B., Smola, A., Sadagopan, N., Zheng, Z., Zha, H.: Like like alike: joint friendship and interest propagation in social networks. In: WWW 2011 (2011)Google Scholar
  17. 17.
    Yardi, S., Boyd, D.: Dynamic Debates: An Analysis of Group Polarization Over Time on Twitter. Bulletin of Science, Technology & Society 30(5), 316–327 (2010)CrossRefGoogle Scholar
  18. 18.
    Pan, R., Zhou, Y., Cao, B., Liu, N.N., Lukose, R., Scholz, M., Yang, Q.: One-class collaborative filtering. In: ICDM 2008 (2008)Google Scholar
  19. 19.
    Abu-Jbara, A., Radev, D.: Subgroup detector: a system for detecting subgroups in online discussions. In: ACL 2012, pp. 133–138 (2012)Google Scholar
  20. 20.
    Bansal, N., Blum, A., Chawla, S.: Correlation clustering. In: Machine Learning, pp. 238–247 (2002)Google Scholar
  21. 21.
    Galley, M., McKeown, K., Hirschberg, J., Shriberg, E.: Identifying agreement and disagreement in conversational speech: use of bayesian networks to model pragmatic dependencies. In: ACL 2004 (2004)Google Scholar
  22. 22.
    Bagon, S., Galun, M.: Large scale correlation clustering optimization. CoRR (2011)Google Scholar
  23. 23.
    Traag, V., Bruggeman, J.: Community detection in networks with positive and negative links. Physical Review E 80(3), 036115 (2009)Google Scholar
  24. 24.
    Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press (2008)Google Scholar
  25. 25.
    Benevenuto, F., Rodrigues, T., Cha, M., Almeida, V.: Characterizing user behavior in online social networks. In: ACM SIGCOMM 2009, pp. 49–62 (2009)Google Scholar
  26. 26.
    Lu, Y., Wang, H., Zhai, C., Roth, D.: Unsupervised discovery of opposing opinion networks from forum discussions. In: CIKM 2012, pp. 1642–1646 (2012)Google Scholar
  27. 27.
    Singh, A.P., Gordon, G.J.: Relational learning via collective matrix factorization. In: KDD 2008, pp. 650–658 (2008)Google Scholar
  28. 28.
    Qiu, M., Yang, L., Jiang, J.: Mining user relations from online discussions using sentiment analysis and probabilistic matrix factorization. In: NAACL (2013)Google Scholar
  29. 29.
    Zhou, D.X., Resnick, P., Mei, Q.: Classifying the political leaning of news articles and users from user votes. In: ICWSM (2011)Google Scholar
  30. 30.
    Dahllöf, M.: Automatic prediction of gender, political affiliation, and age in Swedish politicians from the wording of their speeches - a comparative study of classifiability. LLC 27(2), 139–153 (2012)Google Scholar
  31. 31.
    Somasundaran, S., Wiebe, J.: Recognizing stances in ideological on-line debates. In: NAACL HLT 2010, pp. 116–124 (2010)Google Scholar
  32. 32.
    Conover, M., Gonçalves, B., Ratkiewicz, J., Flammini, A., Menczer, F.: Predicting the political alignment of twitter users. In: SocialCom 2011 (2011)Google Scholar
  33. 33.
    Boutet, A., Kim, H.: What’s in Twitter? I Know What Parties are Popular and Who You are Supporting Now! In: ASONAM 2012, vol. 2 (2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Swapna Gottipati
    • 1
  • Minghui Qiu
    • 1
  • Liu Yang
    • 1
    • 2
  • Feida Zhu
    • 1
  • Jing Jiang
    • 1
  1. 1.School of Information SystemsSingapore Management UniversitySingapore
  2. 2.School of Software and MicroelectronicsPeking UniversityChina

Personalised recommendations