Skip to main content

Inferring Human Traits from Facebook Statuses

  • Conference paper
  • First Online:
Social Informatics (SocInfo 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11185))

Included in the following conference series:

Abstract

This paper explores the use of language models to predict 20 human traits from users’ Facebook status updates. The data was collected by the myPersonality project, and includes user statuses along with their personality, gender, political identification, religion, race, satisfaction with life, IQ, self-disclosure, fair-mindedness, and belief in astrology. A single interpretable model meets state of the art results for well-studied tasks such as predicting gender and personality; and sets the standard on other traits such as IQ, sensational interests, political identity, and satisfaction with life. Additionally, highly weighted words are published for each trait. These lists are valuable for creating hypotheses about human behavior, as well as for understanding what information a model is extracting. Using performance and extracted features we analyze models built on social media. The real world problems we explore include gendered classification bias and Cambridge Analytica’s use of psychographic models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Stewart, J.B.: Facebook has 50 minutes of your time each day. It wants more. The New York Times, vol. 5 (2016)

    Google Scholar 

  2. SunCorp, Digitising reputation pays off in the rental market (2017)

    Google Scholar 

  3. Khandani, A.E., Kim, A.J., Lo, A.W.: Consumer credit-risk models via machine-learning algorithms. J. Bank. Financ. 34(11), 2767–2787 (2010)

    Article  Google Scholar 

  4. Cogburn, D.L., Espinoza-Vasquez, F.K.: From networked nominee to networked nation: examining the impact of web 2.0 and social media on political participation and civic engagement in the 2008 Obama campaign. J. Polit. Mark. 10(1–2), 189–213 (2011)

    Article  Google Scholar 

  5. González, R.J.: Hacking the citizenry? Personality profiling, big data and the election of Donald Trump. Anthropol. Today 33(3), 9–12 (2017)

    Article  Google Scholar 

  6. Fitzpatrick, K.K., Darcy, A., Vierhile, M.: Delivering cognitive behavior therapy to young adults with symptoms of depression and anxiety using a fully automated conversational agent (woebot): a randomized controlled trial. JMIR Mental Health 4(2), e19 (2017). https://doi.org/10.2196/mental.7785. PMID: 28588005, PMCID: 5478797

    Article  Google Scholar 

  7. Allan, R.: Hard questions: who should decide what is hate speech in an online global community? (2017)

    Google Scholar 

  8. Cheng, J., Danescu-Niculescu-Mizil, C., Leskovec, J.: Antisocial behavior in online discussion communities. In: ICWSM, pp. 61–70 (2015)

    Google Scholar 

  9. Noulas, A., Scellato, S., Lambiotte, R., Pontil, M., Mascolo, C.: A tale of many cities: universal patterns in human urban mobility. PloS one 7(5), e37027 (2012)

    Article  Google Scholar 

  10. Yang, S.-H., Long, B., Smola, A., Sadagopan, N., Zheng, Z., Zha, H.: Like like alike: joint friendship and interest propagation in social networks. In: Proceedings of the 20th International Conference on World Wide Web, pp. 537–546. ACM (2011)

    Google Scholar 

  11. Kosinski, M., Matz, S.C., Gosling, S.D., Popov, V., Stillwell, D.: Facebook as a research tool for the social sciences: opportunities, challenges, ethical considerations, and practical guidelines. Am. Psychol. 70(6), 543 (2015)

    Article  Google Scholar 

  12. Henrich, J., Heine, S.J., Norenzayan, A.: The weirdest people in the world? Behav. Brain Sci. 33(2–3), 61–83 (2010)

    Article  Google Scholar 

  13. Egan, V., Auty, J., Miller, R., Ahmadi, S., Richardson, C., Gargan, I.: Sensational interests and general personality traits. J. Forensic Psychiatry 10(3), 567–582 (1999)

    Article  Google Scholar 

  14. Egan, V., Campbell, V.: Sensational interests, sustaining fantasies and personality predict physical aggression. Pers. Individ. Differ. 47(5), 464–469 (2009)

    Article  Google Scholar 

  15. Weiss, A., Egan, V., Figueredo, A.J.: Sensational interests as a form of intrasexual competition. Pers. Individ. Differ. 36(3), 563–573 (2004)

    Article  Google Scholar 

  16. Hagger-Johnson, G., Egan, V., Stillwell, D.: Are social networking profiles reliable indicators of sensational interests? J. Res. Pers. 45(1), 71–76 (2011)

    Article  Google Scholar 

  17. Wang, N., Kosinski, M., Stillwell, D., Rust, J.: Can well-being be measured using facebook status updates? Validation of facebook’s gross national happiness index. Soc. Indic. Res. 115(1), 483–491 (2014)

    Article  Google Scholar 

  18. Kosinski, M., Stillwell, D., Graepel, T.: Private traits and attributes are predictable from digital records of human behavior. Proc. Natl. Acad. Sci. 110(15), 5802–5805 (2013)

    Article  Google Scholar 

  19. Schwartz, H.A., et al.: Personality, gender, and age in the language of social media: the open-vocabulary approach. PloS One 8(9), e73791 (2013)

    Article  Google Scholar 

  20. Farnadi, G., et al.: Computational personality recognition in social media. User Model. User Adapt. Interact. 26(2–3), 109–142 (2016)

    Article  Google Scholar 

  21. Sap, M., et al.: Developing age and gender predictive lexica over social media. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1146–1151 (2014)

    Google Scholar 

  22. The New York Times, How trump consultants exploited the data of millions (2018)

    Google Scholar 

  23. Watch, M.: Facebook valuation drops \$75 billion in week after cambridge analytica scandal (2018)

    Google Scholar 

  24. The Guardian, I made Steve Bannons psychological warfare tool: meet the data war whistleblower (2018)

    Google Scholar 

  25. Pennebaker, J.W., Francis, M.E., Booth, R.J.: Linguistic inquiry and word count. In: LIWC 2001, vol. 71, no. 2001, p. 2001. Lawrence Erlbaum Associates, Mahway (2001)

    Google Scholar 

  26. Youyou, W., Kosinski, M., Stillwell, D.: Computer-based personality judgments are more accurate than those made by humans. Proc. Natl. Acad. Sci. 112(4), 1036–1040 (2015)

    Article  Google Scholar 

  27. Conneau, A., Schwenk, H., Barrault, L., Lecun, Y.: Very deep convolutional networks for text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, vol. 1, pp. 1107–1116 (2017)

    Google Scholar 

  28. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

    Google Scholar 

  29. Nguyen, D., et al.: Why gender and age prediction from tweets is hard: lessons from a crowdsourcing experiment. In: Proceedings of COLING 2014, The 25th International Conference on Computational Linguistics: Technical Papers, pp. 1950–1961 (2014)

    Google Scholar 

  30. Bivens, R.: The gender binary will not be deprogrammed: ten years of coding gender on facebook. New Media Soc. 19(6), 880–898 (2017)

    Article  Google Scholar 

  31. Digman, J.M.: Personality structure: emergence of the five-factor model. Ann. Rev. Psychol. 41(1), 417–440 (1990)

    Article  Google Scholar 

  32. McCrae, R.R., Costa, P.T.: Validation of the five-factor model of personality across instruments and observers. J. Personality Soc. Psychol. 52(1), 81 (1987)

    Article  Google Scholar 

  33. M. LLC, The development and piloting of an online IQ test (2014)

    Google Scholar 

  34. Kosinski, M.: Measurement and prediction of individual and group differences in the digital environment. Department of Psychology, University of Cambridge (2014)

    Google Scholar 

  35. Flynn, J.R.: Massive IQ gains in 14 nations: what IQ tests really measure. Psychol. Bull. 101(2), 171 (1987)

    Article  Google Scholar 

  36. Diener, E., Emmons, R.A., Larsen, R.J., Griffin, S.: The satisfaction with life scale. J. Pers. Assess. 49(1), 71–75 (1985)

    Article  Google Scholar 

  37. Cooke, L., Wardle, J., Gibson, E., Sapochnik, M., Sheiham, A., Lawson, M.: Demographic, familial and trait predictors of fruit and vegetable consumption by pre-school children. Public Health Nutr. 7(2), 295–302 (2004)

    Article  Google Scholar 

  38. Peciña, M., et al.: Personality trait predictors of placebo analgesia and neurobiological correlates. Neuropsychopharmacology 38(4), 639 (2013)

    Article  Google Scholar 

  39. Quilty, L.C., Sellbom, M., Tackett, J.L., Bagby, R.M.: Personality trait predictors of bipolar disorder symptoms. Psychiatry Res. 169(2), 159–163 (2009)

    Article  Google Scholar 

  40. Tett, R.P., Jackson, D.N., Rothstein, M.: Personality measures as predictors of job performance: a meta-analytic review. Pers. Psychol. 44(4), 703–742 (1991)

    Article  Google Scholar 

  41. Park, G., et al.: Automatic personality assessment through social media language. J. Pers. Soc. Psychol. 108(6), 934 (2015)

    Article  Google Scholar 

  42. Cesare, N., Grant, C., Nsoesie, E.O.: Detection of user demographics on social media: a review of methods and recommendations for best practices. arXiv preprint arXiv:1702.01807 (2017)

  43. Kleinberg, J., Mullainathan, S., Raghavan, M.: Inherent trade-offs in the fair determination of risk scores. arXiv preprint arXiv:1609.05807 (2016)

  44. John, O.P., Srivastava, S.: The big five trait taxonomy: history, measurement, and theoretical perspectives. In: Handbook of Personality: Theory and Research, vol. 2, pp. 102–138 (1999)

    Google Scholar 

  45. Kleinberg, J.M.: An impossibility theorem for clustering. In: Advances in Neural Information Processing Systems, pp. 463–470 (2003)

    Google Scholar 

  46. Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. (CSUR) 31(3), 264–323 (1999)

    Article  Google Scholar 

  47. Shamir, R., Sharan, R.: 1 1 algorithmic approaches to clustering gene expression data. In: Current Topics in Computational Molecular Biology, p. 269 (2002)

    Google Scholar 

  48. Dixon, S., Pampalk, E., Widmer, G.: Classification of dance music by periodicity patterns (2003)

    Google Scholar 

  49. Meinshausen, N., Yu, B.: Lasso-type recovery of sparse representations for high-dimensional data. Ann. Stat. 46, 246–270 (2009)

    Article  MathSciNet  Google Scholar 

  50. Lau, R.R., Sigelman, L., Rovner, I.B.: The effects of negative political campaigns: a meta-analytic reassessment. J. Polit. 69(4), 1176–1209 (2007)

    Article  Google Scholar 

  51. Huddy, L.: Group identity and political cohesion. In: Emerging Trends in the Social and Behavioral Sciences: An Interdisciplinary, Searchable, and Linkable Resource (2003)

    Google Scholar 

  52. Branscombe, N.R., Wann, D.L.: Collective self-esteem consequences of outgroup derogation when a valued social identity is on trial. Eur. J. Soc. Psychol. 24(6), 641–657 (1994)

    Article  Google Scholar 

  53. Schneider, M.C., Bos, A.L.: Measuring stereotypes of female politicians. Polit. Psychol. 35(2), 245–266 (2014)

    Article  Google Scholar 

  54. Dolan, K.: The impact of gender stereotyped evaluations on support for women candidates. Polit. Behav. 32(1), 69–88 (2010)

    Article  Google Scholar 

  55. Vehtari, A., Gelman, A., Gabry, J.: Efficient implementation of leave-one-out cross-validation and WAIC for evaluating fitted bayesian models. arXiv preprint arXiv:1507.04544 (2015)

  56. Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  57. Preoţiuc-Pietro, D., Liu, Y., Hopkins, D., Ungar, L.: Beyond binary labels: political ideology prediction of twitter users. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 729–740 (2017)

    Google Scholar 

  58. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  59. Sniekers, S., et al.: Genome-wide association meta-analysis of 78,308 individuals identifies new loci and genes influencing human intelligence. Nature Genet. 49(7), 1107 (2017)

    Article  Google Scholar 

  60. Gottlieb, B.W., Gottlieb, J., Berkell, D., Levy, L.: Sociometric status and solitary play of LD boys and girls. J. Learn. Disabil. 19(10), 619–622 (1986)

    Article  Google Scholar 

  61. Bryan, T., Wheeler, R., Felcan, J., Henek, T.: come on, dummy an observational study of children’s communications. J. Learn. Disabil. 9(10), 661–669 (1976)

    Article  Google Scholar 

  62. McConaughy, S.H., Ritter, D.R.: Social competence and behavioral problems of learning disabled boys aged 6–11. J. Learn. Disabil. 19(1), 39–45 (1986)

    Article  Google Scholar 

  63. Bellanti, C.J., Bierman, K.L.: Disentangling the impact of low cognitive ability and inattention on social behavior and peer relationships. J. Clin. Child Psychol. 29(1), 66–75 (2000)

    Article  Google Scholar 

  64. Suykens, J.A., Vandewalle, J.: Least squares support vector machine classifiers. Neural Process. Lett. 9(3), 293–300 (1999)

    Article  Google Scholar 

  65. Golub, G.H., Reinsch, C.: Singular value decomposition and least squares solutions. Numerische mathematik 14(5), 403–420 (1970)

    Article  MathSciNet  Google Scholar 

  66. Iyyer, M., Enns, P., Boyd-Graber, J., Resnik, P.: Political ideology detection using recursive neural networks. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 1113–1122 (2014)

    Google Scholar 

  67. Felbo, B., Mislove, A., Søgaard, A., Rahwan, I., Lehmann, S.: Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm. arXiv preprint arXiv:1708.00524 (2017)

  68. Wired, The decline and fall of an ultra rich online gaming empire (2008)

    Google Scholar 

  69. CBS News: Trump campaign phased out use of Cambridge analytica data before election (2018)

    Google Scholar 

  70. Pew, Religious landscape study (2014)

    Google Scholar 

  71. Zhao, J., Wang, T., Yatskar, M., Ordonez, V., Chang, K.-W.: Men also like shopping: reducing gender bias amplification using corpus-level constraints. arXiv preprint arXiv:1707.09457 (2017)

  72. Zou, W.Y., Socher, R., Cer, D., Manning, C.D.: Bilingual word embeddings for phrase-based machine translation. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1393–1398 (2013)

    Google Scholar 

  73. Clinchant, S., Perronnin, F.: Aggregating continuous word embeddings for information retrieval. In: Proceedings of the Workshop on Continuous Vector Space Models and Their Compositionality, pp. 100–109 (2013)

    Google Scholar 

  74. Luo, J., Sorour, S.E., Goda, K., Mine, T.: Predicting student grade based on free-style comments using word2vec and ann by considering prediction results obtained in consecutive lessons. International Educational Data Mining Society (2015)

    Google Scholar 

  75. Bolukbasi, T., Chang, K.-W., Zou, J.Y., Saligrama, V., Kalai, A.T.: Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In: Advances in Neural Information Processing Systems, pp. 4349–4357 (2016)

    Google Scholar 

  76. Dwork, C., Hardt, M., Pitassi, T., Reingold, O., Zemel, R.: Fairness through awareness. In: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, pp. 214–226. ACM (2012)

    Google Scholar 

  77. Joseph, M., Kearns, M., Morgenstern, J., Neel, S., Roth, A.: Rawlsian fairness for machine learning. arXiv preprint arXiv:1610.09559 (2016)

  78. Kusner, M.J., Loftus, J., Russell, C., Silva, R.: Counterfactual fairness. In: Advances in Neural Information Processing Systems, pp. 4069–4079 (2017)

    Google Scholar 

  79. Zafar, M.B., Valera, I., Gomez Rodriguez, M., Gummadi, K.P.: Fairness beyond disparate treatment & disparate impact: learning classification without disparate mistreatment. In: Proceedings of the 26th International Conference on World Wide Web, pp. 1171–1180. International World Wide Web Conferences Steering Committee (2017)

    Google Scholar 

  80. Hardt, M., Price, E., Srebro, N., et al.: Equality of opportunity in supervised learning. In: Advances in Neural Information Processing Systems, pp. 3315–3323 (2016)

    Google Scholar 

  81. Grgic-Hlaca, N., Zafar, M.B., Gummadi, K.P., Weller, A.: The case for process fairness in learning: Feature selection for fair decision making. In: NIPS Symposium on Machine Learning and the Law, vol. 1, p. 2 (2016)

    Google Scholar 

  82. Saroglou, V.: Religiousness as a cultural adaptation of basic traits: a five-factor model perspective. Personality Soc. Psychol. Rev. 14(1), 108–125 (2010)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrew Cutler .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cutler, A., Kulis, B. (2018). Inferring Human Traits from Facebook Statuses. In: Staab, S., Koltsova, O., Ignatov, D. (eds) Social Informatics. SocInfo 2018. Lecture Notes in Computer Science(), vol 11185. Springer, Cham. https://doi.org/10.1007/978-3-030-01129-1_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-01129-1_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-01128-4

  • Online ISBN: 978-3-030-01129-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics