Inferring Human Traits from Facebook Statuses

Cutler, Andrew; Kulis, Brian

doi:10.1007/978-3-030-01129-1_11

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11185))

Included in the following conference series:

International Conference on Social Informatics

2526 Accesses
2 Citations

Abstract

This paper explores the use of language models to predict 20 human traits from users’ Facebook status updates. The data was collected by the myPersonality project, and includes user statuses along with their personality, gender, political identification, religion, race, satisfaction with life, IQ, self-disclosure, fair-mindedness, and belief in astrology. A single interpretable model meets state of the art results for well-studied tasks such as predicting gender and personality; and sets the standard on other traits such as IQ, sensational interests, political identity, and satisfaction with life. Additionally, highly weighted words are published for each trait. These lists are valuable for creating hypotheses about human behavior, as well as for understanding what information a model is extracting. Using performance and extracted features we analyze models built on social media. The real world problems we explore include gendered classification bias and Cambridge Analytica’s use of psychographic models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Stewart, J.B.: Facebook has 50 minutes of your time each day. It wants more. The New York Times, vol. 5 (2016)
Google Scholar
SunCorp, Digitising reputation pays off in the rental market (2017)
Google Scholar
Khandani, A.E., Kim, A.J., Lo, A.W.: Consumer credit-risk models via machine-learning algorithms. J. Bank. Financ. 34(11), 2767–2787 (2010)
Article Google Scholar
Cogburn, D.L., Espinoza-Vasquez, F.K.: From networked nominee to networked nation: examining the impact of web 2.0 and social media on political participation and civic engagement in the 2008 Obama campaign. J. Polit. Mark. 10(1–2), 189–213 (2011)
Article Google Scholar
González, R.J.: Hacking the citizenry? Personality profiling, big data and the election of Donald Trump. Anthropol. Today 33(3), 9–12 (2017)
Article Google Scholar
Fitzpatrick, K.K., Darcy, A., Vierhile, M.: Delivering cognitive behavior therapy to young adults with symptoms of depression and anxiety using a fully automated conversational agent (woebot): a randomized controlled trial. JMIR Mental Health 4(2), e19 (2017). https://doi.org/10.2196/mental.7785. PMID: 28588005, PMCID: 5478797
Article Google Scholar
Allan, R.: Hard questions: who should decide what is hate speech in an online global community? (2017)
Google Scholar
Cheng, J., Danescu-Niculescu-Mizil, C., Leskovec, J.: Antisocial behavior in online discussion communities. In: ICWSM, pp. 61–70 (2015)
Google Scholar
Noulas, A., Scellato, S., Lambiotte, R., Pontil, M., Mascolo, C.: A tale of many cities: universal patterns in human urban mobility. PloS one 7(5), e37027 (2012)
Article Google Scholar
Yang, S.-H., Long, B., Smola, A., Sadagopan, N., Zheng, Z., Zha, H.: Like like alike: joint friendship and interest propagation in social networks. In: Proceedings of the 20th International Conference on World Wide Web, pp. 537–546. ACM (2011)
Google Scholar
Kosinski, M., Matz, S.C., Gosling, S.D., Popov, V., Stillwell, D.: Facebook as a research tool for the social sciences: opportunities, challenges, ethical considerations, and practical guidelines. Am. Psychol. 70(6), 543 (2015)
Article Google Scholar
Henrich, J., Heine, S.J., Norenzayan, A.: The weirdest people in the world? Behav. Brain Sci. 33(2–3), 61–83 (2010)
Article Google Scholar
Egan, V., Auty, J., Miller, R., Ahmadi, S., Richardson, C., Gargan, I.: Sensational interests and general personality traits. J. Forensic Psychiatry 10(3), 567–582 (1999)
Article Google Scholar
Egan, V., Campbell, V.: Sensational interests, sustaining fantasies and personality predict physical aggression. Pers. Individ. Differ. 47(5), 464–469 (2009)
Article Google Scholar
Weiss, A., Egan, V., Figueredo, A.J.: Sensational interests as a form of intrasexual competition. Pers. Individ. Differ. 36(3), 563–573 (2004)
Article Google Scholar
Hagger-Johnson, G., Egan, V., Stillwell, D.: Are social networking profiles reliable indicators of sensational interests? J. Res. Pers. 45(1), 71–76 (2011)
Article Google Scholar
Wang, N., Kosinski, M., Stillwell, D., Rust, J.: Can well-being be measured using facebook status updates? Validation of facebook’s gross national happiness index. Soc. Indic. Res. 115(1), 483–491 (2014)
Article Google Scholar
Kosinski, M., Stillwell, D., Graepel, T.: Private traits and attributes are predictable from digital records of human behavior. Proc. Natl. Acad. Sci. 110(15), 5802–5805 (2013)
Article Google Scholar
Schwartz, H.A., et al.: Personality, gender, and age in the language of social media: the open-vocabulary approach. PloS One 8(9), e73791 (2013)
Article Google Scholar
Farnadi, G., et al.: Computational personality recognition in social media. User Model. User Adapt. Interact. 26(2–3), 109–142 (2016)
Article Google Scholar
Sap, M., et al.: Developing age and gender predictive lexica over social media. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1146–1151 (2014)
Google Scholar
The New York Times, How trump consultants exploited the data of millions (2018)
Google Scholar
Watch, M.: Facebook valuation drops \$75 billion in week after cambridge analytica scandal (2018)
Google Scholar
The Guardian, I made Steve Bannons psychological warfare tool: meet the data war whistleblower (2018)
Google Scholar
Pennebaker, J.W., Francis, M.E., Booth, R.J.: Linguistic inquiry and word count. In: LIWC 2001, vol. 71, no. 2001, p. 2001. Lawrence Erlbaum Associates, Mahway (2001)
Google Scholar
Youyou, W., Kosinski, M., Stillwell, D.: Computer-based personality judgments are more accurate than those made by humans. Proc. Natl. Acad. Sci. 112(4), 1036–1040 (2015)
Article Google Scholar
Conneau, A., Schwenk, H., Barrault, L., Lecun, Y.: Very deep convolutional networks for text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, vol. 1, pp. 1107–1116 (2017)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Nguyen, D., et al.: Why gender and age prediction from tweets is hard: lessons from a crowdsourcing experiment. In: Proceedings of COLING 2014, The 25th International Conference on Computational Linguistics: Technical Papers, pp. 1950–1961 (2014)
Google Scholar
Bivens, R.: The gender binary will not be deprogrammed: ten years of coding gender on facebook. New Media Soc. 19(6), 880–898 (2017)
Article Google Scholar
Digman, J.M.: Personality structure: emergence of the five-factor model. Ann. Rev. Psychol. 41(1), 417–440 (1990)
Article Google Scholar
McCrae, R.R., Costa, P.T.: Validation of the five-factor model of personality across instruments and observers. J. Personality Soc. Psychol. 52(1), 81 (1987)
Article Google Scholar
M. LLC, The development and piloting of an online IQ test (2014)
Google Scholar
Kosinski, M.: Measurement and prediction of individual and group differences in the digital environment. Department of Psychology, University of Cambridge (2014)
Google Scholar
Flynn, J.R.: Massive IQ gains in 14 nations: what IQ tests really measure. Psychol. Bull. 101(2), 171 (1987)
Article Google Scholar
Diener, E., Emmons, R.A., Larsen, R.J., Griffin, S.: The satisfaction with life scale. J. Pers. Assess. 49(1), 71–75 (1985)
Article Google Scholar
Cooke, L., Wardle, J., Gibson, E., Sapochnik, M., Sheiham, A., Lawson, M.: Demographic, familial and trait predictors of fruit and vegetable consumption by pre-school children. Public Health Nutr. 7(2), 295–302 (2004)
Article Google Scholar
Peciña, M., et al.: Personality trait predictors of placebo analgesia and neurobiological correlates. Neuropsychopharmacology 38(4), 639 (2013)
Article Google Scholar
Quilty, L.C., Sellbom, M., Tackett, J.L., Bagby, R.M.: Personality trait predictors of bipolar disorder symptoms. Psychiatry Res. 169(2), 159–163 (2009)
Article Google Scholar
Tett, R.P., Jackson, D.N., Rothstein, M.: Personality measures as predictors of job performance: a meta-analytic review. Pers. Psychol. 44(4), 703–742 (1991)
Article Google Scholar
Park, G., et al.: Automatic personality assessment through social media language. J. Pers. Soc. Psychol. 108(6), 934 (2015)
Article Google Scholar
Cesare, N., Grant, C., Nsoesie, E.O.: Detection of user demographics on social media: a review of methods and recommendations for best practices. arXiv preprint arXiv:1702.01807 (2017)
Kleinberg, J., Mullainathan, S., Raghavan, M.: Inherent trade-offs in the fair determination of risk scores. arXiv preprint arXiv:1609.05807 (2016)
John, O.P., Srivastava, S.: The big five trait taxonomy: history, measurement, and theoretical perspectives. In: Handbook of Personality: Theory and Research, vol. 2, pp. 102–138 (1999)
Google Scholar
Kleinberg, J.M.: An impossibility theorem for clustering. In: Advances in Neural Information Processing Systems, pp. 463–470 (2003)
Google Scholar
Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. (CSUR) 31(3), 264–323 (1999)
Article Google Scholar
Shamir, R., Sharan, R.: 1 1 algorithmic approaches to clustering gene expression data. In: Current Topics in Computational Molecular Biology, p. 269 (2002)
Google Scholar
Dixon, S., Pampalk, E., Widmer, G.: Classification of dance music by periodicity patterns (2003)
Google Scholar
Meinshausen, N., Yu, B.: Lasso-type recovery of sparse representations for high-dimensional data. Ann. Stat. 46, 246–270 (2009)
Article MathSciNet Google Scholar
Lau, R.R., Sigelman, L., Rovner, I.B.: The effects of negative political campaigns: a meta-analytic reassessment. J. Polit. 69(4), 1176–1209 (2007)
Article Google Scholar
Huddy, L.: Group identity and political cohesion. In: Emerging Trends in the Social and Behavioral Sciences: An Interdisciplinary, Searchable, and Linkable Resource (2003)
Google Scholar
Branscombe, N.R., Wann, D.L.: Collective self-esteem consequences of outgroup derogation when a valued social identity is on trial. Eur. J. Soc. Psychol. 24(6), 641–657 (1994)
Article Google Scholar
Schneider, M.C., Bos, A.L.: Measuring stereotypes of female politicians. Polit. Psychol. 35(2), 245–266 (2014)
Article Google Scholar
Dolan, K.: The impact of gender stereotyped evaluations on support for women candidates. Polit. Behav. 32(1), 69–88 (2010)
Article Google Scholar
Vehtari, A., Gelman, A., Gabry, J.: Efficient implementation of leave-one-out cross-validation and WAIC for evaluating fitted bayesian models. arXiv preprint arXiv:1507.04544 (2015)
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Preoţiuc-Pietro, D., Liu, Y., Hopkins, D., Ungar, L.: Beyond binary labels: political ideology prediction of twitter users. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 729–740 (2017)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Sniekers, S., et al.: Genome-wide association meta-analysis of 78,308 individuals identifies new loci and genes influencing human intelligence. Nature Genet. 49(7), 1107 (2017)
Article Google Scholar
Gottlieb, B.W., Gottlieb, J., Berkell, D., Levy, L.: Sociometric status and solitary play of LD boys and girls. J. Learn. Disabil. 19(10), 619–622 (1986)
Article Google Scholar
Bryan, T., Wheeler, R., Felcan, J., Henek, T.: come on, dummy an observational study of children’s communications. J. Learn. Disabil. 9(10), 661–669 (1976)
Article Google Scholar
McConaughy, S.H., Ritter, D.R.: Social competence and behavioral problems of learning disabled boys aged 6–11. J. Learn. Disabil. 19(1), 39–45 (1986)
Article Google Scholar
Bellanti, C.J., Bierman, K.L.: Disentangling the impact of low cognitive ability and inattention on social behavior and peer relationships. J. Clin. Child Psychol. 29(1), 66–75 (2000)
Article Google Scholar
Suykens, J.A., Vandewalle, J.: Least squares support vector machine classifiers. Neural Process. Lett. 9(3), 293–300 (1999)
Article Google Scholar
Golub, G.H., Reinsch, C.: Singular value decomposition and least squares solutions. Numerische mathematik 14(5), 403–420 (1970)
Article MathSciNet Google Scholar
Iyyer, M., Enns, P., Boyd-Graber, J., Resnik, P.: Political ideology detection using recursive neural networks. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 1113–1122 (2014)
Google Scholar
Felbo, B., Mislove, A., Søgaard, A., Rahwan, I., Lehmann, S.: Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm. arXiv preprint arXiv:1708.00524 (2017)
Wired, The decline and fall of an ultra rich online gaming empire (2008)
Google Scholar
CBS News: Trump campaign phased out use of Cambridge analytica data before election (2018)
Google Scholar
Pew, Religious landscape study (2014)
Google Scholar
Zhao, J., Wang, T., Yatskar, M., Ordonez, V., Chang, K.-W.: Men also like shopping: reducing gender bias amplification using corpus-level constraints. arXiv preprint arXiv:1707.09457 (2017)
Zou, W.Y., Socher, R., Cer, D., Manning, C.D.: Bilingual word embeddings for phrase-based machine translation. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1393–1398 (2013)
Google Scholar
Clinchant, S., Perronnin, F.: Aggregating continuous word embeddings for information retrieval. In: Proceedings of the Workshop on Continuous Vector Space Models and Their Compositionality, pp. 100–109 (2013)
Google Scholar
Luo, J., Sorour, S.E., Goda, K., Mine, T.: Predicting student grade based on free-style comments using word2vec and ann by considering prediction results obtained in consecutive lessons. International Educational Data Mining Society (2015)
Google Scholar
Bolukbasi, T., Chang, K.-W., Zou, J.Y., Saligrama, V., Kalai, A.T.: Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In: Advances in Neural Information Processing Systems, pp. 4349–4357 (2016)
Google Scholar
Dwork, C., Hardt, M., Pitassi, T., Reingold, O., Zemel, R.: Fairness through awareness. In: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, pp. 214–226. ACM (2012)
Google Scholar
Joseph, M., Kearns, M., Morgenstern, J., Neel, S., Roth, A.: Rawlsian fairness for machine learning. arXiv preprint arXiv:1610.09559 (2016)
Kusner, M.J., Loftus, J., Russell, C., Silva, R.: Counterfactual fairness. In: Advances in Neural Information Processing Systems, pp. 4069–4079 (2017)
Google Scholar
Zafar, M.B., Valera, I., Gomez Rodriguez, M., Gummadi, K.P.: Fairness beyond disparate treatment & disparate impact: learning classification without disparate mistreatment. In: Proceedings of the 26th International Conference on World Wide Web, pp. 1171–1180. International World Wide Web Conferences Steering Committee (2017)
Google Scholar
Hardt, M., Price, E., Srebro, N., et al.: Equality of opportunity in supervised learning. In: Advances in Neural Information Processing Systems, pp. 3315–3323 (2016)
Google Scholar
Grgic-Hlaca, N., Zafar, M.B., Gummadi, K.P., Weller, A.: The case for process fairness in learning: Feature selection for fair decision making. In: NIPS Symposium on Machine Learning and the Law, vol. 1, p. 2 (2016)
Google Scholar
Saroglou, V.: Religiousness as a cultural adaptation of basic traits: a five-factor model perspective. Personality Soc. Psychol. Rev. 14(1), 108–125 (2010)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Boston University, Boston, MA, USA
Andrew Cutler & Brian Kulis

Authors

Andrew Cutler
View author publications
You can also search for this author in PubMed Google Scholar
Brian Kulis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrew Cutler .

Editor information

Editors and Affiliations

University of Koblenz, Koblenz, Germany
Steffen Staab
National Research University Higher School of Economics, St. Petersburg, Russia
Olessia Koltsova
National Research University Higher School of Economics, Moscow, Russia
Dmitry I. Ignatov

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cutler, A., Kulis, B. (2018). Inferring Human Traits from Facebook Statuses. In: Staab, S., Koltsova, O., Ignatov, D. (eds) Social Informatics. SocInfo 2018. Lecture Notes in Computer Science(), vol 11185. Springer, Cham. https://doi.org/10.1007/978-3-030-01129-1_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-01129-1_11
Published: 20 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01128-4
Online ISBN: 978-3-030-01129-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics