Computational personality recognition in social media

  • Golnoosh Farnadi
  • Geetha Sitaraman
  • Shanu Sushmita
  • Fabio Celli
  • Michal Kosinski
  • David Stillwell
  • Sergio Davalos
  • Marie-Francine Moens
  • Martine De Cock


A variety of approaches have been recently proposed to automatically infer users’ personality from their user generated content in social media. Approaches differ in terms of the machine learning algorithms and the feature sets used, type of utilized footprint, and the social media environment used to collect the data. In this paper, we perform a comparative analysis of state-of-the-art computational personality recognition methods on a varied set of social media ground truth data from Facebook, Twitter and YouTube. We answer three questions: (1) Should personality prediction be treated as a multi-label prediction task (i.e., all personality traits of a given user are predicted at once), or should each trait be identified separately? (2) Which predictive features work well across different on-line environments? and (3) What is the decay in accuracy when porting models trained in one social media environment to another?


Big Five personality Social media User generated content Multivariate regression Feature analysis 



We would like to thank the anonymous reviewers for their helpful comments and suggestions. This work was funded in part by the SBO-program of the Flemish Agency for Innovation by Science and Technology (IWT-SBO-Nr. 110067).

Supplementary material (10 kb)
Supplementary material 1 (zip 10 KB)


  1. Aharony, N., Pan, W., Ip, C., Khayal, I., Pentland, A.: Social fmri: Investigating and shaping social mechanisms in the real world. Pervasive Mob. Comput. 7(6), 643–659 (2011)CrossRefGoogle Scholar
  2. Aran, O., Gatica-Perez, D.: Cross-domain personality prediction: from video blogs to small group meetings. In: Proceedings of the 15th ACM International Conference on Multimodal Interaction, pp. 127–130. ACM (2013)Google Scholar
  3. Bachrach, Y., Kosinski, M., Graepel, T., Kohli, P., Stillwell, D.: Personality and patterns of Facebook usage. In: Proceedings of the 3rd Annual ACM Web Science Conference (Web-Sci), pp. 24–32. ACM (2012)Google Scholar
  4. Back, M.D., Stopfer, J.M., Vazire, S., Gaddis, S., Schmukle, S.C., Egloff, B., Gosling, S.D.: Facebook profiles reflect actual personality, not self-idealization. Psychol. Sci. 21, 372–374 (2010)CrossRefGoogle Scholar
  5. Bai, S., Hao, B., Li, A., Yuan, S., Gao, R., Zhu, T.: Predicting Big Five personality traits of microblog users. In: Proceedings of the IEEE/WIC/ACM WI-IAT, vol. 1, pp. 501–508 (2013)Google Scholar
  6. Biel, J., Gatica-Perez, D.: The YouTube lens: crowdsourced personality impressions and audiovisual analysis of vlogs. IEEE Trans. Multimed. 15(1), 41–55 (2013)CrossRefGoogle Scholar
  7. Biel, J.I., Aran, O., Gatica-Perez, D.: You are known by how you vlog: Personality impressions and nonverbal behavior in youtube. In: Proceedings of the AAAI International Conference on Weblogs and Social Media (ICWSM), pp. 446–449 (2011)Google Scholar
  8. Blockeel, H., Raedt, L.D., Ramon, J.: Top-down induction of clustering trees. In: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 55–63 (1998)Google Scholar
  9. Cantador, I., Fernández-Tobías, I., Bellogín, A., Kosinski, M., Stillwell, D.: Relating personality types with user preferences in multiple entertainment domains. In: Proceedings of the 1st Workshop on Emotion and Personality in Personalized Services (EMPIRE) (2013)Google Scholar
  10. Celli, F., Lepri, B., Biel, J.I., Gatica-Perez, D., Riccardi, G., Pianesi, F.: The workshop on computational personality recognition 2014. In: Proceedings of the ACM International Conference on Multimedia, pp. 1245–1246. ACM (2014)Google Scholar
  11. Celli, F., Rossi, L.: The role of emotional stability in Twitter conversations. In: Proceedings of the Workshop on Semantic Analysis in Social Media. Association for Computational Linguistics, pp. 10–17 (2012)Google Scholar
  12. Costa, P.T., McCrae, R.R.: The revised NEO personality inventory (NEO-PI-R). SAGE Handb. Pers. Theory Assess. 2, 179–198 (2008)Google Scholar
  13. Counts, S., Stecher, K.: Self-presentation of personality during online profile creation. In: Proceedings of the International AAAI Conference on Weblogs and Social Media (ICWSM), pp. 191–194 (2009)Google Scholar
  14. de Oliveira, R., Karatzoglou, A., Cerezo, P.C., de Vicuña, A.A.L., Oliver, N.: Towards a psychographic user model from mobile phone usage. In: Proceedings of the International Conference on Human Factors in Computing Systems, CHI, pp. 2191–2196 (2011)Google Scholar
  15. Farnadi, G., Sitaraman, G., Rohani, M., Kosinski, M., Stillwell, D., Moens, M., Davalos, S., De Cock, M.: How are you doing? Emotions and personality in Facebook. In: Proceedings of the EMPIRE, pp. 45–56 (2014)Google Scholar
  16. Farnadi, G., Sushmita, S., Sitaraman, G., Ton, N., De Cock, M., Davalos, S.: A multivariate regression approach to personality impression recognition of vloggers. In: Proceedings of the WCPR, pp. 1–6 (2014)Google Scholar
  17. Farnadi, G., Zoghbi, S., Moens, M., De Cock, M.: Recognising personality traits using Facebook status updates. In: Proceedings of the WCPR, pp. 14–18 (2013)Google Scholar
  18. Fernandez-Tobas, I., Braunhofer, M., Elahi, M., Ricci, F., Cantador, I.: Alleviating the new user problem in collaborative filtering by exploiting personality information. User Modeling and User-Adapted Interaction (2015)Google Scholar
  19. Gill, A.J., Oberlander, J., Austin, E.: Rating e-mail personality at zero acquaintance. Pers. Individ. Differ. 40(3), 497–507 (2006)CrossRefGoogle Scholar
  20. Giota, K.G., Kleftaras, G.: The role of personality and depression in problematic use of social networking sites in Greece. J. Psychosoc. Res. Cyberspace 7(3) (2013). doi: 10.5817/cp2013-3-6
  21. Golbeck, J., Robles, C., Edmondson, M., Turner, K.: Predicting personality from twitter. In: Privacy, Security, Risk and Trust (PASSAT) and 2011 IEEE Third Inernational Conference on Social Computing (SocialCom), 2011 IEEE Third International Conference, pp. 149–156. IEEE (2011)Google Scholar
  22. Golbeck, J., Robles, C., Turner, K.: Predicting personality with social media. In: CHI’11 Extended Abstracts on Human Factors in Computing Systems, pp. 253–262. ACM (2011)Google Scholar
  23. Goldberg, L.R., Johnson, J.A., Eber, H.W., Hogan, R., Ashton, M.C., Cloninger, C.R., Gough, H.G.: The international personality item pool and the future of public-domain personality measures. J. Res. Pers. 40(1), 84–96 (2006)CrossRefGoogle Scholar
  24. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)zbMATHGoogle Scholar
  25. Hagger-Johnson, G., Egan, V., Stillwell, D.: Are social networking profiles reliable indicators of sensational interests? J. Res. Pers. 45(1), 71–76 (2011)CrossRefGoogle Scholar
  26. Hall, M.A.: Correlation-based feature selection for machine learning. Ph.D. thesis, The University of Waikato (1999)Google Scholar
  27. Hu, R., Pu, P.: Enhancing collaborative filtering systems with personality information. In: Proceedings of the ACM RecSys, pp. 197–204 (2011)Google Scholar
  28. Hughes, D.J., Rowe, M., Batey, M., Lee, A.: A tale of two sites: Twitter vs. Facebook and the personality predictors of social media usage. Comput. Hum. Behav. 28(2), 561–569 (2012)CrossRefGoogle Scholar
  29. Iacobelli, F., Culotta, A.: Too neurotic, not too friendly: structured personality classification on textual data. In: Proceedings of the Workshop on Computational Personality Recognition, pp. 19–22. AAAI Press, Melon Park (2013)Google Scholar
  30. John, O.P., Srivastava, S.: The Big Five trait taxonomy: history, measurement, and theoretical perspectives. Handb. Pers. Theory Res. 2, 102–138 (1999)Google Scholar
  31. Jolliffe, I.: Principal Component Analysis. Wiley, New York (2002)zbMATHGoogle Scholar
  32. Kocev, D., Vens, C., Struyf, J., Džeroski, S.: Ensembles of multi-objective decision trees. In: Proceedings of the ECML, pp. 624–631 (2007)Google Scholar
  33. Kosinski, M., Bachrach, Y., Kohli, P., Stillwell, D., Graepel, T.: Manifestations of user personality in website choice and behaviour on online social networks. Mach. Learn. 95(3), 1–24 (2013)MathSciNetGoogle Scholar
  34. Kosinski, M., Stillwell, D.J., Graepel, T.: Private traits and attributes are predictable from digital records of human behavior. Proc. Natl. Acad. Sci. (PNAS) 110, 5802–5805 (2013)CrossRefGoogle Scholar
  35. Lambiotte, R., Kosinski, M.: Tracking the digital footprints of personality. In: Proceedings of the Institute of Electrical and Electronics Engineers (PIEEE), pp. 1935–1939 (2014)Google Scholar
  36. Lee, C., Lee, G.G.: Information gain and divergence-based feature selection for machine learning-based text categorization. Inf. Process. Manag. 42(1), 155–165 (2006)CrossRefGoogle Scholar
  37. Lee, K.M., Nass, C.: Designing social presence of social actors in human computer interaction. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’03, pp. 289–296. ACM (2003)Google Scholar
  38. Mairesse, F., Walker, M.A., Mehl, M.R., Moore, R.K.: Using linguistic cues for the automatic recognition of personality in conversation and text. J. Artif. Intell. Res. 30, 457–501 (2007)zbMATHGoogle Scholar
  39. Mohammad, S., Zhu, X., Martin, J.: Semantic role labeling of emotions in tweets. In: Proceedings of the WASSA, pp. 32–41 (2014)Google Scholar
  40. Mohammad, S.M., Kiritchenko, S.: Using nuances of emotion to identify personality. arXiv preprint. arXiv:1309.6352 (2013)
  41. Nguyen, T., Phung, D.Q., Adams, B., Venkatesh, S.: Towards discovery of influence and personality traits through social link prediction. In: Proceedings of ICWSM, pp. 566–569 (2011)Google Scholar
  42. Oliveira, R.D., Cherubini, M., Oliver, N.: Influence of personality on satisfaction with mobile phone services. ACM Trans. Comput. Hum. Interact. 20(2), 10:1–10:23 (2013)CrossRefGoogle Scholar
  43. Ozer, D.J., Benet-Martinez, V.: Personality and the prediction of consequential outcomes. Annu. Rev. Psychol. 57, 401–421 (2006)CrossRefGoogle Scholar
  44. Park, G., Schwartz, H.A., Eichstaedt, J.C., Kern, M.L., Kosinski, M., Stillwell, D.J., Ungar, L.H., Seligman, M.E.: Automatic personality assessment through social media language. J. Pers. Soc. Psychol. 108(6), 934 (2015)CrossRefGoogle Scholar
  45. Pennebaker, J.W., King, L.A.: Linguistic styles: language use as an individual difference. J. Pers. Soc. Psychol. 77(6), 1296 (1999)CrossRefGoogle Scholar
  46. Polzehl, T., Moller, S., Metze, F.: Automatically assessing personality from speech. In: Semantic Computing (ICSC), 2010 IEEE Fourth International Conference, pp. 134–140. IEEE (2010)Google Scholar
  47. Quercia, D., Kosinski, M., Stillwell, D., Crowcroft, J.: Our Twitter profiles, our selves: predicting personality with Twitter. In: Privacy, Security, Risk and Trust (passat), 2011 IEEE Third International Conference on Social Computing (socialcom), pp. 180–185. IEEE (2011)Google Scholar
  48. Quercia, D., Lambiotte, R., Kosinski, M., Stillwell, D.J., Crowcroft, J.: The personality of popular Facebook users. In: Proceedings of the Conference on Computer Supported Cooperative Work, pp. 955–964 (2012)Google Scholar
  49. R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. (2014)
  50. Rammstedt, B., John, O.P.: Measuring personality in one minute or less: a 10-item short version of the Big Five Inventory in English and German. J. Res. Pers. 41(1), 203–212 (2007)CrossRefGoogle Scholar
  51. Saati, B., Salem, M., Brinkman, W.P.: Towards customized user interface skins: investigating user personality and skin colour. Proc. HCI 2005(2), 89–93 (2005)Google Scholar
  52. Schwartz, H.A., Eichstaedt, J.C., Kern, M.L., Dziurzynski, L., Ramones, S.M., Agrawal, M., Shah, A., Kosinski, M., Stillwell, D., Seligman, M.E., et al.: Personality, gender, and age in the language of social media: the open-vocabulary approach. PloS one 8(9), e73791 (2013)CrossRefGoogle Scholar
  53. Stillwell, D.J., Kosinski, M.: myPersonality Project Website. myPersonality Project. (2015)
  54. Tausczik, Y.R., Pennebaker, J.W.: The Psychological meaning of words: LIWC and computerized text analysis methods. J. Lang. Soc. Psychol. 29, 24–54 (2010)CrossRefGoogle Scholar
  55. Xioufis, E.S., Groves, W., Tsoumakas, G., Vlahavas, I.P.: Multi-label classification methods for multi-target regression. arXiv preprint. arXiv:1211.6581 (2012)
  56. Youyou, W., Kosinski, M., Stillwell, D.J.: Computer-based personality judgements are more accurate than those made by humans. Proc. Natl. Acad. Sci. (PNAS) 112(4), 1036–1040 (2015)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2016

Authors and Affiliations

  • Golnoosh Farnadi
    • 1
    • 2
  • Geetha Sitaraman
    • 3
  • Shanu Sushmita
    • 3
  • Fabio Celli
    • 4
  • Michal Kosinski
    • 5
  • David Stillwell
    • 6
  • Sergio Davalos
    • 7
  • Marie-Francine Moens
    • 2
  • Martine De Cock
    • 3
  1. 1.Department of Applied Mathematics, Computer Science and StatisticsGhent UniversityGhentBelgium
  2. 2.Department of Computer ScienceKatholieke Universiteit LeuvenHeverleeBelgium
  3. 3.Center for Data ScienceUniversity of Washington TacomaTacomaUSA
  4. 4.Center for Mind/Brain SciencesUniversity of TrentoTrentoItaly
  5. 5.Stanford Graduate School of BusinessStanford UniversityStanfordUSA
  6. 6.Judge Business SchoolUniversity of CambridgeCambridgeUK
  7. 7.Milgard School of BusinessUniversity of Washington TacomaTacomaUSA

Personalised recommendations