Multimodal Analysis and Prediction of Latent User Dimensions

  • Laura WendlandtEmail author
  • Rada Mihalcea
  • Ryan L. Boyd
  • James W. Pennebaker
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10539)


Humans upload over 1.8 billion digital images to the internet each day, yet the relationship between the images that a person shares with others and his/her psychological characteristics remains poorly understood. In the current research, we analyze the relationship between images, captions, and the latent demographic/psychological dimensions of personality and gender. We consider a wide range of automatically extracted visual and textual features of images/captions that are shared by a large sample of individuals (\(N \approx 1,350\)). Using correlational methods, we identify several visual and textual properties that show strong relationships with individual differences between participants. Additionally, we explore the task of predicting user attributes using a multimodal approach that simultaneously leverages images and their captions. Results from these experiments suggest that images alone have significant predictive power and, additionally, multimodal methods outperform both visual features and textual features in isolation when attempting to predict individual differences.


Analysis of latent user dimensions Multimodal prediction Joint language/vision models 



This material is based in part upon work supported by the National Science Foundation (NSF #1344257), the John Templeton Foundation (#48503), and the Michigan Institute for Data Science (MIDAS). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF, the John Templeton Foundation, or MIDAS. We would like to thank Chris Pittman for his aid with the data collection, Shibamouli Lahiri for the readability code, and Steven R. Wilson for the implementation of Mairesse et al.


  1. 1.
    Bentivogli, L., Forner, P., Magnini, B., Pianta, E.: Revising the WordNet domains hierarchy: semantics, coverage and balancing. In: Proceedings of the Workshop on Multilingual Linguistic Resources, pp. 101–108. Association for Computational Linguistics (2004)Google Scholar
  2. 2.
    Boyd, R.L.: Psychological text analysis in the digital humanities. In: Hai-Jew, S. (ed.) Data Analytics in the Digital Humanities. MMSA, pp. 161–189. Springer Science, New York City (2017). doi: 10.1007/978-3-319-54499-1_7. In PressCrossRefGoogle Scholar
  3. 3.
    Bruni, E., Tran, N.K., Baroni, M.: Multimodal distributional semantics. J. Artif. Intell. Res. 49, 1–47 (2014)MathSciNetzbMATHGoogle Scholar
  4. 4.
    Chris, D.P.: Another stemmer. In: ACM SIGIR Forum, vol. 24, pp. 56–61 (1990)Google Scholar
  5. 5.
    Ciaramita, M., Johnson, M.: Supersense tagging of unknown nouns in WordNet. In: Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, pp. 168–175. Association for Computational Linguistics (2003)Google Scholar
  6. 6.
    Coltheart, M.: The MRC psycholinguistic database. Q. J. Exp. Psychol. 33(4), 497–505 (1981)CrossRefGoogle Scholar
  7. 7.
    Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The PASCAL visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)CrossRefGoogle Scholar
  8. 8.
    Fellbaum, C.: WordNet. Wiley Online Library, Hoboken (1998)Google Scholar
  9. 9.
    Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by Gibbs sampling. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 363–370 (2005)Google Scholar
  10. 10.
    Gosling, S.D., Craik, K.H., Martin, N.R., Pryor, M.R.: Material attributes of personal living spaces. Home Cultures 2(1), 51–87 (2005)CrossRefGoogle Scholar
  11. 11.
    Gosling, S.D., Ko, S.J., Mannarelli, T., Morris, M.E.: A room with a cue: personality judgments based on offices and bedrooms. J. Personal. Soc. Psychol. 82(3), 379 (2002)CrossRefGoogle Scholar
  12. 12.
    Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 675–678 (2014)Google Scholar
  13. 13.
    John, O.P., Srivastava, S.: The big five trait taxonomy: history, measurement, and theoretical perspectives. Handb. Personal.: Theory Res. 2(1999), 102–138 (1999)Google Scholar
  14. 14.
    Johnson, J., Karpathy, A., Fei-Fei, L.: DenseCap: Fully convolutional localization networks for dense captioning. arXiv preprint arXiv:1511.07571 (2015)
  15. 15.
    Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3128–3137 (2015)Google Scholar
  16. 16.
    Kelly, E.L., Conley, J.J.: Personality and compatibility: a prospective analysis of marital stability and marital satisfaction. J. Personal. Soc. Psychol. 52(1), 27 (1987)CrossRefGoogle Scholar
  17. 17.
    Khouw, N.: The meaning of color for gender. In: Colors Matters-Research (2002)Google Scholar
  18. 18.
    Koppel, M., Argamon, S., Shimoni, A.R.: Automatically categorizing written texts by author gender. Literary Linguist. Comput. 17(4), 401–412 (2002)CrossRefGoogle Scholar
  19. 19.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  20. 20.
    Li, J.J., Nenkova, A.: Fast and accurate prediction of sentence specificity. In: AAAI, pp. 2281–2287 (2015)Google Scholar
  21. 21.
    Liu, H., Mihalcea, R.: Of men, women, and computers: data-driven gender modeling for improved user interfaces. In: International Conference on Weblogs and Social Media (2007)Google Scholar
  22. 22.
    Liu, L., Preotiuc-Pietro, D., Samani, Z.R., Moghaddam, M.E., Ungar, L.: Analyzing personality through social media profile picture choice. In: Tenth International AAAI Conference on Web and Social Media (2016)Google Scholar
  23. 23.
    Lovato, P., Bicego, M., Segalin, C., Perina, A., Sebe, N., Cristani, M.: Faved! biometrics: tell me which image you like and I’ll tell you who you are. IEEE Trans. Inf. Forensics Secur. 9(3), 364–374 (2014)CrossRefGoogle Scholar
  24. 24.
    Machajdik, J., Hanbury, A.: Affective image classification using features inspired by psychology and art theory. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 83–92. ACM (2010)Google Scholar
  25. 25.
    Mairesse, F., Walker, M.A., Mehl, M.R., Moore, R.K.: Using linguistic cues for the automatic recognition of personality in conversation and text. J. Artif. Intell. Res. 30, 457–500 (2007)zbMATHGoogle Scholar
  26. 26.
    Marcus, M.P., Marcinkiewicz, M.A., Santorini, B.: Building a large annotated corpus of English: the Penn treebank. Comput. Linguist. 19(2), 313–330 (1993)Google Scholar
  27. 27.
    Mathias, M., Benenson, R., Pedersoli, M., van Gool, L.: Face detection without bells and whistles. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 720–735. Springer, Cham (2014). doi: 10.1007/978-3-319-10593-2_47 Google Scholar
  28. 28.
    McCrae, R.R., John, O.P.: An introduction to the five-factor model and its applications. J. Personal. 60(2), 175–215 (1992)CrossRefGoogle Scholar
  29. 29.
    Meeker, M.: Internet trends 2014-Code conference (2014). Accessed 28 May 2014Google Scholar
  30. 30.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)Google Scholar
  31. 31.
    Newman, M.L., Groom, C.J., Handelman, L.D., Pennebaker, J.W.: Gender differences in language use: an analysis of 14,000 text samples. Discourse Process. 45(3), 211–236 (2008)CrossRefGoogle Scholar
  32. 32.
    Oberlander, J., Nowson, S.: Whose thumb is it anyway? Classifying author personality from weblog text. In: COLING/ACL, pp. 627–634 (2006)Google Scholar
  33. 33.
    Park, G., Schwartz, H.A., Eichstaedt, J.C., Kern, M.L., Kosinski, M., Stillwell, D.J., Ungar, L.H., Seligman, M.E.P.: Automatic personality assessment through social media language. J. Personal. Soc. Psychol. 108(6), 934–952 (2014)CrossRefGoogle Scholar
  34. 34.
    Pennebaker, J.W., King, L.A.: Linguistic styles: language use as an individual difference. J. Personal. Soc. Psychol. 77(6), 1296 (1999)CrossRefGoogle Scholar
  35. 35.
    Redi, M., Quercia, D., Graham, L., Gosling, S.: Like partying? Your face says it all. Predicting the ambiance of places with profile pictures. In: Ninth International AAAI Conference on Web and Social Media (2015)Google Scholar
  36. 36.
    Roberts, B., Kuncel, N., Shiner, R., Caspi, A., Goldberg, L.: The power of personality: the comparative validity of personality traits, socioeconomic status, and cognitive ability for predicting important life outcomes. Perspect. Psychol. Sci. 4(2), 313–345 (2007)Google Scholar
  37. 37.
    Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)MathSciNetCrossRefGoogle Scholar
  38. 38.
    Segalin, C., Cheng, D.S., Cristani, M.: Social profiling through image understanding: personality inference using convolutional neural networks. Comput. Vis. Image Understanding 156, 34–50 (2016)CrossRefGoogle Scholar
  39. 39.
    Segalin, C., Perina, A., Cristani, M., Vinciarelli, A.: The pictures we like are our image: continuous mapping of favorite pictures into self-assessed and attributed personality traits. IEEE Trans. Affect. Comput. 8(2), 268–285 (2016)CrossRefGoogle Scholar
  40. 40.
    Valdez, P., Mehrabian, A.: Effects of color on emotions. J. Exp. Psychol.: Gen. 123(4), 394 (1994)CrossRefGoogle Scholar
  41. 41.
    Van De Weijer, J., Schmid, C., Verbeek, J., Larlus, D.: Learning color names for real-world applications. IEEE Trans. Image Process. 18(7), 1512–1523 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  42. 42.
    Yoder, P.J., Blackford, J.U., Waller, N.G., Kim, G.: Enhancing power while controlling family-wise error: an illustration of the issues using electrocortical studies. J. Clin. Exp. Neuropsychol. 26(3), 320–331 (2004)CrossRefGoogle Scholar
  43. 43.
    You, Q., Bhatia, S., Sun, T., Luo, J.: The eyes of the beholder: gender prediction using images posted in online social networks. In: 2014 IEEE International Conference on Data Mining Workshop, pp. 1026–1030. IEEE (2014)Google Scholar
  44. 44.
    Zhang, D., Islam, M.M., Lu, G.: A review on automatic image annotation techniques. Pattern Recogn. 45(1), 346–362 (2012)CrossRefGoogle Scholar
  45. 45.
    Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: Advances in Neural Information Processing Systems, pp. 487–495 (2014)Google Scholar
  46. 46.
    Zitnick, C.L., Dollár, P.: Edge boxes: locating object proposals from edges. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 391–405. Springer, Cham (2014). doi: 10.1007/978-3-319-10602-1_26 Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Laura Wendlandt
    • 1
    Email author
  • Rada Mihalcea
    • 1
  • Ryan L. Boyd
    • 2
  • James W. Pennebaker
    • 2
  1. 1.University of MichiganAnn ArborUSA
  2. 2.University of Texas at AustinAustinUSA

Personalised recommendations