Multilingual visual sentiment concept clustering and analysis

Pappas, Nikolaos; Redi, Miriam; Topkara, Mercan; Liu, Hongyi; Jou, Brendan; Chen, Tao; Chang, Shih-Fu

doi:10.1007/s13735-017-0120-4

Multilingual visual sentiment concept clustering and analysis

Regular Paper
Published: 20 February 2017

Volume 6, pages 51–70, (2017)
Cite this article

International Journal of Multimedia Information Retrieval Aims and scope Submit manuscript

Nikolaos Pappas¹,
Miriam Redi²,
Mercan Topkara³,
Hongyi Liu⁴,
Brendan Jou⁴,
Tao Chen⁴ &
…
Shih-Fu Chang⁴

464 Accesses
1 Citation
Explore all metrics

Abstract

Visual content is a rich medium that can be used to communicate not only facts and events, but also emotions and opinions. In some cases, visual content may carry a universal affective bias (e.g., natural disasters or beautiful scenes). Often however, to achieve a parity in the affections a visual media invokes in its recipient compared to the one an author intended requires a deep understanding and even sharing of cultural backgrounds. In this study, we propose a computational framework for the clustering and analysis of multilingual visual affective concepts used in different languages which enable us to pinpoint alignable differences (via similar concepts) and nonalignable differences (via unique concepts) across cultures. To do so, we crowdsource sentiment labels for the MVSO dataset, which contains 16 K multilingual visual sentiment concepts and 7.3M images tagged with these concepts. We then represent these concepts in a distribution-based word vector space via (1) pivotal translation or (2) cross-lingual semantic alignment. We then evaluate these representations on three tasks: affective concept retrieval, concept clustering, and sentiment prediction—all across languages. The proposed clustering framework enables the analysis of the large multilingual dataset both quantitatively and qualitatively. We also show a novel use case consisting of a facial image data subset and explore cultural insights about visual sentiment concepts in such portrait-focused images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

http://www.crowdflower.com.
https://cloud.google.com/translate.
We did not perform lemmatization or any other preprocessing step to preserve the original visual concept properties.
http://corpora2.informatik.uni-leipzig.de/download.html.
https://code.google.com/p/word2vec.
http://webscope.sandbox.yahoo.com.

References

Jou B, Chen T, Pappas N, Redi M, Topkara M, Chang S-F (2015) Visual affect around the world: a large-scale multilingual visual sentiment ontology. In: ACM international conference on multimedia, (Brisbane, Australia), pp 159–168
Turian J, Ratinov L, Bengio Y (2010) Word representations: a simple and general method for semi-supervised learning. In: 48th annual meeting of the Association for Computational Linguistics. ACL ’10, (Uppsala, Sweden), pp 384–394
Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12:2493–2537
MATH Google Scholar
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. CoRR, vol. arXiv:1301.3781
Pennington J, Socher R, Manning CD (2014) GloVe: global vectors for word representation. In: Empirical methods in natural language processing, pp 1532–1543
Al-Rfou R, Perozzi B, Skiena S (2013) Polyglot: distributed word representations for multilingual NLP. CoRR, vol arXiv:1307.1662
Klementiev A, Titov I, Bhattarai B (2012) Inducing crosslingual distributed representations of words. In: Proceedings of COLING 2012, (Mumbai, India), pp 1459–1474
Zou WY, Socher R, Cer D, Manning CD (2013) Bilingual word embeddings for phrase-based machine translation.In: Proceedings of the 2013 conference on empirical methods in natural language processing, (Seattle. WA, USA), pp 1393–1398
Hermann KM, Blunsom P (2014) Multilingual models for compositional distributed semantics. In: Annual meeting of the association for computational linguistics, (Baltimore, Maryland), pp 58–68
Chandar APS, Lauly S, Larochelle H, Khapra MM, Ravindran B, Raykar VC, Saha A (2014) An autoencoder approach to learning bilingual word representations. CoRR, vol arXiv:1402.1454
Hill F, Reichart R, Korhonen A (2014) Simlex-999: evaluating semantic models with (genuine) similarity estimation. CoRR, vol arXiv:1408.3456
Bruni E, Tran NK, Baroni M (2014) Multimodal distributional semantics. J Artif Intell Res 49:1–47
MathSciNet MATH Google Scholar
Silberer C, Lapata M (2014) Learning grounded meaning representations with autoencoders. In: 52nd annual meeting of the association for computational linguistics, (Baltimore, Maryland), pp 721–732
Lazaridou A, Pham NT, Baroni M (2015) Combining language and vision with a multimodal skip-gram model. In: Conference of the North American chapter of the association for computational linguistics: human language technologies, (Denver, Colorado), pp 153–163
Karpathy A, Joulin A, Li F (2014) Deep fragment embeddings for bidirectional image sentence mapping. In: Advances in neural information processing systems 27, pp 1889–1897, Curran Associates, Inc
Kiros R, Salakhutdinov R, Zemel RS (2014) Unifying visual-semantic embeddings with multimodal neural language models. CoRR, vol arXiv:1411.2539
Faruqui M, Dyer C (2014) Improving vector space word representations using multilingual correlation. In: Association for computational linguistics
Socher R, Karpathy A, Le QV, Manning CD, Ng AY (2014) Grounded compositional semantics for finding and describing images with sentences. TACL 2:207–218
Google Scholar
Mao J, Xu W, Yang Y, Wang J, Yuille AL (2014) Explain images with multimodal recurrent neural networks. CoRR vol. arXiv:1410.1090
Kottur S, Vedantam R, Moura JMF, Parikh D (2015) Visual word2vec (vis-w2v): learning visually grounded word embeddings using abstract scenes. CoRR, vol. arXiv:1511.07067
Schnabel T, Labutov I, Mimno D, Joachims T (2015) Evaluation methods for unsupervised word embeddings. In: Conference on empirical methods in natural language processing, (Lisbon, Portugal), pp 298–307
Levy O, Goldberg Y, Dagan I (2015) Improving distributional similarity with lessons learned from word embeddings. Trans Assoc Comput Ling 3:211–225
Google Scholar
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. Adv Neural Inf Process Syst 26:3111–3119
Google Scholar
Lebret R, Collobert R (2014) Word embeddings through hellinger pca. In: Conference of the European chapter of the association for computational linguistics, (Gothenburg, Sweden), pp 482–490
Baroni M, Zamparelli R (2010) Nouns are vectors, adjectives are matrices: Representing adjective-noun constructions in semantic space. In: Conference on empirical methods in natural language processing, (Cambridge. MA, USA), pp 1183–1193
Socher R, Huval B, Manning CD, Ng AY (2012) Semantic compositionality through recursive matrix-vector spaces. In: Joint conference on empirical methods in natural language processing and computational natural language learning, (Jeju Island, Korea), pp 1201–1211
Schmid H (1994) Probabilistic part-of-speech tagging using decision trees. In: International conference on new methods in language processing, (Manchester, UK)
Freiwald WA, Tsao DY (2014) Neurons that keep a straight face. Natl Acad Sci 111(22):7894–7895
Article Google Scholar
Redi M, Rasiwasia N, Aggarwal G, Jaimes A (2015) The beauty of capturing faces: Rating the quality of digital portraits. In: IEEE international conference and workshops on automatic face and gesture recognition, (Ljubljana, Slovenia), pp 1–8
Jou B, Bhattacharya S, Chang S-F (2014) Predicting viewer perceived emotions in animated GIFs. In: ACM international conference on multimedia, (Orlando, Florida, USA), pp 213–216
Bakhshi S, Shamma DA, Gilbert E (2014) Faces engage us: photos with faces attract more likes and comments on instagram. In: ACM conference on human factors in computing systems, (Toronto, ON, Canada), pp 965–974
Liao S, Jain AK, Li SZ (2016) A fast and accurate unconstrained face detector. IEEE Trans Pattern Anal Mach Intell 38:211–223
Article Google Scholar
Ammar W, Mulcaire G, Tsvetkov Y, Lample G, Dyer C, Smith NA (2016) Massively multilingual word embeddings. arXiv preprint arXiv:1602.01925
Quasthoff U, Richter M, Biemann C (2006) Corpus portal for search in monolingual Corpora. In: Proceedings of the fifth international conference on language resources and evaluation. LREC, pp 1799–1802, Genoa
Pappas N, Redi M, Topkara M, Brendan J, Liu H, Chen T, Chang S-F (2015) Multilingual visual sentiment concept matching. In: ACM international conference on multimedia retrieval, pp 151–158, New York, USA
Pang B, Lee L (2005) Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: 43rd annual meeting on association for computational linguistics, pp 115–124, Ann Arbor, Michigan
Brendan J, Chang S-F (2016) Deep cross residual learning for multitask visual recognition. In: Proceedings of the 2016 ACM conference on multimedia conference, pp 998–1007, Amsterdam, Netherlands
Bo Pang, Lee Lillian (2008) Opinion mining and sentiment analysis. Found Trends Inf Retrieval 2(1–2):1–135
Google Scholar
Pang B, Lee L, Vaithyanathan S (2002) Thumbs up?: sentiment classification using machine learning techniques. In: ACL-02 conference on empirical methods in natural language processing Vol 10, pp 79–86, Philadelphia, PA
Liu H, Brendan J, Chen T, Topkara M, Pappas N, Redi M, Chang S-F (2015) Complura: exploring and leveraging a large-scale multilingual visual sentiment ontology. pp 417–420, New York, USA
Turney PD (2002) Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In: 40th annual meeting on association for computational linguistics, pp 417–424, Philadelphia, PA
Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis In: 49th annual meeting of the association for computational linguistics: human language technologies, Vol 1, pp 142–150
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119
Tang D, Wei F, Yang N, Zhou M, Liu T, Qin B (2014) Learning sentiment-specific word embedding for Twitter sentiment classification. In: 52nd annual meeting of the association for computational linguistics, pp 1555–1565, Baltimore, MD
Hu M, Liu B (2004) Mining and summarizing customer reviews. In: 10th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 168–177, Seattle, WA
Li Z, Jing F, Zhu X-Y (2006) Movie review mining and summarization. In: 15th ACM international conference on information and knowledge management, pp 43–50, Arlington, VA
Titov I, McDonald R (2008) Modeling online reviews with multi-grain topic models. In: 17th international conference on World Wide Web, pp 111–120, Beijing, China
Sauper C, Haghighi A, Barzilay R (2010) Incorporating content structure into text analysis applications. In: 2010 conference on empirical methods in natural language processing, pp 377–387, Cambridge, MA
Lu B, Ott M, Cardie C, Tsou BK (2011) Multi-aspect sentiment analysis with topic models. In: 2011 IEEE 11th international conference on data mining workshops. pp 81–88 Washington, DC
McAuley J, Leskovec J, Jurafsky D (2012) Learning attitudes and attributes from multi-aspect reviews In: 2012 IEEE 12th international conference on data mining, pp 1020–1025, Brussels, Belgium
Pappas N, Popescu-Belis A (2014) Explaining the stars: weighted multiple-instance learning for aspect-based sentiment analysis. In: Conference on empirical methods in natural language processing, pp 455–466, Doha, Qatar
Morency L-P, Mihalcea R, Doshi P (2011) Towards multimodal sentiment analysis: harvesting opinions from the web. In: 13th international conference on multimodal interfaces, pp 169–176, Tokyo, Japan
Rosas Veronica, Mihalcea Rada, Morency Louis-Philippe (2013) Multimodal sentiment analysis of Spanish online videos. IEEE Intell Syst 28(3):38–45
Article Google Scholar
Cambria Erik, Schuller Bjorn, Xia Yunqing, Havasi Catherine (2013) New avenues in opinion mining and sentiment analysis. IEEE Intell Syst 28(2):15–21
Article Google Scholar
Borth D, Ji R, Chen T, Breuel T, Chang S-F (2013) Large-scale visual sentiment ontology and detectors using adjective noun pairs. In: 21st ACM international conference on Multimedia, pp 223–232, Barcelona, Spain
You Q, Luo J, Jin H, Yang J (2016) Cross-modality consistent regression for joint visual-textual sentiment analysis of social multimedia. In: 9th ACM international conference on web search and data mining, pp 13–22, San Fransisco, USA
Poria S, Cambria E, Gelbukh A (2015) Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis. In: 2015 conference on empirical methods in natural language processing, pp 2539–2544, Lisbon, Portugal
Dodds PS, Clark EM, Desu S, Frank MR, Reagan AJ, Williams JR, Mitchell L et al (2015) Human language reveals a universal positivity bias. In: Proceedings of the national academy of sciences 112(8): 2389–2394
Poria Soujanya, Cambria Erik, Howard Newton, Huang Guang-Bin, Hussain Amir (2016) Fusing audio, visual and textual clues for sentiment analysis from multimodal content. Neurocomputing 174:50–59
Article Google Scholar
Li, H, Ellis Joseph G, Heng J, Chang S-F (2016) Event specific multimodal pattern mining for knowledge base construction. In: Proceedings of the 2016 ACM on multimedia conference, pp 821–830. ACM

Download references

Author information

Authors and Affiliations

Idiap Research Institute, Martigny, Switzerland
Nikolaos Pappas
Nokia Bell Labs, Cambridge, UK
Miriam Redi
Teachers Pay Teachers, New York, NY, USA
Mercan Topkara
Columbia University, New York, NY, USA
Hongyi Liu, Brendan Jou, Tao Chen & Shih-Fu Chang

Authors

Nikolaos Pappas
View author publications
You can also search for this author in PubMed Google Scholar
Miriam Redi
View author publications
You can also search for this author in PubMed Google Scholar
Mercan Topkara
View author publications
You can also search for this author in PubMed Google Scholar
Hongyi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Brendan Jou
View author publications
You can also search for this author in PubMed Google Scholar
Tao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Shih-Fu Chang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nikolaos Pappas.

Additional information

Nikolaos Pappas, Miriam Redi, Mercan Topkara, Hongyi Liu have contributed equally.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pappas, N., Redi, M., Topkara, M. et al. Multilingual visual sentiment concept clustering and analysis. Int J Multimed Info Retr 6, 51–70 (2017). https://doi.org/10.1007/s13735-017-0120-4

Download citation

Received: 01 December 2016
Revised: 23 January 2017
Accepted: 30 January 2017
Published: 20 February 2017
Issue Date: March 2017
DOI: https://doi.org/10.1007/s13735-017-0120-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multilingual visual sentiment concept clustering and analysis

Abstract

Access this article

Similar content being viewed by others

SentiImgBank: A Large Scale Visual Repository for Image Sentiment Analysis

A Novel Visual-Textual Sentiment Analysis Framework for Social Media Data

Visual Sentiment Analysis by Combining Global and Local Information

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multilingual visual sentiment concept clustering and analysis

Abstract

Access this article

Similar content being viewed by others

SentiImgBank: A Large Scale Visual Repository for Image Sentiment Analysis

A Novel Visual-Textual Sentiment Analysis Framework for Social Media Data

Visual Sentiment Analysis by Combining Global and Local Information

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation