Experiments in Cross-Lingual Sentiment Analysis in Discussion Forums

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7710)


One of the objectives of sentiment analysis is to classify the polarity of conveyed opinions from the perspective of textual evidence. Most of the work in the field has been intensively applied to the English language and only few experiments have explored other languages. In this paper, we present a supervised classification of posts in French online forums where sentiment analysis is based on shallow linguistic features such as POS tagging, chunking and common negation forms. Furthermore, we incorporate word semantic orientation extracted from the English lexical resource SentiWordNet as an additional feature. Since SentiWordNet is an English resource, lexical entries in the studied French corpus should be translated into English. For this purpose, we propose a number of French to English translation experiments such as machine translation and WordNet synset translation using EuroWordNet. Obtained results show that WordNet synset translation have not significantly improved the classification performance with respect to the bag of words baseline due to the shortage in coverage. Automatic translation haven’t either significantly improved the results due to its insufficient quality. Propositions of improving the classification performance are given by the end of the article.


Cross-Lingual Sentiment Analysis Machine Translation Supervised Classification 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2005), Vancouver, B.C., Canada, pp. 347–354 (October 2005)Google Scholar
  2. 2.
    Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? sentiment classification using machine learning techniques. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Philadelphia, pp. 79–86 (July 2002)Google Scholar
  3. 3.
    Esuli, A., Sebastiani, F.: Sentiwordnet: a publicly available lexical resource for opinion mining. In: Proceedings of the 5th Conference on Language Resources and Evaluation, LREC, vol. 6 (2006)Google Scholar
  4. 4.
    Strapparava, C., Valitutti, A.: Wordnet-affect: an affective extension of wordnet. In: Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC 2004), Lisbon, pp. 1083–1086 (May 2004)Google Scholar
  5. 5.
    Banea, C., Mihalcea, R., Wiebe, J.: Multilingual sentiment and subjectivity analysis. In: Zitouni, I., Bikel, D. (eds.) Multilingual Natural Language Processing. Prentice-Hall (2011)Google Scholar
  6. 6.
    Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of Knowledge Discovery and Data Mining (KDD 2004), Seattle (2004)Google Scholar
  7. 7.
    Esuli, A., Sebastiani, F.: Determining the semantic orientation of terms through gloss classification. In: Proceedings of CIKM 2005, pp. 617–624 (2005)Google Scholar
  8. 8.
    Nastase, V., Sokolova, M., Shirabad, J.S.: Do happy words sound happy? a study of the relation between form and meaning for english words expressing emotions. In: Proceedings of Recent Advances in Natural Language Processing (RANLP 2007), pp. 406–410 (2007)Google Scholar
  9. 9.
    Denecke, K.: Using sentiwordnet for multilingual sentiment analysis. In: Proceedings of the IEEE International Conference on Data Engineering (ICDE 2008), Cancun, Mexico, pp. 507–512 (2008)Google Scholar
  10. 10.
    Kim, S.M., Hovy, E.: Determining the sentiment of opinions. In: Proceedings of the 20th International Conference on Computational Linguistics (COLING 2004), Geneva, Switzerland, pp. 1367–1373 (August 2004)Google Scholar
  11. 11.
    Taboada, M., Brooke, J., Tofiloski, M., Voll, K., Stede, M.: Lexicon-based methods for sentiment analysis. Computational Linguistics 37(2), 267–307 (2011)CrossRefGoogle Scholar
  12. 12.
    Yi, J., Nasukawa, T., Bunescu, R., Niblack, W.: Sentiment analyzer: Extracting sentiments about a given topic using natural language processing techniques. In: The Third IEEE International Conference on Data Mining (2003)Google Scholar
  13. 13.
    Gamon, M., Aue, A.: Automatic identification of sentiment vocabulary: exploiting low association with known sentiment terms. In: Proceedings of the ACL 2005 Workshop on Feature Engineering for Machine Learning in Natural Language Processing. Association for Computational Linguistics, Ann Arbor, US (July 2005)Google Scholar
  14. 14.
    Whissell, C.M.: The dictionary of affect in language. In: Lutchik, R., Kellerman, H. (eds.) Emotion: Theory, Research, and Experience, pp. 113–131 (1989)Google Scholar
  15. 15.
    Pennebaker, J.W., Francis, M.E., Booth, R.J.: Linguistic Inquiry and Word Count (LIWC): LIWC 2001. Erlbaum Publisher, Mahwah (2001)Google Scholar
  16. 16.
    Banea, C., Mihalcea, R., Wiebe, J., Hassan, S.: Multilingual subjectivity analysis using machine translation. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 2008, pp. 127–135. Association for Computational Linguistics, Stroudsburg (2008)Google Scholar
  17. 17.
    Riloff, E., Wiebe, J.: Learning extraction patterns for subjective expressions. In: Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, EMNLP 2003, pp. 105–112. Association for Computational Linguistics, Stroudsburg (2003)Google Scholar
  18. 18.
    Abbasi, A., Chen, H., Salem, A.: Sentiment analysis in multiple languages: Feature selection for opinion classification in web forums. ACM Transactions on Information Systems (TOIS) 26(3), Article 12 (2008)Google Scholar
  19. 19.
    Alexandra, B., Marco, T.: Multilingual sentiment analysis using machine translation? In: Proceedings of the 3rd Workshop in Computational Approaches to Subjectivity and Sentiment Analysis, pp. 52–60. Association for Computational Linguistics, Jeju (2012)Google Scholar
  20. 20.
    Généreux, M., Poibeau, T.: Approche mixte utilisant des outils et ressources pour l’anglais pour l’identification de fragments textuels subjectifs français. In: Actes de l’atelier de clôure de la cinquiéme édition du DÉfi Fouille de Textes (DEFT 2009), Paris (June 2009)Google Scholar
  21. 21.
    Kim, S.M., Hovy, E.H.: Identifying and analyzing judgment opinions. In: Proceedings of the Human Language Technology Conference of the NAACL (HLT-NAACL), New York, USA (2006)Google Scholar
  22. 22.
    Mihalcea, R., Banea, C., Wiebe, J.: Learning multilingual subjective language via cross-lingual projections. In: Proceedings of the Association for Computational Linguistics (ACL 2007), Prague (June 2007)Google Scholar
  23. 23.
    Piolat, A., Booth, R.J., Chung, C.K., Davids, M., Pennebaker, J.W.: La version franc̨aise du liwc: modalités de construction et exemples d’application. Psychologie Franc̨aise 56, 145–159 (2011)CrossRefGoogle Scholar
  24. 24.
    Ghorbel, H., Jacot, D.: Further experiments in sentiment analysis of french movie reviews. In: Proceedings of the 7th Atlantic Web Intelligence Conference on Advances in Intelligent Web Mastering 3, AWIC 2011, Fribourg, Switzerland, vol. 86, pp. 19–28 (2011)Google Scholar
  25. 25.
    Scheggloff, E.A.: Sequence organization (2005) (unpublished manuscript)Google Scholar
  26. 26.
    Koshik, I.: Beyond Rhetorical Questions: Assertive Questions in Everyday Interaction. John Benjamins (2005)Google Scholar
  27. 27.
    Ben-Hur, A., Weston, J.: Data Mining Techniques for the Life Sciences. Springer (2009)Google Scholar
  28. 28.
    Hatzivassiloglou, V., McKeown, K.R.: Predicting the semantic orientation of adjectives. In: Proceedings of the 8th conference on European Chapter of the Association for Computational Linguistics, Madrid, Spain, pp. 174–181 (1997)Google Scholar
  29. 29.
    Fellbaum, C. (ed.): WordNet An Electronic Lexical Database. MIT Press (1998)Google Scholar
  30. 30.
    McCarthy, D., Koeling, R.: JulieWeeds: Eurowordnet general document. Technical Report CSRP 569l, Department of Informatics, University of Sussex, Falmer, Brighton (2004)Google Scholar
  31. 31.
    Rentoumi, V., Giannakopoulos, G., Vouros, G.A.: Sentiment analysis of figurative language using a word sense disambiguation approach. In: Proceedings of the International Conference on RANLP, pp. 370–375 (2009)Google Scholar
  32. 32.
    Vossen, P.: Eurowordnet general document. Technical Report Version 3 Final, University of Amsterdam (2010)Google Scholar
  33. 33.
    Schmid, H.: Probabilistic part-of-speech tagging using decision trees. In: Proceedings of the International Conference on New Methods in Language Processing, pp. 44–49 (1994)Google Scholar
  34. 34.
    Joachims, T.: Making large-scale svm learning practical. ACM Transactions on Information Systems, TOIS (1998)Google Scholar
  35. 35.
    Ghorbel, H., Jacot, D.: Sentiment Analysis of French Movie Reviews. In: Pallotta, V., Soro, A., Vargiu, E. (eds.) DART 2011. SCI, vol. 361, pp. 97–108. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  36. 36.
    Gala, N., Brun, C.: Propagation de polarités dans des familles de mots: impact de la morphologie dans la construction d’un lexique pour l’analyse d’opinions. In: Actes de Traitement Automatique des Langues Naturelles (TALN 2012), Grenoble (2012)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  1. 1.Information and Communication Systems Lab (ISIC)HES-SO, HE-Arc IngénierieSt-ImierSwitzerland

Personalised recommendations