Advertisement

Sentiment Analysis with a Multilingual Pipeline

  • Daniella Bal
  • Malissa Bal
  • Arthur van Bunningen
  • Alexander Hogenboom
  • Frederik Hogenboom
  • Flavius Frasincar
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6997)

Abstract

Sentiment analysis refers to retrieving an author’s sentiment from a text. We analyze the differences that occur in sentiment scoring across languages. We present our experiments for the Dutch and English language based on forum, blog, news and social media texts available on the Web, where we focus on the differences in the use of a language and the effect of the grammar of a language on sentiment analysis. We propose a multilingual pipeline for evaluating how an author’s sentiment is conveyed in different languages. We succeed in correctly classifying positive and negative texts with an accuracy of approximately 71% for English and 79% for Dutch. The evaluation of the results shows however that usage of common expressions, emoticons, slang language, irony, sarcasm, and cynicism, acronyms and different ways of negation in English prevent the underlying sentiment scores from being directly comparable.

Keywords

Machine Translation News Article Sentiment Analysis Multiple Language Chinese Text 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abbasi, A., Chan, H., Salem, A.: Sentiment Analysis in Multiple Languages: Feature Selection for Opinion Classification in Web Forums. ACM Transactions on Information Systems 26(3) (2008)Google Scholar
  2. 2.
    Alexa Internet Inc.: Alexa the Web Information Company (2011), http://www.alexa.com/
  3. 3.
    Amati, G., van Rijsbergen, C.: Probabilistic Models of Information Retrieval Based on Measuring the Divergence from Randomness. ACM Transactions on Information Systems 20(4), 375–389 (2002)CrossRefGoogle Scholar
  4. 4.
    Bautin, M., Vijayarenu, L., Skiena, S.: International Sentiment Analysis for News and Blogs. In: 2nd International Conference on Weblogs and Social Media (ICWSM 2008), pp. 19–26. AAAI Press, Menlo Park (2008)Google Scholar
  5. 5.
    Dai, W., Xue, G., Yang, Q., Yu, Y.: Co-clustering Based Classification. In: 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2007), pp. 210–219. ACM, New York (2007)CrossRefGoogle Scholar
  6. 6.
    Dai, W., Xue, G.-R., Yang, Q., Yu, Y.: Transferring naive bayes classifiers for text classification. In: 22nd Association for the Advancement of Articifial Intelligence Conference on Artificial Intelligence (AAAI 2007), pp. 540–545. AAAI Press, Menlo Park (2007)Google Scholar
  7. 7.
    FilmTotaal: Film Recensies en Reviews op FilmTotaal (2011), http://www.filmtotaal.nl/recensies.php
  8. 8.
    Gliozzo, A., Strapparava, C.: Cross Language Text Categorization by Acquiring Multilingual Domain Models from Comparable Corpora. In: ACL Workshop on Building and Using Parallel Texts (ParaText 2005), pp. 9–16. ACL (2005)Google Scholar
  9. 9.
    Hofman, K., Jijkoun, V.: Generating a Non-English Subjectivity Lexicon: Relations that Matter. In: 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2009). pp. 398–405. ACL (2009)Google Scholar
  10. 10.
    IMDb.com Inc.: The Internet Movie Database (IMDb) (2011), http://www.imdb.com/
  11. 11.
    Moens, M.-F., Boiy, E.: A Machine Learning Approach to Sentiment Analysis in Multilingual Web Texts. Information Retrieval 12(5), 526–558 (2007)Google Scholar
  12. 12.
    Pang, B., Lee, L.: A Sentimental Education: Sentiment Analysis using Subjectivity Summarization based on Minimum Cuts. In: 42nd Annual Meeting of the Association for Computational Linguistics (ACL 2004)), pp. 271–280. ACL (2004)Google Scholar
  13. 13.
    Pang, B., Lee, L.: Opinion Mining and Sentiment Analysis. Foundations and Trends in Information Retrieval 2(1), 1–135 (2008)CrossRefGoogle Scholar
  14. 14.
    Wan, X.: Co-Training for Cross-Lingual Sentiment Classification. In: Joint Conference of the 47th Annual Meeting of ACL and the 4th International Join Conference on Natural Language Processing of the AFNLP (ACL 2009), pp. 235–243. ACL (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Daniella Bal
    • 1
  • Malissa Bal
    • 1
  • Arthur van Bunningen
    • 2
  • Alexander Hogenboom
    • 1
  • Frederik Hogenboom
    • 1
  • Flavius Frasincar
    • 1
  1. 1.Erasmus University RotterdamRotterdamThe Netherlands
  2. 2.Teezir BVUtrechtThe Netherlands

Personalised recommendations