Skip to main content

Sentiment Classification in Multiple Languages: Fifty Shades of Customer Opinions

  • Conference paper
Business Challenges in the Changing Economic Landscape - Vol. 2

Part of the book series: Eurasian Studies in Business and Economics ((EBES,volume 2/2))

Abstract

Sentiment analysis is a natural language processing task where the goal is to classify the sentiment polarity of the expressed opinions, although the aim to achieve the highest accuracy in sentiment classification for one particular language, does not truly reflect the needs of business. Sentiment analysis is often used by multinational companies operating on multiple markets. Such companies are interested in consumer opinions about their products and services in different countries (thus in different languages). However, most of the research in multi-language sentiment classification simply utilizes automated translation from minor languages to English (and then conducting sentiment analysis for English). This paper aims to contribute to the multi-language sentiment classification problem and proposes a language independent approach which could provide a good level of classification accuracy in multiple languages without using automated translations or language-dependent components (i.e. lexicons). The results indicate that the proposed approach could provide a high level of sentiment classification accuracy, even for multiple languages and without the language dependent components.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  • Aisopos, F., Papadakis, G., Tserpes, K., & Varvarigou, T. (2012). Content vs. context for sentiment analysis: A comparative analysis over microblogs. In: Proceedings of the 23rd ACM Conference on Hypertext and Social Media, Milwaukee, WI, USA, June 25–28, 2012 (pp. 187–196). New York, NY: ACM.

    Google Scholar 

  • Aldred, J., Astell, A., Behr, R., Cochrane, L., Hind, J., Pickard, A., Potter, L., Wignall, A., & Wiseman, E. (2008). The world’s 50 most powerful blogs. The Guardian [online]. Accessed April 6, 2013, from http://www.guardian.co.uk/technology/2008/mar/09/blogs

  • Anderson, E. W. (1998). Customer satisfaction and word of mouth. Journal of Service Research, 1(1), 5–17.

    Article  Google Scholar 

  • Anon. (n.d.a). Ähnliche Wörter Englisch–Deutsch. Wiktionary [online]. Accessed August 19, 2014, from http://de.wiktionary.org/wiki/Verzeichnis:Englisch/%C3%84hnliche_W%C3%B6rter_Englisch%E2%80%93Deutsch

  • Anon. (n.d.b). English-French relations. Wiktionary [online]. Accessed August 19, 2014, from http://en.wiktionary.org/wiki/Appendix:English-French_relations

  • Aue, A., & Gamon, M. (2005). Customizing sentiment classifiers to new domains: A case study. In Proceedings of the Recent Advances in Natural Language Processing RANLP 2005, Borovets, Bulgaria, September 21–23, 2005 (pp. 1–7). Microsoft Research.

    Google Scholar 

  • Banea, C., Mihalcea, R., & Wiebe, J. (2010). Multilingual subjectivity: Are more languages better? In Proceedings of the 23rd International Conference on Computational Linguistics, Beijing, China, August 23–27, 2010 (pp. 28–36). Association for Computational Linguistics.

    Google Scholar 

  • Berns, M., De Bot, K., & Hasebrink, U. (2007). In the presence of English: Media and European Youth. Berlin: Springer.

    Book  Google Scholar 

  • Blamey, B., Crick, T., & Oatley, G. (2012). RU:-) or:-(? character-vs. word-gram feature selection for sentiment classification of OSN corpora. In Proceedings of the 32nd SGAI International Conference on Artificial Intelligence, Cambridge, UK, December 11–13, 2012 (pp. 207–212). Springer.

    Google Scholar 

  • Brooke, J., Tofiloski, M., & Taboada, M. (2009). Cross-linguistic sentiment analysis: From English to Spanish. In Proceedings of the Recent Advances in Natural Language Processing RANLP 2005, Borovets, Bulgaria, September 14–16, 2009, pp. 50–54.

    Google Scholar 

  • Cambria, E., Schuller, B., Xia, Y., & Havasi, C. (2013). New avenues in opinion mining and sentiment analysis. Intelligent Systems, 28(2), 15–21.

    Article  Google Scholar 

  • Chang, C. C., & Lin, C. J. (2011). LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST), 2(3), 1–39.

    Article  ADS  Google Scholar 

  • Comcowich, W. J. (2010). Media monitoring: The complete guide. CyberAlert [online]. Accessed August 8, 2013, from http://www.cyberalert.com/downloads/media_monitoring_whitepaper.pdf

  • Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.

    MATH  Google Scholar 

  • Escalante, H. J., Solorio, T., & Montes-Y-Gómez, M. (2011). Local histograms of character n-grams for authorship attribution. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, Portland, OR, June 19–24, 2011 (pp. 288–298). Association for Computational Linguistics.

    Google Scholar 

  • Goldenberg, J., Libai, B., & Muller, E. (2001). Talk of the network: A complex systems look at the underlying process of word-of-mouth. Marketing Letters, 12(3), 211–223.

    Article  Google Scholar 

  • Habernal, I., Ptácek, T., & Steinberger, J. (2013). Sentiment analysis in Czech social media using supervised machine learning. In: Proceedings of the Fourth Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, Atlanta, GA, June 14, 2013, pp. 65–74.

    Google Scholar 

  • Horrigan, J. B. (2008). Online shopping. Pew Internet & American Life Project [online]. Washington, DC. Accessed August 8, 2014, from http://www.pewinternet.org/Reports/2008/Online-Shopping/01-Summary-of-Findings.aspx

  • Kanaris, I., Kanaris, K., Houvardas, I., & Stamatatos, E. (2007). Words versus character n-grams for anti-spam filtering. International Journal on Artificial Intelligence Tools, 16(06), 1047–1067.

    Article  Google Scholar 

  • Liu, B. (2012). Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies, 5(1), 1–167.

    Article  Google Scholar 

  • Maas, A. L., Daly, R. E., Pham, P. T., Huang, D., Ng, A. Y., & Potts, C. (2011). Learning word vectors for sentiment analysis. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, Portland, OR, June 19–24, 2011 (pp. 142–150). Association for Computational Linguistics.

    Google Scholar 

  • Mansour, R., Refaei, N., Gamon, M., Abdul-Hamid, A., & Sami, K. (2013). Revisiting the old kitchen sink: Do we need sentiment domain adaptation? In Proceedings of the Recent Advances in Natural Language Processing, RANLP 2013, Hissar, Bulgaria, September 9–11, 2013, pp. 420–427.

    Google Scholar 

  • Pak, A., & Paroubek, P. (2010). Twitter as a corpus for sentiment analysis and opinion mining. In Proceedings of the International Conference on Language Resources and Evaluation, LREC, 2010, Valletta, Malta, May, 17–23, 2010, pp. 1320–1326.

    Google Scholar 

  • Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2(1–2), 1–135.

    Article  Google Scholar 

  • Peng, F., Schuurmans, D., & Wang, S. (2003). Language and task independent text categorization with simple language models. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, NAACL '03, Edmonton, Canada, May 27–June 1, 2003 (pp. 110–117). Association for Computational Linguistics.

    Google Scholar 

  • Ptaszynski, M., Rzepka, R., Araki, K., & Momouchi, Y. (2011). Research on emoticons: review of the field and proposal of research framework. In Proceedings of the Seventeenth Annual Meeting of the Association for Natural Language Processing (NLP-2011) Toyohashi, Japan, March 7–11, 2011 (pp. 1159–1162). The Association for Natural Language Processing.

    Google Scholar 

  • Raaijmakers, S., & Kraaij, W. (2008). A shallow approach to subjectivity classification. In Proceedings of the Second International Conference on Weblogs and Social Media, ICWSM 2008, Seattle, WA, USA, March 30–April 2, 2008 (pp. 216–217). Association for the Advancement of Artificial Intelligence.

    Google Scholar 

  • Ritter, A., Clark, S., & Etzioni, O. (2011). Named entity recognition in tweets: An experimental study. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, Edinburgh, UK, July, 27–31, 2011 (pp. 1524–1534). Association for Computational Linguistics.

    Google Scholar 

  • Rybina, K. (2012). Sentiment analysis of contexts around query terms in documents. Master’s thesis, Technische Universität Dresden.

    Google Scholar 

  • Taboada, M., Brooke, J., Tofiloski, M., Voll, K., & Stede, M. (2011). Lexicon-based methods for sentiment analysis. Computational Linguistics, 37(2), 267–307.

    Article  Google Scholar 

  • Tsarfaty, R., Seddah, D., Goldberg, Y., Kuebler, S., Candito, M., Foster, J., Versley, Y., Rehbein, I., & Tounsi, L. (2010). Statistical parsing of morphologically rich languages (SPMRL): What, how and whither. In Proceedings of the First Workshop on Statistical Parsing of Morphologically-Rich Languages, NAACL HLT 2010, Los Angeles, CA, USA, June 5, 2010 (pp. 1–12). Association for Computational Linguistics.

    Google Scholar 

  • Ye, Q., Zhang, Z., & Law, R. (2009). Sentiment classification of online reviews to travel destinations by supervised machine learning approaches. Expert Systems with Applications, 36(3), 6527–6535.

    Article  Google Scholar 

  • Zhang, L., Ghosh, R., Dekhil, M., Hsu, M., & Liu, B. (2011). Combining lexicon based and learning-based methods for twitter sentiment analysis(Technical Report HPL-2011-89). HP Laboratories.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tomáš Kincl .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Kincl, T., Novák, M., Přibil, J. (2016). Sentiment Classification in Multiple Languages: Fifty Shades of Customer Opinions. In: Bilgin, M., Danis, H., Demir, E., Can, U. (eds) Business Challenges in the Changing Economic Landscape - Vol. 2. Eurasian Studies in Business and Economics, vol 2/2. Springer, Cham. https://doi.org/10.1007/978-3-319-22593-7_19

Download citation

Publish with us

Policies and ethics