Sentiment/subjectivity analysis survey for languages other than English

  • Mohammed Korayem
  • Khalifeh Aljadda
  • David Crandall
Review Article

Abstract

Subjective and sentiment analysis have gained considerable attention recently. Most of the resources and systems built so far are done for English. The need for designing systems for other languages is increasing. This paper surveys different ways used for building systems for subjective and sentiment analysis for languages other than English. There are three different types of systems used for building these systems. The first (and the best) one is the language-specific systems. The second type of systems involves reusing or transferring sentiment resources from English to the target language. The third type of methods is based on using language-independent methods. The paper presents a separate section devoted to Arabic sentiment analysis.

Keywords

Machine Translation Target Language Sentiment Analysis Sentiment Classification Opinion Word 

References

  1. Abbasi A, Chen H, Salem A (2008) Sentiment analysis in multiple languages: feature selection for opinion classification in web forums. ACM Trans Inf Syst 26:121–1234Google Scholar
  2. AbdelRahman S, Elarnaoty M, Magdy M, Fahmy A (2010) Integrated machine learning techniques for Arabic named entity recognition. IJCSI 7:27–36Google Scholar
  3. Abdul-Mageed M, Diab M (2011) Subjectivity and sentiment annotation of modern standard Arabic newswire. In: ACL HLT 2011, p. 110Google Scholar
  4. Abdul-Mageed M, Diab M (2012) Awatif: a multi-genre corpus for modern standard Arabic subjectivity and sentiment analysis. In: The 8th international conference on language resources and evaluation (LREC2012)Google Scholar
  5. Abdul-Mageed M, Diab M, Korayem M (2011) Subjectivity and sentiment analysis of modern standard Arabic. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies: short papers, vol 2. Association for Computational Linguistics, pp 587–591Google Scholar
  6. Abdul-Mageed M, Korayem M (2010) Automatic identification of subjectivity in morphologically rich languages: the case of Arabic. In: Computational approaches to subjectivity and sentiment analysis, p 2Google Scholar
  7. Abdul-Mageed M, übler SK, Diab M (2012) Samar: a system for subjectivity and sentiment analysis of Arabic social media. In: 3rd workshop on computational approaches to subjectivity and sentiment analysis (WASSA)Google Scholar
  8. Agić Ž, Ljubešić N, Tadić M (2010) Towards sentiment analysis of financial texts in croatian. Bull Market 143(45):69Google Scholar
  9. Ahmad K, Cheng D, Almas Y (2006) Multi-lingual sentiment analysis of financial news streams. In: Proceedings of the 1st international conference on grid in financeGoogle Scholar
  10. Almas Y, Ahmad K (2007) A note on extracting sentiments in financial news in English, Arabic & Urdu. In: Proceedings of Workshop on Computational Approaches to Arabic Script-based Languages. Linguistic society of America, Stanford University, Stanford, California, pp 1–12Google Scholar
  11. Banea C, Mihalcea R, Wiebe J (2010) Multilingual subjectivity: are more languages better? In: Proceedings of the 23rd international conference on computational Linguistics. Association for Computational Linguistics, pp 28–36Google Scholar
  12. Banea C, Mihalcea R, Wiebe J, Hassan S (2008) Multilingual subjectivity analysis using machine translation. In: Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics, pp 127–135Google Scholar
  13. Bollen J, Mao H, Zeng X (2011) Twitter mood predicts the stock market. J Comput Sci 2:1–8CrossRefGoogle Scholar
  14. Brooke J, Tofiloski M, Taboada M (2009) Cross-linguistic sentiment analysis: from English to Spanish. In: Proceedings of the 7th international conference on recent advances in natural language processing, Borovets, Bulgaria, pp 50–54Google Scholar
  15. Das A, Bandyopadhyay S (2009) Subjectivity detection in English and Bengali: a CRF-based approach. In: Proceeding of ICONGoogle Scholar
  16. Denecke K (2008) Using Sentiwordnet for multilingual sentiment analysis. In: IEEE 24th international conference on data engineering workshop, 2008. ICDEW 2008. IEEE, pp 507–512Google Scholar
  17. El-Halees A (2011) Arabic opinion mining using combined classification approach. In: 2011 international Arab conference on information technology ACIT2011Google Scholar
  18. Elarnaoty M, AbdelRahman S, Fahmy A (2012) A machine learning approach for opinion holder extraction in Arabic language. Int J Artif Intell Appl 3(2):45–63Google Scholar
  19. Elhawary M, Elfeky M (2010) Mining Arabic business reviews. In: 2010 IEEE international conference on data mining workshops (ICDMW). IEEE, pp 1108–1113Google Scholar
  20. Esuli A, Sebastiani F (2006) Sentiwordnet: a publicly available lexical resource for opinion mining. In: Proceedings of LREC, vol 6, Citeseer, pp 417–422Google Scholar
  21. Farra N, Challita E, Assi R, Hajj H (2010) Sentence-level and document-level sentiment mining for Arabic texts. In: 2010 IEEE international conference on data mining workshops (ICDMW). IEEE, pp 1114–1119Google Scholar
  22. Ghorbel H, Jacot D (2011) Sentiment analysis of French movie reviews. In: Pallotta V, Soro A, Vargiu E (eds) Advances in distributed agent-based retrieval tools. Springer, Berlin, Heidelberg, pp 97–108. doi: 10.1007/978-3-642-21384-7_7. ISBN 978-3-642-21384-7
  23. Habash N (2010) Introduction to Arabic natural language processing. Synth Lect Hum Lang Technol 3(1):1–187CrossRefGoogle Scholar
  24. Habash N, Rambow O, Roth R (2009) MADA + TOKAN: a toolkit for Arabic tokenization, diacritization, morphological disambiguation, pos tagging, stemming and lemmatization. In: Proceedings of the 2nd international conference on Arabic language resources and tools (MEDAR), Cairo, EgyptGoogle Scholar
  25. Kim J, Jung H, LEE Y, LEE J (2009) Conveying subjectivity of a lexicon of one language into another using a bilingual dictionary and a link analysis algorithm. Int J Comput Process Orient Lang 22(2):205CrossRefGoogle Scholar
  26. Li J, Sun M (2007) Experimental study on sentiment classification of Chinese review using machine learning techniques. In: International conference on natural language processing and knowledge engineering, 2007. NLP-KE 2007, 30 Sept 2007, pp 393–400Google Scholar
  27. Lin Z, Tan S, Cheng X (2011) Language-independent sentiment classification using three common words. In: Proceedings of the 20th ACM international conference on information and knowledge management. ACM, pp 1041–1046Google Scholar
  28. Liu B (2010) Sentiment analysis and subjectivity. In: Indurkhya N, Damerau FJ (eds) Handbook of natural language processing. CRC Press, Taylor and Francis Group, Boca Raton, FL, pp 627–666. ISBN 978-1420085921Google Scholar
  29. Mihalcea R, Banea C, Wiebe J (2007) Learning multilingual subjective language via cross-lingual projections. In: Annual meeting-association for computational linguistics, vol 45, p 976Google Scholar
  30. Mullen T, Collier N (2004) Sentiment analysis using support vector machines with diverse information sources. In: Proceedings of conference on empirical methods in natural language processingGoogle Scholar
  31. Pang B, Lee L (2004) A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the ACL, pp 271–278Google Scholar
  32. Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1–2):1–135CrossRefGoogle Scholar
  33. Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on empirical methods in natural language processing, vol 10. EMNLP’02, Stroudsburg, PA, USA, Association for Computational Linguistics, pp 79–86Google Scholar
  34. Pennebaker JW, Francis ME, Booth RJ (2001) Linguistic inquiry and word count: Liwc 2001. Mahway: Lawrence Erlbaum Associates, vol 71, p 2001Google Scholar
  35. Raychev V, Nakov P (2009) Language-independent sentiment analysis using subjectivity and positional information. In: Proceedings of the international conference RANLP, pp 360–364Google Scholar
  36. Remus, R, Quasthoff U, Heyer G (2010) Sentiws-a publicly available german-language resource for sentiment analysis. In: Proceedings of the 7th language resources and evaluation conferenceGoogle Scholar
  37. Rushdi-Saleh M, Martín-Valdivia M, Ureña-López L, Perea-Ortega J (2011a) Bilingual experiments with an arabic-english corpus for opinion mining, pp 740–745Google Scholar
  38. Rushdi-Saleh M, Martín-Valdivia M, Ureña-López L, Perea-Ortega J (2011b) OCA: Opinion corpus for Arabic. J Am Soc Inf Sci Technol 62(10):2045–2054CrossRefGoogle Scholar
  39. Scheible C (2010) Sentiment translation through lexicon induction. In: Proceedings of the ACL 2010 student research workshop. Association for Computational Linguistics, pp 25–30Google Scholar
  40. Scheible C, Laws F, Michelbacher L, Schütze H (2010) Sentiment translation through multi-edge graphs. In: Proceedings of the 23rd international conference on computational Linguistics: posters. Association for Computational Linguistics, pp 1104–1112Google Scholar
  41. Steinberger J, Lenkova P, Ebrahim M, Ehrmann M, Hurriyetoglu A, Kabadjov M, Steinberger R, Tanev H, Zavarella V, Vázquez S (2011) Creating sentiment dictionaries via triangulation. In: Proceedings of the 2nd workshop on computational approaches to subjectivity and sentiment analysis. Association for Computational Linguistics, pp 28–36Google Scholar
  42. Steinberger J, Lenkova P, Kabadjov M, Steinberger R, van der Goot E (2011) Multilingual entity-centered sentiment analysis evaluated by parallel corpora. In Proceedings of Recent Advances in Natural Language Processing, Hissar, Bulgaria, pp 770–775Google Scholar
  43. Stone P, Dunphy D, Smith M, Ogilvie DM (1968) The general inquirer: a computer approach to content analysis. J Reg Sci 8(1):113–116CrossRefGoogle Scholar
  44. Syed A, Aslam M, Martinez-Enriquez A (2010) Lexicon based sentiment analysis of Urdu text using Sentiunits. In: Advances in artificial intelligence, pp 32–43Google Scholar
  45. Versteegh K, Versteegh C (1997) The Arabic language. Columbia University Press, New York CityGoogle Scholar
  46. Vilares D, Thelwall M, Alonso MA (2015) The megaphone of the people? Spanish sentistrength for real-time analysis of political tweets. J Inf Sci 41(6):799–813CrossRefGoogle Scholar
  47. Wan X (2008) Using bilingual knowledge and ensemble techniques for unsupervised Chinese sentiment analysis. In: Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics, pp 553–561Google Scholar
  48. Wan X (2009) Co-training for cross-lingual sentiment classification. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP, vol 1. Association for Computational Linguistics, pp 235–243Google Scholar
  49. Wei B, Pal C (2010) Cross lingual adaptation: an experiment on sentiment classifications. In: Proceedings of the ACL 2010 conference short papers. Association for Computational Linguistics, pp 258–262Google Scholar
  50. Whitelaw C, Garg N, Argamon S (2005) Using appraisal groups for sentiment analysis. In: Proceedings of the 14th ACM international conference on information and knowledge management. CIKM’05. ACM, New York, NY, USA, pp 625–631Google Scholar
  51. Wiebe J, Riloff E (2005) Creating subjective and objective sentence classifiers from unannotated texts. In: Computational linguistics and intelligent text processing, pp 486–497Google Scholar
  52. Wilson T, Wiebe J, Hoffmann P (2005) Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of the conference on human language technology and empirical methods in natural language processing. Association for Computational Linguistics, pp 347–354Google Scholar
  53. Zhai Z, Xu H, Li J, Jia P (2010) Feature subsumption for sentiment classification in multiple languages. In: Zaki M, Yu J, Ravindran B, Pudi V (eds) Advances in knowledge discovery and data mining, vol 6119. Lecture notes in computer science. Springer, Berlin, pp 261–271Google Scholar
  54. Zhang C, Zeng D, Li J, Wang F, Zuo W (2009) Sentiment analysis of Chinese documents: from sentence to document level. J Am Soc Inf Sci Technol 60(12):2474–2487CrossRefGoogle Scholar
  55. Zhang C, Zuo W, Peng T, He F (2008) Sentiment classification for chinese reviews using machine learning methods based on string kernel. In: Third international conference on convergence and hybrid information technology, 2008. ICCIT’08, vol 2. IEEE, pp 909–914Google Scholar

Copyright information

© Springer-Verlag Wien 2016

Authors and Affiliations

  • Mohammed Korayem
    • 1
  • Khalifeh Aljadda
    • 1
  • David Crandall
    • 2
  1. 1.CareerBuilderNorcrossUSA
  2. 2.School of Informatics and ComputingIndiana UniversityBloomingtonUSA

Personalised recommendations