Skip to main content
Log in

Sentiment lexicons and non-English languages: a survey

  • Survey Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

The ever-increasing number of Internet users and online services, such as Amazon, Twitter and Facebook has rapidly motivated people to not just transact using the Internet but to also voice their opinions about products, services, policies, etc. Sentiment analysis is a field of study to extract and analyze public views and opinions. However, current research within this field mainly focuses on building systems and resources using the English language. The primary objective of this study is to examine existing research in building sentiment lexicon systems and to classify the methods with respect to non-English datasets. Additionally, the study also reviewed the tools used to build sentiment lexicons for non-English languages, ranging from those using machine translation to graph-based methods. Shortcomings are highlighted with the approaches along with recommendations to improve the performance of each approach and areas for further study and research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. http://stardict.sourceforge.net.

  2. http://www.dicts.info/uddl.php.

  3. translate.google.com.

  4. http://www.talkingcock.com/html/lexec.php.

  5. http://www.singlishdictionary.com/.

  6. https://en.wikipedia.org/wiki/Singlish_vocabulary.

  7. https://github.com/jeffreybreen/twitter-sentiment-analysis-tutorial-201107/tree/master/data/opinion-lexicon-English.

  8. https://www.mturk.com/.

  9. http://www.jeuxdemots.org/emot.php.

  10. https://github.com/facebookresearch/fastText.

References

  1. Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5(1):1–167

    Google Scholar 

  2. Dodds PS, Harris KD, Kloumann IM, Bliss CA, Danforth CM (2011) Temporal patterns of happiness and information in a global social network: hedonometrics and Twitter. PLoS ONE 6(12):e26752

    Google Scholar 

  3. Akhtar MS, Gupta D, Ekbal A, Bhattacharyya P (2017) Feature selection and ensemble construction: a two-step method for aspect based sentiment analysis. Knowl Based Syst 125:116–135

    Google Scholar 

  4. Dashtipour K, Poria S, Hussain A, Cambria E, Hawalah AY, Gelbukh A, Zhou Q (2016) Multilingual sentiment analysis: state of the art and independent comparison of techniques. Cogn Comput 8(4):757–771

    Google Scholar 

  5. Lo SL, Cambria E, Chiong R, Cornforth D (2016) Multilingual sentiment analysis: from formal to informal and scarce resource languages. Artif Intell Rev 28:499–527

    Google Scholar 

  6. Biltawi M, Etaiwi W, Tedmori S, Hudaib A, Awajan A (2016) Sentiment classification techniques for Arabic language: a survey. In: 7th international conference on information and communication systems, ICICS 2016. Institute of Electrical and Electronics Engineers Inc

  7. Mihalcea R, Banea C, Wiebe JM (2007) Learning multilingual subjective language via cross-lingual projections. In: Proceedings of the 45th annual meeting of the association of computational linguistics

  8. Deng S, Sinha AP, Zhao H (2017) Adapting sentiment lexicons to domain-specific social media texts. Decis Support Syst 94:65–76

    Google Scholar 

  9. Wu S, Wu F, Chang Y, Wu C, Huang Y (2019) Automatic construction of target-specific sentiment lexicon. Expert Syst Appl 116:285–298

    Google Scholar 

  10. Ahire S (2014) A survey of sentiment lexicons. Computer Science and Engineering IIT Bombay, Bombay

    Google Scholar 

  11. Medhat W, Hassan A, Korashy H (2014) Sentiment analysis algorithms and applications: a survey. Ain Shams Eng J 5(4):1093–1113

    Google Scholar 

  12. Montoyo A, Martínez-Barco P, Balahur A (2012) Subjectivity and sentiment analysis: an overview of the current state of the area and envisaged developments. Decis Support Syst 53(4):675–679

    Google Scholar 

  13. Sun S, Luo C, Chen J (2017) A review of natural language processing techniques for opinion mining systems. Inf Fusion 36:10–25

    Google Scholar 

  14. Cambria E, Speer R, Havasi C, Hussain A (2010) SenticNet: a publicly available semantic resource for opinion mining. In: AAAI fall symposium: commonsense knowledge

  15. Wilson T, Hoffmann P, Somasundaran S, Kessler J, Wiebe J, Choi Y, Cardie C, Riloff E, Patwardhan S (2005) OpinionFinder: a system for subjectivity analysis. In: Proceedings of HLT/EMNLP on interactive demonstrations. Association for Computational Linguistics

  16. Hu M, Liu B (2004) Mining and summarizing customer reviews. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM

  17. Wilson T, Wiebe J, Hoffmann P (2005) Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of the conference on human language technology and empirical methods in natural language processing. Association for Computational Linguistics

  18. El-Halees A (2011) Arabic opinion mining using combined classification approach. In: The international Arab conference on information technology, pp 10–13

  19. Feng S, Song KS, Wang DL, Yu G (2015) A word-emoticon mutual reinforcement ranking model for building sentiment lexicon from massive collection of microblogs. World Wide Web Internet Web Inf Syst 18(4):949–967

    Google Scholar 

  20. Lafourcade M, Joubert A, Le Brun N (2015) Collecting and evaluating lexical polarity with a game with a purpose. In: RANLP

  21. Abdaoui A, Azé J, Bringay S, Poncelet P (2016) FEEL: a French expanded emotion lexicon. Lang Resour Eval 51:1–23

    Google Scholar 

  22. Nusko B, Tahmasebi N, Mogren O (2016) Building a sentiment lexicon for Swedish. In: Digital humanities 2016. From digitization to knowledge 2016: resources and methods for semantic processing of digital works/texts, proceedings of the workshop, 11 July 2016, Krakow, Poland. Linköping University Electronic Press

  23. Haniewicz K, Kaczmarek M, Adamczyk M, Rutkowski W (2014) Polarity lexicon for the polish language: design and extension with random walk algorithm. In: Swiatek J et al (eds) International conference on systems science, ICSS 2013. Springer, Berlin, pp 173–182

    Google Scholar 

  24. Ravi K, Ravi V (2015) A survey on opinion mining and sentiment analysis: tasks, approaches and applications. Knowl Based Syst 89:14–46

    Google Scholar 

  25. Cambria E, Schuller B, Xia Y, Havasi C (2013) New avenues in opinion mining and sentiment analysis. IEEE Intell Syst 28(2):15–21

    Google Scholar 

  26. Giachanou A, Crestani F (2016) Like it or not: a survey of twitter sentiment analysis methods. ACM Comput Surv (CSUR) 49(2):28

    Google Scholar 

  27. Mohammad SM, Turney PD (2013) Crowdsourcing a word-emotion association lexicon. Comput Intell 29(3):436–465

    MathSciNet  Google Scholar 

  28. Cho H, Kim S, Lee J, Lee JS (2014) Data-driven integration of multiple sentiment dictionaries for lexicon-based sentiment classification of product reviews. Knowl Based Syst 71:61–71

    Google Scholar 

  29. Esuli A, Sebastiani F (2007) SENTIWORDNET: a high-coverage lexical resource for opinion mining. Technical Report 2007-TR-02. http://nmis.isti.cnr.it/sebastiani/Publications/2007TR02.pdf

  30. Baccianella S, Esuli A, Sebastiani F (2010) SentiWordNet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: LREC

  31. Poria S, Gelbukh A, Hussain A, Howard N, Das D, Bandyopadhyay S (2013) Enhanced SenticNet with affective labels for concept-based opinion mining. IEEE Intell Syst 28(2):31–38

    Google Scholar 

  32. Hung C, Lin H-KJIIS (2013) Using objective words in SentiWordNet to improve word-of-mouth sentiment classification. IEEE Intell Syst 2:47–54

    Google Scholar 

  33. Plutchik R (2001) The nature of emotions human emotions have deep evolutionary roots, a fact that may explain their complexity and provide tools for clinical practice. Am Sci 89(4):344–350

    Google Scholar 

  34. Araujo M, Reis J, Pereira A, Benevenuto F (2016) An evaluation of machine translation for multilingual sentence-level sentiment analysis. In: Proceedings of the 31st annual ACM symposium on applied computing. ACM

  35. Perez-Rosas V, Banea C, Mihalcea R (2012) Learning sentiment lexicons in Spanish. In: Lrec 2012—eighth international conference on language resources and evaluation, pp 3077–3081

  36. Stone PJ, Dunphy DC, Smith MS (1966) The general inquirer: a computer approach to content analysis. M.I.T. Press, Oxford, p 651

    Google Scholar 

  37. Nielsen FA (2011) A new ANEW: evaluation of a word list for sentiment analysis in microblogs. In: 1st workshop on making sense of microposts 2011: big things come in small packages, #MSM 2011—co-located with the 8th extended semantic web conference, ESWC 2011. Heraklion, Crete

  38. Neviarouskaya A, Prendinger H, Ishizuka M (2009) SentiFul: generating a reliable lexicon for sentiment analysis. In: 2009 3rd international conference on affective computing and intelligent interaction and workshops, ACII 2009, Amsterdam

  39. Wu F, Huang Y, Song Y, Liu S (2016) Towards building a high-quality microblog-specific Chinese sentiment lexicon. Decis Support Syst 87:39–49

    Google Scholar 

  40. Hammer H, Bai A, Yazidi A, Engelstad P (2014) Building sentiment lexicons applying graph theory on information from three norwegian thesauruses. Norsk Informatikkonferanse (NIK)

  41. Al-Twairesh N, Al-Khalifa H, Al-Salman A (2016) AraSenTi: large-scale twitter-specific arabic sentiment lexicons. In: Association for computational linguistics, pp 697–705

  42. Yao J, Wu G, Liu J, Zheng Y (2006) Using bilingual lexicon to judge sentiment orientation of Chinese words. In: The sixth IEEE international conference on computer and information technology, 2006. CIT’06. IEEE

  43. Steinberger J, Ebrahim M, Ehrmann M, Hurriyetoglu A, Kabadjov M, Lenkova P, Steinberger R, Tanev H, Vázquez S, Zavarella V (2012) Creating sentiment dictionaries via triangulation. Decis Support Syst 53(4):689–694

    Google Scholar 

  44. Remus R, Quasthoff U, Heyer G (2010) SentiWS—a publicly available German-language resource for sentiment analysis. In: LREC

  45. Denecke K (2008) Using sentiwordnet for multilingual sentiment analysis. In: IEEE 24th international conference on data engineering workshop, 2008. ICDEW 2008. IEEE

  46. Banea C, Mihalcea R, Wiebe J (2013) Porting multilingual subjectivity resources across languages. IEEE Trans Affect Comput 4(2):211–225

    Google Scholar 

  47. Kim J, Li J-J, Lee J-H (2010) Evaluating multilanguage-comparability of subjectivity analysis systems. In: Proceedings of the 48th annual meeting of the association for computational linguistics. Association for Computational Linguistics

  48. Basile V, Nissim M (2013) Sentiment analysis on Italian tweets. In: Proceedings of the 4th workshop on computational approaches to subjectivity, sentiment and social media analysis

  49. Lo SL, Cambria E, Chiong R, Cornforth D (2016) A multilingual semi-supervised approach in deriving Singlish sentic patterns for polarity detection. Knowl Based Syst 105:236–247

    Google Scholar 

  50. Sidorov G, Miranda-Jiménez S, Viveros-Jiménez F, Gelbukh A, Castro-Sánchez N, Velásquez F, Díaz-Rangel I, Suárez-Guerra S, Treviño A, Gordon J (2012) Empirical study of machine learning based approach for opinion mining in tweets. In: Mexican international conference on artificial intelligence. Springer

  51. Kim S-M, Hovy E (2006) Identifying and analyzing judgment opinions. In: Proceedings of the main conference on human language technology conference of the North American chapter of the association of computational linguistics. Association for Computational Linguistics

  52. Das A, Bandyopadhyay S (2010) Sentiwordnet for bangla. Knowl Shar Event4 Task 2:1–8

    Google Scholar 

  53. Rouvier M, Favre B (2016) Building a robust sentiment lexicon with (almost) no resource. arXiv preprint arXiv:1612.05202

  54. Hassan A, Abu-Jbara A, Jha R, Radev D (2011) Identifying the semantic orientation of foreign words. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies: short papers, vol 2. Association for Computational Linguistics

  55. Rosell M, Kann V (2010) Constructing a swedish general purpose polarity lexicon random walks in the people’s dictionary of synonyms. SLTC 2010:19

    Google Scholar 

  56. Banea C, Wiebe JM, Mihalcea R (2008) A bootstrapping method for building subjectivity lexicons for languages with scarce resources. In: Proceedings of the international conference on language resources and evaluation, LREC 2008, 26 May–1 June 2008, Marrakech, Morocco, pp 2764–2467

  57. Rao D, Ravichandran D (2009) Semi-supervised polarity lexicon induction. In: Proceedings of the 12th conference of the European chapter of the association for computational linguistics. Association for Computational Linguistics

  58. Mahyoub FHH, Siddiqui MA, Dahab MY (2014) Building an Arabic sentiment lexicon using semi-supervised learning. J King Saud Univ Comput Inf Sci 26(4):417–424

    Google Scholar 

  59. Bakliwal A, Arora P, Varma V (2012) Hindi subjective lexicon: a lexical resource for hindi polarity classification. In: Proceedings of the eight international conference on language resources and evaluation (LREC)

  60. Zhu Y, Wen Z, Wang P, Peng Z (2009) A method of building Chinese basic semantic lexicon based on word similarity. In: 2009 Chinese conference on pattern recognition, CCPR 2009 and the 1st CJK joint workshop on pattern recognition, CJKPR, Nanjing

  61. Dehdarbehbahani I, Shakery A, Faili H (2014) Semi-supervised word polarity identification in resource-lean languages. Neural Netw 58:50–59

    Google Scholar 

  62. Darwich M, Noah SAM, Omar N (2016) Automatically generating a sentiment lexicon for the Malay language. Asia Pac J Inf Technol Multimed 5(1):49–59

    Google Scholar 

  63. Badaro G, Baly R, Hajj H, Habash N, El-Hajj W (2014) A large scale Arabic sentiment lexicon for Arabic opinion mining. ANLP 2014:165

    Google Scholar 

  64. Joshi A, Balamurali A, Bhattacharyya P (2010) A fall-back strategy for sentiment analysis in hindi: a case study. In: Proceedings of the 8th ICON

  65. Abdul-Mageed M, Diab MT (2014) SANA: a large scale multi-genre, multi-dialect lexicon for Arabic subjectivity and sentiment analysis. In: LREC

  66. Eskander R, Rambow O (2015) SLSA: a sentiment lexicon for Standard Arabic. In: Conference on empirical methods in natural language processing, EMNLP 2015. Association for Computational Linguistics (ACL)

  67. Buscaldi D, Hernandez-Farias DI (2016) IRADABE2: lexicon merging and positional features for sentiment analysis in Italian. In: CLiC-it/EVALITA

  68. Jha V, Savitha R, Hebbar SS, Shenoy PD, Venugopal K (2015) Hmdsad: Hindi multi-domain sentiment aware dictionary. In: 2015 International conference on computing and network communications (CoCoNet). IEEE

  69. Rashed FE, Abdolvand N (2017) A supervised method for constructing sentiment lexicon in Persian language. J Comput Robot 10(1):11–19

    Google Scholar 

  70. Yang AM, Lin JH, Zhou YM, Chen J (2013) Research on building a Chinese sentiment lexicon based on SO-PMI. In: Zhang J et al (eds) Information technology applications in industry, Pts 1-4. Trans Tech Publications Ltd, Stafa-Zurich, pp 1688–1693

    Google Scholar 

  71. Elhawary M, Elfeky M (2010) Mining Arabic business reviews. In: 2010 IEEE international conference on data mining workshops (ICDMW). IEEE

  72. Hong Y, Kwak H, Baek Y, Moon S (2013) Tower of babel: a crowdsourcing game building sentiment lexicons for resource-scarce languages. In: 22nd international conference on World Wide Web, WWW 2013, Rio de Janeiro

  73. Al-Subaihin, A.A., H.S. Al-Khalifa, and A.S. Al-Salman. A proposed sentiment analysis tool for modern arabic using human-based computing. in Proceedings of the 13th International Conference on Information Integration and Web-based Applications and Services. 2011. ACM

  74. Scharl A, Sabou M, Gindl S, Rafelsberger W, Weichselbraun A (2012) Leveraging the wisdom of the crowds for the acquisition of multilingual language resources. In: 8th international conference on language resources and evaluation (LREC-2012), 23–25 May 2012, Istanbul, Turkey, pp 379–383

  75. Trakultaweekoon K, Klaithin S (2016) SenseTag: a tagging tool for constructing Thai sentiment lexicon. In: 2016 13th international joint conference on computer science and software engineering (JCSSE). IEEE

  76. Abdul-Mageed M, Diab M, Kübler S (2014) SAMAR: subjectivity and sentiment analysis for Arabic social media. Comput Speech Lang 28(1):20–37

    Google Scholar 

  77. Pasha A, Al-Badrashiny M, Diab MT, El Kholy A, Eskander R, Habash N, Pooleery M, Rambow O, Roth R (2014) MADAMIRA: a fast, comprehensive tool for morphological analysis and disambiguation of Arabic. In: LREC

  78. Cerini S, Compagnoni V, Demontis A, Formentelli M, Gandini G (2007) Micro-WNOp: a gold standard for the evaluation of automatically compiled lexical resources for opinion mining. In: Language resources and linguistic theory: typology, second language acquisition, English linguistics, pp 200–210

  79. Balahur A, Steinberger R, Van Der Goot E, Pouliquen B, Kabadjov M (2009) Opinion mining on newspaper quotations. In: IEEE/WIC/ACM international joint conferences on web intelligence and intelligent agent technologies, 2009. WI-IAT’09. IEEE

  80. Thelwall M, Buckley K, Paltoglou G, Cai D, Kappas A (2010) Sentiment strength detection in short informal text. J Am Soc Inf Sci Technol 61(12):2544–2558

    Google Scholar 

  81. Chen Y, Skiena S (2014) Building sentiment lexicons for all major languages. In: 52nd annual meeting of the association for computational linguistics, ACL 2014. Association for Computational Linguistics (ACL), Baltimore, MD

  82. Moliner M (1984) Diccionario de uso del espanol.-v. 1–2

  83. Miller GA (1995) WordNet: a lexical database for English. Commun ACM 38(11):39–41

    Google Scholar 

  84. Mohammad S, Turney P (2013) NRC emotion lexicon, in National Research Council. NRC Technical Report, Canada

  85. Black W, Elkateb S, Rodriguez H, Alkhalifa M, Vossen P, Pease A, Fellbaum C (2006) Introducing the Arabic WordNet project. In: Proceedings of the third international WordNet conference

  86. Narayan D, Chakrabarti D, Pande P, Bhattacharyya P (2002) An experience in building the indo WordNet—a WordNet for Hindi. In: First international conference on global WordNet, Mysore, India

  87. Shamsfard M, Hesabi A, Fadaei H, Mansoory N, Famian A, Bagherbeigi S, Fekri E, Monshizadeh M, Assi SM (2010) Semi automatic development of FarsNet; the Persian WordNet. In: Proceedings of 5th global WordNet conference, Mumbai, India

  88. Kann V, Rosell M (2005) Free construction of a free Swedish dictionary of synonyms. In: Proceedings of NODALIDA 2005, Citeseer

  89. Karthikeyan A (2010) Hindi English WordNet linkage. CSE Department, IIT Bombay, Bombay

    Google Scholar 

  90. Borin L, Forsberg M, Lönngren L (2013) SALDO: a touch of yin to WordNet’s yang. Lang Resour Eval 47(4):1191–1211

    Google Scholar 

  91. Maamouri M, Graff D, Bouziri B, Krouna S, Bies A, Kulick S (2010) Standard Arabic morphological analyzer (SAMA) version 3.1. Linguistic Data Consortium, Catalog No.: LDC2010L01

  92. Abdul-Mageed M, Diab MT (2011) Subjectivity and sentiment annotation of modern standard arabic newswire. In: Proceedings of the 5th linguistic annotation workshop. Association for Computational Linguistics

  93. Buckwalter T (2004) Buckwalter Arabic morphological analyzer version 2.0. Linguistic Data Consortium, University of Pennsylvania, 2002. LDC Catalog No.: LDC2004L02. ISBN 1-58563-324-0

  94. Wiebe J, Riloff E (2005) Creating subjective and objective sentence classifiers from unannotated texts. In: International conference on intelligent text processing and computational linguistics. Springer

  95. Balahur A, Turchi M (2014) Comparative experiments using supervised learning and machine translation for multilingual sentiment analysis. Comput Speech Lang 28(1):56–75

    Google Scholar 

  96. Elkateb S, Black W, Rodríguez H, Alkhalifa M, Vossen P, Pease A, Fellbaum C (2006) Building a WordNet for arabic. In: Proceedings of the fifth international conference on language resources and evaluation (LREC 2006)

  97. Turney PD (2001) Mining the web for synonyms: PMI-IR versus LSA on TOEFL. In: European conference on machine learning. Springer

  98. Dumais ST, Furnas GW, Landauer TK, Deerwester S, Harshman R (1988) Using latent semantic analysis to improve access to textual information. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM

  99. Stubbs M (2001) Computer-assisted text and corpus analysis: lexical cohesion and communicative competence. Handb Discourse Anal 18:304

    Google Scholar 

  100. Kumar P, Jaiswal UC (2016) A comparative study on sentiment analysis and opinion mining. Int J Eng Technol 8(2):938–943

    Google Scholar 

  101. Passaro LC, Pollacci L, Lenci A (2015) Item: a vector space model to bootstrap an italian emotive lexicon. CLiC It 60(15):215

    Google Scholar 

  102. Kaity M, Balakrishnan V (2019) An automatic non-English sentiment lexicon builder using unannotated corpus. J Supercomput 1–26

  103. Turney PD (2002) Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th annual meeting on association for computational linguistics. Association for Computational Linguistics, pp 417–424. https://doi.org/10.3115/1073083.1073153

  104. Turney PD, Littman ML (2002) Unsupervised learning of semantic orientation from a hundred-billion-word corpus. arXiv:cs/0212012

  105. Pozzi FA, Fersini E, Messina E, Liu B (2017) Chapter 1—Challenges of sentiment analysis in social networks: an overview. In: Sentiment analysis in social networks. Morgan Kaufmann, Boston, pp 1–11

  106. Lafourcade M, Le Brun N, Joubert A (2016) Mixing crowdsourcing and graph propagation to build a sentiment lexicon: feelings are contagious. In: Metais E et al (eds) Natural language processing and information systems, NLDB 2016. Springer, Cham, pp 258–266

    Google Scholar 

  107. Yuang CT, Banchs RE, Siong CE (2012) An empirical evaluation of stop word removal in statistical machine translation. In: Proceedings of the joint workshop on exploiting synergies between information retrieval and machine translation (ESIRMT) and hybrid approaches to machine translation (HyTra). Association for Computational Linguistics

  108. Al-Kabi MN, Kazakzeh SA, Ata BMA, Al-Rababah SA, Alsmadi IM (2015) A novel root based Arabic stemmer. J King Saud Univ Comput Inf Sci 27(2):94–103

    Google Scholar 

  109. Zhang Y, Tsai FS (2009) Chinese novelty mining. In: Proceedings of the 2009 conference on empirical methods in natural language processing: volume 3. Association for Computational Linguistics

  110. Abdul-Mageed M (2017) Modeling Arabic subjectivity and sentiment in lexical space. Inf Process Manag 56(2):291–307

    Google Scholar 

  111. Taboada M, Brooke J, Tofiloski M, Voll K, Stede M (2011) Lexicon-based methods for sentiment analysis. Comput Linguist 37(2):267–307

    Google Scholar 

  112. Manning C, Surdeanu M, Bauer J, Finkel J, Bethard S, McClosky D (2014) The Stanford CoreNLP natural language processing toolkit. In: Proceedings of 52nd annual meeting of the association for computational linguistics: system demonstrations

  113. Honnibal M, Montani I (2017) Spacy 2: natural language understanding with bloom embeddings, convolutional neural networks and incremental parsing (to appear)

  114. Qiu X, Zhang Q, Huang X (2013) Fudannlp: a toolkit for chinese natural language processing. In: Proceedings of the 51st annual meeting of the association for computational linguistics: system demonstrations, pp 49–54

  115. Abdelali A, Darwish K, Durrani N, Mubarak H (2016) Farasa: a fast and furious segmenter for Arabic. In: HLT-NAACL Demos

  116. Zhang H-P, Yu H-K, Xiong D-Y, Liu Q (2003) HHMM-based Chinese lexical analyzer ICTCLAS. In: Proceedings of the second SIGHAN workshop on Chinese language processing-volume 17. Association for Computational Linguistics

  117. Hussein DME-DM (2016) A survey on sentiment analysis challenges. J King Saud Univ Eng Sci

  118. Bravo-Marquez F, Frank E, Pfahringer B (2016) Building a Twitter opinion lexicon from automatically-annotated tweets. Knowl Based Syst 108:65–78

    Google Scholar 

  119. Yue L, Chen W, Li X, Zuo W, Yin M (2018) A survey of sentiment analysis in social media. Knowl Inf Syst 1–47

  120. Tang D, Wei F, Qin B, Zhou M, Liu T (2014) Building large-scale Twitter-specific sentiment lexicon: a representation learning approach. In: Proceedings of coling 2014, the 25th international conference on computational linguistics: technical papers, pp 172–182

  121. Wang L, Xia R (2017) Sentiment lexicon construction with representation learning based on hierarchical sentiment supervision. In: Proceedings of the 2017 conference on empirical methods in natural language processing

  122. Kong L, Li C, Ge J, Yang Y, Zhang F, Luo B (2018) Construction of microblog-specific chinese sentiment lexicon based on representation learning. In: Pacific Rim international conference on artificial intelligence. Springer

  123. Amir S, Astudillo R, Ling W, Martins B, Silva MJ, Trancoso I (2015) Inesc-id: a regression model for large scale twitter sentiment lexicon induction. In: Proceedings of the 9th international workshop on semantic evaluation (SemEval 2015)

  124. Dong X, de Melo G (2018) Cross-lingual propagation for deep sentiment analysis. In: Proceedings of the 32nd AAAI conference on artificial intelligence (AAAI 2018). AAAI Press

  125. Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146

    Google Scholar 

  126. Tang D, Qin B, Liu T (2015) Deep learning for sentiment analysis: successful approaches and future challenges. Wiley Interdiscip Rev Data Min Knowl Discov 5(6):292–303

    Google Scholar 

  127. Wang K, Xia R (2016) A survey on automatical construction methods of sentiment lexicons. Acta Automatica Sinica 42(4):495–511

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vimala Balakrishnan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kaity, M., Balakrishnan, V. Sentiment lexicons and non-English languages: a survey. Knowl Inf Syst 62, 4445–4480 (2020). https://doi.org/10.1007/s10115-020-01497-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-020-01497-6

Keywords

Navigation