Erratum to: Multilingual Sentiment Analysis: State of the Art and Independent Comparison of Techniques
Erratum to: Cogn Comput DOI 10.1007/s12559-016-9415-7
Unfortunately, the original version of the article has been published with few errors in Abstract, Conclusion, Acknowledgment, and References.
Also, Dr. Erik Cambria is the co-corresponding author of the article.
The corrected versions of the sections are given below.
With the advent of the internet, people actively express their opinions about products, services, events, political parties, etc., in social media, blogs, and website comments. The amount of research work on sentiment analysis is growing explosively. However, the majority of research efforts are devoted to English language data, while a great share of information is available in other languages. We present a state-of-the-art review on multilingual sentiment analysis. More importantly, we compare our own implementation of existing state-of-the-art approaches on common data. Precision observed in our experiments is typically lower than that reported by the original authors, which we attribute to lack of detail in the original presentation of those approaches. Thus, we compare the existing works by what they really offer to the reader, including whether they allow for accurate implementation and for reliable reproduction of the reported results.
We gave an overview of state-of-the-art multilingual sentiment analysis methods. We described data pre-processing, typical features, and the main resources used for multilingual sentiment analysis. Then, we discussed different approaches applied by their authors to English and other languages. We have classified these approaches into corpus-based, lexicon-based, and hybrid ones.
The real value of any sentiment analysis technique for the research community corresponds to the results that can be reproduced with it, not in the results its original authors reportedly obtained with it. To evaluate this real value, we have implemented eleven selected approaches as closely as we could, based on their descriptions in the original papers, and tested them on the same two corpora. In the majority of the cases, we obtained lower results than those reported by their corresponding authors. We attribute this mainly to the incompleteness of their descriptions in the original papers. In some cases, though, the methods were developed for a specific domain, so in such cases, comparison on our test corpora may not be fair. A lesson learnt was that for a method to be useful for the research community, authors should provide sufficient detail to allow its correct implementation by the reader.
According to our results, the approach proposed by Singh et al.  outperforms other approaches. However, this approach is computationally expensive and has been tested only on English language data. The least accurate approaches of those that we considered were the ones proposed by Zhu et al. , Habernal et al. , and Mizumoto et al. .
The main problem of multilingual sentiment analysis is the lack of lexical resources . In our future work, we are planning to develop a multilingual corpus, which will include Persian, Arabic, Turkish, and English data, and compare a range of state-of-the-art methods.
This work was partly supported by the Royal Society of Edinburgh (RSE) and Natural Science Foundation of China (NNSFC) joint project Grant No. 61411130162, and the UK Engineering and Physical Science Research Council (EPSRC) Grant No. EP/M026981/1. We also wish to thank the anonymous reviewers who helped improve the quality of the paper.
- 2.Ahmad K, Cheng D, Almas Y. Multi-lingual sentiment analysis of financial news streams. In: Proceedings of the 1st international conference on grid in finance; 2006.Google Scholar
- 4.Balahur A, Turchi M. Multilingual sentiment analysis using machine translation? In: Proceedings of the 3rd workshop in computational approaches to subjectivity and sentiment analysis. Association for Computational Linguistics; 2012, p. 52–60.Google Scholar
- 5.Balahur A, Turchi M. Improving sentiment analysis in twitter using multilingual machine translated data. In: RANLP; 2013, p. 49–55.Google Scholar
- 6.Bautin M, Vijayarenu L, Skiena S. International sentiment analysis for news and blogs. In: ICWSM; 2008.Google Scholar
- 7.Berger AL, Pietra VJD, Pietra SAD. A maximum entropy approach to natural language processing. Comput Linguist. 1996;22:39–71.Google Scholar
- 8.Bhaskar J, Sruthi K, Nedungadi P. Enhanced sentiment analysis of informal textual communication in social media by considering objective words and intensifiers. In: Recent advances and innovations in engineering (ICRAIE), 2014. IEEE; 2014, p. 1–6.Google Scholar
- 9.Blitzer J, Dredze M, Pereira F, et al. Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. In: ACL; 2007, p. 440–47.Google Scholar
- 11.Cambria E, Olsher D, Rajagopal D. SenticNet 3: A common and common-sense knowledge base for cognition-driven sentiment analysis. In: AAAI, 2014, p. 1515–1521, Quebec City.Google Scholar
- 12.Carroll TZJ. Unsupervised classification of sentiment and objectivity in Chinese text. In: Third international joint conference on natural language processing. 2008, p. 304.Google Scholar
- 13.Chang C-C, Lin C-J. LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol TIST. 2011;2:27.Google Scholar
- 14.Chikersal P, Poria S, Cambria E. SeNTU: sentiment analysis of tweets by combining a rule-based classifier with supervised learning. In: Proceedings of the international workshop on semantic evaluation (SemEval 2015). 2015.Google Scholar
- 16.Cruz-Garcia IO, Gelbukh A, Sidorov G. Implicit aspect indicator extraction for aspect based opinion mining. Int J Comput Linguist Appl. 2014;5(2):135–52.Google Scholar
- 18.Denecke K. Using SentiWordNet for multilingual sentiment analysis. In: IEEE 24th international data engineering workshop, 2008. ICDEW 2008. IEEE; 2008, p. 507–12.Google Scholar
- 19.Duwairi RM, Qarqaz I (2014) Arabic sentiment analysis using supervised classification. In: 2014 international conference on future internet of things and cloud (FiCloud). IEEE; 2014.Google Scholar
- 20.Evans DK, Ku L-W, Seki Y, Chen H–H, Kando N. Opinion analysis across languages: an overview of and observations from the NTCIR6 opinion analysis pilot task. In: Applications of fuzzy sets theory. Berlin: Springer; 2007, p. 456–63.Google Scholar
- 21.Ghorbel H, Jacot D. Further experiments in sentiment analysis of French movie reviews. In: Advances in intelligent web mastering–3. Berlin, Heidelberg: Springer; 2011, p. 19–28.Google Scholar
- 22.Ghosh M, Kar A. Unsupervised linguistic approach for sentiment classification from online reviews using SentiWordNet 3.0. Int J Eng Res Technol. 2013.Google Scholar
- 23.Habernal I, Ptácek T, Steinberger J. Sentiment analysis in Czech social media using supervised machine learning. In: Proceedings of the 4th workshop on computational approaches to subjectivity, sentiment and social media analysis. 2013, p. 65–74.Google Scholar
- 25.Holmes G, Donkin A, Witten IH. Weka: a machine learning workbench. In: Proceedings of the 1994 second Australian and New Zealand conference on intelligent information systems. IEEE; 1994, p. 357–61.Google Scholar
- 26.Hu M, Liu B. Mining and summarizing customer reviews. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM; 2004, p. 168–77.Google Scholar
- 29.Liu Z, Dong X, Guan Y, Yang J. Reserved self-training: a semisupervised sentiment classification method for Chinese microblogs. In: Proceedings of IJCNLP; 2013.Google Scholar
- 30.Mahyoub FHH, Siddiqui MA, Dahab MY. Building an Arabic sentiment lexicon using semi-supervised learning. J King Saud Univ Comput Inf Sci. 2014;26(4):417–24.Google Scholar
- 33.Mirchev U, Last M. Multi-document summarization by extended graph text representation and importance refinement. Innov Doc Summ Tech Revolut Knowl Underst Revolut Knowl Underst. 2014; 28.Google Scholar
- 34.Mizumoto K, Yanagimoto H, Yoshioka M. Sentiment analysis of stock market news with semi-supervised learning. In: 2012 IEEE/ACIS 11th international conference on computer and information science (ICIS). IEEE, 2012; p. 325–28.Google Scholar
- 35.Morency L-P, Mihalcea R, Doshi P. Towards multimodal sentiment analysis: harvesting opinions from the web. In: Proceedings of the 13th international conference on multimodal interfaces. ACM; 2011, p. 169–76.Google Scholar
- 36.Narayanan V, Arora I, Bhatia A. Fast and accurate sentiment classification using an enhanced Naive Bayes model. In: Intelligent data engineering and automated learning–IDEAL 2013. Berlin: Springer; 2013, p. 194–201.Google Scholar
- 37.Pang B, Lee L. A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd annual meeting on association for computational linguistics. Association for Computational Linguistics; 2004, p. 271.Google Scholar
- 38.Pang B, Lee L, Vaithyanathan S. Thumbs up?: sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on empirical methods in natural language processing, vol 10. Association for Computational Linguistics, 2002; p. 79–86.Google Scholar
- 39.Posadas-Durán J-P, Markov I, Gómez-Adorno H, Sidorov G, Batyrshin I, Gelbukh A, Pichardo-Lagunas O. Syntactic N-grams as features for the author profiling task. Notebook for PAN at CLEF 2015. CEUR Workshop Proceedings 1391; 2015.Google Scholar
- 40.Raina P. Sentiment analysis in news articles using sentic computing. In: 2013 IEEE 13th International conference on data mining workshops (ICDMW). IEEE; 2013, p. 959–62.Google Scholar
- 41.Rajagopal D, Cambria E, Olsher D, Kwok K. A graph-based approach to commonsense concept extraction and semantic similarity detection. In: Proceedings of the 22nd international conference on world wide web companion. International World Wide Web Conferences Steering Committee; 2013, p. 565–70.Google Scholar
- 42.Ravi K, Ravi V. A survey on opinion mining and sentiment analysis: tasks, approaches and applications. Knowl Based Syst. 2015.Google Scholar
- 43.Read J. Recognising affect in text using pointwise-mutual information. Unpubl. M Sc Diss. Univ. Sussex UK; 2004.Google Scholar
- 44.Remus R, Quasthoff U, Heyer G. SentiWS-a publicly available German-language resource for sentiment analysis. In: LREC. 2010.Google Scholar
- 45.Saraee M, Bagheri A. Feature selection methods in Persian sentiment analysis. In: Natural language processing and information systems. Springer; 2013, p. 303–308.Google Scholar
- 46.Seki Y, Evans DK, Ku L-W, Sun L, Chen H–H, Kando N, Lin C-Y. Overview of multilingual opinion analysis task at NTCIR-7. In: Proceedings of the 7th NTCIR workshop meeting on evaluation of information access technologies: information retrieval, question answering, and cross-lingual information access. 2008, p. 185–203.Google Scholar
- 47.Shi H–X, Li X-J. A sentiment analysis model for hotel reviews based on supervised learning. In: 2011 international conference on machine learning and cybernetics (ICMLC). IEEE; 2011, p. 950–54.Google Scholar
- 48.Sidorov G. Should syntactic n-grams contain names of syntactic relations? Int J Comput Linguist Appl. 2014;5(2):25–47.Google Scholar
- 49.Sidorov G, Miranda-Jiménez S, Viveros-Jiménez F, Gelbukh A, Castro-Sánchez N, Velásquez F, Díaz-Rangel I, Suárez-Guerra S, Treviño A, Gordon J. Empirical study of opinion mining in Spanish tweets. MICAI 2012. Lect Notes Comput Sci. 2012;7629:1–14.Google Scholar
- 51.Sindhwani V, Melville P. Document-word co-regularization for semi-supervised sentiment analysis. In: Eighth IEEE international conference on data mining, 2008. ICDM’08. IEEE; 2008, p. 1025–30.Google Scholar
- 52.Singh VK, Piryani R, Uddin A, Waila P, et al. Sentiment analysis of textual reviews; Evaluating machine learning, unsupervised and SentiWordNet approaches. In: 2013 5th international conference on knowledge and smart technology (KST). IEEE; 2013, p. 122–27.Google Scholar
- 53.Stone PJ, Dunphy DC, Smith MS. The general inquirer: a computer approach to content analysis; 1966.Google Scholar
- 55.Tong S, Koller D. Support vector machine active learning with applications to text classification. J Mach Learn Res. 2002;2:45–66.Google Scholar
- 56.Tromp E. Multilingual sentiment analysis on social media. Master’s Thesis, Dep. Math. Comput. Sci. Eindh. Univ. Technol.; 2011.Google Scholar
- 57.Wan X. Using bilingual knowledge and ensemble techniques for unsupervised Chinese sentiment analysis. In: Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics; 2008, p. 553–61.Google Scholar
- 59.Wiebe J, Mihalcea R. Word sense and subjectivity. In: Proceedings of the 21st international conference on computational linguistics and the 44th annual meeting of the Association for Computational Linguistics. Association for Computational Linguistics; 2006, p. 1065–72.Google Scholar
- 63.Xia Y, Zhao T, Yao J, Jin P. Measuring Chinese-English crosslingual word similarity with HowNet and parallel corpus. In: Computational linguistics and intelligent text processing, 12th international conference, CICLing 2011, vol. 2. 2011, p. 221–33.Google Scholar
- 64.Xia Y, Li X, Cambria E, Hussain A. A localization toolkit for SenticNet. In: 2014 IEEE international conference on data mining workshop (ICDMW). 2014, p. 403–8.Google Scholar
- 65.Xia R, Zong C. Exploring the use of word relation features for sentiment classification. In: Proceedings of the 23rd international conference on computational linguistics: posters. Association for Computational Linguistics; 2010, p. 1336–44.Google Scholar
- 66.Xu Y, Jones GJ, Li J, Wang B, Sun C. A study on mutual information-based feature selection for text categorization. J Comput Inf Syst. 2007;3:1007–12.Google Scholar
- 67.Xu R, Wong K-F, Lu Q, Xia Y, Li W. Learning knowledge from relevant webpage for opinion analysis. In: IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology, WI-IAT ‘08. 2008, p. 307–13.Google Scholar
- 68.Yang Y, Pedersen JO. A comparative study on feature selection in text categorization. In: ICML; 1997, p. 412–20.Google Scholar
- 69.Ye Q, Shi W, Li Y. Sentiment classification for movie reviews in Chinese by improved semantic oriented approach. In: Proceedings of the 39th annual Hawaii international conference on system sciences, HICSS’06. IEEE; 2006, p. 53b–53b.Google Scholar
- 71.Zagibalov T, Carroll J. Automatic seed word selection for unsupervised sentiment classification of Chinese text. In: Proceedings of the 22nd international conference on computational linguistics, vol. 1. Association for Computational Linguistics; 2008, p. 1073–80.Google Scholar
- 72.Zhang Z-Q, Li Y-J, Ye Q, Law R. Sentiment classification for Chinese product reviews using an unsupervised Internet-based method. In: International conference on management science and engineering, 2008. ICMSE 2008. 15th Annual conference proceedings. IEEE; 2008, p. 3–9.Google Scholar
- 73.Zhu S, Xu B, Zheng D, Zhao T. Chinese microblog sentiment analysis based on semi-supervised learning. In: Semantic web and web science. New York: Springer; 2013, p. 325–31.Google Scholar
- 76.Poria S, Gelbukh A, Cambria E, Das D, Bandyopadhyay S. Enriching SenticNet polarity scores through semi-supervised fuzzy clustering. In: Proceedings of ICDM, 2012, p. 709–716.Google Scholar
- 77.Poria S, Gelbukh A, Cambria E, Yang P, Hussain A, Durrani T. Merging SenticNet and WordNet-affect emotion lists for sentiment analysis. In: Proceedings of ICSP, 2012, p. 1251–1255.Google Scholar