Cognitive Computation

, Volume 8, Issue 4, pp 772–775 | Cite as

Erratum to: Multilingual Sentiment Analysis: State of the Art and Independent Comparison of Techniques

  • Kia Dashtipour
  • Soujanya Poria
  • Amir Hussain
  • Erik Cambria
  • Ahmad Y. A. Hawalah
  • Alexander Gelbukh
  • Qiang Zhou
Erratum

Erratum to: Cogn Comput DOI 10.1007/s12559-016-9415-7

Unfortunately, the original version of the article has been published with few errors in Abstract, Conclusion, Acknowledgment, and References.

Also, Dr. Erik Cambria is the co-corresponding author of the article.

The corrected versions of the sections are given below.

Abstract

With the advent of the internet, people actively express their opinions about products, services, events, political parties, etc., in social media, blogs, and website comments. The amount of research work on sentiment analysis is growing explosively. However, the majority of research efforts are devoted to English language data, while a great share of information is available in other languages. We present a state-of-the-art review on multilingual sentiment analysis. More importantly, we compare our own implementation of existing state-of-the-art approaches on common data. Precision observed in our experiments is typically lower than that reported by the original authors, which we attribute to lack of detail in the original presentation of those approaches. Thus, we compare the existing works by what they really offer to the reader, including whether they allow for accurate implementation and for reliable reproduction of the reported results.

Conclusion

We gave an overview of state-of-the-art multilingual sentiment analysis methods. We described data pre-processing, typical features, and the main resources used for multilingual sentiment analysis. Then, we discussed different approaches applied by their authors to English and other languages. We have classified these approaches into corpus-based, lexicon-based, and hybrid ones.

The real value of any sentiment analysis technique for the research community corresponds to the results that can be reproduced with it, not in the results its original authors reportedly obtained with it. To evaluate this real value, we have implemented eleven selected approaches as closely as we could, based on their descriptions in the original papers, and tested them on the same two corpora. In the majority of the cases, we obtained lower results than those reported by their corresponding authors. We attribute this mainly to the incompleteness of their descriptions in the original papers. In some cases, though, the methods were developed for a specific domain, so in such cases, comparison on our test corpora may not be fair. A lesson learnt was that for a method to be useful for the research community, authors should provide sufficient detail to allow its correct implementation by the reader.

According to our results, the approach proposed by Singh et al. [52] outperforms other approaches. However, this approach is computationally expensive and has been tested only on English language data. The least accurate approaches of those that we considered were the ones proposed by Zhu et al. [73], Habernal et al. [23], and Mizumoto et al. [34].

The main problem of multilingual sentiment analysis is the lack of lexical resources [18]. In our future work, we are planning to develop a multilingual corpus, which will include Persian, Arabic, Turkish, and English data, and compare a range of state-of-the-art methods.

Notes

Acknowledgments

This work was partly supported by the Royal Society of Edinburgh (RSE) and Natural Science Foundation of China (NNSFC) joint project Grant No. 61411130162, and the UK Engineering and Physical Science Research Council (EPSRC) Grant No. EP/M026981/1. We also wish to thank the anonymous reviewers who helped improve the quality of the paper.

References

  1. 1.
    Agarwal B, Poria S, Mittal N, Gelbukh A, Hussain A. Concept-level sentiment analysis with dependency-based semantic parsing: a novel approach. Cogn Comput. 2015;7(4):487–99.CrossRefGoogle Scholar
  2. 2.
    Ahmad K, Cheng D, Almas Y. Multi-lingual sentiment analysis of financial news streams. In: Proceedings of the 1st international conference on grid in finance; 2006.Google Scholar
  3. 3.
    Al-Ayyoub M, Essa SB, Alsmadi I. Lexicon-based sentiment analysis of arabic tweets. Int J Soc Netw Min. 2015;2:101–14.CrossRefGoogle Scholar
  4. 4.
    Balahur A, Turchi M. Multilingual sentiment analysis using machine translation? In: Proceedings of the 3rd workshop in computational approaches to subjectivity and sentiment analysis. Association for Computational Linguistics; 2012, p. 52–60.Google Scholar
  5. 5.
    Balahur A, Turchi M. Improving sentiment analysis in twitter using multilingual machine translated data. In: RANLP; 2013, p. 49–55.Google Scholar
  6. 6.
    Bautin M, Vijayarenu L, Skiena S. International sentiment analysis for news and blogs. In: ICWSM; 2008.Google Scholar
  7. 7.
    Berger AL, Pietra VJD, Pietra SAD. A maximum entropy approach to natural language processing. Comput Linguist. 1996;22:39–71.Google Scholar
  8. 8.
    Bhaskar J, Sruthi K, Nedungadi P. Enhanced sentiment analysis of informal textual communication in social media by considering objective words and intensifiers. In: Recent advances and innovations in engineering (ICRAIE), 2014. IEEE; 2014, p. 1–6.Google Scholar
  9. 9.
    Blitzer J, Dredze M, Pereira F, et al. Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. In: ACL; 2007, p. 440–47.Google Scholar
  10. 10.
    Boiy E, Moens M-F. A machine learning approach to sentiment analysis in multilingual Web texts. Inf Retr. 2009;12:526–58.CrossRefGoogle Scholar
  11. 11.
    Cambria E, Olsher D, Rajagopal D. SenticNet 3: A common and common-sense knowledge base for cognition-driven sentiment analysis. In: AAAI, 2014, p. 1515–1521, Quebec City.Google Scholar
  12. 12.
    Carroll TZJ. Unsupervised classification of sentiment and objectivity in Chinese text. In: Third international joint conference on natural language processing. 2008, p. 304.Google Scholar
  13. 13.
    Chang C-C, Lin C-J. LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol TIST. 2011;2:27.Google Scholar
  14. 14.
    Chikersal P, Poria S, Cambria E. SeNTU: sentiment analysis of tweets by combining a rule-based classifier with supervised learning. In: Proceedings of the international workshop on semantic evaluation (SemEval 2015). 2015.Google Scholar
  15. 15.
    Croft WB, Lafferty J. Language modeling for information retrieval. Berlin: Springer; 2003.CrossRefGoogle Scholar
  16. 16.
    Cruz-Garcia IO, Gelbukh A, Sidorov G. Implicit aspect indicator extraction for aspect based opinion mining. Int J Comput Linguist Appl. 2014;5(2):135–52.Google Scholar
  17. 17.
    Das N, Ghosh S, Gonçalves T, Quaresma P. Comparison of different graph distance metrics for semantic text based classification. Polibits. 2014;49:51–7.CrossRefGoogle Scholar
  18. 18.
    Denecke K. Using SentiWordNet for multilingual sentiment analysis. In: IEEE 24th international data engineering workshop, 2008. ICDEW 2008. IEEE; 2008, p. 507–12.Google Scholar
  19. 19.
    Duwairi RM, Qarqaz I (2014) Arabic sentiment analysis using supervised classification. In: 2014 international conference on future internet of things and cloud (FiCloud). IEEE; 2014.Google Scholar
  20. 20.
    Evans DK, Ku L-W, Seki Y, Chen H–H, Kando N. Opinion analysis across languages: an overview of and observations from the NTCIR6 opinion analysis pilot task. In: Applications of fuzzy sets theory. Berlin: Springer; 2007, p. 456–63.Google Scholar
  21. 21.
    Ghorbel H, Jacot D. Further experiments in sentiment analysis of French movie reviews. In: Advances in intelligent web mastering–3. Berlin, Heidelberg: Springer; 2011, p. 19–28.Google Scholar
  22. 22.
    Ghosh M, Kar A. Unsupervised linguistic approach for sentiment classification from online reviews using SentiWordNet 3.0. Int J Eng Res Technol. 2013.Google Scholar
  23. 23.
    Habernal I, Ptácek T, Steinberger J. Sentiment analysis in Czech social media using supervised machine learning. In: Proceedings of the 4th workshop on computational approaches to subjectivity, sentiment and social media analysis. 2013, p. 65–74.Google Scholar
  24. 24.
    He Y, Zhou D. Self-training from labeled features for sentiment analysis. Inf Process Manag. 2011;47:606–16.CrossRefGoogle Scholar
  25. 25.
    Holmes G, Donkin A, Witten IH. Weka: a machine learning workbench. In: Proceedings of the 1994 second Australian and New Zealand conference on intelligent information systems. IEEE; 1994, p. 357–61.Google Scholar
  26. 26.
    Hu M, Liu B. Mining and summarizing customer reviews. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM; 2004, p. 168–77.Google Scholar
  27. 27.
    Jimenez S, Gonzalez FA, Gelbukh A. Soft cardinality in semantic text processing: experience of the SemEval international competitions. Polibits. 2015;51:63–72.CrossRefGoogle Scholar
  28. 28.
    Liu B. Sentiment analysis: mining opinions, sentiments, and emotions. Cambridge: Cambridge University Press; 2015.CrossRefGoogle Scholar
  29. 29.
    Liu Z, Dong X, Guan Y, Yang J. Reserved self-training: a semisupervised sentiment classification method for Chinese microblogs. In: Proceedings of IJCNLP; 2013.Google Scholar
  30. 30.
    Mahyoub FHH, Siddiqui MA, Dahab MY. Building an Arabic sentiment lexicon using semi-supervised learning. J King Saud Univ Comput Inf Sci. 2014;26(4):417–24.Google Scholar
  31. 31.
    Manning CD, Raghavan P, Schütze H. Introduction to information retrieval. Cambridge: Cambridge University Press; 2008.CrossRefGoogle Scholar
  32. 32.
    Medhat W, Hassan A, Korashy H. Sentiment analysis algorithms and applications: a survey. Ain Shams Eng. J. 2014;5:1093–113.CrossRefGoogle Scholar
  33. 33.
    Mirchev U, Last M. Multi-document summarization by extended graph text representation and importance refinement. Innov Doc Summ Tech Revolut Knowl Underst Revolut Knowl Underst. 2014; 28.Google Scholar
  34. 34.
    Mizumoto K, Yanagimoto H, Yoshioka M. Sentiment analysis of stock market news with semi-supervised learning. In: 2012 IEEE/ACIS 11th international conference on computer and information science (ICIS). IEEE, 2012; p. 325–28.Google Scholar
  35. 35.
    Morency L-P, Mihalcea R, Doshi P. Towards multimodal sentiment analysis: harvesting opinions from the web. In: Proceedings of the 13th international conference on multimodal interfaces. ACM; 2011, p. 169–76.Google Scholar
  36. 36.
    Narayanan V, Arora I, Bhatia A. Fast and accurate sentiment classification using an enhanced Naive Bayes model. In: Intelligent data engineering and automated learning–IDEAL 2013. Berlin: Springer; 2013, p. 194–201.Google Scholar
  37. 37.
    Pang B, Lee L. A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd annual meeting on association for computational linguistics. Association for Computational Linguistics; 2004, p. 271.Google Scholar
  38. 38.
    Pang B, Lee L, Vaithyanathan S. Thumbs up?: sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on empirical methods in natural language processing, vol 10. Association for Computational Linguistics, 2002; p. 79–86.Google Scholar
  39. 39.
    Posadas-Durán J-P, Markov I, Gómez-Adorno H, Sidorov G, Batyrshin I, Gelbukh A, Pichardo-Lagunas O. Syntactic N-grams as features for the author profiling task. Notebook for PAN at CLEF 2015. CEUR Workshop Proceedings 1391; 2015.Google Scholar
  40. 40.
    Raina P. Sentiment analysis in news articles using sentic computing. In: 2013 IEEE 13th International conference on data mining workshops (ICDMW). IEEE; 2013, p. 959–62.Google Scholar
  41. 41.
    Rajagopal D, Cambria E, Olsher D, Kwok K. A graph-based approach to commonsense concept extraction and semantic similarity detection. In: Proceedings of the 22nd international conference on world wide web companion. International World Wide Web Conferences Steering Committee; 2013, p. 565–70.Google Scholar
  42. 42.
    Ravi K, Ravi V. A survey on opinion mining and sentiment analysis: tasks, approaches and applications. Knowl Based Syst. 2015.Google Scholar
  43. 43.
    Read J. Recognising affect in text using pointwise-mutual information. Unpubl. M Sc Diss. Univ. Sussex UK; 2004.Google Scholar
  44. 44.
    Remus R, Quasthoff U, Heyer G. SentiWS-a publicly available German-language resource for sentiment analysis. In: LREC. 2010.Google Scholar
  45. 45.
    Saraee M, Bagheri A. Feature selection methods in Persian sentiment analysis. In: Natural language processing and information systems. Springer; 2013, p. 303–308.Google Scholar
  46. 46.
    Seki Y, Evans DK, Ku L-W, Sun L, Chen H–H, Kando N, Lin C-Y. Overview of multilingual opinion analysis task at NTCIR-7. In: Proceedings of the 7th NTCIR workshop meeting on evaluation of information access technologies: information retrieval, question answering, and cross-lingual information access. 2008, p. 185–203.Google Scholar
  47. 47.
    Shi H–X, Li X-J. A sentiment analysis model for hotel reviews based on supervised learning. In: 2011 international conference on machine learning and cybernetics (ICMLC). IEEE; 2011, p. 950–54.Google Scholar
  48. 48.
    Sidorov G. Should syntactic n-grams contain names of syntactic relations? Int J Comput Linguist Appl. 2014;5(2):25–47.Google Scholar
  49. 49.
    Sidorov G, Miranda-Jiménez S, Viveros-Jiménez F, Gelbukh A, Castro-Sánchez N, Velásquez F, Díaz-Rangel I, Suárez-Guerra S, Treviño A, Gordon J. Empirical study of opinion mining in Spanish tweets. MICAI 2012. Lect Notes Comput Sci. 2012;7629:1–14.Google Scholar
  50. 50.
    Sidorov G, Velasquez F, Stamatatos E, Gelbukh A, Chanona-Hernández L. Syntactic n-grams as machine learning features for natural language processing. Expert Syst Appl. 2014;41(3):853–60.CrossRefGoogle Scholar
  51. 51.
    Sindhwani V, Melville P. Document-word co-regularization for semi-supervised sentiment analysis. In: Eighth IEEE international conference on data mining, 2008. ICDM’08. IEEE; 2008, p. 1025–30.Google Scholar
  52. 52.
    Singh VK, Piryani R, Uddin A, Waila P, et al. Sentiment analysis of textual reviews; Evaluating machine learning, unsupervised and SentiWordNet approaches. In: 2013 5th international conference on knowledge and smart technology (KST). IEEE; 2013, p. 122–27.Google Scholar
  53. 53.
    Stone PJ, Dunphy DC, Smith MS. The general inquirer: a computer approach to content analysis; 1966.Google Scholar
  54. 54.
    Tan S, Zhang J. An empirical study of sentiment analysis for Chinese documents. Expert Syst Appl. 2008;34:2622–9.CrossRefGoogle Scholar
  55. 55.
    Tong S, Koller D. Support vector machine active learning with applications to text classification. J Mach Learn Res. 2002;2:45–66.Google Scholar
  56. 56.
    Tromp E. Multilingual sentiment analysis on social media. Master’s Thesis, Dep. Math. Comput. Sci. Eindh. Univ. Technol.; 2011.Google Scholar
  57. 57.
    Wan X. Using bilingual knowledge and ensemble techniques for unsupervised Chinese sentiment analysis. In: Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics; 2008, p. 553–61.Google Scholar
  58. 58.
    Wang S, Li D, Song X, Wei Y, Li H. A feature selection method based on improved fisher’s discriminant ratio for text sentiment classification. Expert Syst Appl. 2011;38:8696–702.CrossRefGoogle Scholar
  59. 59.
    Wiebe J, Mihalcea R. Word sense and subjectivity. In: Proceedings of the 21st international conference on computational linguistics and the 44th annual meeting of the Association for Computational Linguistics. Association for Computational Linguistics; 2006, p. 1065–72.Google Scholar
  60. 60.
    Wong K-F, Xia Y, Xu R, Wu M, Li W. Pattern-based opinion mining for stock market trend prediction. Int J Comput Process Orient Lang. 2008;21(4):347–61.CrossRefGoogle Scholar
  61. 61.
    Xia Y, Cambria E, Hussain A, Zhao H. Word polarity disambiguation using Bayesian model and opinion-level features. Cogn Comput. 2015;7(3):369–80.CrossRefGoogle Scholar
  62. 62.
    Xia Y, Wang L, Wong K-F. Sentiment vector space model for lyric-based song sentiment classification. Int J Comput Process Orient Lang. 2008;21(4):331–45.CrossRefGoogle Scholar
  63. 63.
    Xia Y, Zhao T, Yao J, Jin P. Measuring Chinese-English crosslingual word similarity with HowNet and parallel corpus. In: Computational linguistics and intelligent text processing, 12th international conference, CICLing 2011, vol. 2. 2011, p. 221–33.Google Scholar
  64. 64.
    Xia Y, Li X, Cambria E, Hussain A. A localization toolkit for SenticNet. In: 2014 IEEE international conference on data mining workshop (ICDMW). 2014, p. 403–8.Google Scholar
  65. 65.
    Xia R, Zong C. Exploring the use of word relation features for sentiment classification. In: Proceedings of the 23rd international conference on computational linguistics: posters. Association for Computational Linguistics; 2010, p. 1336–44.Google Scholar
  66. 66.
    Xu Y, Jones GJ, Li J, Wang B, Sun C. A study on mutual information-based feature selection for text categorization. J Comput Inf Syst. 2007;3:1007–12.Google Scholar
  67. 67.
    Xu R, Wong K-F, Lu Q, Xia Y, Li W. Learning knowledge from relevant webpage for opinion analysis. In: IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology, WI-IAT ‘08. 2008, p. 307–13.Google Scholar
  68. 68.
    Yang Y, Pedersen JO. A comparative study on feature selection in text categorization. In: ICML; 1997, p. 412–20.Google Scholar
  69. 69.
    Ye Q, Shi W, Li Y. Sentiment classification for movie reviews in Chinese by improved semantic oriented approach. In: Proceedings of the 39th annual Hawaii international conference on system sciences, HICSS’06. IEEE; 2006, p. 53b–53b.Google Scholar
  70. 70.
    Ye Q, Zhang Z, Law R. Sentiment classification of online reviews to travel destinations by supervised machine learning approaches. Expert Syst Appl. 2009;36:6527–35.CrossRefGoogle Scholar
  71. 71.
    Zagibalov T, Carroll J. Automatic seed word selection for unsupervised sentiment classification of Chinese text. In: Proceedings of the 22nd international conference on computational linguistics, vol. 1. Association for Computational Linguistics; 2008, p. 1073–80.Google Scholar
  72. 72.
    Zhang Z-Q, Li Y-J, Ye Q, Law R. Sentiment classification for Chinese product reviews using an unsupervised Internet-based method. In: International conference on management science and engineering, 2008. ICMSE 2008. 15th Annual conference proceedings. IEEE; 2008, p. 3–9.Google Scholar
  73. 73.
    Zhu S, Xu B, Zheng D, Zhao T. Chinese microblog sentiment analysis based on semi-supervised learning. In: Semantic web and web science. New York: Springer; 2013, p. 325–31.Google Scholar
  74. 74.
    Cambria E. Affective computing and sentiment analysis. IEEE Intell Syst. 2016;31(2):102–7.CrossRefGoogle Scholar
  75. 75.
    Cambria E, Hussain A. Sentic computing: a common-sense-based framework for concept-level sentiment analysis. Cham: Springer; 2015. ISBN 978-3-319-23654-4.CrossRefGoogle Scholar
  76. 76.
    Poria S, Gelbukh A, Cambria E, Das D, Bandyopadhyay S. Enriching SenticNet polarity scores through semi-supervised fuzzy clustering. In: Proceedings of ICDM, 2012, p. 709–716.Google Scholar
  77. 77.
    Poria S, Gelbukh A, Cambria E, Yang P, Hussain A, Durrani T. Merging SenticNet and WordNet-affect emotion lists for sentiment analysis. In: Proceedings of ICSP, 2012, p. 1251–1255.Google Scholar
  78. 78.
    Cambria E, Schuller B, Liu B, Wang H, Havasi C. Statistical approaches to concept-level sentiment analysis. IEEE Intell Syst. 2013;28(3):6–9.CrossRefGoogle Scholar
  79. 79.
    Poria S, Cambria E, Howard N, Huang G-B, Hussain A. Fusing audio, visual and textual clues for sentiment analysis from multimodal content. Neurocomputing. 2016;174:50–9.CrossRefGoogle Scholar
  80. 80.
    Poria S, Cambria E, Hussain A, Huang G-B. Towards an intelligent framework for multimodal affective data analysis. Neural Netw. 2015;63:104–16.CrossRefPubMedGoogle Scholar
  81. 81.
    Cambria E, Wang H, White B. Guest editorial: big social data analysis. Knowl Based Syst. 2014;69:1–2.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  • Kia Dashtipour
    • 1
  • Soujanya Poria
    • 2
  • Amir Hussain
    • 1
    • 4
  • Erik Cambria
    • 3
  • Ahmad Y. A. Hawalah
    • 4
  • Alexander Gelbukh
    • 5
  • Qiang Zhou
    • 6
  1. 1.Division of Computing Science and MathematicsUniversity of StirlingStirlingScotland, UK
  2. 2.Temasek LaboratoryNanyang Technological UniversitySingaporeSingapore
  3. 3.School of Computer Science and EngineeringNanyang Technological UniversitySingaporeSingapore
  4. 4.College of Computer Science and EngineeringTaibah UniversityMedinaSaudi Arabia
  5. 5.CIC, Instituto Politécnico NacionalMexico CityMexico
  6. 6.Tsinghua UniversityBeijingChina

Personalised recommendations