Abstract
Sentiment analysis of short informal texts, such as tweets, remains a challenging task due to their particular characteristics. Much effort has been made in the literature of Twitter sentiment analysis to achieve an effective and efficient representation of tweets. In this context, distinct types of features have been proposed and employed, from the simple n-gram representation to meta-features to word embeddings. Hence, in this work, using a relevant set of twenty-two datasets of tweets, we present a thorough evaluation of features by means of different supervised learning algorithms. We evaluate not only a rich set of meta-features examined in state-of-the-art studies, but also a significant collection of pre-trained word embedding models. Also, we evaluate and analyze the effect of combining those distinct types of features in order to detect which combination may provide core information in the polarity detection task in Twitter sentiment analysis. For this purpose, we exploit different strategies for combination, such as feature concatenation and ensemble learning techniques, and show that the sentiment detection of tweets benefits from combining different types of features proposed in the literature.
Similar content being viewed by others
Notes
Available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
Available at http://www.csie.ntu.edu.tw/~cjlin/liblinear.
We used the negation words available at http://sentiment.christopherpotts.net/lingstruc.html#negation.
References
Agarwal A, Xie B, Vovsha I, Rambow O, Passonneau R (2011) Sentiment analysis of Twitter data. In: Proceedings of the workshop on languages in social media. Association for Computational Linguistics, pp 30–38
Agrawal A, An A, Papagelis M (2018) Learning emotion-enriched word representations. In: Proceedings of the 27th international conference on computational linguistics, pp 950–961
Akbik A, Bergmann T, Blythe D, Rasul K, Schweter S, Vollgraf R (2019) FLAIR: an easy-to-use framework for state-of-the-art NLP. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics (demonstrations). Association for Computational Linguistics, Minneapolis, Minnesota, pp 54–59. https://doi.org/10.18653/v1/N19-4010
Akbik A, Blythe D, Vollgraf R (2018) Contextual string embeddings for sequence labeling. In: Proceedings of the 27th international conference on computational linguistics. Association for Computational Linguistics, Santa Fe, New Mexico, USA, pp 1638–1649
Araque O, Corcuera-Platas I, Sanchez-Rada JF, Iglesias CA (2017) Enhancing deep learning sentiment analysis with ensemble techniques in social applications. Exp Syst Appl 77:236–246
Araújo M, Pereira A, Benevenuto F (2020) A comparative study of machine translation for multilingual sentence-level sentiment analysis. Inf Sci 512:1078–1102
Arif MH, Li J, Iqbal M, Liu K (2018) Sentiment analysis and spam detection in short informal text using learning classifier systems. Soft Comput 22(21):7281–7291
Baccianella S, Esuli A, Sebastiani F (2010) SentiWordNet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: Proceedings of the 7th international conference on language resources and evaluation, pp 2200–2204
Bakliwal A, Arora P, Madhappan S, Kapre N, Singh M, Varma V (2012) Mining sentiments from tweets. In: Proceedings of the 3rd workshop in computational approaches to subjectivity and sentiment analysis. Association for Computational Linguistics, Jeju, Korea, pp 11–18
Barbosa L, Feng J (2010) Robust sentiment detection on Twitter from biased and noisy data. In: Proceedings of the 23rd international conference on computational linguistics: posters. Association for Computational Linguistics, pp 36–44
Bengio Y, Ducharme R, Vincent P, Janvin C (2003) A neural probabilistic language model. J Mach Learn Res 3:1137–1155
Bermingham A, Smeaton A (2010) Classifying sentiment in microblogs: is brevity an advantage?. In: Proceedings of the 19th ACM international conference on information and knowledge management. Association for Computational Linguistics, pp 1833–1836
Bifet A, Frank E (2010) Sentiment knowledge discovery in Twitter streaming data. In: Proceedings of the 13th international conference on discovery science. Springer, pp 1–15
Bojanowski P, Grave E, Joulin A, Mikolov T (2016) Enriching word vectors with subword information. CoRR abs/1607.04606
Bollen J, Mao H, Zeng X (2011) Twitter mood predicts the stock market. J Comput Sci 2(1):1–8
Bravo-Marquez F, Frank E, Mohammad SM, Pfahringer B (2016) Determining word-emotion associations from tweets by multi-label classification. In: 2016 IEEE/WIC/ACM international conference on web intelligence (WI), pp 536–539
Bravo-Marquez F, Frank E, Pfahringer B, Mohammad SM (2019) Affectivetweets: a weka package for analyzing affect in tweets. J Mach Learn Res 20(92):1–6
Bravo-Marquez F, Mendoza M, Poblete B (2013) Combining strengths, emotions and polarities for boosting twitter sentiment analysis. In: Proceedings of the 2nd international workshop on issues of sentiment discovery and opinion mining, WISDOM ’13. Association for Computational Linguistics, New York, NY, USA. https://doi.org/10.1145/2502069.2502071
Bravo-Marquez F, Mendoza M, Poblete B (2014) Meta-level sentiment models for big social data analysis. Knowl Based Syst 69:86–99
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Brown G, Wyatt J, Harris R, Yao X (2005) Diversity creation methods: a survey and categorisation. Inf Fusion 6(1):5–20
Buscaldi D, Hernandez-Farias I (2015) Sentiment analysis on microblogs for natural disasters management: a study on the 2014 genoa floodings. In: Proceedings of the 24th international conference on world wide web, pp 1185–1188
Cambria E, Hussain A (2015) Sentic computing: a common-sense-based framework for concept-level sentiment analysis, 1st edn. Springer, Berlin
Cambria E, Hussain A, Durrani T, Havasi C, Eckl C, Munro J (2010) Sentic computing for patient centered applications. In: Proceedings of the 10th IEEE international conference on signal processing, pp 1279–1282
Cambria E, Poria S, Gelbukh A, Thelwall M (2017) Sentiment analysis is a big suitcase. IEEE Intell Syst 32(6):74–80. https://doi.org/10.1109/MIS.2017.4531228
Canuto S, Gonçalves M, Benevenuto F (2016) Exploiting new sentiment-based meta-level features for effective sentiment analysis. In: Proceedings of the 9th ACM international conference on web search and data mining. Association for Computational Linguistics, pp 53–62
Carvalho J, Plastino A (2016) An assessment study of feature and meta-level features in twitter sentiment analysis. In: Proceedings of the 22nd European conference on artificial intelligence. IOS Press, pp 769–777
Chang C, Lin C (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2:1–27
Chaturvedi I, Cambria E, Welsch RE, Herrera F (2018) Distinguishing between facts and opinions for sentiment analysis: survey and challenges. Inf Fusion 44:65–77. https://doi.org/10.1016/j.inffus.2017.12.006
Chen L, Wang W, Nagarajan M, Wang S, Sheth A (2012) Extracting diverse sentiment expressions with target-dependent polarity from Twitter. In: Proceedings of the 6th international AAAI conference on weblogs and social media, pp 50–57
Chen P, Sun Z, Bing L, Yang W (2017) Recurrent attention network on memory for aspect sentiment analysis. In: Proceedings of the 2017 conference on empirical methods in natural language processing. Association for Computational Linguistics, Copenhagen, Denmark, pp 452–461. https://doi.org/10.18653/v1/D17-1047
Chikersal P, Poria S, Cambria E, Gelbukh A, Siong CE (2015) Modelling public sentiment in twitter: Using linguistic patterns to enhance supervised learning. In: Gelbukh A (ed) Proceedings of the 16th international conference on intelligent text processing and computational linguistics. Springer International Publishing, Cairo, Egypt, pp 49–65
Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12:2493–2537
Cozza V, Petrocchi M (2016) mib at semeval-2016 task 4a: exploiting lexicon based features for sentiment analysis in twitter. In: Proceedings of the 10th international workshop on semantic evaluation (SemEval-2016), pp 133–138
da Silva N, Colleta L, Hruschka E, Hruschka E Jr (2016) Using unsupervised information to improve semi-supervised tweet sentiment classification. Inf Sci 355:348–365
da Silva N, Hruschka E, Hruschka E Jr (2014) Tweet sentiment analysis with classifier ensembles. Decis Support Syst 66:170–179
Dashtipour K, Poria S, Hussain A, Cambria E, Hawalah AY, Gelbukh A, Zhou Q (2016) Multilingual sentiment analysis: state of the art and independent comparison of techniques. Cognit Comput 8(4):757–771
Davidov D, Tsur O, Rappoport A (2010) Enhanced sentiment learning using Twitter hashtags and smileys. In: Proceedings of the 23rd international conference on computational linguistics: posters. Association for Computational Linguistics, pp 241–249
De Smedt T, Daelemans W (2012) Pattern for python. J Mach Learn Res 13:2063–2067
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers). Association for Computational Linguistics, Minneapolis, Minnesota, pp 4171–4186. https://doi.org/10.18653/v1/N19-1423. https://www.aclweb.org/anthology/N19-1423
Diakopoulos N, Shamma D (2010) Characterizing debate performance via aggregated Twitter sentiment. In: Proceedings of the SIGCHI conference on human factors in computing systems. Association for Computing Machinery, pp 1195–1198
Dietterich TG (2000) Ensemble methods in machine learning. In: Multiple classifier systems. Springer, Berlin, pp 1–15
Dong L, Wei F, Tan C, Tang D, Zhou M, Xu K (2014) Adaptive recursive neural network for target-dependent Twitter sentiment classification. In: Proceedings of the 52nd annual meeting of the association for computational linguistics: short papers. Association for Computational Linguistics, pp 49–54
Emadi M, Rahgozar M (2019) Twitter sentiment analysis using fuzzy integral classifier fusion. J Inf Sci 46:1–17
Fan RE, Chang KW, Hsieh CJ, Wang XR, Lin CJ (2008) Liblinear: a library for large linear classification. J Mach Learn Res 9:1871–1874
Farias DH, Rosso P (2017) Chapter 7 - irony, sarcasm, and sentiment analysis. In: Pozzi FA, Fersini E, Messina E, Liu B (eds) Sentiment analysis in social networks. Morgan Kaufmann, Boston, pp 113–128. https://doi.org/10.1016/B978-0-12-804412-4.00007-3. http://www.sciencedirect.com/science/article/pii/B9780128044124000073
Felbo B, Mislove A, Søgaard A, Rahwan I, Lehmann S (2017) Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm. arXiv preprint arXiv:1708.00524
Fersini E, Messina E, Pozzi F (2014) Sentiment analysis: Bayesian ensemble learning. Decis Support Syst 68:26–38
Fersini E, Messina E, Pozzi F (2016) Expressive signals in social media languages to improve polarity detection. Inf Proc Manag 52(1):20–35
Fu X, Wei Y, Xu F, Wang T, Lu Y, Li J, Huang JZ (2019) Semi-supervised aspect-level sentiment classification model based on variational autoencoder. Knowl Based Syst 171:81–92
Ghosh A, Li G, Veale T, Rosso P, Shutova E, Barnden J, Reyes A (2015) SemEval-2015 task 11: Sentiment analysis of figurative language in twitter. In: Proceedings of the 9th international workshop on semantic evaluation (SemEval 2015). Association for Computational Linguistics, Denver, Colorado, pp 470–478. https://doi.org/10.18653/v1/S15-2080
Gimpel K, Schneider N, O’Connor B, Das D, Mills D, Eisenstein J, Heilman M, Yogatama D, Flanigan J, Smith N (2011) Part-of-speech tagging for Twitter: annotation, features, and experiments. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies: short papers. Association for Computational Linguistics, pp 42–47
Go A, Bhayani R, Huang L (2009) Twitter sentiment classification using distant supervision. Technical report CS224N, Stanford
Gonçalves P, Dalip D, Reis J, Messias J, Ribeiro F, Melo P, Gonçalves M, Benevenuto F (2015) Caracterizando e detectando sarcasmo e ironia no Twitter. In: Proceedings of the Brazilian workshop on social network analysis and mining
Hagen M, Potthast M, Büchner M, Stein B (2015) Twitter sentiment detection via ensemble classification using averaged confidence scores. In: Proceedings of the 37th European conference on IR research. Springer, pp 741–754
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten I (2009) The weka data mining software: an update. SIGKDD Explor Newsl 11(1):10–18
Hamdan H (2016) Sentisys at semeval-2016 task 4: feature-based system for sentiment analysis in twitter. In: Proceedings of the 10th international workshop on semantic evaluation (SemEval-2016), pp 190–197
Hamdan H, Bellot P, Bechet F (2015) Lsislif: Crf and logistic regression for opinion target extraction and sentiment polarity analysis. In: Proceedings of the 9th international workshop on semantic evaluation (SemEval 2015), pp 753–758
Hussain A, Cambria E (2018) Semi-supervised learning for big social data analysis. Neurocomputing 275:1662–1673
Hutto C, Gilbert E (2014) Vader: a parsimonious rule-based model for sentiment analysis of social media text. In: Proceedings of the 8th international AAAI conference on weblogs and social media
Jabreel M, Moreno A (2017) Sitaka at semeval-2017 task 4: sentiment analysis in twitter based on a rich set of features. In: Proceedings of the 11th international workshop on semantic evaluation (SemEval-2017), pp 694–699
Jiang L, Yu M, Zhou M, Liu X, Zhao T (2011) Target-dependent Twitter sentiment classification. In: Proceedings of the 49th annual meeting of the ACL: human language technologies. Association for Computational Linguistics, pp 151–160
Kathuria P (2019) Sentiment classification using WSD, maximum entropy and Naive Bayes classifiers. https://github.com/kevincobain2000/sentiment_classifier. Accessed 30 08 2019
Khuc V, Shivade C, Ramnath R, Ramanathan J (2012) Towards building large-scale distributed systems for Twitter sentiment analysis. In: Proceedings of the 27th annual ACM symposium on applied computing. Association for Computing Machinery, pp 459–464
Kingma DP, Welling M (2013) Auto-encoding variational Bayes
Kouloumpis E, Wilson T, Moore J (2011) Twitter sentiment analysis: the good the bad and the omg! In: Proceedings of the 5th international AAAI conference on web and social media, pp 538–541
Li X, Wu P, Wang W (2020) Incorporating stock prices and news sentiments for stock market prediction: a case of Hong Kong. Inf Process Manag 57(5):102212. https://doi.org/10.1016/j.ipm.2020.102212
Lin J, Kolcz A (2012) Large-scale machine learning at Twitter. In: Proceedings of the 2012 ACM SIGMOD international conference on management of data. Association for Computing Machinery, pp 793–804
Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5(1):1–167
Liu B (2015) Sentiment analysis: mining opinions, sentiments, and emotions. Cambridge University Press, Cambridge
Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) RoBERTa: a robustly optimized BERT pretraining approach
Lo SL, Cambria E, Chiong R, Cornforth D (2017) Multilingual sentiment analysis: from formal to informal and scarce resource languages. Artif Intell Rev 48(4):499–527
Lochter JV, Zanetti RF, Reller D, Almeida TA (2016) Short text opinion detection using ensemble of classifiers and semantic indexing. Expert Syst Appl 62:243–249
Loria S (2016) Textblob: simplified text processing. https://textblob.readthedocs.io/en/dev/index.html. Accessed 08 30 2019
Ma Y, Peng H, Cambria E (2018) Targeted aspect-based sentiment analysis via embedding commonsense knowledge into an attentive lstm. In: Proceedings of 32nd AAAI conference on artificial intelligence. New Orleans, Louisiana, pp 5876–5883
Manning C, Surdeanu M, Bauer J, Finkel J, Bethard S, McClosky D (2014) The Stanford CoreNLP natural language processing toolkit. In: Proceedings of 52nd annual meeting of the association for computational linguistics: system demonstrations. Association for Computational Linguistics, Baltimore, Maryland, pp 55–60
Mansour R, Hady MFA, Hosam E, Amr H, Ashour A (2015) Feature selection for twitter sentiment analysis: An experimental study. In: Gelbukh A (ed) Proceedings of the 16th international conference on intelligent text processing and computational linguistics. Springer International Publishing, Cairo, Egypt, pp 92–103
Martínez-Cámara E, Martín-Valdivia M, Ureña-López L, Montejo-Ráez A (2014) Sentiment analysis in twitter. Nat Lang Eng 20(1):1–28
Maynard D, Bontcheva K (2016) Challenges of evaluating sentiment analysis tools on social media. In: Proceedings of the 10th international conference on language resources and evaluation (LREC’16). European Language Resources Association (ELRA), Portorož, Slovenia, pp 1142–1148
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781
Mikolov T, Grave E, Bojanowski P, Puhrsch C, Joulin A (2018) Advances in pre-training distributed word representations. In: Proceedings of the international conference on language resources and evaluation (LREC 2018)
Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th international conference on neural information processing systems, vol 2, NIPS’13, pp 3111–3119
Miranda-Jiménez S, Graff M, Tellez ES, Moctezuma D (2017) Ingeotec at semeval 2017 task 4: a b4msa ensemble based on genetic programming for twitter sentiment analysis. In: Proceedings of the 11th international workshop on semantic evaluation (SemEval-2017), pp 771–776
Mohammad S, Kiritchenko S, Zhu X (2013) Nrc-canada: building the state-of-the-art in sentiment analysis of tweets. In: Proceedings of the 7th international workshop on semantic evaluation exercises. Atlanta, Georgia, USA
Mohammad S, Turney P (2013) Crowdsourcing a word-emotion association lexicon. Comput Intell 29(3):436–465
Mohammad SM, Bravo-Marquez F, Salameh M, Kiritchenko S (2018) Semeval-2018 task 1: affect in tweets. In: Proceedings of 12th international workshop on semantic evaluation (SemEval 2018). Association for Computational Linguistics, New Orleans, LA, USA
Nakov P, Ritter A, Rosenthal S, Stoyanov V, Sebastiani F (2016) SemEval-2016 task 4: sentiment analysis in Twitter. In: Proceedings of the 10th international workshop on semantic evaluation (SemEval 2016), SemEval’16. Association for Computational Linguistics, San Diego, California
Nakov P, Rosenthal S, Kozareva Z, Stoyanov V, Ritter A, Wilson T (2013) SemEval-2013 task 2: sentiment analysis in twitter. In: Proceedings of the 7th international workshop on semantic evaluation (SemEval 2013). Association for Computational Linguistics, Atlanta, Georgia, pp 312–320
Narayanan V, Arora I, Bhatia A (2013) Fast and accurate sentiment classification using an enhanced naive Bayes model. In: Intelligent data engineering and automated learning—IDEAL 2013. Springer, Berlin, pp 194–201
Narr S, Hulfenhaus M, Albayrak S (2012) Language-independent Twitter sentiment analysis. In: Proceedings of the workshop on knowledge discovery, data mining and machine learning
Nielsen FÅ (2011) A new ANEW: evaluation of a word list for sentiment analysis in microblogs. CoRR abs/1103.2903. http://arxiv.org/abs/1103.2903
Pak A, Paroubek P (2010) Twitter as a corpus for sentiment analysis and opinion mining. In: Proceedings of the 7th international conference on language resources and evaluation, pp 1320–1326
Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on empirical methods in natural language processing. Association for Computational Linguistics, pp 79–86
Park JH, Xu P, Fung P (2018) Plusemo2vec at semeval-2018 task 1: exploiting emotion knowledge from emoji and# hashtags. In: Proceedings of the 12th international workshop on semantic evaluation (SemEval-2018), pp 264–272
Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Empirical methods in natural language processing (EMNLP), pp 1532–1543. http://www.aclweb.org/anthology/D14-1162
Peters M, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, vol 1 (long papers). Association for Computational Linguistics, New Orleans, Louisiana, pp 2227–2237. https://doi.org/10.18653/v1/N18-1202. https://www.aclweb.org/anthology/N18-1202
Petrović S, Osborne M, Lavrenko V (2010) The Edinburgh twitter corpus. In: Proceedings of the NAACL HLT 2010 workshop on computational linguistics in a world of social media. Association for Computational Linguistics, Stroudsburg, PA, USA, pp 25–26
Poria S, Cambria E, Gelbukh A (2016) Aspect extraction for opinion mining with a deep convolutional neural network. Knowl Based Syst 108:42–49. https://doi.org/10.1016/j.knosys.2016.06.009 New Avenues in Knowledge Bases for Natural Language Processing
Pouyanfar S, Sadiq S, Yan Y, Tian H, Tao Y, Reyes MP, Shyu ML, Chen SC, Iyengar SS (2018) A survey on deep learning: algorithms, techniques, and applications. ACM Comput Surv 51(5):1–36
Prusa J, Khoshgoftaar TM, Dittman DJ (2015) Using ensemble learners to improve classifier performance on tweet sentiment data. In: 2015 IEEE international conference on information reuse and integration, pp 252–257
Reyes A, Rosso P, Veale T (2013) A multidimensional approach for detecting irony in twitter. Lang Resour Eval 47(1):239–268
Rosenthal S, Farra N, Nakov P (2017) SemEval-2017 task 4: sentiment analysis in Twitter. In: Proceedings of the 11th international workshop on semantic evaluation (SemEval 2017), SemEval’17. Association for Computational Linguistics, Vancouver, Canada
Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules. In: Proceedings of the 31st international conference on neural information processing systems, NIPS’17. Curran Associates Inc., Red Hook, NY, USA, pp 3859–3869
Saif H (2015) Semantic sentiment analysis of microblogs. Ph.D. thesis, The Open University. http://oro.open.ac.uk/44063/
Saif H, Fernandez M, He Y, Alani H (2013) Evaluation datasets for Twitter sentiment analysis: a survey and a new dataset, the STS-Gold. In: Proceedings of the 1st workshop on emotion and sentiment in social and expressive media
Saif H, He Y, Alani H (2012) Alleviating data sparsity for Twitter sentiment analysis. In: Proceedings of the 2nd workshop on making sense of microposts. CEUR-WS, pp 2–9
Satapathy R, Guerreiro C, Chaturvedi I, Cambria E (2017) Phonetic-based microtext normalization for twitter sentiment analysis. In: 2017 IEEE international conference on data mining workshops (ICDMW), pp 407–413. https://doi.org/10.1109/ICDMW.2017.59
Siddiqua UA, Ahsan T, Chy AN (2016) Combining a rule-based classifier with ensemble of feature sets and machine learning techniques for sentiment analysis on microblog. In: 2016 19th international conference on computer and information technology (ICCIT), pp 304–309
Sousa L, de Mello R, Cedrim D, Garcia A, Missier P, Uchôa A, Oliveira A, Romanovsky A (2018) Vazadengue: an information system for preventing and combating mosquito-borne diseases with social networks. Inf Syst 75:26–42. https://doi.org/10.1016/j.is.2018.02.003
Speriosu M, Sudan N, Upadhyay S, Baldridge J (2011) Twitter polarity classification with label propagation over lexical links and the follower graph. In: Proceedings of the 1st workshop on unsupervised learning in NLP. Association for Computational Linguistics, pp 53–63
Tang D, Wei F, Yang N, Zhou M, Liu T, Qin B (2014) Learning sentiment-specific word embedding for twitter sentiment classification. In: Proceedings of the 52nd annual meeting of the association for computational linguistics (vol 1: long papers). Association for Computational Linguistics, Baltimore, Maryland, pp 1555–1565. https://doi.org/10.3115/v1/P14-1146. https://www.aclweb.org/anthology/P14-1146
Thelwall M, Buckley K, Paltoglou G (2012) Sentiment strength detection for the social web. J Am Soc Inform Sci Technol 63(1):163–173
Ting KM, Witten IH (1999) Issues in stacked generalization. J Artif Intell Res 10:271–289
Tumasjan A, Sprenger TO, Sandner PG, Welpe IM (2010) Predicting elections with twitter: what 140 characters reveal about political sentiment. In: Fourth international AAAI conference on weblogs and social media
Turney PD (2002) Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th annual meeting on association for computational linguistics, ACL’02. Association for Computational Linguistics, USA, pp 417–424
Valdivia A, Luzón MV, Herrera F (2017) Sentiment analysis in tripadvisor. IEEE Intell Syst 32(4):72–77
Vo D, Zhang Y (2016) Don’t count, predict! an automatic approach to learning sentiment lexicons for short text. In: Proceedings of the 54th annual meeting of the association for computational linguistics. Association for Computing Machinery
Vo DT, Zhang Y (2015) Target-dependent twitter sentiment classification with rich automatic features. In: Proceedings of the 24th international conference on artificial intelligence, IJCAI’15. AAAI Press, pp 1347–1353
Vosoughi S, Vijayaraghavan P, Roy D (2016) Tweet2vec: Learning tweet embeddings using character-level cnn-lstm encoder-decoder. In: Proceedings of the 39th international ACM SIGIR conference on research and development in information retrieval, SIGIR’16. ACM, New York, NY, USA, pp 1041–1044
Wang B, Liakata M, Zubiaga A, Procter R (2017) TDParse: multi-target-specific sentiment recognition on twitter. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics: volume 1, long papers. Association for Computational Linguistics, Valencia, Spain, pp 483–493
Wang H, Can D, Kazemzadeh A, Bar F, Narayanan S (2012) A system for real-time twitter sentiment analysis of 2012 U.S. presidential election cycle. In: Proceedings of the ACL 2012 system demonstrations, ACL’12. Association for Computational Linguistics, USA, pp 115–120
Wang H, Can D, Kazemzadeh A, Bar F, Narayanan S (2012) A system for real-time Twitter sentiment analysis of 2012 US presidential election cycle. In: Proceedings of the ACL 2012 system demonstrations. Association for Computational Linguistics, pp 115–120
Wasden L (2010) Internet lingo dictionary: a parents’ guide to codes used in chat rooms, instant messaging, text messaging, and blogs. Technical report, Office of the Attorney General
Wiegand M, Balahur A, Roth B, Klakow D, Montoyo A (2010) A survey on the role of negation in sentiment analysis. In: Proceedings of the workshop on negation and speculation in natural language processing. Association for Computational Linguistics, pp 60–68
Wilson T, Wiebe J, Hoffmann P (2005) Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of the conference on human language technology and empirical methods in natural language processing. Association for Computational Linguistics, pp 347–354
Wolpert DH (1992) Stacked generalization. Neural Netw 5(2):241–259. https://doi.org/10.1016/S0893-6080(05)80023-1
Xing FZ, Cambria E, Welsch RE (2018) Intelligent asset allocation via market sentiment views. IEEE Comput Intell Mag 13(4):25–34
Xing FZ, Cambria E, Zhang Y (2019) Sentiment-aware volatility forecasting. Knowl Based Syst 176:68–76
Xu P, Madotto A, Wu C, Park JH, Fung P (2018) Emo2vec: learning generalized emotion representation by multi-task training. In: Proceedings of the EMNLP WASSA workshop
Yang M, Zhao W, Ye J, Lei Z, Zhao Z, Zhang S (2018) Investigating capsule networks with dynamic routing for text classification. In: Proceedings of the 2018 conference on empirical methods in natural language processing. Association for Computational Linguistics, Brussels, Belgium, pp 3110–3119
Yoo S, Song J, Jeong O (2018) Social media contents based sentiment analysis and prediction system. Expert Syst Appl 105:102–111
Young T, Hazarika D, Poria S, Cambria E (2018) Recent trends in deep learning based natural language processing [review article]. IEEE Comput Intell Mag 13(3):55–75. https://doi.org/10.1109/mci.2018.2840738
Zhang CX, Duin RP (2011) An experimental study of one- and two-level classifier fusion for different sample sizes. Pattern Recognit Lett 32(14):1756–1767. https://doi.org/10.1016/j.patrec.2011.07.009
Zhang L, Ghosh R, Dekhil M, Hsu M, Liu B (2011) Combining lexicon-based and learning-based methods for Twitter sentiment analysis. Technical report HPL-2011-89, HP Laboratories
Zhao W, Peng H, Eger S, Cambria E, Yang M (2019) Towards scalable and reliable capsule networks for challenging NLP applications. In: Proceedings of the 57th annual meeting of the association for computational linguistics. Association for Computational Linguistics, Florence, Italy, pp 1549–1559
Zimbra D, Abbasi A, Zeng D, Chen H (2018) The state-of-the-art in twitter sentiment analysis: a review and benchmark evaluation. ACM Trans Manag Inf Syst. https://doi.org/10.1145/3185045
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Carvalho, J., Plastino, A. On the evaluation and combination of state-of-the-art features in Twitter sentiment analysis. Artif Intell Rev 54, 1887–1936 (2021). https://doi.org/10.1007/s10462-020-09895-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-020-09895-6