Abstract
Sentiment analysis (SA) is the key element for a variety of opinion and attitude mining tasks. While various unsupervised SA tools already exist, a central problem is that they are lexicon-based where the lexicons used are limited, leading to a vocabulary mismatch. In this paper, we present an unsupervised word embedding-based sentiment scoring framework for sentiment intensity scoring (SIS). The framework generalizes and combines past works so that pre-existing lexicons (e.g. VADER, LabMT) and word embeddings (e.g. BERT, RoBERTa) can be used to address this problem, with no require training, and while providing fine grained SIS of words and phrases. The framework is scalable and extensible, so that custom lexicons or word embeddings can be used to core methods, and to even create new corpus specific lexicons without the need for extensive supervised learning and retraining. The Python 3 toolkit is open source, freely available from GitHub (https://github.com/cumulative-revelations/awessome) and can be directly installed via pip install awessome.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Dodds, P.S., Harris, K.D., Kloumann, I.M., Bliss, C.A., Danforth, C.M.: Temporal patterns of happiness and information in a global social network: hedonometrics and Twitter. PLoS ONE 6(12), e26752 (2011)
Gilbert, C., Hutto, E.: Vader: a parsimonious rule-based model for sentiment analysis of social media text. In: ICWSM 2014, vol. 81, p. 82 (2014)
Go, A., Bhayani, R., Huang, L.: Twitter sentiment classification using distant supervision. CS224N project report, Stanford 1(12), 2009 (2009)
Htait, A., Fournier, S., Bellot, P., Azzopardi, L., Pasi, G.: Using sentiment analysis for pseudo-relevance feedback in social book search. In: Proceedings of the 2020 ACM SIGIR on International Conference on Theory of Information Retrieval, pp. 29–32 (2020)
Kiritchenko, S., Mohammad, S., Salameh, M.: SemEval-2016 task 7: determining sentiment intensity of English and Arabic phrases. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SEMEVAL-2016), pp. 42–51 (2016)
Liu, B.: Sentiment analysis and opinion mining. Synth. Lect. Hum. Lang. Technol. 5(1), 1–167 (2012)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: ICLR Workshop (2013)
Mohammad, S.M., Bravo-Marquez, F., Salameh, M., Kiritchenko, S.: SemEval-2018 task 1: affect in tweets. In: Proceedings of International Workshop on Semantic Evaluation (SemEval-2018), New Orleans, LA, USA (2018)
Pennebaker, J.W., Francis, M.E., Booth, R.J.: Linguistic inquiry and word count: LIWC 2001. Mahway: Lawrence Erlbaum Assoc. 71(2001), 2001 (2001)
Thelwall, M., Buckley, K., Paltoglou, G., Cai, D., Kappas, A.: Sentiment strength detection in short informal text. J. Am. Soc. Inform. Sci. Technol. 61(12), 2544–2558 (2010)
Wang, Y., Huang, M., Zhu, X., Zhao, L.: Attention-based LSTM for aspect-level sentiment classification. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 606–615 (2016)
Acknowledgement
Cumulative Revelations of Personal Data. This project is supported by the UKRI’s EPSRC under Grant Numbers: EP/R033854/1.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Htait, A., Azzopardi, L. (2021). AWESSOME: An Unsupervised Sentiment Intensity Scoring Framework Using Neural Word Embeddings. In: Hiemstra, D., Moens, MF., Mothe, J., Perego, R., Potthast, M., Sebastiani, F. (eds) Advances in Information Retrieval. ECIR 2021. Lecture Notes in Computer Science(), vol 12657. Springer, Cham. https://doi.org/10.1007/978-3-030-72240-1_56
Download citation
DOI: https://doi.org/10.1007/978-3-030-72240-1_56
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-72239-5
Online ISBN: 978-3-030-72240-1
eBook Packages: Computer ScienceComputer Science (R0)