Abstract
In this research, we look at the notions of objectivity and subjectivity and create word embeddings from them for the purpose of sentiment analysis. We created word vectors from two datasets, the Wikipedia English Dataset for objectivity and the Amazon Product Reviews Data dataset for subjectivity. A model incorporating an Attention Mechanism was proposed. The proposed Attention model was compared to Logistic Regression, Linear Support Vector Classification models, and the former was able to achieve the highest accuracy with large enough data through augmentation. In the case of objectivity and subjectivity, models trained with the objectivity word embeddings performed worse than their counterpart. However, when compared to the BERT model, a model also with Attention Mechanism but has its own word embedding technique, the BERT model achieved higher accuracy even though model training was performed with only transfer learning.
Keywords
- Sentiment analysis
- Objectivity
- Subjectivity
- Word vectors
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008
Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12(Aug):2493–2537
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119
Bengio Y, Ducharme R, Vincent P, Jauvin C (2003) A neural probabilistic language model. J Mach Learn Res 3(Feb):1137–1155
Collobert R, Weston J (2008) A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th international conference on Machine learning, pp 160–167
Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146
Gurevych I (2005) Using the structure of a conceptual network in computing semantic relatedness. In: International conference on natural language processing, pp 767–778. Springer, Heidelberg
Zesch T, Gurevych I (2006) Automatically creating datasets for measures of semantic relatedness. In Proceedings of the workshop on linguistic distances, pp 16–24
Finkelstein L, Gabrilovich E, Matias Y, Rivlin E, Solan Z, Wolfman G, Ruppin E (2001) Placing search in context: the concept revisited. In: Proceedings of the 10th international conference on World Wide Web, pp 406–414
Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. arXiv preprint arXiv:1802.05365
Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng AY, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 conference on empirical methods in natural language processing, pp 1631–1642
Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805
Wang A, Singh A, Michael J, Hill F, Levy O, Bowman SR (2018) Glue: a multi-task benchmark and analysis platform for natural language understanding. arXiv preprint arXiv:1804.07461
Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, …, Stoyanov V (2019) Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692
Tkachenko M, Chia CC, Lauw H (2018) Searching for the x-factor: exploring corpus subjectivity for word embeddings. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp 1212–1221
Wikimedia.org (n.d.) Wikimedia Downloads. Wikimedia.org. https://dumps.wikimedia.org/backup-index.html
He R, McAuley J (2016) Ups and downs: modeling the visual evolution of fashion trends with one-class collaborative filtering. In: Proceedings of the 25th international conference on world wide web, pp 507–517
Pang B, Lee L, Vaithyanathan S (2002) Thumbs up?: Sentiment classification using machine learning techniques. In: Proceedings of the ACL 2002 conference on Empirical methods in natural language processing, vol 10, pp 79–86. Association for Computational Linguistics
Liu H (2017) Sentiment analysis of citations using word2vec. arXiv preprint arXiv:1704.00177
Chorowski JK, Bahdanau D, Serdyuk D, Cho K, Bengio Y (2015) Attention-based models for speech recognition. In: Advances in neural information processing systems, pp 577–585
Wei JW, Zou K (2019) Eda: easy data augmentation techniques for boosting performance on text classification tasks. arXiv preprint arXiv:1901.11196
Zhu Y, Kiros R, Zemel R, Salakhutdinov R, Urtasun R, Torralba A, Fidler S (2015) Aligning books and movies: towards story-like visual explanations by watching movies and reading books. In: Proceedings of the IEEE international conference on computer vision, pp 19–27
Kolar Z, Chen H, Luo X (2018) Transfer learning and deep convolutional neural networks for safety guardrail detection in 2D images. Autom Constr 89:58–70
Menegola A, Fornaciali M, Pires R, Bittencourt FV, Avila S, Valle E (2017) Knowledge transfer for melanoma screening with deep learning. In: 2017 IEEE 14th international symposium on biomedical imaging (ISBI 2017), pp 297–300. IEEE
Lee J, Yoon W, Kim S, Kim D, Kim S, So CH, Kang J (2020) BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36(4):1234–1240
Acknowledgements
This work was supported by the the Ministry of Higher Education, Malaysia, under the Fundamental Research Grant Scheme with grant number FRGS/1/2018/ICT02/MMU/03/6 and Multimedia University, under the CAPEX fund with grant number MMUI/CAPEX170008.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Lee, W.S., Ng, H., Yap, T.T.V., Ho, C.C., Goh, V.T., Tong, H.L. (2021). Attention Models for Sentiment Analysis Using Objectivity and Subjectivity Word Vectors. In: Alfred, R., Iida, H., Haviluddin, H., Anthony, P. (eds) Computational Science and Technology. Lecture Notes in Electrical Engineering, vol 724. Springer, Singapore. https://doi.org/10.1007/978-981-33-4069-5_5
Download citation
DOI: https://doi.org/10.1007/978-981-33-4069-5_5
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-33-4068-8
Online ISBN: 978-981-33-4069-5
eBook Packages: Computer ScienceComputer Science (R0)