Skip to main content

How Dimensionality Reduction Affects Sentiment Analysis NLP Tasks: An Experimental Study

  • Conference paper
  • First Online:
Artificial Intelligence Applications and Innovations (AIAI 2022)

Abstract

Dimensionality reduction is a well-known technique for limiting the size of the feature space and for discovering latent meaningful variables in the input data. It is particularly valuable when the raw data is sparse and its processing by machine learning algorithms becomes computationally very expensive. On the other hand, sentiment analysis refers to a collection of text classification methods that identify the polarity of the user opinions in blog posts, reviews, tweets, etc. However, since text is naturally very sparse, training classification models is often intractable, rendering the importance of dimensionality reduction even greater. In this paper we study the impact of dimensionality reduction in sentiment analysis classification tasks. Through extensive experimentation with traditional algorithms and benchmark datasets, we verify the general intuition that the dimensionality reduction methods significantly improve the data preprocessing times and the model training durations, while they sacrifice only small amounts of accuracy. Simultaneously, we highlight several exceptions to this rule, where the training times actually increase and the accuracy losses are significant.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/lakritidis/SADR.

  2. 2.

    https://www.kaggle.com/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews.

  3. 3.

    https://www.kaggle.com/crowdflower/twitter-airline-sentiment.

  4. 4.

    https://www.kaggle.com/vivekrathi055/sentiment-analysis-on-financial-tweets.

  5. 5.

    https://jmcauley.ucsd.edu/data/amazon/.

References

  1. Akritidis, L., Bozanis, P.: A supervised machine learning classification algorithm for research articles. In: Proceedings of the 28th ACM Symposium on Applied Computing, pp. 115–120 (2013)

    Google Scholar 

  2. Akritidis, L., Bozanis, P.: Improving opinionated blog retrieval effectiveness with quality measures and temporal features. World Wide Web 17(4), 777–798 (2013). https://doi.org/10.1007/s11280-013-0237-1

    Article  Google Scholar 

  3. Akritidis, L., Fevgas, A., Bozanis, P.: Effective products categorization with importance scores and morphological analysis of the titles. In: Proceedings of the 30th IEEE International Conference on Tools with Artificial Intelligence, pp. 213–220 (2018)

    Google Scholar 

  4. Boldrini, E., Balahur, A., Martínez-Barco, P., Montoyo, A.: Using EmotiBlog to annotate and analyse subjectivity in the new textual genres. Data Mining Knowl. Discov. 25(3), 603–634 (2012)

    Article  Google Scholar 

  5. Jelodar, H., Wang, Y., Orji, R., Huang, S.: Deep sentiment classification and topic discovery on novel coronavirus or COVID-19 online discussions: NLP using LSTM Recurrent Neural Network approach. IEEE J. Biomed. Health Inf. 24(10), 2733–2742 (2020)

    Article  Google Scholar 

  6. Kaya, T., Bicen, H.: The effects of social media on students’ behaviors; Facebook as a case study. Comput. Human Behav. 59, 374–379 (2016)

    Article  Google Scholar 

  7. Kim, K.: An improved semi-supervised dimensionality reduction using feature weighting: application to sentiment analysis. Exp. Syst. Appl. 109, 49–65 (2018)

    Article  Google Scholar 

  8. Kim, K., Lee, J.: Sentiment visualization and classification via semi-supervised nonlinear dimensionality reduction. Pattern Recogn. 47(2), 758–768 (2014)

    Article  Google Scholar 

  9. Kusner, M., Sun, Y., Kolkin, N., Weinberger, K.: From word embeddings to document distances. In: Proceedings of the 2015 International Conference on Machine Learning, pp. 957–966 (2015)

    Google Scholar 

  10. Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text classification. In: Proceedings of the 29th AAAI Conference on Artificial Intelligence, pp. 2267–2273 (2015)

    Google Scholar 

  11. Lane, P.C., Clarke, D., Hender, P.: On developing robust models for favourability analysis: model choice, feature sets and imbalanced data. Decis. Supp. Syst. 53(4), 712–718 (2012)

    Article  Google Scholar 

  12. Ma, Y., Peng, H., Khan, T., Cambria, E., Hussain, A.: Sentic LSTM: a hybrid network for targeted aspect-based sentiment analysis. Cognit. Comput. 10(4), 639–650 (2018)

    Article  Google Scholar 

  13. Medhat, W., Hassan, A., Korashy, H.: Sentiment analysis algorithms and applications: a survey. Ain Shams Eng. J. 5(4), 1093–1113 (2014)

    Article  Google Scholar 

  14. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

  15. Moraes, R., Valiati, J.F., Neto, W.P.G.: Document-level sentiment classification: an empirical comparison between SVM and ANN. Exp. Syst. Appl. 40(2), 621–633 (2013)

    Article  Google Scholar 

  16. Mukherjee, S., Bhattacharyya, P.: Feature specific sentiment analysis for product reviews. In: Proceedings of the 13th International Conference on Intelligent Text Processing and Computational Linguistics, pp. 475–487 (2012)

    Google Scholar 

  17. Naseem, U., Razzak, I., Musial, K., Imran, M.: Transformer based deep intelligent contextual embedding for twitter sentiment analysis. Future Gen. Comput. Syst. 113, 58–69 (2020)

    Article  Google Scholar 

  18. Ortigosa, A., Martín, J.M., Carro, R.M.: Sentiment analysis in Facebook and its application to e-learning. Comput. Human Behav. 31, 527–541 (2014)

    Article  Google Scholar 

  19. Ouyang, X., Zhou, P., Li, C.H., Liu, L.: Sentiment analysis using convolutional neural network. In: Proceedings of the 2015 IEEE International Conference on Computer and Information Technology, pp. 2359–2364 (2015)

    Google Scholar 

  20. Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1532–1543 (2014)

    Google Scholar 

  21. Shyamasundar, L., Rani, P.J.: Twitter sentiment analysis with different feature extractors and dimensionality reduction using supervised learning algorithms. In: Proceedings of the 2016 IEEE Annual India Conference, pp. 1–6 (2016)

    Google Scholar 

  22. Stieglitz, S., Dang-Xuan, L.: Emotions and information diffusion in social media-sentiment of microblogs and sharing behavior. J. Manag. Inf. Syst. 29(4), 217–248 (2013)

    Article  Google Scholar 

  23. Thelwall, M., Buckley, K., Paltoglou, G.: Sentiment in Twitter events. J. Am. Soc. Inf. Sci. Technol. 62(2), 406–418 (2011)

    Article  Google Scholar 

  24. Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)

    Google Scholar 

  25. Wang, Y., Huang, M., Zhu, X., Zhao, L.: Attention-based lSTM for aspect-level sentiment classification. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 606–615 (2016)

    Google Scholar 

  26. Zhang, T., Xu, B., Thung, F., Haryono, S.A., Lo, D., Jiang, L.: Sentiment analysis for software engineering: how far can pre-trained transformer models go? In: Proceedings of the 2020 IEEE International Conference on Software Maintenance and Evolution, pp. 70–80 (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Leonidas Akritidis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 IFIP International Federation for Information Processing

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Akritidis, L., Bozanis, P. (2022). How Dimensionality Reduction Affects Sentiment Analysis NLP Tasks: An Experimental Study. In: Maglogiannis, I., Iliadis, L., Macintyre, J., Cortez, P. (eds) Artificial Intelligence Applications and Innovations. AIAI 2022. IFIP Advances in Information and Communication Technology, vol 647. Springer, Cham. https://doi.org/10.1007/978-3-031-08337-2_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-08337-2_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-08336-5

  • Online ISBN: 978-3-031-08337-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics