Knowledge and Information Systems

, Volume 51, Issue 3, pp 851–872

A semi-supervised approach to sentiment analysis using revised sentiment strength based on SentiWordNet

Regular Paper

DOI: 10.1007/s10115-016-0993-1

Cite this article as:
Khan, F.H., Qamar, U. & Bashir, S. Knowl Inf Syst (2017) 51: 851. doi:10.1007/s10115-016-0993-1
  • 226 Downloads

Abstract

An immense amount of data is available with the advent of social media in the last decade. This data can be used for sentiment analysis and decision making. The data present on blogs, news/review sites, social networks, etc., are so enormous that manual labeling is not feasible and an automatic approach is required for its analysis. The sentiment of the masses can be understood by analyzing this large scale and opinion rich data. The major issues in the application of automated approaches are data unavailability, data sparsity, domain independence and inadequate performance. This research proposes a semi-supervised sentiment analysis approach that incorporates lexicon-based methodology with machine learning in order to improve sentiment analysis performance. Mathematical models such as information gain and cosine similarity are employed to revise the sentiment scores defined in SentiWordNet. This research also emphasizes on the importance of nouns and employs them as semantic features with other parts of speech. The evaluation of performance measures and comparison with state-of-the-art techniques proves that the proposed approach is superior.

Keywords

Sentiment analysis Polarity classification Support vector machine Cosine similarity Information gain 

Copyright information

© Springer-Verlag London 2016

Authors and Affiliations

  1. 1.Department of Computer Engineering, College of Electrical and Mechanical EngineeringNational University of Sciences and Technology (NUST)IslamabadPakistan

Personalised recommendations