Skip to main content
Log in

Lexicon based semantic detection of sentiments using expected likelihood estimate smoothed odds ratio

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

Sentiment analysis is an active research area in today’s era due to the abundance of opinionated data present on online social networks. Semantic detection is a sub-category of sentiment analysis which deals with the identification of sentiment orientation in any text. Many sentiment applications rely on lexicons to supply features to a model. Various machine learning algorithms and sentiment lexicons have been proposed in research in order to improve sentiment categorization. Supervised machine learning algorithms and domain specific sentiment lexicons generally perform better as compared to the unsupervised or semi-supervised domain independent lexicon based approaches. The core hindrance in the application of supervised algorithms or domain specific sentiment lexicons is the unavailability of sentiment labeled training datasets for every domain. On the other hand, the performance of algorithms based on general purpose sentiment lexicons needs improvement. This research is focused on building a general purpose sentiment lexicon in a semi-supervised manner. The proposed lexicon defines word semantics based on Expected Likelihood Estimate Smoothed Odds Ratio that are then incorporated with supervised machine learning based model selection approach. A comprehensive performance comparison verifies the superiority of our proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. http://www.statista.com/statistics/273018/number-of-internet-users-worldwide/ (Last Accessed: Jan 18, 2016).

  2. https://wordnet.princeton.edu/ (Last Accessed: July 27, 2015).

  3. http://www.noslang.com/dictionary (Last Accessed: January 20, 2016).

  4. http://nlp.stanford.edu/software/tagger.shtml (Last Accessed: July 28, 2015).

  5. https://www.jspell.com/java-spell-checker.html (Last Accessed: January 20, 2016).

  6. http://www.interopia.com/education/all-question-words-in-english/ (Last Accessed: January 20, 2016).

  7. http://nlp.stanford.edu/IR-book/html/htmledition/dropping-common-terms-stop-words-1.html (Last Accessed: January 20, 2016).

  8. http://download.joachims.org/svm_light/current/svm_light_windows64.zip (Last Accessed: August 9, 2015).

  9. http://sentiwordnet.isti.cnr.it/code/SentiWordNetDemoCode.java (Last Accessed: May 18, 2016).

  10. http://liwc.wpengine.com (Last Accessed: May 18, 2016).

  11. http://www.wjh.harvard.edu/~inquirer/ (Last Accessed: May 18, 2016).

  12. https://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html (Last Accessed: May 21, 2016).

  13. http://www.dcs.bbk.ac.uk/~andrius/psenti/ (Last Accessed: May 21, 2016).

  14. http://sentistrength.wlv.ac.uk/ (Last Accessed: May 21, 2016).

  15. https://github.com/DrOttensooser/BiblicalNLPworks/tree/master/SkyDrive/NLP/CommonWorks/Data/Opion-Lexicon-English/SO-CAL (Last Accessed: May 21, 2016).

References

  • Agarwal B, Mittal N (2013) Sentiment classification using rough set based hybrid feature selection. In: Proceedings of the 4th workshop on computational approaches to subjectivity, sentiment & social media analysis (WASSA), pp 115–119, June 2013

  • Agarwal B, Mittal N (2016) Prominent feature extraction for sentiment analysis. Springer book series: socio-affective computing series. Springer, Berlin

    Book  Google Scholar 

  • Agarwal B, Mittal N, Bansal P, Garg S (2015) Sentiment analysis using common-sense and context information. Comput Intell Neurosci. doi:10.1155/2015/715730

    Google Scholar 

  • Baccianella S, Esuli A, Sebastiani F (2010) SentiWordNet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: International conference on language resources and evaluation (LREC), vol 10, pp 2200–2204

  • Bhaskar J, Sruthi K, Nedungadi P (2015) Hybrid approach for emotion classification of audio conversation based on text and speech mining. Proc Comput Sci 46:635–643

    Article  Google Scholar 

  • Blitzer J, Dredze M, Pereira F (2007) Biographies, Bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. In: ACL, vol 7, pp 440–447, June 2007

  • Cambria E, Havasi C, Hussain A (2012 May) SenticNet 2: a semantic and affective resource for opinion mining and sentiment analysis. In: FLAIRS conference, pp 202–207

  • Dang Y, Zhang Y, Chen H (2010) A lexicon-enhanced method for sentiment classification: an experiment on online product reviews. Intell Syst IEEE 25(4):46–53

    Article  Google Scholar 

  • Demiroz G, Yanikoglu B, Tapucu D, Saygin Y (2012) Learning domain-specific polarity lexicons. In: 2012 IEEE 12th international conference on data mining workshops (ICDMW). IEEE, pp 674–679, Dec 2012

  • Dhande LL, Patnaik GK (2014) Analyzing sentiment of movie review data using Naive Bayes neural classifier. Int J Emerg Trends Technol Comput Sci (IJETTCS)

  • Franco-Salvador M, Cruz F, Troyano JA, Rosso P (2015) Cross-domain polarity classification using a knowledge-enhanced meta-classifier. Knowl Based Syst 86:46–56

    Article  Google Scholar 

  • Galavotti L, Sebastiani F, Simi M (2000) Experiments on the use of feature selection and negative evidence in automated text categorization. In: Proceedings of ECDL-00, 4th European conference on research and advanced technology for digital libraries, Lisbon, Portugal, pp 59–68

  • Ghosh M, Kar A (2013) Unsupervised linguistic approach for sentiment classification from online reviews using SentiWordNet 3.0. Int J Eng Res Technol 2(9)

  • Ghosh A, Li G, Veale T, Rosso P, Shutova, E, Reyes A, Barnden J (2015) Semeval-2015 task 11: sentiment analysis of figurative language in Twitter. In: International workshop on semantic evaluation (SemEval-2015), June 2015

  • Habernal I, Ptáček T, Steinberger J (2014) Supervised sentiment analysis in Czech social media. Inf Process Manag 50(5):693–707

    Article  Google Scholar 

  • Hamouda A, Marei M, Rohaim M (2011) Building machine learning based senti-word lexicon for sentiment analysis. J Adv Inf Technol 2(4):199–203

    Google Scholar 

  • He Y, Zhou D (2011) Self-training from labeled features for sentiment analysis. Inf Process Manag 47(4):606–616

    Article  MathSciNet  Google Scholar 

  • Hu ZH, Li YG, Cai YZ, Xu XM (2004) An empirical comparison of ensemble classification algorithms with support vector machines. In: Proceedings of 2004 international conference on machine learning and cybernetics, 2004, vol 6. IEEE, pp 3520-3523, Aug 2004

  • Hung C, Lin HK (2013) Using objective words in SentiWordNet to improve word-of-mouth sentiment classification. IEEE Intell Syst 2:47–54

    Article  Google Scholar 

  • Kalaivani P, Shunmuganathan KL (2015) Feature reduction based on genetic algorithm and hybrid model for opinion mining. Sci Program. doi:10.1155/2015/961454

    Google Scholar 

  • Lin C, He Y, Everson Y (2010) A comparative study of Bayesian models for unsupervised sentiment. In: Proceedings of the fourteenth conference on computational natural language learning, pp 144–152, Uppsala, Sweden

  • Liu B, Blasch E, Chen Y, Shen D, Chen G (2013) Scalable sentiment classification for big data analysis using Naive Bayes classifier. In: IEEE international conference on big data, 2013. IEEE, pp 99–104, Oct 2013

  • Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: Human Language Technologies, vol 1. Association for Computational Linguistics, pp 142–150, June 2011

  • Medhat W, Hassan A, Korashy H (2014) Sentiment analysis algorithms and applications: a survey. Ain Shams Eng J 5(4):1093–1113

    Article  Google Scholar 

  • Memon N, Xu JJ, Hicks DL, Chen H (2010) Data mining for social network data. Ann Inf Syst 12:1–215

    Article  Google Scholar 

  • Mladeni D (1998) Machine learnimg on non-homogeneous, distributed text data. PhD dissertation, University of Ljubljana, Slovenia

  • Molina-González MD, Martínez-Cámara E, Martín-Valdivia MT, Ureña-López LA (2015) A Spanish semantic orientation approach to domain adaptation for polarity classification. Inf Process Manag 51:520–531

    Article  Google Scholar 

  • Mudinas A, Zhang D, Levene M (2012) Combining lexicon and learning based approaches for concept-level sentiment analysis. In: Proceedings of the first international workshop on issues of sentiment discovery and opinion mining. ACM, p 5, Aug 2012

  • Nguyen NT, Kim CG, Janiak A (2011) Intelligent information and database systems. Springer, Berlin

    Google Scholar 

  • Ohana B, Tierney B (2009) Sentiment classification of reviews using SentiWordNet. In: 9th IT and T conference, p 13, Oct 2009

  • Pang B, Lee L (2004) A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd annual meeting on Association for Computational Linguistics. Association for Computational Linguistics, p 271, July 2004

  • Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2:1–135

    Article  Google Scholar 

  • Pang L, Lee L, Vaithyanathan S (2002) Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP), pp 79–86

  • Park S, Lee W, Moon IC (2015) Efficient extraction of domain specific sentiment lexicon with active learning. Pattern Recognit Lett 56:38–44

    Article  Google Scholar 

  • Ponti MP Jr (2011) Combining classifiers: from the creation of ensembles to the decision fusion. In: 2011 24th SIBGRAPI conference on graphics, patterns and images tutorials (SIBGRAPI-T). IEEE, pp 1–10, Aug 2011

  • Ravi K, Ravi V (2015) A survey on opinion mining and sentiment analysis: tasks, approaches and applications. Knowl Based Syst. doi:10.1016/j.knosys.2015.06.015

    Google Scholar 

  • Reyes A, Rosso P (2014) On the difficulty of automatically detecting irony: beyond a simple case of negation. Knowl Inf Syst 40(3):595–614

    Article  Google Scholar 

  • Rice DR, Zorn C (2013) Corpus-based dictionaries for sentiment analysis of specialized vocabularies. In: Proceedings of NDATAD

  • Saif H, He Y, Fernandez M, Alani H (2015) Contextual semantics for sentiment analysis of Twitter. Inf Process Manag. doi:10.1016/j.ipm.2015.01.005

    Google Scholar 

  • Sharma A, Dey S (2012) Performance investigation of feature selection methods and sentiment lexicons for sentiment analysis. Special issue of Int J Comput Appl Adv Comput Commun Technol HPC Appl ACCTHPCA (0975-8887)

  • Singh PK, Husain MS (2014) Methodological study of opinion mining and sentiment analysis techniques. Int J Soft Comput 5(1):11

    Article  Google Scholar 

  • Socher R, Pennington J, Huang EH, Ng AY, Manning CD (2011) Semi-supervised recursive autoencoders for predicting sentiment distributions. In: Proceedings of the conference on empirical methods in natural language processing, pp 151–161

  • Su F, Markert K (2008) From words to senses: a case study of subjectivity recognition. In: Proceedings of the 22nd international conference on computational linguistics, vol 1. Association for Computational Linguistics, pp 825–832, Aug 2008

  • Taboada M, Brooke J, Tofiloski M, Voll K, Stede M (2011) Lexicon-based methods for sentiment analysis. Comput Linguist 37(2):267–307

    Article  Google Scholar 

  • Varela PL, Martins AF, Aguiar PM, Figueiredo MA (2013) An empirical study of feature selection for sentiment analysis. In: 9th conference on telecommunications, Conftele, Castelo Branco, May 2013

  • Verma S, Bhattacharyya P (2009) Incorporating semantic knowledge for sentiment analysis. In: Proceedings of 6th international conference on natural language processing

  • Wang G, Sun J, Ma J, Xu K, Gu J (2014) Sentiment classification: the contribution of ensemble learning. Decis Support Syst 57:77–93

    Article  Google Scholar 

  • Wiebe J, Wilson T, Bruce R, Bell M, Martin M (2004) Learning subjective language. Comput Linguist 30(3):277–308

    Article  Google Scholar 

  • Xia R, Zong C, Li S (2011) Ensemble of feature sets and classification algorithms for sentiment classification. Inf Sci 181(6):1138–1152

    Article  Google Scholar 

  • Yang Y, Pedersen JO(1997) A comparative study on feature selection in text categorization. In: ICML, vol 97, pp 412-420, July 1997

  • Zhou S, Chen Q, Wang X, Li X (2014) Hybrid deep belief networks for semi-supervised sentiment classification. In: Proceedings of COLING 2014, the 25th international conference on computational linguistics. Technical papers, pp 1341–1349

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Farhan Hassan Khan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Khan, F.H., Qamar, U. & Bashir, S. Lexicon based semantic detection of sentiments using expected likelihood estimate smoothed odds ratio. Artif Intell Rev 48, 113–138 (2017). https://doi.org/10.1007/s10462-016-9496-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-016-9496-4

Keywords

Navigation