Abstract
Online reviews are becoming increasingly important for decision-making. Consumers often refer to online reviews for opinions before making a purchase. Marketers also acknowledge the importance of online reviews and use them to improve product success. However, the massive amount of online review data, as well as its unstructured nature, is a challenge for anyone wanting to derive a conclusion quickly. In this paper, we propose a novel framework for gauging the ratings of online reviews using machine learning techniques. This framework uses a combination of text pre-processing and feature extraction methods. Here, we investigate four different aspects of the new framework. First, we assess the performance of single and ensemble classifiers in predicting sentiment—positive or negative—initially on a specific dataset (Yelp), but subsequently also on two other datasets (Amazon's product reviews and a movie review dataset). Second, using the best identified classifiers, we improve the accuracy with which neutral polarity can be predicted, an ability largely overlooked in the literature. Third, we further improve the performance of these classifiers by testing different pre-processing and feature extraction methods. Finally, we measure how well our deep learning approach performs on the same task compared to the best previously identified classifiers. Our extensive testing shows that the linear-kernel support vector machine, logistic regression and multilayer perceptron are the three best single classifiers in terms of accuracy, precision, recall, and F-measure. Their performance could be further improved if they were used as base classifiers for ensemble models. We also observe that several text pre-processing techniques—negation word identification, word elongation correction, and part of speech lemmatisation (combined with Terms Frequency and N-gram words)—can increase accuracy. In addition, we demonstrate that the general sentiment of lexicons such as SentiWordNet 3.0 and SenticNet 4 can be used to generate features with good results, although deep learning models can perform equally well. Experiments with different datasets confirm that our framework provides consistent outcomes. In particular, we have focused on improving the accuracy of neutral sentiment, and we conclude by showing how this can be achieved without sacrificing the accuracy of positive or negative ratings.
Similar content being viewed by others
References
Fan ZP, Che YJ, Chen ZY (2017) Product sales forecasting using online reviews and historical sales data: a method combining the Bass model and sentiment analysis. J Bus Res 74:90–100
Chua AYK, Banerjee S (2016) Helpfulness of user-generated reviews as a function of review sentiment, product type and information quality. Comput Hum Behav 54:547–554
Liu Y, Bi JW, Fan ZP (2017) Ranking products through online reviews: a method based on sentiment analysis technique and intuitionistic fuzzy set theory. Inform Fusion 36:149–161
Felbermayr A, Nanopoulos A (2016) The role of emotions for the perceived usefulness in online customer reviews. J Interact Market 36:60–76
Ma Y, Chen G, Wei Q (2017) Finding users preferences from large-scale online reviews for personalized recommendation. Electron Commer Res 17(1):3–29
Khan FH, Qamar U, Bashir S (2016) SWIMS: semi-supervised subjective feature weighting and intelligent model selection for sentiment analysis. Knowl Based Syst 100:97–111
Jing N, Jiang T, Du J, Sugumaran V (2018) Personalized recommendation based on customer preference mining and sentiment assessment from a Chinese e-commerce website. Electron Commer Res 18(1):159–179
Zhang H, Rao H, Feng J (2018) Product innovation based on online review data mining: a case study of Huawei phones. Electron Commer Res 18(1):3–22
Tripathy A, Agrawal A, Rath SK (2016) Classification of sentiment reviews using n-gram machine learning approach. Expert Syst Appl 57:117–126
Salehan M, Kim DJ (2016) Predicting the performance of online consumer reviews: a sentiment mining approach to big data analytics. Decis Support Syst 81:30–40
Bagheri A, Saraee M, de Jong F (2013) Care more about customers: unsupervised domain-independent aspect detection for sentiment analysis of customer reviews. Knowl Based Syst 52:201–213
Fersini E, Messina E, Pozzi FA (2016) Expressive signals in social media languages to improve polarity detection. Inf Process Manag 52(1):20–35
Devika MD, Sunitha C, Amal G (2016) Sentiment analysis: a comparative study on different approaches. Proc Comput Sci 87:44–49
Basari ASH, Hussin B, Ananta IGP, Zeniarja J (2013) Opinion mining of movie review using hybrid method of support vector machine and particle swarm optimization. Proc Eng 53:453–462
Khan FH, Qamar U, Bashir S (2016) eSAP: a decision support framework for enhanced sentiment analysis and polarity classification. Inf Sci 367–368:862–873
Khan FH, Qamar U, Bashir S (2016) SentiMI: introducing point-wise mutual information with SentiWordNet to improve sentiment polarity detection. Appl Soft Comput 39:140–153
Katz G, Ofek N, Shapira B (2015) ConSent: context-based sentiment analysis. Knowl Based Syst 84:162–178
Agarwal B, Mittal N, Bansal P, Garg S (2015) Sentiment analysis using common-sense and context information. Comput Intell Neurosci Art 2015:1–9
Araque O, Corcuera-Platas I, Sánchez-Rada JF, Iglesias CA (2017) Enhancing deep learning sentiment analysis with ensemble techniques in social applications. Expert Syst Appl 77:236–246
Bafna K, Toshniwal D (2013) Feature based summarization of customer’s reviews of online products. Proc Comput Sci 22:142–151
Rong W, Nie Y, Ouyang Y, Peng B, Xiong Z (2014) Auto-encoder based bagging architecture for sentiment analysis. J Vis Lang Comput 25(6):840–849
Wang G, Zhang Z, Sun J, Yang S, Larson CA (2015) POS-RS: a random subspace method for sentiment classification based on part-of-speech analysis. Inf Process Manag 51(4):458–479
Abdel Fattah M (2015) New term weighting schemes with combination of multiple classifiers for sentiment analysis. Neurocomputing 167:434–442
Hajmohammadi MS, Ibrahim R, Selamat A, Fujita H (2015) Combination of active learning and self-training for cross-lingual sentiment classification with density analysis of unlabelled samples. Inf Sci 317:67–77
Hung C, Chen SJ (2016) Word sense disambiguation based sentiment lexicons for sentiment classification. Knowl Based Syst 110:224–232
Ikram MT, Butt NA, Afzal MT (2016) Open source software adoption evaluation through feature level sentiment analysis using Twitter data. Turk J Electric Eng Comput Sci 24:4481–4496
Onan A, Korukoğlu S, Bulut H (2016) A multiobjective weighted voting ensemble classifier based on differential evolution algorithm for text sentiment classification. Expert Syst Appl 62:1–16
Vechtomova O (2017) Disambiguating context-dependent polarity of words: an information retrieval approach. Inf Process Manag 53(5):1062–1079
Yousefpour A, Ibrahim R, Hamed HNA (2017) Ordinal-based and frequency-based integration of feature selection methods for sentiment analysis. Expert Syst Appl 75:80–93
Vinodhini G, Chandrasekaran RM (2017) A sampling based sentiment mining approach for e-commerce applications. Inf Process Manag 53(1):223–236
Chen T, Xu R, He Y, Wang X (2017) Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN. Expert Syst Appl 72:221–230
Fernández-Gavilanes M, Álvarez-López T, Juncal-Martínez J, Costa-Montenegro E, Javier González-Castaño F (2016) Unsupervised method for sentiment analysis in online texts. Expert Syst. Appl. 58:57–75
Nowlis SM, Kahn BE, Dhar R (2002) Coping with ambivalence: the effect of removing a neutral option on consumer attitude and preference judgments. J Consum Res 29(3):319–334
Tang T, Fang E, Feng W (2014) Is neutral really neutral? The effects of neutral user-generated content on product sales. J Market Art 78(4):41–58
Gasper K, Hackenbracht J (2014) Too busy to feel neutral: reducing cognitive resources attenuates neutral affective states. Motiv Emot 39(3):458–466
Koppel M, Schler J (2006) The importance of neutral examples for learning sentiment. Comput Intell 22(2):100–109
Baccianella S, Esuli A, Sebastian F (2010) SentiWordNet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation, Valletta, Malta, 2010, pp 2200–2204
Cambria E, Poria S, Bajpai R, Schuller B (2016) SenticNet 4: a semantic resource for sentiment analysis based on conceptual primitives. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, Osaka, Japan, 2016, pp 2666–2677
Manning CD, Raghavan P, Schuetze H (2008) Naïve Bayes text classification. Introduction to information retrieval. Cambridge University Press, Cambridge, pp 234–265
Bramer M (2007) Nearest neighbour classification. Principles of data mining. Springer, London, pp 31–38
Menard S (2010) Logistic regression: from introductory to advanced concepts and applications. SAGE, Los Angeles
Crammer K, Dekel O, Keshet J, Shalev-Shwartz S, Singer Y (2006) Online passive-aggressive algorithms. J Mach Learn Res 7:551–585
Rokach L, Maimon O (2007) Data mining with decision trees: theory and applications. World Scientific Publishing, Singapore
Campbell C, Ying Y (2011) Learning with support vector machines. Morgan & Claypool, San Rafael
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning internal representations by error propagation. In: Asis S (ed) Parallel distributed processing: Explorations in the Microstructure of Cognition, vol 1. MIT Press, Cambridge, pp 318–362
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29(5):1189–1232
Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Mach Learn 63(1):3–42
Zhu J, Zou H, Rosset S, Hastie T (2009) Multi-class AdaBoost. Stat Interface 2:349–360
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. Presented at the 3rd international conference on learning representations, San Diego
Yelp (2017) Yelp dataset challenge: round 9 of the Yelp dataset challenge: our largest yet! https://www.yelp.com.au/dataset_challenge
McAuley J (2014) Amazon product data. http://jmcauley.ucsd.edu/data/amazon/links.html
Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics. Human Language Technologies, Portland, 2011, pp 142–150
Budhi GS, Chiong R, Pranata I, Hu Z (2017) Predicting rating polarity through automatic classification of review texts. In: Proceedings of the 2017 IEEE Conference on Big Data and Analytics (ICBDA), Kuching, Malaysia, 2017, pp 19-24
Wang X, Xu G, Zhang J, Sun X, Wang L, Huang T (2019) Syntax-directed hybrid attention network for aspect-level sentiment analysis. IEEE Access 7:5014–5025
López M, Valdivia A, Martínez-Cámara E, Luzón MV, Herrera F (2019) E2SAM: evolutionary ensemble of sentiment analysis methods for domain adaptation. Inf Sci 480:273–286
Hur M, Kang P, Cho S (2016) Box-office forecasting based on sentiments of movie reviews and independent subspace method. Inf Sci 372:608–624
Zhang L, Jiang L, Li C, Kong G (2016) Two feature weighting approaches for naïve Bayes text classifiers. Knowl Based Syst 100:137–144
Gui L, Zhou Y, Xu R, He Y, Lu Q (2017) Learning representations from heterogeneous network for sentiment classification of product reviews. Knowl Based Syst 124:34–45
Zhang JD, Chow CY (2019) MOCA: multi-objective, collaborative, and attentive sentiment analysis. IEEE Access 7:10927–10936
Pranata I, Susilo W (2016) Are the most popular users always trustworthy? The case of Yelp. Electron Commer Res Appl 20:30–41
NLTK (2019) Nltk package. http://www.nltk.org/api/nltk.html
Bhadane C, Dalal H, Doshi H (2015) Sentiment analysis: measuring opinions. Proc Comput Sci 45:808–814
Scikit-Learn (2019) API reference. http://scikit-learn.org/stable/modules/classes.html
Keras (2019) Keras: the python deep learning library. https://keras.io/
Wang Z, Liu K, Li J, Zhu Y, Zhang Y (2019) Various frameworks and libraries of machine learning and deep learning: a survey. Arch Comput Methods Eng 6:1–24
Hameg S, Lazri M, Ameur S (2016) Using naive bayes classifier for classification of convective rainfall intensities based on spectral characteristics retrieved from SEVIRI. J Earth Syst Sci 125(5):945–955
Hui Z et al (2017) Development of novel in silico model for developmental toxicity assessment by using naïve Bayes classifier method. Reprod Toxicol 71:8–15
Wang S, Jiang L, Li C (2015) Adapting naïve bayes tree for text classification. Knowl Inf Syst 44:77–89
Hu Z, Chiong R, Pranata I, Susilo W, Bao Y (2016) Identifying malicious web domains using machine learning techniques with online credibility and performance data. In: Proceedings of Congress on Evolutionary Computation (CEC), Vancouver, Canada, 2016, pp 5186–5194
Jiang L, Li C, Wang S, Zhang L (2016) Deep feature weighting for naïve bayes and its application to text classification. Eng Appl Artif Intell 52:26–39
Chan TF, Golub GH, LeVeque RJ (1979) Updating formulae and a pairwise algorithm for computing sample variances. Stanford University, New Haven
Peterson LE (2009) K-nearest neighbor. Scholarpedia 4(2):1883
Dramé K, Mougin F, Diallo G (2016) Large scale biomedical texts classification: a kNN and an ESA-based approaches. J Biomed Semant 7:40–53
Hu LY, Huang MW, Ke SW, Tsai CF (2016) The distance function effect on k-nearest neighbor classification for medical datasets. SpringerPlus 5:1304–1314
Mengesh TM, Cho HJ, Song HJ, Sungsoo K, Chung TS (2016) New approach to continuous k-nearest neighbor monitoring in a directed road network. Adhoc Sens Wirel Netw 34(1–4):307–321
Pan Z, Wang Y, Ku W (2017) A new general nearest neighbor classification based on the mutual neighborhood information. Knowl Based Syst 121:142–152
Tibshirani R, Hastie T, Narasimhan B, Chu G (2002) Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci USA 99(10):6567–6572
Nelder JA, Wedderburn RWM (1972) Generalized linear models. J R Stat Soc Ser A (General) 135(3):370–384
Hastie TJ, Tibshirani RJ (1990) Generalized additive models. CRC, Boca Raton
Dunteman GH, Ho MHR (2011) Generalized linear models. An introduction to generalized linear models. SAGE Publications Inc., Thousand Oaks, pp 2–6
Dobson AJ, Barnett AG (2008) An introduction to generalized linear models, 3rd edn. CRC Press, Boca Raton
Jurka TP (2012) Maxent: an R package for low-memory multinomial logistic regression with support for semi-automated text classification. R J 4(1):56–59
Bui DDA, Fiol GD, Jonnalagadda S (2016) PDF text classification to leverage information extraction from publication reports. J Biomed Inform 61:141–148
Lu J, Zhao P, Hoi SCH (2016) Online passive-aggressive active learning. Mach Learn 103(2):141–183
Ruhwinaningsih L, Djatna T (2016) A sentiment knowledge discovery model in Twitter’s TV content using stochastic gradient descent algorithm. Telkomnika 14(3):1067–1076
Guo F, Zhang L, Jin S, Tigabu M, Su Z, Wang W (2016) Modeling anthropogenic fire occurrence in the boreal forest of China using logistic regression and random forests. Forests 7(11):250
Murphy KP (2012) Machine learning. MIT Press, Cambridge
Bottou L, Bousquet O (2008) The tradeoffs of large scale learning. Adv Neural Inf Process Syst 20:161–168
Quinlan JR (1986) Induction of decision trees. Mach Learn J Art 1(1):81–106
Hunt EB, Marin J, Stone PJ (1966) Experiments in induction. Academic Press, New York
Luo B, Zeng J, Duan J (2016) Emotion space model for classifying opinions in stock message board. Expert Syst Appl 44:138–146
Xu Z, Li P, Wang Y (2012) Text classifier based on an improved SVM decision tree. Phys Proc 33:1986–1991
Abhishek S, Sugumaran V, Babu DS (2014) Misfire detection in an IC engine using vibration signal and decision tree algorithms. Measurement 50:370–380
Izydorczyk B, Wojciechowski B (2016) Differential diagnosis of eating disorders with the use of classification trees (decision algorithm). Arch Psychiat Psychother 18(4):53–62
Yu D, Mu Y, Jin Y (2017) Rating prediction using review texts with underlying sentiments. Inf Process Lett 117:10–18
Shah YS, Hernandez-Garcia L, Jahanian H, Peltier SJ (2016) Support vector machine classification of arterial volume-weighted arterial spin tagging images. Brain Behav 6:1–8
Sun J, Fujita H, Chen P, Li H (2017) Dynamic financial distress prediction with concept drift based on time weighting combined with Adaboost support vector machine ensemble. Knowl Based Syst 120:4–14
Chiong R, Fan Z, Hu Z, Chiong F (2021) Using an improved relative error support vector machine for body fat prediction. Comput Methods Programs Biomed 198:105749
Lo SL, Chiong R, Cornforth D (2015) Using support vector machine ensembles for target audience classification on Twitter. PLoS ONE 10(4):e0122855
Lo SL, Cornforth D, Chiong R (2014) Identifying the high-value social audience from Twitter through text-mining methods. In: Proceedings of the 18th Asia Pacific Symposium on Intelligent and Evolutionary Systems (IES 2014), Singapore, 2014, pp 325–339
Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):1–27
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. J Mach Learn Res 9:249–256
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. CoRR abs/1412.6980
Adipranata R, Budhi GS, Setiahadi B (2013) Automatic classification of sunspot groups for space weather analysis. Int J Multimed Ubiquit Eng 8(3):41–54
Budhi GS, Adipranata R (2015) Handwritten Javanese character recognition using several artificial neural network methods. J ICT Res Appl 8(3):195–212
Budhi GS, Adipranata R (2014) Java characters recognition using evolutionary neural network and combination of Chi2 and backpropagation neural network. Int J Appl Eng Res 9(22):18025–18036
Sangjae L, Joon YC (2014) Predicting the helpfulness of online reviews using multilayer perceptron neural networks. Expert Syst Appl 41(6):3041–3046
Gaspar R, Pedro C, Panagiotopoulos P, Seibt B (2016) Beyond positive or negative: qualitative sentiment analysis of social media reactions to unexpected stressful events. Comput Hum Behav 56:179–191
Yunfeng W et al (2017) Dysphonic voice pattern analysis of patients in Parkinson’s disease using minimum interclass probability risk feature selection and bagging ensemble learning methods. Comput Math Methods Med 2017:1–11
Wu Q, Ye Y, Zhang H, Ng MK, Ho SS (2014) ForesTexter: an efficient random forest algorithm for imbalanced text categorization. Knowl Based Syst 67:105–116
Asbai N, Amrouche A (2017) Boosting scores fusion approach using front-end diversity and Adaboost algorithm, for speaker verification. Comput Electr Eng 62:648–662
Lee W, Jun CH, Lee JS (2017) Instance categorization by support vector machines to adjust weights in AdaBoost for imbalanced data classification. Inf Sci 381:92–103
González-Recio O, Jiménez-Montero JA, Alenda R (2013) The gradient boosting algorithm and random boosting for genome-assisted evaluation in large data sets. J Dairy Sci 96(1):614–624
Napolitano G, Sting JC, Schmid M, Viviani R (2017) Predicting CYP2D6 phenotype from resting brain perfusion images by gradient boosting. Psychiatry Res Neuroimaging 259:16–24
Dargan S, Kumar M, Ayyagari MR, Kumar G (2019) A survey of deep learning and its applications: a new paradigm to machine learning. Arch Comput Methods Eng 6:1–22
Rojas-Barahona LM (2016) Deep learning for sentiment analysis. Lang Linguist Comp 10(12):701–719
Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60(6):84–90
Dewa CK, Fadhilah AL, Afiahayati A (2018) Convolutional neural networks for handwritten Javanese character recognition. Indones J Comput Cybern Syst 12(1):83–94
Yu Y, Lin H, Meng J, Zhao Z (2016) Visual and textual sentiment analysis of a microblog using deep convolutional neural networks. Algorithms 9:2
Vieira A, Ribeiro B (2018) Deep neural network models. In: Introduction to deep learning business applications for developers: from conversational bots in customer service to medical image processing. Apress
Maas A (2011) Large movie review dataset. http://ai.stanford.edu/~amaas/data/sentiment/
Preston K (2013) Why are grown women typing like thiiiiiiiiiiiiis? https://www.mamamia.com.au/why-do-so-many-people-text-likeee-thiiis/
Norvig P (2016) How to write a spelling corrector. https://norvig.com/spell-correct.html
Lee S, Ha J, Zokhirova M, Moon H, Lee J (2017) Background information of deep learning for structural engineering. Arch Comput Methods Eng 25(1):121–129
Mader K (2019) Simple CNN. https://www.kaggle.com/kmader/simple-cnn
Baraldi L (2019) VGG-16 pre-trained model for Keras. https://gist.github.com/baraldilorenzo/07d7802847aaad0a35d3
Mikolov T, Corrado G, Sutskever I, Chen K, Dean J (2013) Distributed representations of words and phrases and their compositionality. Adv Neural Inf Process Syst 26: 3111–3119
Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: Proceedings of the 31st International Conference on Machine Learning, Beijing, China, vol 32, no. 2, pp 1188–1196
Acknowledgements
The first author would like to acknowledge financial support from the Indonesian Endowment Fund for Education (LPDP), Ministry of Finance, and the Directorate General of Higher Education (DIKTI), Ministry of Education and Culture, The Republic of Indonesia.
Funding
The authors confirm that there is no source of funding for this study.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Human Participants and/or Animals
None.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Budhi, G.S., Chiong, R., Pranata, I. et al. Using Machine Learning to Predict the Sentiment of Online Reviews: A New Framework for Comparative Analysis. Arch Computat Methods Eng 28, 2543–2566 (2021). https://doi.org/10.1007/s11831-020-09464-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11831-020-09464-8