Using Machine Learning to Predict the Sentiment of Online Reviews: A New Framework for Comparative Analysis

Budhi, Gregorius Satia; Chiong, Raymond; Pranata, Ilung; Hu, Zhongyi

doi:10.1007/s11831-020-09464-8

Using Machine Learning to Predict the Sentiment of Online Reviews: A New Framework for Comparative Analysis

Original Paper
Published: 08 January 2021

Volume 28, pages 2543–2566, (2021)
Cite this article

Archives of Computational Methods in Engineering Aims and scope Submit manuscript

Gregorius Satia Budhi^1,2,
Raymond Chiong¹,
Ilung Pranata¹ &
…
Zhongyi Hu³

35 Citations
Explore all metrics

Abstract

Online reviews are becoming increasingly important for decision-making. Consumers often refer to online reviews for opinions before making a purchase. Marketers also acknowledge the importance of online reviews and use them to improve product success. However, the massive amount of online review data, as well as its unstructured nature, is a challenge for anyone wanting to derive a conclusion quickly. In this paper, we propose a novel framework for gauging the ratings of online reviews using machine learning techniques. This framework uses a combination of text pre-processing and feature extraction methods. Here, we investigate four different aspects of the new framework. First, we assess the performance of single and ensemble classifiers in predicting sentiment—positive or negative—initially on a specific dataset (Yelp), but subsequently also on two other datasets (Amazon's product reviews and a movie review dataset). Second, using the best identified classifiers, we improve the accuracy with which neutral polarity can be predicted, an ability largely overlooked in the literature. Third, we further improve the performance of these classifiers by testing different pre-processing and feature extraction methods. Finally, we measure how well our deep learning approach performs on the same task compared to the best previously identified classifiers. Our extensive testing shows that the linear-kernel support vector machine, logistic regression and multilayer perceptron are the three best single classifiers in terms of accuracy, precision, recall, and F-measure. Their performance could be further improved if they were used as base classifiers for ensemble models. We also observe that several text pre-processing techniques—negation word identification, word elongation correction, and part of speech lemmatisation (combined with Terms Frequency and N-gram words)—can increase accuracy. In addition, we demonstrate that the general sentiment of lexicons such as SentiWordNet 3.0 and SenticNet 4 can be used to generate features with good results, although deep learning models can perform equally well. Experiments with different datasets confirm that our framework provides consistent outcomes. In particular, we have focused on improving the accuracy of neutral sentiment, and we conclude by showing how this can be achieved without sacrificing the accuracy of positive or negative ratings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sentiment Analysis of Movie Reviews Using Machine Learning Techniques

Sentiment Analysis by Using Supervised Machine Learning and Deep Learning Approaches

Sentiment analysis on cross-domain textual data using classical and deep learning approaches

Article 28 February 2023

References

Fan ZP, Che YJ, Chen ZY (2017) Product sales forecasting using online reviews and historical sales data: a method combining the Bass model and sentiment analysis. J Bus Res 74:90–100
Article Google Scholar
Chua AYK, Banerjee S (2016) Helpfulness of user-generated reviews as a function of review sentiment, product type and information quality. Comput Hum Behav 54:547–554
Article Google Scholar
Liu Y, Bi JW, Fan ZP (2017) Ranking products through online reviews: a method based on sentiment analysis technique and intuitionistic fuzzy set theory. Inform Fusion 36:149–161
Article Google Scholar
Felbermayr A, Nanopoulos A (2016) The role of emotions for the perceived usefulness in online customer reviews. J Interact Market 36:60–76
Article Google Scholar
Ma Y, Chen G, Wei Q (2017) Finding users preferences from large-scale online reviews for personalized recommendation. Electron Commer Res 17(1):3–29
Article Google Scholar
Khan FH, Qamar U, Bashir S (2016) SWIMS: semi-supervised subjective feature weighting and intelligent model selection for sentiment analysis. Knowl Based Syst 100:97–111
Article Google Scholar
Jing N, Jiang T, Du J, Sugumaran V (2018) Personalized recommendation based on customer preference mining and sentiment assessment from a Chinese e-commerce website. Electron Commer Res 18(1):159–179
Article Google Scholar
Zhang H, Rao H, Feng J (2018) Product innovation based on online review data mining: a case study of Huawei phones. Electron Commer Res 18(1):3–22
Article Google Scholar
Tripathy A, Agrawal A, Rath SK (2016) Classification of sentiment reviews using n-gram machine learning approach. Expert Syst Appl 57:117–126
Article Google Scholar
Salehan M, Kim DJ (2016) Predicting the performance of online consumer reviews: a sentiment mining approach to big data analytics. Decis Support Syst 81:30–40
Article Google Scholar
Bagheri A, Saraee M, de Jong F (2013) Care more about customers: unsupervised domain-independent aspect detection for sentiment analysis of customer reviews. Knowl Based Syst 52:201–213
Article Google Scholar
Fersini E, Messina E, Pozzi FA (2016) Expressive signals in social media languages to improve polarity detection. Inf Process Manag 52(1):20–35
Article Google Scholar
Devika MD, Sunitha C, Amal G (2016) Sentiment analysis: a comparative study on different approaches. Proc Comput Sci 87:44–49
Article Google Scholar
Basari ASH, Hussin B, Ananta IGP, Zeniarja J (2013) Opinion mining of movie review using hybrid method of support vector machine and particle swarm optimization. Proc Eng 53:453–462
Article Google Scholar
Khan FH, Qamar U, Bashir S (2016) eSAP: a decision support framework for enhanced sentiment analysis and polarity classification. Inf Sci 367–368:862–873
Article Google Scholar
Khan FH, Qamar U, Bashir S (2016) SentiMI: introducing point-wise mutual information with SentiWordNet to improve sentiment polarity detection. Appl Soft Comput 39:140–153
Article Google Scholar
Katz G, Ofek N, Shapira B (2015) ConSent: context-based sentiment analysis. Knowl Based Syst 84:162–178
Article Google Scholar
Agarwal B, Mittal N, Bansal P, Garg S (2015) Sentiment analysis using common-sense and context information. Comput Intell Neurosci Art 2015:1–9
Article Google Scholar
Araque O, Corcuera-Platas I, Sánchez-Rada JF, Iglesias CA (2017) Enhancing deep learning sentiment analysis with ensemble techniques in social applications. Expert Syst Appl 77:236–246
Article Google Scholar
Bafna K, Toshniwal D (2013) Feature based summarization of customer’s reviews of online products. Proc Comput Sci 22:142–151
Article Google Scholar
Rong W, Nie Y, Ouyang Y, Peng B, Xiong Z (2014) Auto-encoder based bagging architecture for sentiment analysis. J Vis Lang Comput 25(6):840–849
Article Google Scholar
Wang G, Zhang Z, Sun J, Yang S, Larson CA (2015) POS-RS: a random subspace method for sentiment classification based on part-of-speech analysis. Inf Process Manag 51(4):458–479
Article Google Scholar
Abdel Fattah M (2015) New term weighting schemes with combination of multiple classifiers for sentiment analysis. Neurocomputing 167:434–442
Article Google Scholar
Hajmohammadi MS, Ibrahim R, Selamat A, Fujita H (2015) Combination of active learning and self-training for cross-lingual sentiment classification with density analysis of unlabelled samples. Inf Sci 317:67–77
Article Google Scholar
Hung C, Chen SJ (2016) Word sense disambiguation based sentiment lexicons for sentiment classification. Knowl Based Syst 110:224–232
Article Google Scholar
Ikram MT, Butt NA, Afzal MT (2016) Open source software adoption evaluation through feature level sentiment analysis using Twitter data. Turk J Electric Eng Comput Sci 24:4481–4496
Article Google Scholar
Onan A, Korukoğlu S, Bulut H (2016) A multiobjective weighted voting ensemble classifier based on differential evolution algorithm for text sentiment classification. Expert Syst Appl 62:1–16
Article Google Scholar
Vechtomova O (2017) Disambiguating context-dependent polarity of words: an information retrieval approach. Inf Process Manag 53(5):1062–1079
Article Google Scholar
Yousefpour A, Ibrahim R, Hamed HNA (2017) Ordinal-based and frequency-based integration of feature selection methods for sentiment analysis. Expert Syst Appl 75:80–93
Article Google Scholar
Vinodhini G, Chandrasekaran RM (2017) A sampling based sentiment mining approach for e-commerce applications. Inf Process Manag 53(1):223–236
Article Google Scholar
Chen T, Xu R, He Y, Wang X (2017) Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN. Expert Syst Appl 72:221–230
Article Google Scholar
Fernández-Gavilanes M, Álvarez-López T, Juncal-Martínez J, Costa-Montenegro E, Javier González-Castaño F (2016) Unsupervised method for sentiment analysis in online texts. Expert Syst. Appl. 58:57–75
Article Google Scholar
Nowlis SM, Kahn BE, Dhar R (2002) Coping with ambivalence: the effect of removing a neutral option on consumer attitude and preference judgments. J Consum Res 29(3):319–334
Article Google Scholar
Tang T, Fang E, Feng W (2014) Is neutral really neutral? The effects of neutral user-generated content on product sales. J Market Art 78(4):41–58
Article Google Scholar
Gasper K, Hackenbracht J (2014) Too busy to feel neutral: reducing cognitive resources attenuates neutral affective states. Motiv Emot 39(3):458–466
Article Google Scholar
Koppel M, Schler J (2006) The importance of neutral examples for learning sentiment. Comput Intell 22(2):100–109
Article MathSciNet Google Scholar
Baccianella S, Esuli A, Sebastian F (2010) SentiWordNet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation, Valletta, Malta, 2010, pp 2200–2204
Google Scholar
Cambria E, Poria S, Bajpai R, Schuller B (2016) SenticNet 4: a semantic resource for sentiment analysis based on conceptual primitives. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, Osaka, Japan, 2016, pp 2666–2677
Manning CD, Raghavan P, Schuetze H (2008) Naïve Bayes text classification. Introduction to information retrieval. Cambridge University Press, Cambridge, pp 234–265
Chapter Google Scholar
Bramer M (2007) Nearest neighbour classification. Principles of data mining. Springer, London, pp 31–38
MATH Google Scholar
Menard S (2010) Logistic regression: from introductory to advanced concepts and applications. SAGE, Los Angeles
Book Google Scholar
Crammer K, Dekel O, Keshet J, Shalev-Shwartz S, Singer Y (2006) Online passive-aggressive algorithms. J Mach Learn Res 7:551–585
MathSciNet MATH Google Scholar
Rokach L, Maimon O (2007) Data mining with decision trees: theory and applications. World Scientific Publishing, Singapore
Book MATH Google Scholar
Campbell C, Ying Y (2011) Learning with support vector machines. Morgan & Claypool, San Rafael
Book MATH Google Scholar
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning internal representations by error propagation. In: Asis S (ed) Parallel distributed processing: Explorations in the Microstructure of Cognition, vol 1. MIT Press, Cambridge, pp 318–362
Chapter Google Scholar
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Article MATH Google Scholar
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Article MATH Google Scholar
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29(5):1189–1232
Article MathSciNet MATH Google Scholar
Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Mach Learn 63(1):3–42
Article MATH Google Scholar
Zhu J, Zou H, Rosset S, Hastie T (2009) Multi-class AdaBoost. Stat Interface 2:349–360
Article MathSciNet MATH Google Scholar
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. Presented at the 3rd international conference on learning representations, San Diego
Yelp (2017) Yelp dataset challenge: round 9 of the Yelp dataset challenge: our largest yet! https://www.yelp.com.au/dataset_challenge
McAuley J (2014) Amazon product data. http://jmcauley.ucsd.edu/data/amazon/links.html
Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics. Human Language Technologies, Portland, 2011, pp 142–150
Budhi GS, Chiong R, Pranata I, Hu Z (2017) Predicting rating polarity through automatic classification of review texts. In: Proceedings of the 2017 IEEE Conference on Big Data and Analytics (ICBDA), Kuching, Malaysia, 2017, pp 19-24
Wang X, Xu G, Zhang J, Sun X, Wang L, Huang T (2019) Syntax-directed hybrid attention network for aspect-level sentiment analysis. IEEE Access 7:5014–5025
Article Google Scholar
López M, Valdivia A, Martínez-Cámara E, Luzón MV, Herrera F (2019) E2SAM: evolutionary ensemble of sentiment analysis methods for domain adaptation. Inf Sci 480:273–286
Article Google Scholar
Hur M, Kang P, Cho S (2016) Box-office forecasting based on sentiments of movie reviews and independent subspace method. Inf Sci 372:608–624
Article Google Scholar
Zhang L, Jiang L, Li C, Kong G (2016) Two feature weighting approaches for naïve Bayes text classifiers. Knowl Based Syst 100:137–144
Article Google Scholar
Gui L, Zhou Y, Xu R, He Y, Lu Q (2017) Learning representations from heterogeneous network for sentiment classification of product reviews. Knowl Based Syst 124:34–45
Article Google Scholar
Zhang JD, Chow CY (2019) MOCA: multi-objective, collaborative, and attentive sentiment analysis. IEEE Access 7:10927–10936
Article Google Scholar
Pranata I, Susilo W (2016) Are the most popular users always trustworthy? The case of Yelp. Electron Commer Res Appl 20:30–41
Article Google Scholar
NLTK (2019) Nltk package. http://www.nltk.org/api/nltk.html
Bhadane C, Dalal H, Doshi H (2015) Sentiment analysis: measuring opinions. Proc Comput Sci 45:808–814
Article Google Scholar
Scikit-Learn (2019) API reference. http://scikit-learn.org/stable/modules/classes.html
Keras (2019) Keras: the python deep learning library. https://keras.io/
Wang Z, Liu K, Li J, Zhu Y, Zhang Y (2019) Various frameworks and libraries of machine learning and deep learning: a survey. Arch Comput Methods Eng 6:1–24
Google Scholar
Hameg S, Lazri M, Ameur S (2016) Using naive bayes classifier for classification of convective rainfall intensities based on spectral characteristics retrieved from SEVIRI. J Earth Syst Sci 125(5):945–955
Article Google Scholar
Hui Z et al (2017) Development of novel in silico model for developmental toxicity assessment by using naïve Bayes classifier method. Reprod Toxicol 71:8–15
Article Google Scholar
Wang S, Jiang L, Li C (2015) Adapting naïve bayes tree for text classification. Knowl Inf Syst 44:77–89
Article Google Scholar
Hu Z, Chiong R, Pranata I, Susilo W, Bao Y (2016) Identifying malicious web domains using machine learning techniques with online credibility and performance data. In: Proceedings of Congress on Evolutionary Computation (CEC), Vancouver, Canada, 2016, pp 5186–5194
Jiang L, Li C, Wang S, Zhang L (2016) Deep feature weighting for naïve bayes and its application to text classification. Eng Appl Artif Intell 52:26–39
Article Google Scholar
Chan TF, Golub GH, LeVeque RJ (1979) Updating formulae and a pairwise algorithm for computing sample variances. Stanford University, New Haven
MATH Google Scholar
Peterson LE (2009) K-nearest neighbor. Scholarpedia 4(2):1883
Article Google Scholar
Dramé K, Mougin F, Diallo G (2016) Large scale biomedical texts classification: a kNN and an ESA-based approaches. J Biomed Semant 7:40–53
Article Google Scholar
Hu LY, Huang MW, Ke SW, Tsai CF (2016) The distance function effect on k-nearest neighbor classification for medical datasets. SpringerPlus 5:1304–1314
Article Google Scholar
Mengesh TM, Cho HJ, Song HJ, Sungsoo K, Chung TS (2016) New approach to continuous k-nearest neighbor monitoring in a directed road network. Adhoc Sens Wirel Netw 34(1–4):307–321
Google Scholar
Pan Z, Wang Y, Ku W (2017) A new general nearest neighbor classification based on the mutual neighborhood information. Knowl Based Syst 121:142–152
Article Google Scholar
Tibshirani R, Hastie T, Narasimhan B, Chu G (2002) Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci USA 99(10):6567–6572
Article Google Scholar
Nelder JA, Wedderburn RWM (1972) Generalized linear models. J R Stat Soc Ser A (General) 135(3):370–384
Article Google Scholar
Hastie TJ, Tibshirani RJ (1990) Generalized additive models. CRC, Boca Raton
MATH Google Scholar
Dunteman GH, Ho MHR (2011) Generalized linear models. An introduction to generalized linear models. SAGE Publications Inc., Thousand Oaks, pp 2–6
Google Scholar
Dobson AJ, Barnett AG (2008) An introduction to generalized linear models, 3rd edn. CRC Press, Boca Raton
Book MATH Google Scholar
Jurka TP (2012) Maxent: an R package for low-memory multinomial logistic regression with support for semi-automated text classification. R J 4(1):56–59
Article Google Scholar
Bui DDA, Fiol GD, Jonnalagadda S (2016) PDF text classification to leverage information extraction from publication reports. J Biomed Inform 61:141–148
Article Google Scholar
Lu J, Zhao P, Hoi SCH (2016) Online passive-aggressive active learning. Mach Learn 103(2):141–183
Article MathSciNet MATH Google Scholar
Ruhwinaningsih L, Djatna T (2016) A sentiment knowledge discovery model in Twitter’s TV content using stochastic gradient descent algorithm. Telkomnika 14(3):1067–1076
Article Google Scholar
Guo F, Zhang L, Jin S, Tigabu M, Su Z, Wang W (2016) Modeling anthropogenic fire occurrence in the boreal forest of China using logistic regression and random forests. Forests 7(11):250
Article Google Scholar
Murphy KP (2012) Machine learning. MIT Press, Cambridge
MATH Google Scholar
Bottou L, Bousquet O (2008) The tradeoffs of large scale learning. Adv Neural Inf Process Syst 20:161–168
Google Scholar
Quinlan JR (1986) Induction of decision trees. Mach Learn J Art 1(1):81–106
Article Google Scholar
Hunt EB, Marin J, Stone PJ (1966) Experiments in induction. Academic Press, New York
Google Scholar
Luo B, Zeng J, Duan J (2016) Emotion space model for classifying opinions in stock message board. Expert Syst Appl 44:138–146
Article Google Scholar
Xu Z, Li P, Wang Y (2012) Text classifier based on an improved SVM decision tree. Phys Proc 33:1986–1991
Article Google Scholar
Abhishek S, Sugumaran V, Babu DS (2014) Misfire detection in an IC engine using vibration signal and decision tree algorithms. Measurement 50:370–380
Article Google Scholar
Izydorczyk B, Wojciechowski B (2016) Differential diagnosis of eating disorders with the use of classification trees (decision algorithm). Arch Psychiat Psychother 18(4):53–62
Article Google Scholar
Yu D, Mu Y, Jin Y (2017) Rating prediction using review texts with underlying sentiments. Inf Process Lett 117:10–18
Article MathSciNet Google Scholar
Shah YS, Hernandez-Garcia L, Jahanian H, Peltier SJ (2016) Support vector machine classification of arterial volume-weighted arterial spin tagging images. Brain Behav 6:1–8
Article Google Scholar
Sun J, Fujita H, Chen P, Li H (2017) Dynamic financial distress prediction with concept drift based on time weighting combined with Adaboost support vector machine ensemble. Knowl Based Syst 120:4–14
Article Google Scholar
Chiong R, Fan Z, Hu Z, Chiong F (2021) Using an improved relative error support vector machine for body fat prediction. Comput Methods Programs Biomed 198:105749
Google Scholar
Lo SL, Chiong R, Cornforth D (2015) Using support vector machine ensembles for target audience classification on Twitter. PLoS ONE 10(4):e0122855
Article Google Scholar
Lo SL, Cornforth D, Chiong R (2014) Identifying the high-value social audience from Twitter through text-mining methods. In: Proceedings of the 18th Asia Pacific Symposium on Intelligent and Evolutionary Systems (IES 2014), Singapore, 2014, pp 325–339
Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):1–27
Article Google Scholar
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. J Mach Learn Res 9:249–256
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. CoRR abs/1412.6980
Adipranata R, Budhi GS, Setiahadi B (2013) Automatic classification of sunspot groups for space weather analysis. Int J Multimed Ubiquit Eng 8(3):41–54
Google Scholar
Budhi GS, Adipranata R (2015) Handwritten Javanese character recognition using several artificial neural network methods. J ICT Res Appl 8(3):195–212
Article Google Scholar
Budhi GS, Adipranata R (2014) Java characters recognition using evolutionary neural network and combination of Chi2 and backpropagation neural network. Int J Appl Eng Res 9(22):18025–18036
Google Scholar
Sangjae L, Joon YC (2014) Predicting the helpfulness of online reviews using multilayer perceptron neural networks. Expert Syst Appl 41(6):3041–3046
Article Google Scholar
Gaspar R, Pedro C, Panagiotopoulos P, Seibt B (2016) Beyond positive or negative: qualitative sentiment analysis of social media reactions to unexpected stressful events. Comput Hum Behav 56:179–191
Article Google Scholar
Yunfeng W et al (2017) Dysphonic voice pattern analysis of patients in Parkinson’s disease using minimum interclass probability risk feature selection and bagging ensemble learning methods. Comput Math Methods Med 2017:1–11
Google Scholar
Wu Q, Ye Y, Zhang H, Ng MK, Ho SS (2014) ForesTexter: an efficient random forest algorithm for imbalanced text categorization. Knowl Based Syst 67:105–116
Article Google Scholar
Asbai N, Amrouche A (2017) Boosting scores fusion approach using front-end diversity and Adaboost algorithm, for speaker verification. Comput Electr Eng 62:648–662
Article Google Scholar
Lee W, Jun CH, Lee JS (2017) Instance categorization by support vector machines to adjust weights in AdaBoost for imbalanced data classification. Inf Sci 381:92–103
Article Google Scholar
González-Recio O, Jiménez-Montero JA, Alenda R (2013) The gradient boosting algorithm and random boosting for genome-assisted evaluation in large data sets. J Dairy Sci 96(1):614–624
Article Google Scholar
Napolitano G, Sting JC, Schmid M, Viviani R (2017) Predicting CYP2D6 phenotype from resting brain perfusion images by gradient boosting. Psychiatry Res Neuroimaging 259:16–24
Article Google Scholar
Dargan S, Kumar M, Ayyagari MR, Kumar G (2019) A survey of deep learning and its applications: a new paradigm to machine learning. Arch Comput Methods Eng 6:1–22
Google Scholar
Rojas-Barahona LM (2016) Deep learning for sentiment analysis. Lang Linguist Comp 10(12):701–719
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60(6):84–90
Article Google Scholar
Dewa CK, Fadhilah AL, Afiahayati A (2018) Convolutional neural networks for handwritten Javanese character recognition. Indones J Comput Cybern Syst 12(1):83–94
Article Google Scholar
Yu Y, Lin H, Meng J, Zhao Z (2016) Visual and textual sentiment analysis of a microblog using deep convolutional neural networks. Algorithms 9:2
Article MathSciNet MATH Google Scholar
Vieira A, Ribeiro B (2018) Deep neural network models. In: Introduction to deep learning business applications for developers: from conversational bots in customer service to medical image processing. Apress
Maas A (2011) Large movie review dataset. http://ai.stanford.edu/~amaas/data/sentiment/
Preston K (2013) Why are grown women typing like thiiiiiiiiiiiiis? https://www.mamamia.com.au/why-do-so-many-people-text-likeee-thiiis/
Norvig P (2016) How to write a spelling corrector. https://norvig.com/spell-correct.html
Lee S, Ha J, Zokhirova M, Moon H, Lee J (2017) Background information of deep learning for structural engineering. Arch Comput Methods Eng 25(1):121–129
Article MathSciNet MATH Google Scholar
Mader K (2019) Simple CNN. https://www.kaggle.com/kmader/simple-cnn
Baraldi L (2019) VGG-16 pre-trained model for Keras. https://gist.github.com/baraldilorenzo/07d7802847aaad0a35d3
Mikolov T, Corrado G, Sutskever I, Chen K, Dean J (2013) Distributed representations of words and phrases and their compositionality. Adv Neural Inf Process Syst 26: 3111–3119
Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: Proceedings of the 31st International Conference on Machine Learning, Beijing, China, vol 32, no. 2, pp 1188–1196

Download references

Acknowledgements

The first author would like to acknowledge financial support from the Indonesian Endowment Fund for Education (LPDP), Ministry of Finance, and the Directorate General of Higher Education (DIKTI), Ministry of Education and Culture, The Republic of Indonesia.

Funding

The authors confirm that there is no source of funding for this study.

Author information

Authors and Affiliations

School of Electrical Engineering and Computing, The University of Newcastle, Callaghan, NSW, 2308, Australia
Gregorius Satia Budhi, Raymond Chiong & Ilung Pranata
Informatics Department, Petra Christian University, Surabaya, 60236, Indonesia
Gregorius Satia Budhi
School of Information Management, Wuhan University, Wuhan, 430072, China
Zhongyi Hu

Authors

Gregorius Satia Budhi
View author publications
You can also search for this author in PubMed Google Scholar
Raymond Chiong
View author publications
You can also search for this author in PubMed Google Scholar
Ilung Pranata
View author publications
You can also search for this author in PubMed Google Scholar
Zhongyi Hu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Raymond Chiong.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Human Participants and/or Animals

None.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Budhi, G.S., Chiong, R., Pranata, I. et al. Using Machine Learning to Predict the Sentiment of Online Reviews: A New Framework for Comparative Analysis. Arch Computat Methods Eng 28, 2543–2566 (2021). https://doi.org/10.1007/s11831-020-09464-8

Download citation

Received: 02 December 2019
Accepted: 06 July 2020
Published: 08 January 2021
Issue Date: June 2021
DOI: https://doi.org/10.1007/s11831-020-09464-8

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Using Machine Learning to Predict the Sentiment of Online Reviews: A New Framework for Comparative Analysis

Abstract

Access this article

Similar content being viewed by others

Sentiment Analysis of Movie Reviews Using Machine Learning Techniques

Sentiment Analysis by Using Supervised Machine Learning and Deep Learning Approaches

Sentiment analysis on cross-domain textual data using classical and deep learning approaches

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Human Participants and/or Animals

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Navigation

Using Machine Learning to Predict the Sentiment of Online Reviews: A New Framework for Comparative Analysis

Abstract

Access this article

Similar content being viewed by others

Sentiment Analysis of Movie Reviews Using Machine Learning Techniques

Sentiment Analysis by Using Supervised Machine Learning and Deep Learning Approaches

Sentiment analysis on cross-domain textual data using classical and deep learning approaches

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Human Participants and/or Animals

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation