Skip to main content
Log in

Using Machine Learning to Predict the Sentiment of Online Reviews: A New Framework for Comparative Analysis

  • Original Paper
  • Published:
Archives of Computational Methods in Engineering Aims and scope Submit manuscript

Abstract

Online reviews are becoming increasingly important for decision-making. Consumers often refer to online reviews for opinions before making a purchase. Marketers also acknowledge the importance of online reviews and use them to improve product success. However, the massive amount of online review data, as well as its unstructured nature, is a challenge for anyone wanting to derive a conclusion quickly. In this paper, we propose a novel framework for gauging the ratings of online reviews using machine learning techniques. This framework uses a combination of text pre-processing and feature extraction methods. Here, we investigate four different aspects of the new framework. First, we assess the performance of single and ensemble classifiers in predicting sentiment—positive or negative—initially on a specific dataset (Yelp), but subsequently also on two other datasets (Amazon's product reviews and a movie review dataset). Second, using the best identified classifiers, we improve the accuracy with which neutral polarity can be predicted, an ability largely overlooked in the literature. Third, we further improve the performance of these classifiers by testing different pre-processing and feature extraction methods. Finally, we measure how well our deep learning approach performs on the same task compared to the best previously identified classifiers. Our extensive testing shows that the linear-kernel support vector machine, logistic regression and multilayer perceptron are the three best single classifiers in terms of accuracy, precision, recall, and F-measure. Their performance could be further improved if they were used as base classifiers for ensemble models. We also observe that several text pre-processing techniques—negation word identification, word elongation correction, and part of speech lemmatisation (combined with Terms Frequency and N-gram words)—can increase accuracy. In addition, we demonstrate that the general sentiment of lexicons such as SentiWordNet 3.0 and SenticNet 4 can be used to generate features with good results, although deep learning models can perform equally well. Experiments with different datasets confirm that our framework provides consistent outcomes. In particular, we have focused on improving the accuracy of neutral sentiment, and we conclude by showing how this can be achieved without sacrificing the accuracy of positive or negative ratings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Fan ZP, Che YJ, Chen ZY (2017) Product sales forecasting using online reviews and historical sales data: a method combining the Bass model and sentiment analysis. J Bus Res 74:90–100

    Article  Google Scholar 

  2. Chua AYK, Banerjee S (2016) Helpfulness of user-generated reviews as a function of review sentiment, product type and information quality. Comput Hum Behav 54:547–554

    Article  Google Scholar 

  3. Liu Y, Bi JW, Fan ZP (2017) Ranking products through online reviews: a method based on sentiment analysis technique and intuitionistic fuzzy set theory. Inform Fusion 36:149–161

    Article  Google Scholar 

  4. Felbermayr A, Nanopoulos A (2016) The role of emotions for the perceived usefulness in online customer reviews. J Interact Market 36:60–76

    Article  Google Scholar 

  5. Ma Y, Chen G, Wei Q (2017) Finding users preferences from large-scale online reviews for personalized recommendation. Electron Commer Res 17(1):3–29

    Article  Google Scholar 

  6. Khan FH, Qamar U, Bashir S (2016) SWIMS: semi-supervised subjective feature weighting and intelligent model selection for sentiment analysis. Knowl Based Syst 100:97–111

    Article  Google Scholar 

  7. Jing N, Jiang T, Du J, Sugumaran V (2018) Personalized recommendation based on customer preference mining and sentiment assessment from a Chinese e-commerce website. Electron Commer Res 18(1):159–179

    Article  Google Scholar 

  8. Zhang H, Rao H, Feng J (2018) Product innovation based on online review data mining: a case study of Huawei phones. Electron Commer Res 18(1):3–22

    Article  Google Scholar 

  9. Tripathy A, Agrawal A, Rath SK (2016) Classification of sentiment reviews using n-gram machine learning approach. Expert Syst Appl 57:117–126

    Article  Google Scholar 

  10. Salehan M, Kim DJ (2016) Predicting the performance of online consumer reviews: a sentiment mining approach to big data analytics. Decis Support Syst 81:30–40

    Article  Google Scholar 

  11. Bagheri A, Saraee M, de Jong F (2013) Care more about customers: unsupervised domain-independent aspect detection for sentiment analysis of customer reviews. Knowl Based Syst 52:201–213

    Article  Google Scholar 

  12. Fersini E, Messina E, Pozzi FA (2016) Expressive signals in social media languages to improve polarity detection. Inf Process Manag 52(1):20–35

    Article  Google Scholar 

  13. Devika MD, Sunitha C, Amal G (2016) Sentiment analysis: a comparative study on different approaches. Proc Comput Sci 87:44–49

    Article  Google Scholar 

  14. Basari ASH, Hussin B, Ananta IGP, Zeniarja J (2013) Opinion mining of movie review using hybrid method of support vector machine and particle swarm optimization. Proc Eng 53:453–462

    Article  Google Scholar 

  15. Khan FH, Qamar U, Bashir S (2016) eSAP: a decision support framework for enhanced sentiment analysis and polarity classification. Inf Sci 367–368:862–873

    Article  Google Scholar 

  16. Khan FH, Qamar U, Bashir S (2016) SentiMI: introducing point-wise mutual information with SentiWordNet to improve sentiment polarity detection. Appl Soft Comput 39:140–153

    Article  Google Scholar 

  17. Katz G, Ofek N, Shapira B (2015) ConSent: context-based sentiment analysis. Knowl Based Syst 84:162–178

    Article  Google Scholar 

  18. Agarwal B, Mittal N, Bansal P, Garg S (2015) Sentiment analysis using common-sense and context information. Comput Intell Neurosci Art 2015:1–9

    Article  Google Scholar 

  19. Araque O, Corcuera-Platas I, Sánchez-Rada JF, Iglesias CA (2017) Enhancing deep learning sentiment analysis with ensemble techniques in social applications. Expert Syst Appl 77:236–246

    Article  Google Scholar 

  20. Bafna K, Toshniwal D (2013) Feature based summarization of customer’s reviews of online products. Proc Comput Sci 22:142–151

    Article  Google Scholar 

  21. Rong W, Nie Y, Ouyang Y, Peng B, Xiong Z (2014) Auto-encoder based bagging architecture for sentiment analysis. J Vis Lang Comput 25(6):840–849

    Article  Google Scholar 

  22. Wang G, Zhang Z, Sun J, Yang S, Larson CA (2015) POS-RS: a random subspace method for sentiment classification based on part-of-speech analysis. Inf Process Manag 51(4):458–479

    Article  Google Scholar 

  23. Abdel Fattah M (2015) New term weighting schemes with combination of multiple classifiers for sentiment analysis. Neurocomputing 167:434–442

    Article  Google Scholar 

  24. Hajmohammadi MS, Ibrahim R, Selamat A, Fujita H (2015) Combination of active learning and self-training for cross-lingual sentiment classification with density analysis of unlabelled samples. Inf Sci 317:67–77

    Article  Google Scholar 

  25. Hung C, Chen SJ (2016) Word sense disambiguation based sentiment lexicons for sentiment classification. Knowl Based Syst 110:224–232

    Article  Google Scholar 

  26. Ikram MT, Butt NA, Afzal MT (2016) Open source software adoption evaluation through feature level sentiment analysis using Twitter data. Turk J Electric Eng Comput Sci 24:4481–4496

    Article  Google Scholar 

  27. Onan A, Korukoğlu S, Bulut H (2016) A multiobjective weighted voting ensemble classifier based on differential evolution algorithm for text sentiment classification. Expert Syst Appl 62:1–16

    Article  Google Scholar 

  28. Vechtomova O (2017) Disambiguating context-dependent polarity of words: an information retrieval approach. Inf Process Manag 53(5):1062–1079

    Article  Google Scholar 

  29. Yousefpour A, Ibrahim R, Hamed HNA (2017) Ordinal-based and frequency-based integration of feature selection methods for sentiment analysis. Expert Syst Appl 75:80–93

    Article  Google Scholar 

  30. Vinodhini G, Chandrasekaran RM (2017) A sampling based sentiment mining approach for e-commerce applications. Inf Process Manag 53(1):223–236

    Article  Google Scholar 

  31. Chen T, Xu R, He Y, Wang X (2017) Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN. Expert Syst Appl 72:221–230

    Article  Google Scholar 

  32. Fernández-Gavilanes M, Álvarez-López T, Juncal-Martínez J, Costa-Montenegro E, Javier González-Castaño F (2016) Unsupervised method for sentiment analysis in online texts. Expert Syst. Appl. 58:57–75

    Article  Google Scholar 

  33. Nowlis SM, Kahn BE, Dhar R (2002) Coping with ambivalence: the effect of removing a neutral option on consumer attitude and preference judgments. J Consum Res 29(3):319–334

    Article  Google Scholar 

  34. Tang T, Fang E, Feng W (2014) Is neutral really neutral? The effects of neutral user-generated content on product sales. J Market Art 78(4):41–58

    Article  Google Scholar 

  35. Gasper K, Hackenbracht J (2014) Too busy to feel neutral: reducing cognitive resources attenuates neutral affective states. Motiv Emot 39(3):458–466

    Article  Google Scholar 

  36. Koppel M, Schler J (2006) The importance of neutral examples for learning sentiment. Comput Intell 22(2):100–109

    Article  MathSciNet  Google Scholar 

  37. Baccianella S, Esuli A, Sebastian F (2010) SentiWordNet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation, Valletta, Malta, 2010, pp 2200–2204

    Google Scholar 

  38. Cambria E, Poria S, Bajpai R, Schuller B (2016) SenticNet 4: a semantic resource for sentiment analysis based on conceptual primitives. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, Osaka, Japan, 2016, pp 2666–2677

  39. Manning CD, Raghavan P, Schuetze H (2008) Naïve Bayes text classification. Introduction to information retrieval. Cambridge University Press, Cambridge, pp 234–265

    Chapter  Google Scholar 

  40. Bramer M (2007) Nearest neighbour classification. Principles of data mining. Springer, London, pp 31–38

    MATH  Google Scholar 

  41. Menard S (2010) Logistic regression: from introductory to advanced concepts and applications. SAGE, Los Angeles

    Book  Google Scholar 

  42. Crammer K, Dekel O, Keshet J, Shalev-Shwartz S, Singer Y (2006) Online passive-aggressive algorithms. J Mach Learn Res 7:551–585

    MathSciNet  MATH  Google Scholar 

  43. Rokach L, Maimon O (2007) Data mining with decision trees: theory and applications. World Scientific Publishing, Singapore

    Book  MATH  Google Scholar 

  44. Campbell C, Ying Y (2011) Learning with support vector machines. Morgan & Claypool, San Rafael

    Book  MATH  Google Scholar 

  45. Rumelhart DE, Hinton GE, Williams RJ (1986) Learning internal representations by error propagation. In: Asis S (ed) Parallel distributed processing: Explorations in the Microstructure of Cognition, vol 1. MIT Press, Cambridge, pp 318–362

    Chapter  Google Scholar 

  46. Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140

    Article  MATH  Google Scholar 

  47. Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  MATH  Google Scholar 

  48. Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29(5):1189–1232

    Article  MathSciNet  MATH  Google Scholar 

  49. Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Mach Learn 63(1):3–42

    Article  MATH  Google Scholar 

  50. Zhu J, Zou H, Rosset S, Hastie T (2009) Multi-class AdaBoost. Stat Interface 2:349–360

    Article  MathSciNet  MATH  Google Scholar 

  51. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  52. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. Presented at the 3rd international conference on learning representations, San Diego

  53. Yelp (2017) Yelp dataset challenge: round 9 of the Yelp dataset challenge: our largest yet! https://www.yelp.com.au/dataset_challenge

  54. McAuley J (2014) Amazon product data. http://jmcauley.ucsd.edu/data/amazon/links.html

  55. Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics. Human Language Technologies, Portland, 2011, pp 142–150

  56. Budhi GS, Chiong R, Pranata I, Hu Z (2017) Predicting rating polarity through automatic classification of review texts. In: Proceedings of the 2017 IEEE Conference on Big Data and Analytics (ICBDA), Kuching, Malaysia, 2017, pp 19-24

  57. Wang X, Xu G, Zhang J, Sun X, Wang L, Huang T (2019) Syntax-directed hybrid attention network for aspect-level sentiment analysis. IEEE Access 7:5014–5025

    Article  Google Scholar 

  58. López M, Valdivia A, Martínez-Cámara E, Luzón MV, Herrera F (2019) E2SAM: evolutionary ensemble of sentiment analysis methods for domain adaptation. Inf Sci 480:273–286

    Article  Google Scholar 

  59. Hur M, Kang P, Cho S (2016) Box-office forecasting based on sentiments of movie reviews and independent subspace method. Inf Sci 372:608–624

    Article  Google Scholar 

  60. Zhang L, Jiang L, Li C, Kong G (2016) Two feature weighting approaches for naïve Bayes text classifiers. Knowl Based Syst 100:137–144

    Article  Google Scholar 

  61. Gui L, Zhou Y, Xu R, He Y, Lu Q (2017) Learning representations from heterogeneous network for sentiment classification of product reviews. Knowl Based Syst 124:34–45

    Article  Google Scholar 

  62. Zhang JD, Chow CY (2019) MOCA: multi-objective, collaborative, and attentive sentiment analysis. IEEE Access 7:10927–10936

    Article  Google Scholar 

  63. Pranata I, Susilo W (2016) Are the most popular users always trustworthy? The case of Yelp. Electron Commer Res Appl 20:30–41

    Article  Google Scholar 

  64. NLTK (2019) Nltk package. http://www.nltk.org/api/nltk.html

  65. Bhadane C, Dalal H, Doshi H (2015) Sentiment analysis: measuring opinions. Proc Comput Sci 45:808–814

    Article  Google Scholar 

  66. Scikit-Learn (2019) API reference. http://scikit-learn.org/stable/modules/classes.html

  67. Keras (2019) Keras: the python deep learning library. https://keras.io/

  68. Wang Z, Liu K, Li J, Zhu Y, Zhang Y (2019) Various frameworks and libraries of machine learning and deep learning: a survey. Arch Comput Methods Eng 6:1–24

    Google Scholar 

  69. Hameg S, Lazri M, Ameur S (2016) Using naive bayes classifier for classification of convective rainfall intensities based on spectral characteristics retrieved from SEVIRI. J Earth Syst Sci 125(5):945–955

    Article  Google Scholar 

  70. Hui Z et al (2017) Development of novel in silico model for developmental toxicity assessment by using naïve Bayes classifier method. Reprod Toxicol 71:8–15

    Article  Google Scholar 

  71. Wang S, Jiang L, Li C (2015) Adapting naïve bayes tree for text classification. Knowl Inf Syst 44:77–89

    Article  Google Scholar 

  72. Hu Z, Chiong R, Pranata I, Susilo W, Bao Y (2016) Identifying malicious web domains using machine learning techniques with online credibility and performance data. In: Proceedings of Congress on Evolutionary Computation (CEC), Vancouver, Canada, 2016, pp 5186–5194

  73. Jiang L, Li C, Wang S, Zhang L (2016) Deep feature weighting for naïve bayes and its application to text classification. Eng Appl Artif Intell 52:26–39

    Article  Google Scholar 

  74. Chan TF, Golub GH, LeVeque RJ (1979) Updating formulae and a pairwise algorithm for computing sample variances. Stanford University, New Haven

    MATH  Google Scholar 

  75. Peterson LE (2009) K-nearest neighbor. Scholarpedia 4(2):1883

    Article  Google Scholar 

  76. Dramé K, Mougin F, Diallo G (2016) Large scale biomedical texts classification: a kNN and an ESA-based approaches. J Biomed Semant 7:40–53

    Article  Google Scholar 

  77. Hu LY, Huang MW, Ke SW, Tsai CF (2016) The distance function effect on k-nearest neighbor classification for medical datasets. SpringerPlus 5:1304–1314

    Article  Google Scholar 

  78. Mengesh TM, Cho HJ, Song HJ, Sungsoo K, Chung TS (2016) New approach to continuous k-nearest neighbor monitoring in a directed road network. Adhoc Sens Wirel Netw 34(1–4):307–321

    Google Scholar 

  79. Pan Z, Wang Y, Ku W (2017) A new general nearest neighbor classification based on the mutual neighborhood information. Knowl Based Syst 121:142–152

    Article  Google Scholar 

  80. Tibshirani R, Hastie T, Narasimhan B, Chu G (2002) Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci USA 99(10):6567–6572

    Article  Google Scholar 

  81. Nelder JA, Wedderburn RWM (1972) Generalized linear models. J R Stat Soc Ser A (General) 135(3):370–384

    Article  Google Scholar 

  82. Hastie TJ, Tibshirani RJ (1990) Generalized additive models. CRC, Boca Raton

    MATH  Google Scholar 

  83. Dunteman GH, Ho MHR (2011) Generalized linear models. An introduction to generalized linear models. SAGE Publications Inc., Thousand Oaks, pp 2–6

    Google Scholar 

  84. Dobson AJ, Barnett AG (2008) An introduction to generalized linear models, 3rd edn. CRC Press, Boca Raton

    Book  MATH  Google Scholar 

  85. Jurka TP (2012) Maxent: an R package for low-memory multinomial logistic regression with support for semi-automated text classification. R J 4(1):56–59

    Article  Google Scholar 

  86. Bui DDA, Fiol GD, Jonnalagadda S (2016) PDF text classification to leverage information extraction from publication reports. J Biomed Inform 61:141–148

    Article  Google Scholar 

  87. Lu J, Zhao P, Hoi SCH (2016) Online passive-aggressive active learning. Mach Learn 103(2):141–183

    Article  MathSciNet  MATH  Google Scholar 

  88. Ruhwinaningsih L, Djatna T (2016) A sentiment knowledge discovery model in Twitter’s TV content using stochastic gradient descent algorithm. Telkomnika 14(3):1067–1076

    Article  Google Scholar 

  89. Guo F, Zhang L, Jin S, Tigabu M, Su Z, Wang W (2016) Modeling anthropogenic fire occurrence in the boreal forest of China using logistic regression and random forests. Forests 7(11):250

    Article  Google Scholar 

  90. Murphy KP (2012) Machine learning. MIT Press, Cambridge

    MATH  Google Scholar 

  91. Bottou L, Bousquet O (2008) The tradeoffs of large scale learning. Adv Neural Inf Process Syst 20:161–168

    Google Scholar 

  92. Quinlan JR (1986) Induction of decision trees. Mach Learn J Art 1(1):81–106

    Article  Google Scholar 

  93. Hunt EB, Marin J, Stone PJ (1966) Experiments in induction. Academic Press, New York

    Google Scholar 

  94. Luo B, Zeng J, Duan J (2016) Emotion space model for classifying opinions in stock message board. Expert Syst Appl 44:138–146

    Article  Google Scholar 

  95. Xu Z, Li P, Wang Y (2012) Text classifier based on an improved SVM decision tree. Phys Proc 33:1986–1991

    Article  Google Scholar 

  96. Abhishek S, Sugumaran V, Babu DS (2014) Misfire detection in an IC engine using vibration signal and decision tree algorithms. Measurement 50:370–380

    Article  Google Scholar 

  97. Izydorczyk B, Wojciechowski B (2016) Differential diagnosis of eating disorders with the use of classification trees (decision algorithm). Arch Psychiat Psychother 18(4):53–62

    Article  Google Scholar 

  98. Yu D, Mu Y, Jin Y (2017) Rating prediction using review texts with underlying sentiments. Inf Process Lett 117:10–18

    Article  MathSciNet  Google Scholar 

  99. Shah YS, Hernandez-Garcia L, Jahanian H, Peltier SJ (2016) Support vector machine classification of arterial volume-weighted arterial spin tagging images. Brain Behav 6:1–8

    Article  Google Scholar 

  100. Sun J, Fujita H, Chen P, Li H (2017) Dynamic financial distress prediction with concept drift based on time weighting combined with Adaboost support vector machine ensemble. Knowl Based Syst 120:4–14

    Article  Google Scholar 

  101. Chiong R, Fan Z, Hu Z, Chiong F (2021) Using an improved relative error support vector machine for body fat prediction. Comput Methods Programs Biomed 198:105749

    Google Scholar 

  102. Lo SL, Chiong R, Cornforth D (2015) Using support vector machine ensembles for target audience classification on Twitter. PLoS ONE 10(4):e0122855

    Article  Google Scholar 

  103. Lo SL, Cornforth D, Chiong R (2014) Identifying the high-value social audience from Twitter through text-mining methods. In: Proceedings of the 18th Asia Pacific Symposium on Intelligent and Evolutionary Systems (IES 2014), Singapore, 2014, pp 325–339

  104. Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):1–27

    Article  Google Scholar 

  105. Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. J Mach Learn Res 9:249–256

  106. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. CoRR abs/1412.6980

  107. Adipranata R, Budhi GS, Setiahadi B (2013) Automatic classification of sunspot groups for space weather analysis. Int J Multimed Ubiquit Eng 8(3):41–54

    Google Scholar 

  108. Budhi GS, Adipranata R (2015) Handwritten Javanese character recognition using several artificial neural network methods. J ICT Res Appl 8(3):195–212

    Article  Google Scholar 

  109. Budhi GS, Adipranata R (2014) Java characters recognition using evolutionary neural network and combination of Chi2 and backpropagation neural network. Int J Appl Eng Res 9(22):18025–18036

    Google Scholar 

  110. Sangjae L, Joon YC (2014) Predicting the helpfulness of online reviews using multilayer perceptron neural networks. Expert Syst Appl 41(6):3041–3046

    Article  Google Scholar 

  111. Gaspar R, Pedro C, Panagiotopoulos P, Seibt B (2016) Beyond positive or negative: qualitative sentiment analysis of social media reactions to unexpected stressful events. Comput Hum Behav 56:179–191

    Article  Google Scholar 

  112. Yunfeng W et al (2017) Dysphonic voice pattern analysis of patients in Parkinson’s disease using minimum interclass probability risk feature selection and bagging ensemble learning methods. Comput Math Methods Med 2017:1–11

    Google Scholar 

  113. Wu Q, Ye Y, Zhang H, Ng MK, Ho SS (2014) ForesTexter: an efficient random forest algorithm for imbalanced text categorization. Knowl Based Syst 67:105–116

    Article  Google Scholar 

  114. Asbai N, Amrouche A (2017) Boosting scores fusion approach using front-end diversity and Adaboost algorithm, for speaker verification. Comput Electr Eng 62:648–662

    Article  Google Scholar 

  115. Lee W, Jun CH, Lee JS (2017) Instance categorization by support vector machines to adjust weights in AdaBoost for imbalanced data classification. Inf Sci 381:92–103

    Article  Google Scholar 

  116. González-Recio O, Jiménez-Montero JA, Alenda R (2013) The gradient boosting algorithm and random boosting for genome-assisted evaluation in large data sets. J Dairy Sci 96(1):614–624

    Article  Google Scholar 

  117. Napolitano G, Sting JC, Schmid M, Viviani R (2017) Predicting CYP2D6 phenotype from resting brain perfusion images by gradient boosting. Psychiatry Res Neuroimaging 259:16–24

    Article  Google Scholar 

  118. Dargan S, Kumar M, Ayyagari MR, Kumar G (2019) A survey of deep learning and its applications: a new paradigm to machine learning. Arch Comput Methods Eng 6:1–22

    Google Scholar 

  119. Rojas-Barahona LM (2016) Deep learning for sentiment analysis. Lang Linguist Comp 10(12):701–719

    Article  Google Scholar 

  120. Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60(6):84–90

    Article  Google Scholar 

  121. Dewa CK, Fadhilah AL, Afiahayati A (2018) Convolutional neural networks for handwritten Javanese character recognition. Indones J Comput Cybern Syst 12(1):83–94

    Article  Google Scholar 

  122. Yu Y, Lin H, Meng J, Zhao Z (2016) Visual and textual sentiment analysis of a microblog using deep convolutional neural networks. Algorithms 9:2

    Article  MathSciNet  MATH  Google Scholar 

  123. Vieira A, Ribeiro B (2018) Deep neural network models. In: Introduction to deep learning business applications for developers: from conversational bots in customer service to medical image processing. Apress

  124. Maas A (2011) Large movie review dataset. http://ai.stanford.edu/~amaas/data/sentiment/

  125. Preston K (2013) Why are grown women typing like thiiiiiiiiiiiiis? https://www.mamamia.com.au/why-do-so-many-people-text-likeee-thiiis/

  126. Norvig P (2016) How to write a spelling corrector. https://norvig.com/spell-correct.html

  127. Lee S, Ha J, Zokhirova M, Moon H, Lee J (2017) Background information of deep learning for structural engineering. Arch Comput Methods Eng 25(1):121–129

    Article  MathSciNet  MATH  Google Scholar 

  128. Mader K (2019) Simple CNN. https://www.kaggle.com/kmader/simple-cnn

  129. Baraldi L (2019) VGG-16 pre-trained model for Keras. https://gist.github.com/baraldilorenzo/07d7802847aaad0a35d3

  130. Mikolov T, Corrado G, Sutskever I, Chen K, Dean J (2013) Distributed representations of words and phrases and their compositionality. Adv Neural Inf Process Syst 26: 3111–3119

  131. Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: Proceedings of the 31st International Conference on Machine Learning, Beijing, China, vol 32, no. 2, pp 1188–1196

Download references

Acknowledgements

The first author would like to acknowledge financial support from the Indonesian Endowment Fund for Education (LPDP), Ministry of Finance, and the Directorate General of Higher Education (DIKTI), Ministry of Education and Culture, The Republic of Indonesia.

Funding

The authors confirm that there is no source of funding for this study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Raymond Chiong.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Human Participants and/or Animals

None.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Budhi, G.S., Chiong, R., Pranata, I. et al. Using Machine Learning to Predict the Sentiment of Online Reviews: A New Framework for Comparative Analysis. Arch Computat Methods Eng 28, 2543–2566 (2021). https://doi.org/10.1007/s11831-020-09464-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11831-020-09464-8

Navigation