BERT: a sentiment analysis odyssey

Alaparthi, Shivaji; Mishra, Manit

doi:10.1057/s41270-021-00109-8

BERT: a sentiment analysis odyssey

Original Article
Published: 26 February 2021

Volume 9, pages 118–126, (2021)
Cite this article

Journal of Marketing Analytics Aims and scope Submit manuscript

2104 Accesses
50 Citations
Explore all metrics

Abstract

The study investigates relative effectiveness of four sentiment analysis techniques: (1) unsupervised lexicon-based model using SentiWordNet, (2) traditional supervised machine learning model using logistic regression, (3) supervised deep learning model using Long Short-Term Memory (LSTM), and (4) advanced supervised deep learning model using Bidirectional Encoder Representations from Transformers (BERT). Publicly available labeled corpora of 50,000 movie reviews originally posted on Internet movie database (IMDB) were analyzed. Sentiment classification performance was calibrated on accuracy, precision, recall, and F1 score. The study puts forth two key insights: (1) relative efficacy of four sentiment analysis algorithms and (2) undisputed superiority of pre-trained advanced supervised deep learning algorithm BERT in sentiment classification from text. The study is of value to analytics professionals and academicians working on text analysis as it offers critical insight regarding sentiment classification performance of key algorithms, including the recently developed BERT.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A review on sentiment analysis and emotion detection from text

Article 28 August 2021

Sentiment Analysis in the Age of Generative AI

Article Open access 05 March 2024

Sentiment Analysis in Social Media Data for Depression Detection Using Artificial Intelligence: A Review

Article 19 November 2021

References

Abadi, M., P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, and X. Zheng. 2016. Tensorflow: A system for large-scale machine learning. In 12th USENIX symposium on operating systems design and implementation (OSDI 16): 265–283
Aggarwal, C.C., and C. Zhai, eds. 2012. Mining text data. New York: Springer.
Google Scholar
Ahuja, R., A. Chug, S. Kohli, S. Gupta, and P. Ahuja. 2019. The impact of features extraction on the sentiment analysis. Procedia Computer Science 152: 341–348.
Google Scholar
Archak, N., A. Ghose, and P.G. Ipeirotis. 2011. Deriving the pricing power of product features by mining consumer reviews. Management Science 57 (8): 1485–1509.
Google Scholar
Baccianella, S., A. Esuli, and F. Sebastiani. 2010. SentiWordNet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. LREC 10: 2200–2204.
Google Scholar
Balahur, A., R. Steinberger, M. Kabadjov, V. Zavarella, E. Van Der Goot, M. Halkia, B. Pouliquen, and J. Belyaeva. 2013. Sentiment analysis in the news. https://arxiv.org/ftp/arxiv/papers/1309/1309.6202.pdf.
Berners-Lee, T., and N. Shadbolt. 2011. There's gold to be mined from all our data. http://www.thetimes.co.uk/tto/opinion/columnists/article3272618.ece.
Bird, S., E. Klein, and E. Loper. 2009. Natural language processing with Python: Analyzing text with the natural language toolkit. O'Reilly Media, Inc. https://www.nltk.org/book/.
Bradlow, E.T., M. Gangwar, P. Kopalle, and S. Voleti. 2017. The role of big data and predictive analytics in retailing. Journal of Retailing 93 (1): 79–95.
Google Scholar
Büschken, J., and G.M. Allenby. 2016. Sentence-based text analysis for customer reviews. Marketing Science 35 (6): 953–975.
Google Scholar
Chevalier, J.A., and D. Mayzlin. 2006. The effect of word of mouth on sales: Online book reviews. Journal of Marketing Research 43 (3): 345–354.
Google Scholar
Chintagunta, P.K., S. Gopinath, and S. Venkataraman. 2010. The effects of online user reviews on movie box office performance: Accounting for sequential rollout and aggregation across local markets. Marketing Science 29 (5): 944–957.
Google Scholar
Chollet, F., et al. 2015. Keras. GitHub. https://github.com/fchollet/keras.
Da Silva, N.F., E.R. Hruschka, and E.R. Hruschka Jr. 2014. Tweet sentiment analysis with classifier ensembles. Decision Support System 66: 170–179.
Google Scholar
Das, S.R., and M.Y. Chen. 2007. Yahoo! for Amazon: Sentiment extraction from small talk on the web. Management Science 53 (9): 1375–1388.
Google Scholar
Denecke, K. 2009. Are SentiWordNet scores suited for multi-domain sentiment classification? In Fourth International Conference on Digital Information Management, Ann Arbor, USA: 32–37. https://ieeexplore.ieee.org/abstract/document/5356764.
Devlin, J., M.W. Chang, K. Lee, and K. Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. https://arxiv.org/abs/1810.04805.
Dobre, C., and F. Xhafa. 2014. Intelligent services for big data science. Future Generation Computer Systems 37: 267–281.
Google Scholar
Esuli, A., and F. Sebastiani. 2006. SentiWordNet: A publicly available lexical resource for opinion mining. LREC 6: 417–422.
Google Scholar
Fellbaum, C. 1998. WordNet: An electronic lexical database. Cambridge, MA: MIT Press.
Google Scholar
Gonçalves, P., M. Araújo, F. Benevenuto, and M. Cha. 2013. Comparing and combining sentiment analysis methods. https://arxiv.org/pdf/1406.0032.pdf.
Graves, A., and J. Schmidhuber. 2005. Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks 18 (5–6): 602–610.
Google Scholar
Greff, K., R.K. Srivastava, J. Koutník, B.R. Steunebrink, and J. Schmidhuber. 2016. LSTM: A search space odyssey. IEEE Transactions on Neural Networks and Learning Systems 28 (10): 2222–2232.
Google Scholar
Harris, C.R., K.J. Millman, S.J. van der Walt, et al. 2020. Array programming with NumPy. Nature 585: 357–362. https://doi.org/10.1038/s41586-020-2649-2.
Article Google Scholar
Horev, R. 2018. BERT Explained: State of the art language model for NLP. Towards Data Science, November 10. https://towardsdatascience.com/bert-explained-state-of-the-art-language-model-for-nlp-f8b21a9b6270 Accessed 01 February 2021.
Howells, K., and A. Ertugan. 2017. Applying fuzzy logic for sentiment analysis of social media network data in marketing. Procedia Computer Science 120: 664–670.
Google Scholar
Hung, C., C.F. Tsai, and H. Huang. 2012. Extracting word-of-mouth sentiments via SentiWordNet for document quality classification. Recent Patents on Computer Science 5 (2): 145–152.
Google Scholar
Kluyver, T., B. Ragan-Kelley, F. Pérez, B.E. Granger, M. Bussonnier, J. Frederic and Jupyter Development Team. 2016. Jupyter Notebooks - a publishing format for reproducible computational workflows. In ELPUB, May, 87–90.
Kohli, A.K., and B.J. Jaworski. 1990. Market orientation: The construct, research propositions, and managerial implications. Journal of Marketing 54 (2): 1–18.
Google Scholar
Krishen, A.S., and M. Petrescu. 2017. The world of analytics: Interdisciplinary, inclusive, insightful, and influential. Journal of Marketing Analytics 5: 1–4.
Google Scholar
Lee, T.Y., and E.T. Bradlow. 2011. Automated marketing research using online customer reviews. Journal of Marketing Research 48 (5): 881–894.
Google Scholar
Lin, J. and A. Kolcz. 2012. Large-scale machine learning at Twitter. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data: 793–804. https://doi.org/10.1145/2213836.2213958.
Liu, B. and L. Zhang. 2012. A survey of opinion mining and sentiment analysis. In Aggarwal, C.C. and Zhai, C.X. (Eds.), Mining Text Data, Springer Science & Business Media, NY: 415–463.
Maas, A.L., R.E. Daly, P.T. Pham, D. Huang, A.Y. Ng, and C. Potts. 2011. Learning word vectors for sentiment analysis. In Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies-volume 1: 142–150. https://www.aclweb.org/anthology/P11-1015.pdf.
McAfee, A., E. Brynjolfsson, T.H. Davenport, D.J. Patil, and D. Barton. 2012. Big data: The management revolution. Harvard Business Review 90 (10): 60–68.
Google Scholar
Metaxiotis, K., and J. Psarras. 2004. The contribution of neural networks and genetic algorithms to business decision support. Management Decision 42 (2): 229–242.
Google Scholar
Miller, G.A. 1995. WordNet: A lexical database for English. Communications of the ACM 38 (11): 39–41.
Google Scholar
Mishra, M. 2018. For Indian online shoppers, have saying and doing parted ways? Psychology and Marketing 35 (1): 5–19.
Google Scholar
Nayak, P. 2019. Understanding searches better than ever before. https://www.blog.google/products/search/search-language-understanding-bert/.
Netzer, O., R. Feldman, J. Goldenberg, and M. Fresko. 2012. Mine your own business: Market-structure surveillance through text mining. Marketing Science 31 (3): 521–543.
Google Scholar
Nijkamp, P., A. Reggiani, and T. Tritapepe. 1996. Modelling inter-urban transport flows in Italy: A comparison between neural network analysis and logit analysis. Transportation Research Part C: Emerging Technologies 4 (6): 323–338.
Google Scholar
Ohana, B., and B. Tierney. 2009. Sentiment classification of reviews using SentiWordNet. Proceedings of IT & T 13: 18–30.
Google Scholar
Pal, S., S. Ghosh, and A. Nag. 2018. Sentiment analysis in the light of LSTM recurrent neural networks. International Journal of Synthetic Emotions 9 (1): 33–39.
Google Scholar
Park, E., J. Kang, D. Choi, and J. Han. 2020. Understanding customers’ hotel revisiting behaviour: A sentiment analysis of online feedback reviews. Current Issues in Tourism 23 (5): 605–611.
Google Scholar
Pedregosa, F., G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, E. Duchesnay, et al. 2011. Scikit-learn: Machine learning in Python. The Journal of Machine Learning Research 12: 2825–2830.
Google Scholar
Rambocas, M., and B.G. Pacheco. 2018. Online sentiment analysis in marketing research: A review. Journal of Research in Interactive Marketing 12 (2): 146–163.
Google Scholar
Rehurek, R. and P. Sojka. 2010. Software framework for topic modelling with large corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, ELRA, Valletta, Malta. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.695.4595.
Roger, M. 2019. Google’s BERT rolls out worldwide. https://www.searchenginejournal.com/google-bert-rolls-out-worldwide/339359/.
Saggion, H., and A. Funk. 2010. Interpreting SentiWordNet for opinion classification. LREC 10: 1129–1133.
Google Scholar
Sarkar, D. 2019. Text analytics with Python: A practitioner’s guide to natural language processing. Bangalore: Apress.
Google Scholar
Sherstinsky, A. 2020. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Physica D: Nonlinear Phenomena 404: 132306. https://doi.org/10.1016/j.physd.2019.132306.
Article Google Scholar
Shmueli, G., N.R. Patel, and P.C. Bruce. 2007. Data mining for business intelligence: Concepts, techniques, and applications in Microsoft Office Excel with XLMiner. New Delhi: Wiley.
Google Scholar
Sivarajah, U., M.M. Kamal, Z. Irani, and V. Weerakkody. 2017. Critical analysis of big data challenges and analytical methods. Journal of Business Research 70: 263–286.
Google Scholar
Sousa, M.G., K. Sakiyama, L. de Souza Rodrigues, P.H. Moraes, E.R. Fernandes, and E.T. Matsubara. 2019. BERT for stock market sentiment analysis. In 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI): 1597–1601.
Sundermeyer, M., R. Schlüter, and H. Ney. 2012. LSTM neural networks for language modeling. https://www.isca-speech.org/archive/archive_papers/interspeech_2012/i12_0194.pdf.
The Pandas Development Team. 2020. pandas-dev/pandas: Pandas. Zenodo. https://doi.org/10.5281/zenodo.3509134.
Tirunillai, S., and G.J. Tellis. 2012. Does chatter really matter? Dynamics of user-generated content and stock performance. Marketing Science 31 (2): 198–215.
Google Scholar
Tirunillai, S., and G.J. Tellis. 2014. Mining marketing meaning from online chatter: Strategic brand analysis of big data using latent Dirichlet allocation. Journal of Marketing Research 51 (4): 463–479.
Google Scholar
Trivedi, K. 2019. Multi-label text classification using BERT—The mighty transformer. https://medium.com/huggingface/multi-label-text-classification-using-bert-the-mighty-transformer-69714fa3fb3d.
Van Rossum, G. and F.L. Drake. 2009. PYTHON 2.6 Reference Manual. https://doi.org/10.5555/1610526.
Vaswani, A., N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, L. Kaiser, and I. Polosukhin. 2017. Attention is all you need. https://arxiv.org/abs/1706.03762.
Venugopal, V., and W. Baets. 1994. Neural networks and statistical techniques in marketing research. Marketing Intelligence & Planning 12 (7): 30–38.
Google Scholar
Wang, A., A. Singh, J. Michael, F. Hill, O. Levy, and S.R. Bowman. 2018. GLUE: A multi-task benchmark and analysis platform for natural language understanding. https://arxiv.org/abs/1804.07461.
Wedel, M., and P.K. Kannan. 2016. Marketing analytics for data-rich environments. Journal of Marketing 80 (6): 97–121.
Google Scholar
Yi, X., F. Liu, J. Liu, and H. Jin. 2014. Building a network highway for big data: Architecture and challenges. IEEE Network 28 (4): 5–13.
Google Scholar
Zhao, L., L. Li, and X. Zheng. 2020. A BERT based sentiment analysis and key entity detection approach for online financial texts. https://arxiv.org/ftp/arxiv/papers/2001/2001.05326.pdf.
Zhou, J., Y. Lu, H.N. Dai, H. Wang, and H. Xiao. 2019. Sentiment analysis of Chinese microblog based on stacked bidirectional LSTM. IEEE Access 7: 38856–38866.
Google Scholar

Download references

Acknowledgements

The authors are grateful to the editors and the two anonymous reviewers for their constructive suggestions which helped in vastly improving the manuscript.

Author information

Authors and Affiliations

CenturyLink, Bengaluru, Karnataka, India
Shivaji Alaparthi
International Management Institute Bhubaneswar, IDCO Plot # 1, Gothapatna, Bhubaneswar, Odisha, 751003, India
Manit Mishra

Authors

Shivaji Alaparthi
View author publications
You can also search for this author in PubMed Google Scholar
Manit Mishra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Manit Mishra.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Alaparthi, S., Mishra, M. BERT: a sentiment analysis odyssey. J Market Anal 9, 118–126 (2021). https://doi.org/10.1057/s41270-021-00109-8

Download citation

Revised: 17 September 2020
Accepted: 08 February 2021
Published: 26 February 2021
Issue Date: June 2021
DOI: https://doi.org/10.1057/s41270-021-00109-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

BERT: a sentiment analysis odyssey

Abstract

Access this article

Similar content being viewed by others

A review on sentiment analysis and emotion detection from text

Sentiment Analysis in the Age of Generative AI

Sentiment Analysis in Social Media Data for Depression Detection Using Artificial Intelligence: A Review

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

BERT: a sentiment analysis odyssey

Abstract

Access this article

Similar content being viewed by others

A review on sentiment analysis and emotion detection from text

Sentiment Analysis in the Age of Generative AI

Sentiment Analysis in Social Media Data for Depression Detection Using Artificial Intelligence: A Review

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation