Exploring the Performance of Ensemble Machine Learning Classifiers for Sentiment Analysis of COVID-19 Tweets

Rahman, Md. Mahbubar; Islam, Muhammad Nazrul

doi:10.1007/978-981-16-5157-1_30

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1408))

1786 Accesses
24 Citations

Abstract

Since the beginning of the global COVID-19 pandemic, measuring public opinion has been considered as one of the most critical issues for decision-makers to fight against the pandemic, such as implementing a national lockdown, introducing quarantine procedure, providing health services, and the like. During the COVID-19 pandemic, decision-makers in several countries around the world made a number of critical decisions focused on public opinion to combat coronavirus. In the field of natural language processing, sentiment analysis has emerged for mining public opinion, while machine learning (ML) algorithms are very common for analyzing sentiment. In this research, approximately 12 thousand tweets from United Kingdom (UK) were rigorously annotated by three independent reviewers, and based on the labeled tweets, three different ensemble ML models were proposed to classify the tweet data into three sentiment labels: positive, negative, and neutral. The study found that stacking classifier (SC) showed the highest F1-score (83.5%), followed by the voting classifier (VC) (83.3%) and bagging classifier (BC) (83.2%).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://www.tensorflow.org/hub.

References

Chong, W. Y., Selvaretnam, B., & Soon, L. K. (2014). Natural language processing for sentiment analysis: An exploratory analysis on tweets. In 2014 4th international conference on artificial intelligence with applications in engineering and technology (pp. 212–217). IEEE.
Google Scholar
Islam, M. N., & Islam, A. N. (2020). A systematic review of the digital interventions for fighting covid-19: The Bangladesh perspective. IEEE Access, 8, 114078–114087.
Article Google Scholar
Islam, M. N., Inan, T. T., & Islam, A. N. (2020). Covid-19 and the Rohingya refugees in Bangladesh: The challenges and recommendations. Asia Pacific Journal of Public Health, 32(5), 283–284.
Article Google Scholar
Laato, S., Islam, A. N., Islam, M. N., & Whelan, E. (2020). What drives unverified information sharing and cyberchondria during the covid-19 pandemic? European Journal of Information Systems, 29(3), 288–305.
Article Google Scholar
Islam, M. N., Inan, T. T., Rafi, S., Akter, S. S., Sarker, I. H., & Islam, A. N. (2021). A systematic review on the use of AI and ML for fighting the covid-19 pandemic. IEEE Transactions on Artificial Intelligence.
Google Scholar
Nichols, J. A., Chan, H. W. H., & Baker, M. A. (2019). Machine learning: Applications of artificial intelligence to imaging and diagnosis. Biophysical Reviews, 11(1), 111–118.
Article Google Scholar
Islam, M. N., Mahmud, T., Khan, N. I., Mustafina, S. N., & Islam, A. N. (2020). Exploring machine learning algorithms to find the best features for predicting modes of childbirth. IEEE Access.
Google Scholar
Khan, N. I., Mahmud, T., Islam, M. N., & Mustafina, S. N. (2020). Prediction of cesarean childbirth using ensemble machine learning methods. In Proceedings of the 22nd international conference on information integration and web-based applications & services (pp. 331–339).
Google Scholar
Aishwarja, A. I., Eva, N. J., Mushtary, S., Tasnim, Z., Khan, N. I., & Islam, M. N. (2020). Exploring the machine learning algorithms to find the best features for predicting the breast cancer and its recurrence. In International conference on intelligent computing & optimization (pp. 546–558). Springer.
Google Scholar
Khan, N. S., Muaz, M. H., Kabir, A., & Islam, M. N. (2017). Diabetes predicting mhealth application using machine learning. In 2017 IEEE international WIE conference on electrical and computer engineering (WIECON-ECE) (pp. 237–240). IEEE.
Google Scholar
Dhaya, R. (2020). Deep net model for detection of covid-19 using radiographs based on ROC analysis. Journal of Innovative Image Processing (JIIP), 2(03), 135–140.
Article Google Scholar
Zaman, A., Islam, M. N., Zaki, T., & Hossain, M. S. (2020). Ict intervention in the containment of the pandemic spread of covid-19: An exploratory study. arXiv:2004.09888
Omar, K. S., Mondal, P., Khan, N. S., Rizvi, M. R. K., & Islam, M. N. (2019). A machine learning approach to predict autism spectrum disorder. In 2019 international conference on electrical, computer and communication engineering (ECCE) (pp. 1–6). IEEE.
Google Scholar
Villavicencio, C., Macrohon, J. J., Inbaraj, X. A., Jeng, J. H., & Hsieh, J. G. (2021). Twitter sentiment analysis towards covid-19 vaccines in the philippines using naïve bayes. Information, 12(5), 204.
Article Google Scholar
Khan, R., Shrivastava, P., Kapoor, A., Tiwari, A., & Mittal, A. (2020). Social media analysis with AI: Sentiment analysis techniques for the analysis of twitter covid-19 data. Journal of Critical Review, 7(9), 2761–2774.
Google Scholar
Kaur, H., Ahsaan, S. U., Alankar, B., & Chang, V. (2021). A proposed sentiment analysis deep learning algorithm for analyzing covid-19 tweets. In Information Systems Frontiers (pp. 1–13).
Google Scholar
Gupta, M., Bansal, A., Jain, B., Rochelle, J., Oak, A., & Jalali, M. S. (2021). Whether the weather will help us weather the covid-19 pandemic: Using machine learning to measure twitter users’ perceptions. International Journal of Medical Informatics, 145, 104340.
Google Scholar
Garcia, K., & Berton, L. (2021). Topic detection and sentiment analysis in twitter content related to covid-19 from Brazil and the USA. Applied Soft Computing, 101, 107057.
Google Scholar
de Melo, T., & Figueiredo, C. M. (2021). Comparing news articles and tweets about covid-19 in Brazil: Sentiment analysis and topic modeling approach. JMIR Public Health and Surveillance, 7(2), e24585.
Google Scholar
Abd-Alrazaq, A., Alhuwail, D., Househ, M., Hamdi, M., & Shah, Z. Top concerns of tweeters during the covid-19 pandemic: A surveillance study.
Google Scholar
Rustam, F., Khalid, M., Aslam, W., Rupapara, V., Mehmood, A., & Choi, G. S. (2021). A performance comparison of supervised machine learning models for covid-19 tweets sentiment analysis. Plos One, 16(2), e0245909.
Google Scholar
Anderson, R. M., Hollingsworth, T. D., Baggaley, R. F., Maddren, R., & Vegvari, C. (2020). Covid-19 spread in the UK: The end of the beginning? The Lancet, 396(10251), 587–590.
Article Google Scholar
Armstrong, D., Gosling, A., Weinman, J., & Marteau, T. (1997). The place of inter-rater reliability in qualitative research: An empirical study. Sociology, 31(3), 597–606.
Article Google Scholar
Gwet, K. L. (2008). Computing inter-rater reliability and its variance in the presence of high agreement. British Journal of Mathematical and Statistical Psychology, 61(1), 29–48.
Article MathSciNet Google Scholar
Artstein, R., & Poesio, M. (2008). Inter-coder agreement for computational linguistics. Computational Linguistics, 34(4), 555–596.
Article Google Scholar
Hays, R. D., & Revicki, D. (2005). Reliability and validity (including responsiveness). Assessing Quality of Life in Clinical Trials, 2, 25–39.
Google Scholar
Japkowicz, N., & Stephen, S. (2002). The class imbalance problem: A systematic study. Intelligent Data Analysis, 6(5), 429–449.
Article Google Scholar
He, H., Bai, Y., Garcia, E. A., & Li, S. (2008). Adasyn: Adaptive synthetic sampling approach for imbalanced learning. In 2008 IEEE international joint conference on neural networks (IEEE world congress on computational intelligence) (pp. 1322–1328). IEEE.
Google Scholar
Dai, A. M., Olah, C., & Le, Q. V. (2015). Document embedding with paragraph vectors. arXiv:1507.07998
Reimers, N., & Gurevych, I. (2019). Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv:1908.10084
Cer, D., Yang, Y., Kong, S. Y., Hua, N., Limtiaco, N., John, R. S., Constant, N., Guajardo-Céspedes, M., Yuan, S., Tar, C., et al. (2018). Universal sentence encoder. arXiv:1803.11175
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al. (2011). Scikit-learn: Machine learning in python. The Journal of Machine Learning Research, 12, 2825–2830.
Google Scholar
Ghawi, R., & Pfeffer, J. (2019). Efficient hyperparameter tuning with grid search for text categorization using knn approach with bm25 similarity. Open Computer Science, 9(1), 160–180.
Article Google Scholar
Ruta, D., & Gabrys, B. (2005). Classifier selection for majority voting. Information Fusion, 6(1), 63–81.
Article Google Scholar
Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140.
MathSciNet MATH Google Scholar
Bühlmann, P., Yu, B., et al. (2002). Analyzing bagging. The Annals of Statistics, 30(4), 927–961.
Article MathSciNet Google Scholar
Efron, B., & Tibshirani, R. J. (1994). An introduction to the bootstrap. CRC Press.
Google Scholar
Džeroski, S., & Ženko, B. (2004). Is combining classifiers with stacking better than selecting the best one? Machine Learning, 54(3), 255–273.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Military Institute of Science and Technology, Mirpur Cantonment, Dhaka, 1216, Bangladesh
Md. Mahbubar Rahman & Muhammad Nazrul Islam

Authors

Md. Mahbubar Rahman
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Nazrul Islam
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Md. Mahbubar Rahman .

Editor information

Editors and Affiliations

Institute of Engineering, Tribhuvan University, Pulchowk Campus, Lalitpur, Nepal
Subarna Shakya
Intelligent Systems Research Centre, Aurel Vlaicu University of Arad, Arad, Romania
Valentina Emilia Balas
Songkla University, Songkhla, Thailand
Sinchai Kamolphiwong
Department of Electrical and Computer Engineering, Concordia University, Montreal, QC, Canada
Ke-Lin Du

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rahman, M.M., Islam, M.N. (2022). Exploring the Performance of Ensemble Machine Learning Classifiers for Sentiment Analysis of COVID-19 Tweets. In: Shakya, S., Balas, V.E., Kamolphiwong, S., Du, KL. (eds) Sentimental Analysis and Deep Learning. Advances in Intelligent Systems and Computing, vol 1408. Springer, Singapore. https://doi.org/10.1007/978-981-16-5157-1_30

Download citation

DOI: https://doi.org/10.1007/978-981-16-5157-1_30
Published: 26 October 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-5156-4
Online ISBN: 978-981-16-5157-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics