F-Measure Optimization for Multi-class, Imbalanced Emotion Classification Tasks

Inan, Toki Tahmid; Liu, Mingrui; Shehu, Amarda

doi:10.1007/978-3-031-15919-0_14

Toki Tahmid Inan¹²,
Mingrui Liu¹² &
Amarda Shehu¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13529))

Included in the following conference series:

International Conference on Artificial Neural Networks

2315 Accesses
1 Citations

Abstract

Recent NLP breakthroughs have significantly advanced the state of emotion classification (EC) over text data. However, current treatments guide learning by traditional performance metrics, such as classification error rate, which are not suitable for the highly-imbalanced EC problems; in fact, EC models are predominantly evaluated by variations of the F-measure, recognizing the data imbalance. This paper addresses the dissonance between the learning objective and the performance evaluation for EC with moderate to severe data imbalance. We propose a series of increasingly powerful algorithms for F-measure improvement. An ablation study demonstrates the superiority of learning an optimal class decision threshold. Increased performance is demonstrated when joint learning is carried out over both the representation and the class decision thresholds. Thorough empirical evaluation on benchmark EC datasets that span the spectrum of number of classes and class imbalance shows clear F-measure improvements over baseline models, with good improvements over pre-trained deep models and higher improvements over untrained deep architectures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

SMOTE for learning from imbalanced data: progress and challenges, marking the 15-year anniversary. J. Artif. Intell. Res. 61, 863–905 (2018)
Google Scholar
Barbieri, F., Camacho-Collados, J., Espinosa-Anke, L., Neves, L.: TweetEval: unified benchmark and comparative evaluation for tweet classification. In: Findings of EMNLP (2020)
Google Scholar
Chen, S.Y., Hsu, C.C., Kuo, C.C., Ku, L.W., et al.: EmotionLines: an emotion corpus of multi-party conversations. arXiv preprint arXiv:1802.08379 (2018)
Demszky, D., Movshovitz-Attias, D., Ko, J., Cowen, A., Nemade, G., et al.: GoEmotions: a dataset of fine-grained emotions. In: 58th Annual Meeting of the Association for Computational Linguistics (ACL) (2020)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Gayed, A., Milligan-Seville, J.S., Nicholas, J., Bryan, B.T., LaMontagne, A.D., et al.: Effectiveness of training workplace managers to understand and support the mental health needs of employees: a systematic review and meta-analysis. Occup. Environ. Med. 75(6), 462–470 (2017)
Article Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Koyejo, O., Natarajan, N., Ravikumar, P., Dhillon, I.S.: Consistent binary classification with generalized performance metrics. In: NIPS, vol. 27, pp. 2744–2752. Citeseer (2014)
Google Scholar
Liu, M., Zhang, X., Zhou, X., Yang, T.: Faster online learning of optimal threshold for consistent f-measure optimization. In: Advances in Neural Information Processing Systems, pp. 3893–3903 (2018)
Google Scholar
Narasimhan, H., Kar, P., Jain, P.: Optimizing non-decomposable performance measures: a tale of two classes. In: International Conference on Machine Learning, pp. 199–208 (2015)
Google Scholar
Padurariu, C., Breaban, M.E.: Dealing with data imbalance in text classification. Procedia Comput. Sci. 159, 736–745 (2019)
Article Google Scholar
Rosenthal, S., Farra, N., Nakov, P.: SemEval-2017 task 4: sentiment analysis in Twitter. In: International Workshop on Semantic Evaluation, SemEval 2017, Vancouver, Canada. Association for Computational Linguistics (2017)
Google Scholar
Shorten, C., Khoshgoftaar, T.M., Furht, B.: Text data augmentation for deep learning. J. Big Data 8, 101 (2021)
Article Google Scholar
Singh, K.: How to improve class imbalance using class weights in machine learning (2020). www.analyticsvidhya.com/blog/2020/10/improve-class-imbalance-class-weights/. Accessed 27 Jan 2022
Wang, C., Lin, H.: Constructing an affective tutoring system for designing course learning and evaluation. J. Educ. Comput. 55(8), 1111–1128 (2017)
Article Google Scholar
Wolf, T., et al.: Transformers: state-of-the-art natural language processing. In: Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38–45. Association for Computational Linguistics (2020). www.aclweb.org/anthology/2020.emnlp-demos.6
Yan, Y., Yang, T., Yang, Y., Chen, J.: A framework of online learning with imbalanced streaming data. In: AAAI Conference on Artificial Intelligence, pp. 2817–2823 (2017)
Google Scholar
Ye, L., Xu, R., Xu, J.: Emotion prediction of news articles from reader’s perspective based on multi-label classification. In: International Conference on Machine Learning Cybernetics, vol. 5, pp. 2019–2024 (2012)
Google Scholar
Zahiri, S.M., Choi, J.D.: Emotion detection on tv show transcripts with sequence-based convolutional neural networks. In: Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Zahra Rajabi, A.S., Uzuner, O.: Detecting scarce emotions via BERT and hyperparameter optimization. In: International Conference on Artificial Neural Networks (ICANN), pp. 1–12 (2021)
Google Scholar
Zhao, M.J., Edakunni, N., Pocock, A., Brown, G.: Beyond Fano’s inequality: bounds on the optimal F-score, BER, and cost-sensitive risk and their implications. J. Mach. Learn. Res. 14(1), 1033–1090 (2013)
MathSciNet MATH Google Scholar

Download references

Acknowledgement

All experiments were run on ARGO, a computing cluster provided by the Office of Research Computing at George Mason University, VA (URL: http://orc.gmu.edu).

Author information

Authors and Affiliations

Department of Computer Science, George Mason University, Fairfax, VA, 22030, USA
Toki Tahmid Inan, Mingrui Liu & Amarda Shehu

Authors

Toki Tahmid Inan
View author publications
You can also search for this author in PubMed Google Scholar
Mingrui Liu
View author publications
You can also search for this author in PubMed Google Scholar
Amarda Shehu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Toki Tahmid Inan .

Editor information

Editors and Affiliations

University of the West of England, Bristol, UK
Elias Pimenidis
Lancaster University, Lancaster, UK
Plamen Angelov
Digital Innovation, Teesside University, Middlesbrough, UK
Chrisina Jayne
Democritus University of Thrace, Xanthi, Greece
Antonios Papaleonidas
The University of the West of England, Bristol, UK
Mehmet Aydin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Inan, T.T., Liu, M., Shehu, A. (2022). F-Measure Optimization for Multi-class, Imbalanced Emotion Classification Tasks. In: Pimenidis, E., Angelov, P., Jayne, C., Papaleonidas, A., Aydin, M. (eds) Artificial Neural Networks and Machine Learning – ICANN 2022. ICANN 2022. Lecture Notes in Computer Science, vol 13529. Springer, Cham. https://doi.org/10.1007/978-3-031-15919-0_14

Download citation

DOI: https://doi.org/10.1007/978-3-031-15919-0_14
Published: 07 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15918-3
Online ISBN: 978-3-031-15919-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

F-Measure Optimization for Multi-class, Imbalanced Emotion Classification Tasks