Skip to main content

F-Measure Optimization for Multi-class, Imbalanced Emotion Classification Tasks

  • Conference paper
  • First Online:
Artificial Neural Networks and Machine Learning – ICANN 2022 (ICANN 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13529))

Included in the following conference series:

Abstract

Recent NLP breakthroughs have significantly advanced the state of emotion classification (EC) over text data. However, current treatments guide learning by traditional performance metrics, such as classification error rate, which are not suitable for the highly-imbalanced EC problems; in fact, EC models are predominantly evaluated by variations of the F-measure, recognizing the data imbalance. This paper addresses the dissonance between the learning objective and the performance evaluation for EC with moderate to severe data imbalance. We propose a series of increasingly powerful algorithms for F-measure improvement. An ablation study demonstrates the superiority of learning an optimal class decision threshold. Increased performance is demonstrated when joint learning is carried out over both the representation and the class decision thresholds. Thorough empirical evaluation on benchmark EC datasets that span the spectrum of number of classes and class imbalance shows clear F-measure improvements over baseline models, with good improvements over pre-trained deep models and higher improvements over untrained deep architectures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. SMOTE for learning from imbalanced data: progress and challenges, marking the 15-year anniversary. J. Artif. Intell. Res. 61, 863–905 (2018)

    Google Scholar 

  2. Barbieri, F., Camacho-Collados, J., Espinosa-Anke, L., Neves, L.: TweetEval: unified benchmark and comparative evaluation for tweet classification. In: Findings of EMNLP (2020)

    Google Scholar 

  3. Chen, S.Y., Hsu, C.C., Kuo, C.C., Ku, L.W., et al.: EmotionLines: an emotion corpus of multi-party conversations. arXiv preprint arXiv:1802.08379 (2018)

  4. Demszky, D., Movshovitz-Attias, D., Ko, J., Cowen, A., Nemade, G., et al.: GoEmotions: a dataset of fine-grained emotions. In: 58th Annual Meeting of the Association for Computational Linguistics (ACL) (2020)

    Google Scholar 

  5. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  6. Gayed, A., Milligan-Seville, J.S., Nicholas, J., Bryan, B.T., LaMontagne, A.D., et al.: Effectiveness of training workplace managers to understand and support the mental health needs of employees: a systematic review and meta-analysis. Occup. Environ. Med. 75(6), 462–470 (2017)

    Article  Google Scholar 

  7. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  8. Koyejo, O., Natarajan, N., Ravikumar, P., Dhillon, I.S.: Consistent binary classification with generalized performance metrics. In: NIPS, vol. 27, pp. 2744–2752. Citeseer (2014)

    Google Scholar 

  9. Liu, M., Zhang, X., Zhou, X., Yang, T.: Faster online learning of optimal threshold for consistent f-measure optimization. In: Advances in Neural Information Processing Systems, pp. 3893–3903 (2018)

    Google Scholar 

  10. Narasimhan, H., Kar, P., Jain, P.: Optimizing non-decomposable performance measures: a tale of two classes. In: International Conference on Machine Learning, pp. 199–208 (2015)

    Google Scholar 

  11. Padurariu, C., Breaban, M.E.: Dealing with data imbalance in text classification. Procedia Comput. Sci. 159, 736–745 (2019)

    Article  Google Scholar 

  12. Rosenthal, S., Farra, N., Nakov, P.: SemEval-2017 task 4: sentiment analysis in Twitter. In: International Workshop on Semantic Evaluation, SemEval 2017, Vancouver, Canada. Association for Computational Linguistics (2017)

    Google Scholar 

  13. Shorten, C., Khoshgoftaar, T.M., Furht, B.: Text data augmentation for deep learning. J. Big Data 8, 101 (2021)

    Article  Google Scholar 

  14. Singh, K.: How to improve class imbalance using class weights in machine learning (2020). www.analyticsvidhya.com/blog/2020/10/improve-class-imbalance-class-weights/. Accessed 27 Jan 2022

  15. Wang, C., Lin, H.: Constructing an affective tutoring system for designing course learning and evaluation. J. Educ. Comput. 55(8), 1111–1128 (2017)

    Article  Google Scholar 

  16. Wolf, T., et al.: Transformers: state-of-the-art natural language processing. In: Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38–45. Association for Computational Linguistics (2020). www.aclweb.org/anthology/2020.emnlp-demos.6

  17. Yan, Y., Yang, T., Yang, Y., Chen, J.: A framework of online learning with imbalanced streaming data. In: AAAI Conference on Artificial Intelligence, pp. 2817–2823 (2017)

    Google Scholar 

  18. Ye, L., Xu, R., Xu, J.: Emotion prediction of news articles from reader’s perspective based on multi-label classification. In: International Conference on Machine Learning Cybernetics, vol. 5, pp. 2019–2024 (2012)

    Google Scholar 

  19. Zahiri, S.M., Choi, J.D.: Emotion detection on tv show transcripts with sequence-based convolutional neural networks. In: Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  20. Zahra Rajabi, A.S., Uzuner, O.: Detecting scarce emotions via BERT and hyperparameter optimization. In: International Conference on Artificial Neural Networks (ICANN), pp. 1–12 (2021)

    Google Scholar 

  21. Zhao, M.J., Edakunni, N., Pocock, A., Brown, G.: Beyond Fano’s inequality: bounds on the optimal F-score, BER, and cost-sensitive risk and their implications. J. Mach. Learn. Res. 14(1), 1033–1090 (2013)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgement

All experiments were run on ARGO, a computing cluster provided by the Office of Research Computing at George Mason University, VA (URL: http://orc.gmu.edu).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Toki Tahmid Inan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Inan, T.T., Liu, M., Shehu, A. (2022). F-Measure Optimization for Multi-class, Imbalanced Emotion Classification Tasks. In: Pimenidis, E., Angelov, P., Jayne, C., Papaleonidas, A., Aydin, M. (eds) Artificial Neural Networks and Machine Learning – ICANN 2022. ICANN 2022. Lecture Notes in Computer Science, vol 13529. Springer, Cham. https://doi.org/10.1007/978-3-031-15919-0_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-15919-0_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-15918-3

  • Online ISBN: 978-3-031-15919-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics