ASTra: A Novel Algorithm-Level Approach to Imbalanced Classification

Twomey, David; Gorse, Denise

doi:10.1007/978-3-031-15934-3_47

David Twomey¹² &
Denise Gorse¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13531))

Included in the following conference series:

International Conference on Artificial Neural Networks

1779 Accesses

Abstract

We propose a novel output layer activation function, which we name ASTra (Asymmetric Sigmoid Transfer function), which makes the classification of minority examples, in scenarios of high imbalance, more tractable. We combine this with a loss function that helps to effectively target minority misclassification. These two methods can be used together or separately, with their combination recommended for the most severely imbalanced cases. The proposed approach is tested on datasets with IRs from 588.24 to 4000 and very few minority examples (in some datasets, as few as five). Results using neural networks with from two to 12 hidden units are demonstrated to be comparable to, or better than, equivalent results obtained in a recent study that deployed a wide range of complex, hybrid data-level ensemble classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
All derivatives, including those associated with the use of the ASTra transform, may be obtained from the authors on request; space precludes their inclusion here.
2.
Modified, where necessary, to record target-1 for the minority class and target-0 for the majority, a requirement for the application of the ASTra activation function.

References

Abhishek, K., Hamarneh, G.: Matthews correlation coefficient loss for deep convolutional networks: application to skin lesion segmentation (2021). https://doi.org/10.48550/arXiv.2010.13454
Chen, Z., Duan, J., Kang, L., Qiu, G.: HD-Ensemble datasets (2021). https://github.com/smallcube/HD-Ensemble. Accessed Apr 11 2022
Chen, Z., Duan, J., Kang, L., Qiu, G.: A hybrid data-level ensemble to enable learning from highly imbalanced dataset. Inf. Sci. 554, 157–176 (2021)
Article MathSciNet Google Scholar
Chicco, D., Tötsch, N., Jurman, G.: The Matthews correlation coefficient (MCC) is more reliable than balanced accuracy, bookmaker informedness, and markedness in two-class confusion matrix evaluation. BioData Min. 14, 13 (2021)
Google Scholar
Czarnecki, W., Rataj, K.: Compounds activity prediction in large imbalanced datasets with substructural relations fingerprint and EEM. In: 2015 IEEE TrustCom/BigDataSE/ISPA, vol. 2. pp. 192–197. IEEE, New York City (2015)
Google Scholar
Foret, P., Kleiner, A., Mobahi, H., Neyshabur, B.: Sharpness-aware minimization for efficiently improving generalization (2021), https://doi.org/10.48550/arXiv.2010.01412
Hurtado, L.F., González, J.A., Ferran, P.: Choosing the right loss function for multi-label emotion classification. J. Intell. Fuzzy Syst. 36, 4697–4708 (2019)
Article Google Scholar
Johnson, J.M., Khoshgoftaar, T.M.: Survey on deep learning with class imbalance. J. Big Data 6(1), 1–54 (2019). https://doi.org/10.1186/s40537-019-0192-5
Article Google Scholar
Maratea, A., Petrosino, A., Manzo, M.: Adjusted F-measure and kernel scaling for imbalanced data learning. Inf. Sci. 257, 331–341 (2014)
Article Google Scholar
Matthews, B.: Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim. Biophys. Acta 405, 442–445 (1975)
Article Google Scholar
Paradowski, M., Spytkowsk, M., Kwaśnicka, H.: A new F-score gradient-based training rule for the linear model. Pattern. Anal. Applic. 22, 537–548 (2019)
Article MathSciNet Google Scholar
Pastor-Pellicer, J., Zamora-Martínez, F., España-Boquera, S., Castro-Bleda, M.J.: F-Measure as the error function to train neural networks. In: Rojas, I., Joya, G., Gabestany, J. (eds.) IWANN 2013. LNCS, vol. 7902, pp. 376–384. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38679-4_37
Chapter Google Scholar
Powers, D.: Evaluation: from precision, recall and F-measure to ROC, informedness, markedness & correlation. J. Mach. Learn. Technol. 2(1), 37–63 (2011)
MathSciNet Google Scholar
Richards, F.: A flexible growth function for empirical use. J. Exp. Bot. 10(29), 290–300 (1959)
Article Google Scholar
Scherzinger, A., Hugenroth, P., Rüder, M., Bogdan, S., Jiang, X.: Multi-class cell segmentation using CNNs with F1-measure loss function. In: Brox, T., et al. (eds.) GCPR 2016, LNCS, vol. 11269, pp. 434–446. Springer, Heidelberg (2019)
Google Scholar
Seiffert, C., Khoshgoftaar, T., Van Hulse, J., A., N.: RUSBoost: A hybrid approach to alleviating class imbalance. IEEE Trans. Syst., Man, Cyb. 40(1), 185–197 (2010)
Google Scholar
Twomey, D., Gorse, D.: A neural network cost function for highly class-imbalanced data sets. In: Proceedings of ESANN, pp. 207–212. i6doc.com, Bruges (2017)
Google Scholar
Wang, B., Japkowicz, N.: Imbalanced data set learning with synthetic samples. In: Proceedings of IRIS Machine Learning Workshop (2004)
Google Scholar
Wei, W., Li, J., Cao, L., Ou, Y., Chen, J.: Effective detection of sophisticated online banking fraud on extremely imbalanced data. World Wide Web 16(4), 449–475 (2013)
Article Google Scholar
Weng, C., Poon, J.: A new evaluation measure for imbalanced datasets. In: Proceedings of Seventh Australasian Data Mining Conference. pp. 27–32. Glenelg, South Australia (2008)
Google Scholar
Zhang, C., Wang, G., Zhou, Y., Jiang, J.: A new approach for imbalanced data classification based on minimize loss learning. In: 2017 IEEE Second International Conference on Data Science in Cyberspace (DSC). pp. 82–87. IEEE, New York City (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University College London, London, WC1E 6BT, UK
David Twomey & Denise Gorse

Authors

David Twomey
View author publications
You can also search for this author in PubMed Google Scholar
Denise Gorse
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David Twomey .

Editor information

Editors and Affiliations

University of the West of England, Bristol, UK
Elias Pimenidis
Lancaster University, Lancaster, UK
Plamen Angelov
Digital Innovation, Teeside University, Middlesbrough, UK
Chrisina Jayne
Democritus University of Thrace, Xanthi, Greece
Antonios Papaleonidas
The University of the West of England, Bristol, UK
Mehmet Aydin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Twomey, D., Gorse, D. (2022). ASTra: A Novel Algorithm-Level Approach to Imbalanced Classification. In: Pimenidis, E., Angelov, P., Jayne, C., Papaleonidas, A., Aydin, M. (eds) Artificial Neural Networks and Machine Learning – ICANN 2022. ICANN 2022. Lecture Notes in Computer Science, vol 13531. Springer, Cham. https://doi.org/10.1007/978-3-031-15934-3_47

Download citation

DOI: https://doi.org/10.1007/978-3-031-15934-3_47
Published: 15 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15933-6
Online ISBN: 978-3-031-15934-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

ASTra: A Novel Algorithm-Level Approach to Imbalanced Classification