Abstract
We propose a novel output layer activation function, which we name ASTra (Asymmetric Sigmoid Transfer function), which makes the classification of minority examples, in scenarios of high imbalance, more tractable. We combine this with a loss function that helps to effectively target minority misclassification. These two methods can be used together or separately, with their combination recommended for the most severely imbalanced cases. The proposed approach is tested on datasets with IRs from 588.24 to 4000 and very few minority examples (in some datasets, as few as five). Results using neural networks with from two to 12 hidden units are demonstrated to be comparable to, or better than, equivalent results obtained in a recent study that deployed a wide range of complex, hybrid data-level ensemble classifiers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
All derivatives, including those associated with the use of the ASTra transform, may be obtained from the authors on request; space precludes their inclusion here.
- 2.
Modified, where necessary, to record target-1 for the minority class and target-0 for the majority, a requirement for the application of the ASTra activation function.
References
Abhishek, K., Hamarneh, G.: Matthews correlation coefficient loss for deep convolutional networks: application to skin lesion segmentation (2021). https://doi.org/10.48550/arXiv.2010.13454
Chen, Z., Duan, J., Kang, L., Qiu, G.: HD-Ensemble datasets (2021). https://github.com/smallcube/HD-Ensemble. Accessed Apr 11 2022
Chen, Z., Duan, J., Kang, L., Qiu, G.: A hybrid data-level ensemble to enable learning from highly imbalanced dataset. Inf. Sci. 554, 157–176 (2021)
Chicco, D., Tötsch, N., Jurman, G.: The Matthews correlation coefficient (MCC) is more reliable than balanced accuracy, bookmaker informedness, and markedness in two-class confusion matrix evaluation. BioData Min. 14, 13 (2021)
Czarnecki, W., Rataj, K.: Compounds activity prediction in large imbalanced datasets with substructural relations fingerprint and EEM. In: 2015 IEEE TrustCom/BigDataSE/ISPA, vol. 2. pp. 192–197. IEEE, New York City (2015)
Foret, P., Kleiner, A., Mobahi, H., Neyshabur, B.: Sharpness-aware minimization for efficiently improving generalization (2021), https://doi.org/10.48550/arXiv.2010.01412
Hurtado, L.F., González, J.A., Ferran, P.: Choosing the right loss function for multi-label emotion classification. J. Intell. Fuzzy Syst. 36, 4697–4708 (2019)
Johnson, J.M., Khoshgoftaar, T.M.: Survey on deep learning with class imbalance. J. Big Data 6(1), 1–54 (2019). https://doi.org/10.1186/s40537-019-0192-5
Maratea, A., Petrosino, A., Manzo, M.: Adjusted F-measure and kernel scaling for imbalanced data learning. Inf. Sci. 257, 331–341 (2014)
Matthews, B.: Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim. Biophys. Acta 405, 442–445 (1975)
Paradowski, M., Spytkowsk, M., Kwaśnicka, H.: A new F-score gradient-based training rule for the linear model. Pattern. Anal. Applic. 22, 537–548 (2019)
Pastor-Pellicer, J., Zamora-Martínez, F., España-Boquera, S., Castro-Bleda, M.J.: F-Measure as the error function to train neural networks. In: Rojas, I., Joya, G., Gabestany, J. (eds.) IWANN 2013. LNCS, vol. 7902, pp. 376–384. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38679-4_37
Powers, D.: Evaluation: from precision, recall and F-measure to ROC, informedness, markedness & correlation. J. Mach. Learn. Technol. 2(1), 37–63 (2011)
Richards, F.: A flexible growth function for empirical use. J. Exp. Bot. 10(29), 290–300 (1959)
Scherzinger, A., Hugenroth, P., Rüder, M., Bogdan, S., Jiang, X.: Multi-class cell segmentation using CNNs with F1-measure loss function. In: Brox, T., et al. (eds.) GCPR 2016, LNCS, vol. 11269, pp. 434–446. Springer, Heidelberg (2019)
Seiffert, C., Khoshgoftaar, T., Van Hulse, J., A., N.: RUSBoost: A hybrid approach to alleviating class imbalance. IEEE Trans. Syst., Man, Cyb. 40(1), 185–197 (2010)
Twomey, D., Gorse, D.: A neural network cost function for highly class-imbalanced data sets. In: Proceedings of ESANN, pp. 207–212. i6doc.com, Bruges (2017)
Wang, B., Japkowicz, N.: Imbalanced data set learning with synthetic samples. In: Proceedings of IRIS Machine Learning Workshop (2004)
Wei, W., Li, J., Cao, L., Ou, Y., Chen, J.: Effective detection of sophisticated online banking fraud on extremely imbalanced data. World Wide Web 16(4), 449–475 (2013)
Weng, C., Poon, J.: A new evaluation measure for imbalanced datasets. In: Proceedings of Seventh Australasian Data Mining Conference. pp. 27–32. Glenelg, South Australia (2008)
Zhang, C., Wang, G., Zhou, Y., Jiang, J.: A new approach for imbalanced data classification based on minimize loss learning. In: 2017 IEEE Second International Conference on Data Science in Cyberspace (DSC). pp. 82–87. IEEE, New York City (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Twomey, D., Gorse, D. (2022). ASTra: A Novel Algorithm-Level Approach to Imbalanced Classification. In: Pimenidis, E., Angelov, P., Jayne, C., Papaleonidas, A., Aydin, M. (eds) Artificial Neural Networks and Machine Learning – ICANN 2022. ICANN 2022. Lecture Notes in Computer Science, vol 13531. Springer, Cham. https://doi.org/10.1007/978-3-031-15934-3_47
Download citation
DOI: https://doi.org/10.1007/978-3-031-15934-3_47
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15933-6
Online ISBN: 978-3-031-15934-3
eBook Packages: Computer ScienceComputer Science (R0)