Classification of Imbalanced Data by Combining the Complementary Neural Network and SMOTE Algorithm

Jeatrakul, Piyasak; Wong, Kok Wai; Fung, Chun Che

doi:10.1007/978-3-642-17534-3_19

Piyasak Jeatrakul¹⁹,
Kok Wai Wong¹⁹ &
Chun Che Fung¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6444))

Included in the following conference series:

International Conference on Neural Information Processing

3215 Accesses
72 Citations

Abstract

In classification, when the distribution of the training data among classes is uneven, the learning algorithm is generally dominated by the feature of the majority classes. The features in the minority classes are normally difficult to be fully recognized. In this paper, a method is proposed to enhance the classification accuracy for the minority classes. The proposed method combines Synthetic Minority Over-sampling Technique (SMOTE) and Complementary Neural Network (CMTNN) to handle the problem of classifying imbalanced data. In order to demonstrate that the proposed technique can assist classification of imbalanced data, several classification algorithms have been used. They are Artificial Neural Network (ANN), k-Nearest Neighbor (k-NN) and Support Vector Machine (SVM). The benchmark data sets with various ratios between the minority class and the majority class are obtained from the University of California Irvine (UCI) machine learning repository. The results show that the proposed combination techniques can improve the performance for the class imbalance problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Batista, G.E.A.P.A., Prati, R.C., Monard, M.C.: A study of the behavior of several methods for balancing machine learning training data. SIGKDD Explorations Newsletter 6, 20–29 (2004)
Article Google Scholar
Laurikkala, J.: Improving identification of difficult small classes by balancing class distribution. In: Quaglini, S., Barahona, P., Andreassen, S. (eds.) AIME 2001. LNCS (LNAI), vol. 2101, p. 63. Springer, Heidelberg (2001)
Chapter Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research 16, 321–357 (2002)
MATH Google Scholar
Gu, Q., Cai, Z., Zhu, L., Huang, B.: Data mining on imbalanced data sets. In: International Conference on Advanced Computer Theory and Engineering, ICACTE 2008, pp. 1020–1024 (2008)
Google Scholar
Gedeon, T.D., Wong, P.M., Harris, D.: Balancing bias and variance: Network topology and pattern set reduction techniques. In: Sandoval, F., Mira, J. (eds.) IWANN 1995. LNCS, vol. 930, pp. 551–558. Springer, Heidelberg (1995)
Chapter Google Scholar
Tomek, I.: Two Modifications of CNN. IEEE Transactions on Systems, Man and Cybernetics 6, 769–772 (1976)
MathSciNet MATH Google Scholar
Wilson, D.L.: Asymptotic properties of nearest neighbor rules using edited Data. IEEE Transactions on Systems, Man and Cybernetics 2, 408–421 (1972)
Article MathSciNet MATH Google Scholar
Gedeon, T.D., Bowden, T.G.: Heuristic pattern reduction. In: International Joint Conference on Neural Networks, Beijing, vol. 2, pp. 449–453 (1992)
Google Scholar
Barandela, R., Sanchez, J.S., Garcia, V., Rangel, E.: Strategies for learning in class imbalance problems. Pattern Recognition 36, 849–851 (2003)
Article Google Scholar
Kraipeerapun, P., Fung, C.C., Nakkrasae, S.: Porosity prediction using bagging of complementary neural networks. In: Yu, W., He, H., Zhang, N. (eds.) ISNN 2009. LNCS, vol. 5551, pp. 175–184. Springer, Heidelberg (2009)
Chapter Google Scholar
Kraipeerapun, P., Fung, C.C.: Binary classification using ensemble neural networks and interval neutrosophic sets. Neurocomput. 72, 2845–2856 (2009)
Article Google Scholar
Jeatrakul, P., Wong, K.W., Fung, C.C.: Data cleaning for classification using misclassification analysis. Journal of Advanced Computational Intelligence and Intelligent Informatics 14(3), 297–302 (2010)
Article Google Scholar
Asuncion, A., Newman, D.J.: UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Technology, Murdoch University, South Street, Murdoch, Western Australia, 6150
Piyasak Jeatrakul, Kok Wai Wong & Chun Che Fung

Authors

Piyasak Jeatrakul
View author publications
You can also search for this author in PubMed Google Scholar
Kok Wai Wong
View author publications
You can also search for this author in PubMed Google Scholar
Chun Che Fung
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Information Technology, Murdoch University, 6150, Murdoch, WA, Australia
Kok Wai Wong
The Australian National University, 0200, Canberra, ACT, Australia
B. Sumudu U. Mendis
School of Electrical, Computer and Telecommunications Engineering, University of Wollongong, Northfields Avenue, 2522, P.O. Box, Wollongong, NSW, Australia
Abdesselam Bouzerdoum

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jeatrakul, P., Wong, K.W., Fung, C.C. (2010). Classification of Imbalanced Data by Combining the Complementary Neural Network and SMOTE Algorithm. In: Wong, K.W., Mendis, B.S.U., Bouzerdoum, A. (eds) Neural Information Processing. Models and Applications. ICONIP 2010. Lecture Notes in Computer Science, vol 6444. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17534-3_19

Download citation

DOI: https://doi.org/10.1007/978-3-642-17534-3_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17533-6
Online ISBN: 978-3-642-17534-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics