Abstract
During the last few years, classification task in machine learning is commonly used by various real-life applications. One of the common applications is credit scoring systems where the ability to accurately predict creditworthy or non-creditworthy applicants is critically important because incorrect predictions can cause major financial loss. In this paper, we aim to focus on skewed data distribution issue faced by credit scoring system. To reduce the imbalance between the classes, we apply preprocessing on the dataset which makes combined use of random re-sampling and dimensionality reduction. Experimental results on Australian and German credit datasets with the presented preprocessing technique has shown significant performance improvement in terms of AUC and F-measure.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Marqués, A. I., Vicente García, and Javier Salvador Sánchez. “Two-level classifier ensembles for credit risk assessment.” Expert Systems with Applications 39.12 (2012): 10916–10922.
BIS. Basel III: a global regulatory framework for more resilient banks and banking systems. (2011). Basel Committee on Banking Supervision, Bank for International Settlements, Basel. ISBN print: 92-9131-859-0. <http://www.bis.org/publ/bcbs189.pdf>.
Marqués, A. I., Vicente García, and Javier Salvador Sánchez. “Exploring the behaviour of base classifiers in credit scoring ensembles.” Expert Systems with Applications 39.11 (2012): 10244–10250.
Wu, Xiaojun, and SufangMeng. “E-commerce customer churn prediction based on improved SMOTE and AdaBoost.” Service Systems and Service Management (ICSSSM), 2016 13th International Conference on. IEEE, 2016.
Xiao, Hongshan, Zhi Xiao, and Yu Wang. “Ensemble classification based on supervised clustering for credit scoring.” Applied Soft Computing 43 (2016): 73–86.
Abellán, Joaquín, and Javier G. Castellano. “A comparative study on base classifiers in ensemble methods for credit scoring.” Expert Systems with Applications 73 (2017): 1–10.
Dal Pozzolo, Andrea, et al. “Learned lessons in credit card fraud detection from a practitioner perspective.” Expert systems with applications 41.10 (2014): 4915–4928.
Oreski, Stjepan, and Goran Oreski. “Genetic algorithm-based heuristic for feature selection in credit risk assessment.” Expert systems with applications 41.4 (2014): 2052–2064.
Han, Lu, Liyan Han, and Hongwei Zhao. “Orthogonal support vector machine for credit scoring.” Engineering Applications of Artificial Intelligence 26.2 (2013): 848–862.
Kim, Myoung-Jong, and Dae-Ki Kang. “Classifiers selection in ensembles using genetic algorithms for bankruptcy prediction.” Expert Systems with applications 39.10 (2012): 9308–9314.
Xiao, Jin, et al. “Dynamic classifier ensemble model for customer classification with imbalanced class distribution.” Expert Systems with Applications 39.3 (2012): 3668–3675.
Salunkhe, Uma R., and Suresh N. Mali. “Classifier Ensemble Design for Imbalanced Data Classification: A Hybrid Approach.” Procedia Computer Science 85 (2016): 725–732.
Liu, Xu-Ying, Jianxin Wu, and Zhi-Hua Zhou. “Exploratory under-sampling for class-imbalance learning.” Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on 39.2 (2009): 539–550.
Kamalloo, Ehsan, and Mohammad SanieeAbadeh. “An artificial immune system for extracting fuzzy rules in credit scoring.” Evolutionary Computation (CEC), 2010 IEEE Congress on. IEEE, 2010.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Salunkhe, U.R., Mali, S.N. (2018). A Hybrid Approach for Preprocessing of Imbalanced Data in Credit Scoring Systems. In: Bhalla, S., Bhateja, V., Chandavale, A., Hiwale, A., Satapathy, S. (eds) Intelligent Computing and Information and Communication. Advances in Intelligent Systems and Computing, vol 673. Springer, Singapore. https://doi.org/10.1007/978-981-10-7245-1_10
Download citation
DOI: https://doi.org/10.1007/978-981-10-7245-1_10
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-7244-4
Online ISBN: 978-981-10-7245-1
eBook Packages: EngineeringEngineering (R0)