A Hybrid Approach for Preprocessing of Imbalanced Data in Credit Scoring Systems

Salunkhe, Uma R.; Mali, Suresh N.

doi:10.1007/978-981-10-7245-1_10

A Hybrid Approach for Preprocessing of Imbalanced Data in Credit Scoring Systems

Uma R. Salunkhe¹⁹ &
Suresh N. Mali²⁰

Conference paper
First Online: 20 January 2018

1658 Accesses
1 Citations

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 673))

Abstract

During the last few years, classification task in machine learning is commonly used by various real-life applications. One of the common applications is credit scoring systems where the ability to accurately predict creditworthy or non-creditworthy applicants is critically important because incorrect predictions can cause major financial loss. In this paper, we aim to focus on skewed data distribution issue faced by credit scoring system. To reduce the imbalance between the classes, we apply preprocessing on the dataset which makes combined use of random re-sampling and dimensionality reduction. Experimental results on Australian and German credit datasets with the presented preprocessing technique has shown significant performance improvement in terms of AUC and F-measure.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Marqués, A. I., Vicente García, and Javier Salvador Sánchez. “Two-level classifier ensembles for credit risk assessment.” Expert Systems with Applications 39.12 (2012): 10916–10922.
Google Scholar
BIS. Basel III: a global regulatory framework for more resilient banks and banking systems. (2011). Basel Committee on Banking Supervision, Bank for International Settlements, Basel. ISBN print: 92-9131-859-0. <http://www.bis.org/publ/bcbs189.pdf>.
Marqués, A. I., Vicente García, and Javier Salvador Sánchez. “Exploring the behaviour of base classifiers in credit scoring ensembles.” Expert Systems with Applications 39.11 (2012): 10244–10250.
Google Scholar
Wu, Xiaojun, and SufangMeng. “E-commerce customer churn prediction based on improved SMOTE and AdaBoost.” Service Systems and Service Management (ICSSSM), 2016 13th International Conference on. IEEE, 2016.
Google Scholar
Xiao, Hongshan, Zhi Xiao, and Yu Wang. “Ensemble classification based on supervised clustering for credit scoring.” Applied Soft Computing 43 (2016): 73–86.
Google Scholar
Abellán, Joaquín, and Javier G. Castellano. “A comparative study on base classifiers in ensemble methods for credit scoring.” Expert Systems with Applications 73 (2017): 1–10.
Google Scholar
Dal Pozzolo, Andrea, et al. “Learned lessons in credit card fraud detection from a practitioner perspective.” Expert systems with applications 41.10 (2014): 4915–4928.
Google Scholar
Oreski, Stjepan, and Goran Oreski. “Genetic algorithm-based heuristic for feature selection in credit risk assessment.” Expert systems with applications 41.4 (2014): 2052–2064.
Google Scholar
Han, Lu, Liyan Han, and Hongwei Zhao. “Orthogonal support vector machine for credit scoring.” Engineering Applications of Artificial Intelligence 26.2 (2013): 848–862.
Google Scholar
Kim, Myoung-Jong, and Dae-Ki Kang. “Classifiers selection in ensembles using genetic algorithms for bankruptcy prediction.” Expert Systems with applications 39.10 (2012): 9308–9314.
Google Scholar
Xiao, Jin, et al. “Dynamic classifier ensemble model for customer classification with imbalanced class distribution.” Expert Systems with Applications 39.3 (2012): 3668–3675.
Google Scholar
Salunkhe, Uma R., and Suresh N. Mali. “Classifier Ensemble Design for Imbalanced Data Classification: A Hybrid Approach.” Procedia Computer Science 85 (2016): 725–732.
Google Scholar
Liu, Xu-Ying, Jianxin Wu, and Zhi-Hua Zhou. “Exploratory under-sampling for class-imbalance learning.” Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on 39.2 (2009): 539–550.
Google Scholar
Kamalloo, Ehsan, and Mohammad SanieeAbadeh. “An artificial immune system for extracting fuzzy rules in credit scoring.” Evolutionary Computation (CEC), 2010 IEEE Congress on. IEEE, 2010.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, Smt. Kashibai Navale College of Engineering, Pune, Maharashtra, India
Uma R. Salunkhe
Sinhgad Institute of Technology and Science, Pune, Maharashtra, India
Suresh N. Mali

Authors

Uma R. Salunkhe
View author publications
You can also search for this author in PubMed Google Scholar
Suresh N. Mali
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Uma R. Salunkhe .

Editor information

Editors and Affiliations

Department of Computer Software, University of Aizu, Aizuwakamatsu, Fukushima, Japan
Subhash Bhalla
Department of Electronics and Communication Engineering, Shri Ramswaroop Memorial Group of Professional Colleges, Lucknow, Uttar Pradesh, India
Vikrant Bhateja
Department of Information Technology, MIT College of Engineering, Pune, Maharashtra, India
Anjali A. Chandavale
Department of Information Technology, MIT College of Engineering, Pune, Maharashtra, India
Anil S. Hiwale
Department of Computer Science and Engineering, Anil Neerukonda Institute of Technology and Sciences, Visakhapatnam, Andhra Pradesh, India
Suresh Chandra Satapathy

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Salunkhe, U.R., Mali, S.N. (2018). A Hybrid Approach for Preprocessing of Imbalanced Data in Credit Scoring Systems. In: Bhalla, S., Bhateja, V., Chandavale, A., Hiwale, A., Satapathy, S. (eds) Intelligent Computing and Information and Communication. Advances in Intelligent Systems and Computing, vol 673. Springer, Singapore. https://doi.org/10.1007/978-981-10-7245-1_10

Download citation

DOI: https://doi.org/10.1007/978-981-10-7245-1_10
Published: 20 January 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-7244-4
Online ISBN: 978-981-10-7245-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics