Skip to main content

A Hybrid Approach for Preprocessing of Imbalanced Data in Credit Scoring Systems

  • Conference paper
  • First Online:

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 673))

Abstract

During the last few years, classification task in machine learning is commonly used by various real-life applications. One of the common applications is credit scoring systems where the ability to accurately predict creditworthy or non-creditworthy applicants is critically important because incorrect predictions can cause major financial loss. In this paper, we aim to focus on skewed data distribution issue faced by credit scoring system. To reduce the imbalance between the classes, we apply preprocessing on the dataset which makes combined use of random re-sampling and dimensionality reduction. Experimental results on Australian and German credit datasets with the presented preprocessing technique has shown significant performance improvement in terms of AUC and F-measure.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Marqués, A. I., Vicente García, and Javier Salvador Sánchez. “Two-level classifier ensembles for credit risk assessment.” Expert Systems with Applications 39.12 (2012): 10916–10922.

    Google Scholar 

  2. BIS. Basel III: a global regulatory framework for more resilient banks and banking systems. (2011). Basel Committee on Banking Supervision, Bank for International Settlements, Basel. ISBN print: 92-9131-859-0. <http://www.bis.org/publ/bcbs189.pdf>.

  3. Marqués, A. I., Vicente García, and Javier Salvador Sánchez. “Exploring the behaviour of base classifiers in credit scoring ensembles.” Expert Systems with Applications 39.11 (2012): 10244–10250.

    Google Scholar 

  4. Wu, Xiaojun, and SufangMeng. “E-commerce customer churn prediction based on improved SMOTE and AdaBoost.” Service Systems and Service Management (ICSSSM), 2016 13th International Conference on. IEEE, 2016.

    Google Scholar 

  5. Xiao, Hongshan, Zhi Xiao, and Yu Wang. “Ensemble classification based on supervised clustering for credit scoring.” Applied Soft Computing 43 (2016): 73–86.

    Google Scholar 

  6. Abellán, Joaquín, and Javier G. Castellano. “A comparative study on base classifiers in ensemble methods for credit scoring.” Expert Systems with Applications 73 (2017): 1–10.

    Google Scholar 

  7. Dal Pozzolo, Andrea, et al. “Learned lessons in credit card fraud detection from a practitioner perspective.” Expert systems with applications 41.10 (2014): 4915–4928.

    Google Scholar 

  8. Oreski, Stjepan, and Goran Oreski. “Genetic algorithm-based heuristic for feature selection in credit risk assessment.” Expert systems with applications 41.4 (2014): 2052–2064.

    Google Scholar 

  9. Han, Lu, Liyan Han, and Hongwei Zhao. “Orthogonal support vector machine for credit scoring.” Engineering Applications of Artificial Intelligence 26.2 (2013): 848–862.

    Google Scholar 

  10. Kim, Myoung-Jong, and Dae-Ki Kang. “Classifiers selection in ensembles using genetic algorithms for bankruptcy prediction.” Expert Systems with applications 39.10 (2012): 9308–9314.

    Google Scholar 

  11. Xiao, Jin, et al. “Dynamic classifier ensemble model for customer classification with imbalanced class distribution.” Expert Systems with Applications 39.3 (2012): 3668–3675.

    Google Scholar 

  12. Salunkhe, Uma R., and Suresh N. Mali. “Classifier Ensemble Design for Imbalanced Data Classification: A Hybrid Approach.” Procedia Computer Science 85 (2016): 725–732.

    Google Scholar 

  13. Liu, Xu-Ying, Jianxin Wu, and Zhi-Hua Zhou. “Exploratory under-sampling for class-imbalance learning.” Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on 39.2 (2009): 539–550.

    Google Scholar 

  14. Kamalloo, Ehsan, and Mohammad SanieeAbadeh. “An artificial immune system for extracting fuzzy rules in credit scoring.” Evolutionary Computation (CEC), 2010 IEEE Congress on. IEEE, 2010.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Uma R. Salunkhe .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Salunkhe, U.R., Mali, S.N. (2018). A Hybrid Approach for Preprocessing of Imbalanced Data in Credit Scoring Systems. In: Bhalla, S., Bhateja, V., Chandavale, A., Hiwale, A., Satapathy, S. (eds) Intelligent Computing and Information and Communication. Advances in Intelligent Systems and Computing, vol 673. Springer, Singapore. https://doi.org/10.1007/978-981-10-7245-1_10

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-7245-1_10

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-7244-4

  • Online ISBN: 978-981-10-7245-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics