Skip to main content

A Feature Selection Based on Network Structure for Credit Card Default Prediction

  • Conference paper
  • First Online:
Computer Supported Cooperative Work and Social Computing (ChineseCSCW 2019)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1042))

  • 989 Accesses

Abstract

The problem of credit card default prediction is important in finance and electronic commerce, thus it has been attracting more and more attention. Generally, the existing research work on credit card default prediction directly applies a classification model to the historical data and train a predictor, but rarely deeply explores the data. In this paper, we research the problem of credit card default prediction in an unconventional way. First, we study the records of consumption by credit card from the perspective of network to uncover the relationships between features and the ones between features and label. Second, based on the network structure we propose a new feature selection algorithm named as NSFSA. Finally, we apply the NSFSA to five machine learning models to train predictors over the real dataset of consumption records by credit card, and also compare with four existing feature selection algorithms. Experimental results show that the proposed NSFSA performs excellently, which demonstrates the potentials of our way to research the credit card default problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Ajay, A., Venkatesh, A., Gracia, S., et al.: Prediction of credit-card defaulters: a comparative study on performance of classifiers. Int. J. Comput. Appl. 145(7), 36–41 (2016)

    Google Scholar 

  • Leow, M., Crook, J.: A new Mixture model for the estimation of credit card Exposure at Default. Eur. J. Oper. Res. 249(2), 487–497 (2016)

    Article  MathSciNet  Google Scholar 

  • Sun, S.H., Jin, Z.: Estimating credit risk parameters using ensemble learning methods: an empirical study on loss given default. J. Credit Risk (2016, Forthcoming)

    Google Scholar 

  • Bermingham, M.L., Pongwong, R., Spiliopoulou, A., et al.: Application of high-dimensional feature selection: evaluation for genomic prediction in man. Sci. Rep. 5, 10312 (2015)

    Article  Google Scholar 

  • Wang, Q., Hu, Y., Li, J.: Community-based feature selection for credit card default prediction. In: Cherifi, C., Cherifi, H., Karsai, M., Musolesi, M. (eds.) Complex Networks & Their Applications VI. SCI, vol. 689, pp. 153–165. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-72150-7_13

    Chapter  Google Scholar 

  • Li, J., Cheng, K., Wang, S., et al.: Feature selection: a data perspective. arXiv:1601.07996 (2016)

  • Alelyani, S., Tang, J., Liu, H.: Feature selection for clustering: a review. Encycl. Database Syst. 21(3), 110–121 (2016)

    Google Scholar 

  • Abdou, H.A., Tsafack, M., Ntim, C.G., et al.: Predicting creditworthiness in retail banking with limited scoring data. Knowl. Based Syst. 103(1), 89–103 (2016)

    Article  Google Scholar 

  • Wang, H., Xu, Q., Zhou, L., et al.: Large unbalanced credit scoring using Lasso-logistic regression ensemble. PLoS ONE 10(2), e0117844 (2015)

    Article  Google Scholar 

  • Hon, P.S., Bellotti, T.: Models and forecasts of credit card balance. Eur. J. Oper. Res. 249(2), 498–505 (2016)

    Article  MathSciNet  Google Scholar 

  • Evangelista, R.D., Artes, R.: Using multi-state markov models to identify credit card risk. Production 26(2), 330–344 (2016). The Scientific Electronic Library Online

    Google Scholar 

  • Yang, J., Leskovec, J.: Structure and overlaps of ground-truth communities in networks. ACM Trans. Intell. Syst. Technol. 5(2), 26 (2014)

    Article  Google Scholar 

  • Hu, Y., Yang, B., Wong, H.: A weighted local view method based on observation over ground truth for community detection. Inf. Sci. 355–356, 37–57 (2016)

    Article  Google Scholar 

  • Hu, Y., Yang, B.: Characterizing the structure of large real networks to improve community detection. Neural Comput. Appl. 28(8), 2363 (2017)

    Article  Google Scholar 

  • Nie, G., Wang, G., Zhang, P., Tian, Y., Shi, Y.: Finding the hidden pattern of credit card holder’s churn: a case of China. In: Allen, G., Nabrzyski, J., Seidel, E., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2009. LNCS, vol. 5545, pp. 561–569. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-01973-9_63

    Chapter  Google Scholar 

  • Zhao, B., Wang, W., Xue, G., Yuan, N., Tian, Q.: An empirical analysis on temporal pattern of credit card trade. In: Tan, Y., Shi, Y., Buarque, F., Gelbukh, A., Das, S., Engelbrecht, A. (eds.) ICSI 2015. LNCS, vol. 9141, pp. 63–70. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-20472-7_7

    Chapter  Google Scholar 

  • Zhang, C., Kumar, A., Ré, C.: Materialization optimizations for feature selection workloads. ACM Trans. Database Syst. 41(1), 2 (2016)

    Article  MathSciNet  Google Scholar 

  • Kung, S.Y., Mak, M.W.: Feature selection for genomic and proteomic data mining, Chap. 1. In: Machine Learning in Bioinformatics. Wiley, Hoboken (2009)

    Google Scholar 

  • Boln-Canedo, V., Snchez-Maroo, N., Alonso-Betanzos, A.: Feature Selection for High-Dimensional Data. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-319-21858-8

    Book  Google Scholar 

  • Asir Antony Gnana Singh, D., Appavu alias Balamurugan, S., Jebamalar Leavline, E.: Literature review on feature selection methods for high-dimensional data. Methods 136(1) (2016)

    Google Scholar 

  • Tallón-Ballesteros, A.J., Riquelme, J.C., Ruiz, R.: Merging subsets of attributes to improve a hybrid consistency-based filter: a case of study in product unit neural networks. Connection Sci. 28(3), 242–257 (2016)

    Article  Google Scholar 

  • Peng, H., Ding, C., Long, F.: Minimum redundancy-maximum relevance feature selection and its applications. Feature Selection (2015)

    Google Scholar 

  • Ruiz, R., Riquelme, J.C., Aguilar-Ruiz, J.S.: Incremental wrapper-based gene selection from microarray data for cancer classification. Pattern Recogn. 39(12), 2383–2392 (2006)

    Article  Google Scholar 

  • Ma, L., Li, M., Gao, Y., et al.: A novel wrapper approach for feature selection in object-based image classification using polygon-based cross-validation. IEEE Geosci. Remote Sens. Lett. 99, 1–5 (2017)

    Google Scholar 

  • Mejía-Lavalle, M., Sucar, E., Arroyo, G.: Feature selection with a perceptron neural networks. In: International Workshop on Feature Selection for Data Mining, pp. 131–135 (2006)

    Google Scholar 

  • Lin, X., Yang, F., Zhou, L., et al.: A support vector machine-recursive feature elimination feature selection method based on artificial contrast variables and mutual information. J. Chromatogr. B Anal. Technol. Biomed. Life Sci. 910(23), 149–155 (2012)

    Article  Google Scholar 

  • Fu, H., Xiao, Z., Dellandréa, E., Dou, W., Chen, L.: Image categorization using ESFS: a new embedded feature selection method based on SFS. In: Blanc-Talon, J., Philips, W., Popescu, D., Scheunders, P. (eds.) ACIVS 2009. LNCS, vol. 5807, pp. 288–299. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04697-1_27

    Chapter  Google Scholar 

  • Butterworth, R., Piatetskyshapiro, G., Simovici, D.A., et al.: On feature selection through clustering. In: 5th International Conference on Data Mining, pp. 581–584 (2005)

    Google Scholar 

  • Zhou, X., Hu, Y., Guo, L., et al.: Text categorization based on clustering feature selection. Proc. Comput. Sci. 31, 398–405 (2014)

    Article  Google Scholar 

  • Han, D., Kim, J.: Unsupervised simultaneous orthogonal basis clustering feature selection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5016–5023 (2015)

    Google Scholar 

  • Pearson, K.: Note on regression and inheritance in the case of two parents. Proc. Roy. Soc. Lond. 58, 240–242 (1895)

    Article  Google Scholar 

  • Lapata, M.: Automatic evaluation of information ordering: Kendall’s tau. Comput. Linguist. 32(4), 471–484 (2016)

    Article  Google Scholar 

  • Prion, S., Haerling, K.A.: Making sense of methods and measurement: Pearson product-moment correlation coefficient. Clin. Simul. Nurs. 10(11), 587–588 (2014)

    Article  Google Scholar 

  • Sedgwick, P.: Spearman’s rank correlation coefficient. BMJ (2014)

    Google Scholar 

  • Friedman, N., Geiger, D., Goldszmidt, M., et al.: Bayesian network classifiers. Mach. Learn. 29, 131–163 (1997)

    Article  Google Scholar 

  • Langley, P., Iba, A.W., Thompson, K., et al.: An analysis of Bayesian classifiers. In: International Conference on Artificial Intelligence, pp. 223–228 (1992)

    Google Scholar 

  • Kim, T., Wright, S.: PMU placement for line outage identification via multinomial logistic regression. IEEE Trans. Smart Grid 9, 122–131 (2016)

    Article  Google Scholar 

  • Wang, W., Lin, W., Zhang, R., et al.: Research on human face location based on Adaboost and convolutional neural network. In: IEEE 2nd International Conference on Cloud Computing and Big Data Analysis (2017)

    Google Scholar 

  • Byun, H., Lee, S.-W.: Applications of support vector machines for pattern recognition: a survey. In: Lee, S.-W., Verri, A. (eds.) SVM 2002. LNCS, vol. 2388, pp. 213–236. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45665-1_17

    Chapter  MATH  Google Scholar 

  • Deng, H., Runger, G.: Feature selection via regularized trees. In: International Joint Conference on Neural Networks (2015)

    Google Scholar 

  • Sharma, A., Imoto, S., Miyano, S.: A top-r feature selection algorithm for microarray gene expression data. IEEE/ACM Trans. Comput. Biol. Bioinf. 9(3), 754–764 (2015)

    Google Scholar 

Download references

Acknowledgements

This work is supported by Natural Science Foundation of China under Grant No. 61802034.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yanmei Hu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hu, Y., Ren, Y., Wang, Q. (2019). A Feature Selection Based on Network Structure for Credit Card Default Prediction. In: Sun, Y., Lu, T., Yu, Z., Fan, H., Gao, L. (eds) Computer Supported Cooperative Work and Social Computing. ChineseCSCW 2019. Communications in Computer and Information Science, vol 1042. Springer, Singapore. https://doi.org/10.1007/978-981-15-1377-0_20

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-1377-0_20

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-1376-3

  • Online ISBN: 978-981-15-1377-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics