Abstract
Performance tuning of the machine learning models is very important, especially in the banking domain. In line with the new age, they are moving away from their conventional methods to target customers for credit cards, loans, etc. products. The transactional data, customer information, which was collected over the years, have a huge scope of applying data mining techniques to extract useful information for maximizing the return on investment, cost optimization and fraud detection. For a successful deployment of a machine learning model multiple out of time validations are performed and stability is strictly evaluated. Here, we propose a cardinal method for intuitive use case related feature engineering, tuning the hyper parameters, best model selection and diagnosing the model for further improvements. The stability of a model plays a huge factor, as we expect the deployed model to work well for the next 6–8 months, then up for re-tuning based on the data distribution and model performance. Statistical and data driven methods are used to develop sophisticated features and achieved minimal accuracy variation across time periods. Implementation of our methods for the use cases like customer attrition from the bank in the next 6 months and detection of the in-bound calls from the customer to the call centre to enquire about balance, transaction details, etc. are discussed. Achieved 47% & 55% recall score in the top 2 deciles respectively for the use cases.
CIMB Bank.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Adhikari, N.C.D., et al.: An intelligent approach to demand forecasting. In: Smys, S., Bestak, R., Chen, J.I.-Z., Kotuliak, I. (eds.) ICCNCT 2018. LNDECT, vol. 15, pp. 167–183. Springer, Singapore (2019). https://doi.org/10.1007/978-981-10-8681-6_17
Ahmad, A.K., Jafar, A., Aljoumaa, K.: Customer churn prediction in telecom using machine learning in big data platform. J. Big Data 6(1), 1–24 (2019). https://doi.org/10.1186/s40537-019-0191-6
Kendall, M.G.: Rank Correlation Methods, 4th edn. Charles Griffin, London (1975)
Wang, Q., Luo, Z.: A Novel Ensemble Method for Imbalanced Data Learning: Bagging of Extrapolation-SMOTE SVM (2017)
Hirsch, R.M., Slack, J.R., Smith, R.A.: Techniques of trend analysis for monthly water quality data. Water Resour. Res. 18(1), 107–121 (1982)
Coussement, K., Van den Poel, D.: Churn prediction in subscription services: an application of support vector machines while comparing two parameter-selection techniques. Expert Syst. Appl. 34(1), 313–327 (2008)
Saghir, M., et al.: Churn prediction using neural network based individual and ensemble models. In: 2019 16th International Bhurban Conference on Applied Sciences and Technology (IBCAST), pp. 634–639 (2019)
Vafeiadis, T., Diamantaras, K.I., Sarigiannidis, G., Chatzisavvas, K.Ch.: A comparison of machine learning techniques for customer churn prediction. Simul. Model. Pract. Theory 55, 1–9 (2015). ISSN 1569-190X
Coussement, K., Benoit, D.F., Van den Poel, D.: Improved marketing decision making in a customer churn prediction context using generalized additive models. Expert Syst. Appl. 37(3), 2132–2143 (2010). ISSN 0957-4174
Xie, Y., Li, X., Ngai, E.W.T., Ying, W.: Customer churn prediction using improved balanced random forests. Expert Syst. Appl. 36(3), Part 1, 5445–5449 (2009). ISSN 0957-4174
Burez, J., Van den Poel, D.: Handling class imbalance in customer churn prediction. Expert Syst. Appl. 36(3), Part 1, 4626–4636 (2009). ISSN 0957-4174
Tang, Y., Zhang, Y., Chawla, N.V., Krasser, S.: SVMs modeling for highly imbalanced classification. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 39(1), 281–288 (2009). https://doi.org/10.1109/TSMCB.2008.2002909
Wang, H.: Combination approach of SMOTE and biased-SVM for imbalanced datasets. In: 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), Hong Kong, pp. 228–231 (2008). https://doi.org/10.1109/IJCNN.2008.4633794
Han, H., Wang, W.-Y., Mao, B.-H.: Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: Huang, D.-S., Zhang, X.-P., Huang, G.-B. (eds.) ICIC 2005. LNCS, vol. 3644, pp. 878–887. Springer, Heidelberg (2005). https://doi.org/10.1007/11538059_91
Batuwita, R., Palade, V.: Class imbalance learning methods for support vector machines. In: Imbalanced Learning: Foundations, Algorithms, and Applications, pp. 83–99. Wiley, Berlin (2013)
Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2016), pp. 785–794. Association for Computing Machinery, New York (2016)
Chapelle, O., Chang, Y.: Yahoo! learning to rank challenge overview. J. Mach. Learn. Res. - W & CP 14, 1–24 (2011)
Sabbeh, S.: Machine-learning techniques for customer retention: a comparative study. Int. J. Adv. Comput. Sci. Appl. (2018)
Alkhatib, K., Abualigah, S.: Predictive model for cutting customers migration from banks: based on machine learning classification algorithms. In: 2020 11th International Conference on Information and Communication Systems (2020)
Ahmed, A., Linen, D.M.: A review and analysis of churn prediction methods for customer retention in telecom industries. In: 2017 4th International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, pp. 1–7 (2017). https://doi.org/10.1109/ICACCS.2017.8014605
Acknowledgment
The complete list of the cross functional team which has worked in these projects are: Ang E Mei, Megan Azreen Ehsan, NG Wai Keat, Abhishek Prakash, Anoop Sharma, Ashish Chauhan, Gajanan Thenge, Ganapathy K, Hylish James, Rajeev Reddy, Saikat Kumar, Shaik Imran, Shilpana Sathyanarayana, Suraj Shukla, Somnath Ojha, Ujjwal Gupta, Uttam Kumar Kushwaha, Varsha Vishwakarma, \(^{\dagger }\)Decision Management, \({^\dagger }\)Consumer Banking.
\(^{\dagger }\)Organization & Team.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Teja, S. et al. (2021). Intuitive Feature Engineering and Machine Learning Performance Improvement in the Banking Domain. In: Garg, D., Wong, K., Sarangapani, J., Gupta, S.K. (eds) Advanced Computing. IACC 2020. Communications in Computer and Information Science, vol 1367. Springer, Singapore. https://doi.org/10.1007/978-981-16-0401-0_28
Download citation
DOI: https://doi.org/10.1007/978-981-16-0401-0_28
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-0400-3
Online ISBN: 978-981-16-0401-0
eBook Packages: Computer ScienceComputer Science (R0)