Similarity Aggregation a New Version of Rank Aggregation Applied to Credit Scoring Case
Credit scoring is one of the most challenging research topics that have been a source of many innovative works in banking field. Choosing the appropriate set of features is one of the most interesting and difficult tasks that have a key effect on the performance of credit scoring models. With the huge amount of feature selection techniques and specially ranking techniques for feature selection, rank aggregation techniques become indispensable tools for fusing individual ranked lists into a single consensus list with better performance. However, in some cases the obtained ranking may be noisy or incomplete witch lead to an unsatisfactory final rank. We investigate on this issue by proposing a similarity based algorithm that extends two standard methods of rank aggregation namely majority vote and mean aggregation based on the similarity between the features in the dataset. Evaluations on four credit datasets show that feature subsets selected by the aggregation based similarity technique give superior results to those selected by individual filters and the standard aggregation techniques.
KeywordsFeature selection filter mutual information
Unable to display preview. Download preview PDF.
- 5.Wu, O., Zuo, H., Zhu, M., Hu, W., Gao, J., Wang, H.: Rank aggregation based text feature selection. In: Web Intelligence, pp. 165–172 (2009)Google Scholar
- 7.Bouaguel, W., Bel Mufti, G.: An improvement direction for filter selection techniques using information theory measures and quadratic optimization. International Journal of Advanced Research in Artificial Intelligence 1, 7–11 (2012) Google Scholar
- 11.Kira, K., Rendell, L.: A practical approach to feature selection. In: Sleeman, D., Edwards, P. (eds.) International Conference on Machine Learning, pp. 368–377 (1992)Google Scholar
- 12.Hall, M.A.: Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of the Seventeenth International Conference on Machine Learning, pp. 359–366. Morgan Kaufmann (2000)Google Scholar
- 13.Quinlan, J.R.: C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc. (1993)Google Scholar
- 14.Bouckaert, R.R., Frank, E., Hall, M., Kirkby, R., Reutemann, P., Seewald, A., Scuse, D.: Weka manual (3.7.1) (June 2009)Google Scholar
- 16.Guldogan, E., Gabbouj, M.: Feature selection for content-based image retrieval. In: Signal, Image and Video Processing, pp. 241–250 (2008)Google Scholar
- 17.Wald, R., Khoshgoftaar, T.M., Dittman, D.J.: Mean aggregation versus robust rank aggregation for ensemble gene selection. ICMLA (1), 63–69 (2012)Google Scholar
- 18.Okun, O.: Feature Selection and Ensemble Methods for Bioinformatics: Algorithmic Classification and Implementations. In (2011)Google Scholar