Selection bias in credit scorecard evaluation
- 57 Downloads
Selection bias is a perennial problem when constructing and evaluating scorecards. It is familiar in the context of reject inference, but crops up in many other situations as well. In this paper, we examine the impact of how accepting or rejecting customers using one scorecard leads to biased comparisons of performance between that scorecard and others. This has important implications for organisations seeking to improve or replace scorecards.
Keywordscredit scoring Kolmogorov–Smirnov statistic area under the ROC curve selection bias
We are grateful to the UK bank that provided the UPL data for the real example, and to the anonymous referees for deep and helpful comments.
- Adams NM, Tasoulis DK, Anagnostopoulos C and Hand DJ (2010). Temporally-adaptive linear classification for handling population drift in credit scoring. In: Lechevallier Y and Saporta G (eds). COMPSTAT2010, Proceedings of the 19th International Conference on Computational Statistics. Heidelberg: Springer, pp 167–176.Google Scholar
- Babbage C (1830). Reflections on the Decline of Science in England, and on Some of Its Causes. B. Fellowes: London.Google Scholar
- Glennon DC (2001). Model design and validation: Identifying potential sources of model risk. In: Elizabeth M (ed). Handbook of Credit Scoring. Glenlake Publishing Co. Ltd.: Chicago, pp 243–274.Google Scholar
- Hand DJ (2001). Reject inference in credit operations. In: Elizabeth M (ed) Handbook of Credit Scoring. Chicago: Glenlake Publishing, pp 225–240.Google Scholar
- Hand DJ and Henley WE (1993). Can reject inference ever work? IMA Journal of Mathematics Applied in Business and Industry 5 (1): 45–55.Google Scholar
- Heckman JJ (1976). The common structure of statistical models of truncation, sample selection and limited dependent variables and a simple estimator for such models. Annals of Economic and Social Measurement 5 (4): 475–492.Google Scholar
- Judson HF (2004). The Great Betrayal: Fraud in Science. Harcourt Inc: Orlando.Google Scholar
- Kelly MG (1998). Tackling change and uncertainty in credit scoring. PhD Thesis, Department of Mathematics, The Open University, UK.Google Scholar
- Kelly MG, Hand DJ and Adams NM (1999). The impact of changing populations on classifier performance. In: Chaudhuri S and Madigan D (eds). Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery: New York, pp 367–371.CrossRefGoogle Scholar
- Phua C, Lee V, Smith K and Gayler R (2010). A comprehensive survey of data mining-based fraud detection research. arXiv: 1009.6119v1.Google Scholar