PAKDD 2017: Advances in Knowledge Discovery and Data Mining pp 210-222 | Cite as
Cost Matters: A New Example-Dependent Cost-Sensitive Logistic Regression Model
Abstract
Connectivity and automation are evermore part of today’s cars. To provide automation, many gauges are integrated in cars to collect physical readings. In the automobile industry, the gathered multiple datasets can be used to predict whether a car repair is needed soon. This information gives drivers and retailers helpful information to take action early. However, prediction in real use cases shows new challenges: misclassified instances have not equal but different costs. For example, incurred costs for not predicting a necessarily needed tire change are usually higher than predicting a tire change even though the car could still drive thousands of kilometers. To tackle this problem, we introduce a new example-dependent cost sensitive prediction model extending the well-established idea of logistic regression. Our model allows different costs of misclassified instances and obtains prediction results leading to overall less cost. Our method consistently outperforms the state-of-the-art in example-dependent cost-sensitive logistic regression on various datasets. Applying our methods to vehicle data from a large European car manufacturer, we show cost savings of about 10%.
Keywords
Logistic Regression Loss Function Average Loss Misclassification Cost Tire WearReferences
- 1.Zadrozny, B., et al.: Cost-sensitive learning by cost-proportionate example weighting. In: ICDM, pp. 435–442 (2003)Google Scholar
- 2.Günnemann, N., et al.: Robust multivariate autoregression for anomaly detection in dynamic product ratings. In: WWW, pp. 361–372 (2014)Google Scholar
- 3.Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT Press, Cambridge (2012)MATHGoogle Scholar
- 4.Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1, 81–106 (1986)Google Scholar
- 5.Haykin, S.: A comprehensive foundation. Neural Netw. 2, 41 (2004)Google Scholar
- 6.Weiss, G.M.: Learning with rare cases and small disjuncts. In: ICML, pp. 558–565 (1995)Google Scholar
- 7.Bahnsen, A.C., et al.: Example-dependent cost-sensitive logistic regression for credit scoring. In: ICMLA, pp. 263–269 (2014)Google Scholar
- 8.Anderson, R.: The Credit Scoring Toolkit: Theory and Practice for Retail Credit Risk Management and Decision Automation. Oxford University Press, Oxford (2007)Google Scholar
- 9.Bahnsen, A.C., et al.: Cost sensitive credit card fraud detection using Bayes minimum risk. In: ICMLA, pp. 333–338 (2013)Google Scholar
- 10.Bahnsen, A.C., et al.: Improving credit card fraud detection with calibrated probabilities. In: SIAM, pp. 677–685 (2014)Google Scholar
- 11.Alejo, R., García, V., Marqués, A.I., Sánchez, J.S., Antonio-Velázquez, J.A.: Making accurate credit risk predictions with cost-sensitive MLP neural networks. In: Casillas, J., Martínez-López, F., Vicari, R., De la Prieta, F. (eds.) Management Intelligent Systems. AISC, vol. 220, pp. 1–8. Springer, Heidelberg (2013). doi: 10.1007/978-3-319-00569-0_1 CrossRefGoogle Scholar
- 12.Beling, P., et al.: Optimal scoring cutoff policies and efficient frontiers. J. Oper. Res. Soc. 56(9), 1016–1029 (2005)CrossRefMATHGoogle Scholar
- 13.Oliver, R.M., et al.: Optimal score cutoffs and pricing in regulatory capital in retail credit portfolios. University of Southampton (2009)Google Scholar
- 14.Verbraken, T., et al.: Development and application of consumer credit scoring models using profit-based classification measures. Eur. J. Oper. Res. 238(2), 505–513 (2014)MathSciNetCrossRefMATHGoogle Scholar
- 15.Lomax, S., et al.: A survey of cost-sensitive decision tree induction algorithms. CSUR 45(2), 16 (2013)CrossRefMATHGoogle Scholar
- 16.Bahnsen, A.C., et al.: Ensemble of example-dependent cost-sensitive decision trees (2015). arXiv preprint arXiv:1505.04637
- 17.Mobley, R.K.: An Introduction to Predictive Maintenance. Butterworth-Heinemann, Oxford (2002)Google Scholar