Data Mining pp 53-74 | Cite as

Building Acceptable Classification Models

  • David Martens
  • Bart Baesens
Part of the Annals of Information Systems book series (AOIS, volume 8)


Classification (Carvalho et al. Evaluating the Correlation Between Objective Rule Interestingness Measures and Real Human Interest. Springer, New York, 2005) is an important data mining task, where the value of a discrete (dependent) variable is predicted, based on the values of some independent variables. Classification models should provide correct predictions on new unseen data instances. This accuracy measure is often the only performance requirement used. However, comprehensibility of the model is a key requirement as well in any domain where the model needs to be validated before it can be implemented. Whenever comprehensibility is needed, justifiability will be required as well, meaning the model should be in line with existing domain knowledge. Although recent academic research has acknowledged the importance of comprehensibility in the last years, justifiability is often neglected. By providing comprehensible, justifiable classification models, they become acceptable in domains where previously such models are deemed too theoretical and incomprehensible. As such, new opportunities emerge for data mining. A classification model that is accurate, comprehensible, and intuitive is defined as acceptable for implementation.


Support Vector Machine Data Mining Domain Expert Univariate Constraint Decision Table 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.



We extend our gratitude to the guest editor and the anonymous reviewers, as their many constructive and detailed remarks certainly contributed much to the quality of this chapter. Further, we would like to thank the Flemish Research Council (FWO, Grant G.0615.05) for financial support.


  1. 1.
    D. W. Aha, D. F. Kibler, and M. K. Albert. Instance-based learning algorithms. Machine Learning, 6:37–66, 1991.Google Scholar
  2. 2.
    E. Altendorf, E. Restificar, and T.G. Dietterich. Learning from sparse data by exploiting monotonicity constraints. In Proceedings of the 21st Conference on Uncertainty in Artificial Intelligence, Edinburgh, Scotland, 2005.Google Scholar
  3. 3.
    I. Askira-Gelman. Knowledge discovery: Comprehensibility of the results. In HICSS ’98: Proceedings of the Thirty-First Annual Hawaii International Conference on System Sciences-Volume 5, p. 247, Washington, DC, USA, 1998. IEEE Computer Society.Google Scholar
  4. 4.
    B. Baesens. Developing intelligent systems for credit scoring using machine learning techniques. PhD thesis, K.U. Leuven, 2003.Google Scholar
  5. 5.
    B. Baesens, R. Setiono, C. Mues, and J. Vanthienen. Using neural network rule extraction and decision tables for credit-risk evaluation. Management Science, 49(3):312–329, 2003.CrossRefGoogle Scholar
  6. 6.
    B. Baesens, T. Van Gestel, S. Viaene, M. Stepanova, J. Suykens, and J. Vanthienen. Benchmarking state-of-the-art classification algorithms for credit scoring. Journal of the Operational Research Society, 54(6):627–635, 2003.CrossRefGoogle Scholar
  7. 7.
    A. Ben-David. Monotonicity maintenance in information-theoretic machine learning algorithms. Machine Learning, 19(1):29–43, 1995.Google Scholar
  8. 8.
    D. Billman and D. Davila. Consistency is the hobgoblin of human minds: People care but concept learning models do not. In Proceedings of the 17th Annual Conference of the Cognitive Science Society, pp. 188–193, 1995.Google Scholar
  9. 9.
    C.M. Bishop. Neural Networks for Pattern Recognition. Oxford University Press, Oxford, UK, 1996.Google Scholar
  10. 10.
    L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and Regression Trees. Chapman 8 Hall, New York, 1984.Google Scholar
  11. 11.
    D. R. Carvalho, A. A. Freitas, and N. F. F. Ebecken. Evaluating the correlation between objective rule interestingness measures and real human interest. In Alìpio Jorge, Luís Torgo, Pavel Brazdil, Rui Camacho, and João Gama, editors, PKDD, volume 3721 of Lecture Notes in Computer Science, pp. 453–461. Springer, 2005.Google Scholar
  12. 12.
    P. Clark and T. Niblett. The CN2 induction algorithm. Machine Learning, 3(4):261–283, 1989.Google Scholar
  13. 13.
    W. W. Cohen. Fast effective rule induction. In Armand Prieditis and Stuart Russell, editors, Proc. of the 12th International Conference on Machine Learning, pp. 115–123, Tahoe City, CA, 1995. Morgan Kaufmann.Google Scholar
  14. 14.
    N. Cristianini and J. Shawe-Taylor. An introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, New York, 2000.Google Scholar
  15. 15.
    B. Cumps, D. Martens, M. De Backer, S. Viaene, G. Dedene, R. Haesen, M. Snoeck, and B. Baesens. Inferring rules for business/ict alignment using ants. Information and Management, 46(2):116–124, 2009.CrossRefGoogle Scholar
  16. 16.
    H. Daniels and M. Velikova. Derivation of monotone decision models from noisy data. IEEE Transactions on Systems, Man and Cybernetics, Part C: Applications and Reviews, 36(5):705–710, 2006.CrossRefGoogle Scholar
  17. 17.
    P. Domingos. The role of occam’s razor in knowledge discovery. Data Mining and Knowledge Discovery, 3(4):409–425, 1999.CrossRefGoogle Scholar
  18. 18.
    R.O. Duda, P.E. Hart, and D.G. Stork. Pattern Classification. John Wiley and Sons, New York, second edition, 2001.Google Scholar
  19. 19.
    Federal Trade Commission for the Consumer. Facts for consumers: Equal credit opportunity. Technical report, FTC, March 1998.Google Scholar
  20. 20.
    A.J. Feelders. Prior knowledge in economic applications of data mining. In Proceedings of the fourth European conference on principles and practice of knowledge discovery in data bases, volume 1910 of Lecture Notes in Computer Science, pp. 395–400. Springer, 2000.Google Scholar
  21. 21.
    A.J. Feelders and M. Pardoel. Pruning for monotone classification trees. In Advanced in intelligent data analysis V, volume 2810, pp. 1–12. Springer, 2003.Google Scholar
  22. 22.
    D. Hand. Pattern detection and discovery. In D. Hand, N. Adams, and R. Bolton, editors, Pattern Detection and Discovery, volume 2447 of Lecture Notes in Computer Science, pp. 1–12. Springer, 2002.Google Scholar
  23. 23.
    D. Hand. Protection or privacy? Data mining and personal data. In Advances in Knowledge Discovery and Data Mining, 10th Pacific-Asia Conference, PAKDD 2006, Singapore, April 9-12, volume 3918 of Lecture Notes in Computer Science, pp. 1–10. Springer, 2006.Google Scholar
  24. 24.
    T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning, Data Mining, Inference, and Prediction. Springer, New York, 2001.Google Scholar
  25. 25.
    J. Huysmans, B. Baesens, D. Martens, K. Denys, and J. Vanthienen. New trends in data mining. In Tijdschrift voor economie en Management, volume L, pp. 697–711, 2005.Google Scholar
  26. 26.
    J. Huysmans, C. Mues, B. Baesens, and J. Vanthienen. An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models. 2007.Google Scholar
  27. 27.
    D.G. Kleinbaum, L.L. Kupper, K. E. Muller, and A. Nizam. Applied Regression Analysis and Multivariable Methods. Duxbury Press, North Scituate, MA, 1997.Google Scholar
  28. 28.
    Y. Kodratoff. The comprehensibility manifesto. KDD Nuggets (94:9), 1994.Google Scholar
  29. 29.
    D. Martens, B. Baesens, and T. Van Gestel. Decompositional rule extraction from support vector machines by active learning. IEEE Transactions on Knowledge and Data Engineering, 21(2):178–191, 2009.CrossRefGoogle Scholar
  30. 30.
    D. Martens, B. Baesens, T. Van Gestel, and J. Vanthienen. Comprehensible credit scoring models using rule extraction from support vector machines. European Journal of Operational Research, 183(3):1466–1476, 2007.Google Scholar
  31. 31.
    D. Martens, L. Bruynseels, B. Baesens, M. Willekens, and J. Vanthienen. Predicting going concern opinion with data mining. Decision Support Systems, 45(4):765–777, 2008.CrossRefGoogle Scholar
  32. 32.
    D. Martens, M. De Backer, R. Haesen, B. Baesens, C. Mues, and J. Vanthienen. Ant-based approach to the knowledge fusion problem. In Proceedings of the Fifth International Workshop on Ant Colony Optimization and Swarm Intelligence, Lecture Notes in Computer Science, pp. 85–96. Springer, 2006.Google Scholar
  33. 33.
    D. Martens, M. De Backer, R. Haesen, M. Snoeck, J. Vanthienen, and B. Baesens. Classification with ant colony optimization. IEEE Transaction on Evolutionary Computation, 11(5):651–665, 2007.CrossRefGoogle Scholar
  34. 34.
    R.S. Michalski. A theory and methodology of inductive learning. Artificial Intelligence, 20(2):111–161, 1983.CrossRefGoogle Scholar
  35. 35.
    O.O. Maimon and L. Rokach. Decomposition Methodology For Knowledge Discovery And Data Mining: Theory And Applications (Machine Perception and Artificial Intelligence). World Scientific Publishing Company, July 2005.Google Scholar
  36. 36.
    M. Ohsaki, S. Kitaguchi, K. Okamoto, H. Yokoi, and T. Yamaguchi. Evaluation of rule interestingness measures with a clinical dataset on hepatitis. In PKDD ’04: Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases, pp. 362–373, New York, NY, USA, 2004. Springer-Verlag New York, Inc.Google Scholar
  37. 37.
    L. Passmore, J. Goodside, L. Hamel, L. Gonzales, T. Silberstein, and J. Trimarchi. Assessing decision tree models for clinical in-vitro fertilization data. Technical Report TR03-296, Dept. of Computer Science and Statistics, University of Rhode Island, 2003.Google Scholar
  38. 38.
    M. Pazzani. Influence of prior knowledge on concept acquisition: Experimental and computational results. Journal of Experimental Psychology: Learning, Memory, and Cognition, 17(3):416–432, 1991.CrossRefGoogle Scholar
  39. 39.
    M. Pazzani and S. Bay. The independent sign bias: Gaining insight from multiple linear regression. In Proceedings of the Twenty First Annual Conference of the Cognitive Science Society, pp. 525–530., 1999.Google Scholar
  40. 40.
    M. Pazzani, S. Mani, and W. Shankle. Acceptance by medical experts of rules generated by machine learning. Methods of Information in Medicine, 40(5):380–385, 2001.Google Scholar
  41. 41.
    M. Pazzani. Learning with globally predictive tests. In Discovery Science, pp. 220–231, 1998.Google Scholar
  42. 42.
    J. R. Quinlan. C4.5 Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, 1993.Google Scholar
  43. 43.
    J.W. Seifert. Data mining and homeland security: An overview. CRS Report for Congress, 2006.Google Scholar
  44. 44.
    R. Setiono, B. Baesens, and C. Mues. Risk management and regulatory compliance: A data mining framework based on neural network rule extraction. In Proceedings of the International Conference on Information Systems (ICIS 2006), 2006.Google Scholar
  45. 45.
    R. Setiono, B. Baesens, and C. Mues. Recursive neural network rule extraction for data with mixed attributes. IEEE Transactions on Neural Networks, Forthcoming.Google Scholar
  46. 46.
    A. Silberschatz and A. Tuzhilin. On subjective measures of interestingness in knowledge discovery. In KDD, pp. 275–281, 1995.Google Scholar
  47. 47.
    J. Sill. Monotonic networks. In Advances in Neural Information Processing Systems, volume 10. The MIT Press, Cambridge, MA, 1998.Google Scholar
  48. 48.
    E. Sommer. An approach to quantifying the quality of induced theories. In Claire Nedellec, editor, Proceedings of the IJCAI Workshop on Machine Learning and Comprehensibility, 1995.Google Scholar
  49. 49.
    J. A. K. Suykens, T. Van Gestel, J. De Brabanter, B. De Moor, and J. Vandewalle. Least Squares Support Vector Machines. World Scientific, Singapore, 2002.CrossRefGoogle Scholar
  50. 50.
    P.-N. Tan, M. Steinbach, and V. Kumar. Introduction to Data Mining. Pearson Education, Boston, MA, 2006.Google Scholar
  51. 51.
    L. Thomas, D. Edelman, and J. Crook, editors. Credit Scoring and its Applications. SIAM, Philadelphia, PA, 2002.Google Scholar
  52. 52.
    T. Van Gestel, B. Baesens, and L. Thomas. Introduction to Modern Credit Scoring. Oxford University Press, Oxford, Forthcoming.Google Scholar
  53. 53.
    T. Van Gestel, B. Baesens, P. Van Dijcke, J. Garcia, J.A.K. Suykens, and J. Vanthienen. A process model to develop an internal rating system: sovereign credit ratings. Decision Support Systems, 42(2):1131–1151, 2006.Google Scholar
  54. 54.
    T. Van Gestel, B. Baesens, P. Van Dijcke, J.A.K. Suykens, J. Garcia, and T. Alderweireld. Linear and nonlinear credit scoring by combining logistic regression and support vector machines. Journal of Credit Risk, 1(4), 2005.Google Scholar
  55. 55.
    T. Van Gestel, D. Martens, B. Baesens, D. Feremans, J Huysmans, and J. Vanthienen. Forecasting and analyzing insurance companies’ ratings. International Journal of Forecasting, 23(3):513–529, 2007.CrossRefGoogle Scholar
  56. 56.
    T. Van Gestel, J.A.K. Suykens, B. Baesens, S. Viaene, J. Vanthienen, G. Dedene, B. De Moor, and J. Vandewalle. Benchmarking least squares support vector machine classifiers. Machine Learning, 54(1):5–32, 2004.CrossRefGoogle Scholar
  57. 57.
    O. Vandecruys, D. Martens, B. Baesens, C. Mues, M. De Backer, and R. Haesen. Mining software repositories for comprehensible software fault prediction models. Journal of Systems and Software, 81(5):823–839, 2008.CrossRefGoogle Scholar
  58. 58.
    J. Vanthienen, C. Mues, and A. Aerts. An illustration of verification and validation in the modelling phase of KBS development. Data and Knowledge Engineering, 27(3):337–352, 1998.CrossRefGoogle Scholar
  59. 59.
    V. N. Vapnik. The nature of statistical learning theory. Springer-Verlag New York, Inc., New York, 1995.Google Scholar
  60. 60.
    M. Velikova and H. Daniels. Decision trees for monotone price models. Computational Management Science, 1(3–4):231–244, 2004.CrossRefGoogle Scholar
  61. 61.
    M. Velikova, H. Daniels, and A. Feelders. Solving partially monotone problems with neural networks. In Proceedings of the International Conference on Neural Networks, Vienna, Austria, March 2006.Google Scholar
  62. 62.
    M. P. Wellman. Fundamental concepts of qualitative probabilistic networks. Artificial Intelligence, 44(3):257–303, 1990.CrossRefGoogle Scholar
  63. 63.
    I. H. Witten and E. Frank. Data mining: practical machine learning tools and techniques with Java implementations. Morgan Kaufmann Publishers Inc., San Francisco, CA, 2000.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  1. 1.Department of Decision Sciences and Information ManagementK.U.LeuvenLeuvenBelgium
  2. 2.Department of Business Administration and Public ManagementHogeschool GentGhentBelgium
  3. 3.School of ManagementUniversity of SouthamptonHighfield SouthamptonUK

Personalised recommendations