Skip to main content

Rule Extraction from Support Vector Machines: An Overview of Issues and Application in Credit Scoring

  • Chapter
Rule Extraction from Support Vector Machines

Part of the book series: Studies in Computational Intelligence ((SCI,volume 80))

Summary

Innovative storage technology and the rising popularity of the Internet have generated an ever-growing amount of data. In this vast amount of data much valuable knowledge is available, yet it is hidden. The Support Vector Machine (SVM) is a state-of-the-art classification technique that generally provides accurate models, as it is able to capture non-linearities in the data. However, this strength is also its main weakness, as the generated non-linear models are typically regarded as incomprehensible black-box models. By extracting rules that mimic the black box as closely as possible, we can provide some insight into the logics of the SVM model. This explanation capability is of crucial importance in any domain where the model needs to be validated before being implemented, such as in credit scoring (loan default prediction) and medical diagnosis. If the SVM is regarded as the current state-of-the-art, SVM rule extraction can be the state-of-the-art of the (near) future. This chapter provides an overview of recently proposed SVM rule extraction techniques, complemented with the pedagogical Artificial Neural Network (ANN) rule extraction techniques which are also suitable for SVMs. Issues related to this topic are the different rule outputs and corresponding rule expressiveness; the focus on high dimensional data as SVM models typically perform well on such data; and the requirement that the extracted rules are in line with existing domain knowledge. These issues are explained and further illustrated with a credit scoring case, where we extract a Trepan tree and a RIPPER rule set from the generated SVM model. The benefit of decision tables in a rule extraction context is also demonstrated. Finally, some interesting alternatives for SVM rule extraction are listed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. E. Altendorf, E. Restificar, and T.G. Dietterich. Learning from sparse data by exploiting monotonicity constraints. In Proceedings of the 21st Conference on Uncertainty in Artificial Intelligence, Edinburgh, Scotland, 2005.

    Google Scholar 

  2. Robert Andrews, Joachim Diederich, and Alan B. Tickle. Survey and critique of techniques for extracting rules from trained artificial neural networks. Knowledge-Based Systems, 8(6):373-389, 1995.

    Article  Google Scholar 

  3. B. Baesens, R. Setiono, C. Mues, and J. Vanthienen. Using neural network rule extraction and decision tables for credit-risk evaluation. Management Science, 49(3):312-329, 2003.

    Article  Google Scholar 

  4. B. Baesens, T. Van Gestel, S. Viaene, M. Stepanova, J.A.K. Suykens, and J. Vanthienen. Benchmarking state-of-the-art classification algorithms for credit scoring. Journal of the Operational Research Society, 54(6):627-635, 2003.

    Article  MATH  Google Scholar 

  5. N. Barakat and J. Diederich. Learning-based rule-extraction from support vector machines. In 14th International Conference on Computer Theory and Applications ICCTA 2004 Proceedings, Alexandria, Egypt, 2004.

    Google Scholar 

  6. N. Barakat and J. Diederich. Eclectic rule-extraction from support vector machines. International Journal of Computational Intelligence, 2 (1):59-62, 2005.

    Google Scholar 

  7. A. Ben-David. Monotonicity maintenance in information-theoretic machine learning algorithms. Machine Learning, 19(1):29-43, 1995.

    Google Scholar 

  8. C.M. Bishop. Neural networks for pattern recognition. Oxford University Press, Oxford, UK, 1996.

    MATH  Google Scholar 

  9. G.E.P. Box and D.R. Cox. An analysis of transformations. Journal of the Royal Statistical Society Series B, 26:211-243, 1964.

    MATH  MathSciNet  Google Scholar 

  10. O. Boz. Converting A Trained Neural Network To A Decision Tree. DecText -Decision Tree Extractor. PhD thesis, Lehigh University, Department of Computer Science and Engineering, 2000.

    Google Scholar 

  11. L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and Regression trees. Wadsworth and Brooks, Monterey, CA, 1994.

    Google Scholar 

  12. P.L. Brockett, X. Xia, and R. Derrig. Using kohonen's self-organizing feature map to uncover automobile bodily injury claims fraud. International Journal of Risk and Insurance, 65:245-274, 1998.

    Article  Google Scholar 

  13. M. Brown, W. Grundy, D. Lin, N. Cristianini, C. Sugnet, M. Ares Jr., and D. Haussler. Support vector machine classification of microarray gene expression data. Technical UCSC-CRL-99-09, University of California, Santa Cruz, 1999.

    Google Scholar 

  14. F. Chen. Learning accurate and understandable rules from SVM classifiers. Master's thesis, Simon Fraser University, 2004.

    Google Scholar 

  15. P. Clark and T. Niblett. The CN2 induction algorithm. Machine Learning, 3(4):261-283, 1989.

    Google Scholar 

  16. W. Cohen. Fast effective rule induction. In Armand Prieditis and Stuart Russell, editors, Proceedings of the 12th International Conference on Machine Learning, pages 115-123, Tahoe City, CA, 1995. Morgan Kaufmann Publishers.

    Google Scholar 

  17. M.W. Craven. Extracting Comprehensible Models from Trained Neural Networks. PhD thesis, Department of Computer Sciences, University of Wisconsin-Madison, 1996.

    Google Scholar 

  18. M.W. Craven and J.W. Shavlik. Extracting tree-structured representations of trained networks. In D.S. Touretzky, M.C. Mozer, and M.E. Hasselmo, editors, Advances in Neural Information Processing Systems, volume 8, pages 24-30. The MIT Press, 1996.

    Google Scholar 

  19. M.W. Craven and J.W. Shavlik. Rule extraction: Where do we go from here? Working paper, University of Wisconsin, Department of Computer Sciences, 1999.

    Google Scholar 

  20. N. Cristianini and J. Shawe-Taylor. An introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, New York, NY, USA, 2000.

    Google Scholar 

  21. H. Daniels and M. Velikova. Derivation of monotone decision models from non-monotone data. Discussion Paper 30, Tilburg University, Center for Economic Research, 2003.

    Google Scholar 

  22. G. Deboeck and T. Kohonen. Visual Explorations in Finance with selforganizing maps. Springer-Verlag, 1998.

    Google Scholar 

  23. EMC. Groundbreaking study forecasts a staggering 988 billion gigabytes of digital information created in 2010. Technical report, EMC, March 6, 2007.

    Google Scholar 

  24. A.J. Feelders and M. Pardoel. Pruning for monotone classification trees. In Advanced in intelligent data analysis V, volume 2810, pages 1-12. Springer, 2003.

    Google Scholar 

  25. G. Fung, S. Sandilya, and R.B. Rao. Rule extraction from linear support vector machines. In Proceedings of the 11th ACM SIGKDD international Conference on Knowledge Discovery in Data Mining, pages 32-40, 2005.

    Google Scholar 

  26. S. Hettich and S. D. Bay. The uci kdd archive [http://kdd.ics.uci.edu], 1996.

  27. T. Honkela, S. Kaski, K. Lagus, and T. Kohonen. WEBSOM—self-organizing maps of document collections. In Proceedings of Workshop on Self-Organizing Maps (WSOM’97), pages 310-315. Helsinki University of Technology, Neural Networks Research Centre, Espoo, Finland, 1997.

    Google Scholar 

  28. J. Huysmans, B. Baesens, and J. Vanthienen. ITER: an algorithm for predictive regression rule extraction. In 8th International Conference on Data Warehousing and Knowledge Discovery (DaWaK 2006), volume 4081, pages 270-279. Springer Verlag, lncs 4081, 2006.

    Google Scholar 

  29. J. Huysmans, B. Baesens, and J. Vanthienen. Using rule extraction to improve the comprehensibility of predictive models. Research 0612, K.U.Leuven KBI, 2006.

    Google Scholar 

  30. J. Huysmans, B. Baesens, and J. Vanthienen. Minerva: sequential covering for rule extraction. 2007.

    Google Scholar 

  31. J. Huysmans, D. Martens, B. Baesens, J. Vanthienen, and T. van Gestel. Country corruption analysis with self organizing maps and support vector machines. In International Workshop on Intelligence and Security Informatics (PAKDD-WISI 2006), volume 3917, pages 103-114. Springer Verlag, lncs 3917, 2006.

    Google Scholar 

  32. J. Huysmans, C. Mues, B. Baesens, and J. Vanthienen. An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models. 2007.

    Google Scholar 

  33. T. Joachims. Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms. Kluwer Academic Publishers, Norwell, MA, USA, 2002.

    Google Scholar 

  34. U. Johansson, R. König, and L. Niklasson. Rule extraction from trained neural networks using genetic programming. In Joint 13th International Conference on Artificial Neural Networks and 10th International Conference on Neural Information Processing, ICANN/ICONIP 2003, pages 13-16, 2003.

    Google Scholar 

  35. U. Johansson, R. König, and L. Niklasson. The truth is in there - rule extraction from opaque models using genetic programming. In 17th International Florida AI Research Symposium Conference FLAIRS Proceedings, 2004.

    Google Scholar 

  36. R. Kohavi and J.R. Quinlan. Decision-tree discovery. In W. Klosgen and J. Zytkow, editors, Handbook of Data Mining and Knowledge Discovery, pages 267-276. Oxford University Press, 2002.

    Google Scholar 

  37. T. Kohonen. Self-organized formation of topologically correct feature maps. Biological Cybernetics, 43:59-69, 1982.

    Article  MATH  MathSciNet  Google Scholar 

  38. T. Kohonen. Self-Organising Maps. Springer-Verlag, 1995.

    Google Scholar 

  39. M. Mannino and M. Koushik. The cost-minimizing inverse classification problem: A genetic algorithm approach. Decision Support Systems, 29:283-300, 2000.

    Article  Google Scholar 

  40. U. Markowska-Kaczmar and M. Chumieja. Discovering the mysteries of neural networks. International Journal of Hybrid Intelligent Systems, 1(3-4):153-163, 2004.

    MATH  Google Scholar 

  41. U. Markowska-Kaczmar and W. Trelak. Extraction of fuzzy rules from trained neural network using evolutionary algorithm. In European Symposium on Artificial Neural Networks (ESANN), pages 149-154, 2003.

    Google Scholar 

  42. D. Martens, B. Baesens, T. Van Gestel, and J. Vanthienen. Comprehensible credit scoring models using rule extraction from support vector machines. European Journal of Operational Research, Forthcoming.

    Google Scholar 

  43. D. Martens, M. De Backer, R. Haesen, B. Baesens, C. Mues, and J. Vanthienen. Ant-based approach to the knowledge fusion problem. In Proceedings of the Fifth International Workshop on Ant Colony Optimization and Swarm Intelligence, Lecture Notes in Computer Science, pages 85-96. Springer, 2006.

    Google Scholar 

  44. D. Martens, M. De Backer, R. Haesen, M. Snoeck, J. Vanthienen, and B. Baesens. Classification with ant colony optimization. IEEE Transaction on Evolutionary Computation, Forthcoming.

    Google Scholar 

  45. R. Michalski. On the quasi-minimal solution of the general covering problem. In Proceedings of the 5th International Symposium on Information Processing (FCIP 69), pages 125-128, 1969.

    Google Scholar 

  46. H. Nú ñez, C. Angulo, and A. Català. Rule extraction from support vector machines. In European Symposium on Artificial Neural Networks (ESANN), pages 107-112, 2002.

    Google Scholar 

  47. M. Pazzani, S. Mani, and W. Shankle. Acceptance by medical experts of rules generated by machine learning. Methods of Information in Medicine, 40(5):380-385, 2001.

    Google Scholar 

  48. J. R. Quinlan. Induction of decision trees. Machine Learning, 1(1):81-106, 1986.

    Google Scholar 

  49. J.R. Quinlan. C4.5 programs for machine learning. Morgan Kaufmann, 1993.

    Google Scholar 

  50. J.R. Rabuñal, J. Dorado, A. Pazos, J. Pereira, and D. Rivero. A new approach to the extraction of ANN rules and to their generalization capacity through GP. Neural Computation, 16(47):1483-1523, 2004.

    Article  MATH  Google Scholar 

  51. B.D. Ripley. Neural networks and related methods for classification. Journal of the Royal Statistical Society B, 56:409-456, 1994.

    MATH  MathSciNet  Google Scholar 

  52. G.P.J. Schmitz, C. Aldrich, and F.S. Gouws. Ann-dt: An algorithm for the extraction of decision trees from artificial neural networks. IEEE Transactions on Neural Networks, 10(6):1392-1401, 1999.

    Article  Google Scholar 

  53. R. Setiono, B. Baesens, and C. Mues. Risk management and regulatory compliance: A data mining framework based on neural network rule extraction. In Proceedings of the International Conference on Information Systems (ICIS 2006), 2006.

    Google Scholar 

  54. J. Sill. Monotonic networks. In Advances in Neural Information Processing Systems, volume 10. The MIT Press, 1998.

    Google Scholar 

  55. D.W. Silverman. Density Estimation for Statistics and Data Analysis. Chapman and Hall, 1986.

    Google Scholar 

  56. I.A. Taha and J. Ghosh. Symbolic interpretation of artificial neural networks. IEEE Transactions on Knowledge and Data Engineering, 11(3):448-463, 1999.

    Article  Google Scholar 

  57. P.-N. Tan, M. Steinbach, and V. Kumar. Introduction to Data Mining. Addison Wesley, Boston, MA, 2005.

    Google Scholar 

  58. M. Tipping. The relevance vector machine. In Advances in Neural Information Processing Systems, San Mateo, CA. Morgan Kaufmann, 2000.

    Google Scholar 

  59. M. Tipping. Sparse bayesian learning and the relevance vector machine. Journal of Machine Learning Research, 1:211-244, 2001.

    Article  MATH  MathSciNet  Google Scholar 

  60. T. Van Gestel, B. Baesens, P. Van Dijcke, J. Garcia, J.A.K. Suykens, and J. Van-thienen. A process model to develop an internal rating system: credit ratings. Decision Support Systems, forthcoming.

    Google Scholar 

  61. T. Van Gestel, B. Baesens, P. Van Dijcke, J.A.K. Suykens, J. Garcia, and T. Alderweireld. Linear and non-linear credit scoring by combining logistic regression and support vector machines. Journal of Credit Risk, 1(4), 2006.

    Google Scholar 

  62. T. Van Gestel, D. Martens, B. Baesens, D. Feremans, J; Huysmans, and J. Vanthienen. Forecasting and analyzing insurance companies ratings.

    Google Scholar 

  63. T. Van Gestel, J.A.K. Suykens, B. Baesens, S. Viaene, J. Vanthienen, G. Dedene, B. De Moor, and J. Vandewalle. Benchmarking least squares support vector machine classifiers. CTEO, Technical Report 0037, K.U. Leuven, Belgium, 2000.

    Google Scholar 

  64. V. N. Vapnik. The nature of statistical learning theory. Springer-Verlag New York, Inc., New York, NY, USA, 1995.

    MATH  Google Scholar 

  65. M. Velikova and H. Daniels. Decision trees for monotone price models. Computational Management Science, 1(3-4):231-244, 2004.

    Article  MATH  Google Scholar 

  66. M. Velikova, H. Daniels, and A. Feelders. Solving partially monotone problems with neural networks. In Proceedings of the International Conference on Neural Networks, Vienna, Austria, March 2006.

    Google Scholar 

  67. J. Vesanto. Som-based data visualization methods. Intelligent Data Analysis, 3:111-26, 1999.

    Article  MATH  Google Scholar 

  68. Z.-H. Zhou, Y. Jiang, and S.-F. Chen. Extracting symbolic rules from trained neural network ensembles. AI Communications, 16(1):3-15, 2003.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Martens, D., Huysmans, J., Setiono, R., Vanthienen, J., Baesens, B. (2008). Rule Extraction from Support Vector Machines: An Overview of Issues and Application in Credit Scoring. In: Diederich, J. (eds) Rule Extraction from Support Vector Machines. Studies in Computational Intelligence, vol 80. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75390-2_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-75390-2_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-75389-6

  • Online ISBN: 978-3-540-75390-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics