Rule Extraction from Support Vector Machines: An Overview of Issues and Application in Credit Scoring

Martens, David; Huysmans, Johan; Setiono, Rudy; Vanthienen, Jan; Baesens, Bart

doi:10.1007/978-3-540-75390-2_2

David Martens³,
Johan Huysmans³,
Rudy Setiono⁴,
Jan Vanthienen³ &
…
Bart Baesens^3,5

Part of the book series: Studies in Computational Intelligence ((SCI,volume 80))

1264 Accesses
37 Citations

Summary

Innovative storage technology and the rising popularity of the Internet have generated an ever-growing amount of data. In this vast amount of data much valuable knowledge is available, yet it is hidden. The Support Vector Machine (SVM) is a state-of-the-art classification technique that generally provides accurate models, as it is able to capture non-linearities in the data. However, this strength is also its main weakness, as the generated non-linear models are typically regarded as incomprehensible black-box models. By extracting rules that mimic the black box as closely as possible, we can provide some insight into the logics of the SVM model. This explanation capability is of crucial importance in any domain where the model needs to be validated before being implemented, such as in credit scoring (loan default prediction) and medical diagnosis. If the SVM is regarded as the current state-of-the-art, SVM rule extraction can be the state-of-the-art of the (near) future. This chapter provides an overview of recently proposed SVM rule extraction techniques, complemented with the pedagogical Artificial Neural Network (ANN) rule extraction techniques which are also suitable for SVMs. Issues related to this topic are the different rule outputs and corresponding rule expressiveness; the focus on high dimensional data as SVM models typically perform well on such data; and the requirement that the extracted rules are in line with existing domain knowledge. These issues are explained and further illustrated with a credit scoring case, where we extract a Trepan tree and a RIPPER rule set from the generated SVM model. The benefit of decision tables in a rule extraction context is also demonstrated. Finally, some interesting alternatives for SVM rule extraction are listed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

E. Altendorf, E. Restificar, and T.G. Dietterich. Learning from sparse data by exploiting monotonicity constraints. In Proceedings of the 21st Conference on Uncertainty in Artificial Intelligence, Edinburgh, Scotland, 2005.
Google Scholar
Robert Andrews, Joachim Diederich, and Alan B. Tickle. Survey and critique of techniques for extracting rules from trained artificial neural networks. Knowledge-Based Systems, 8(6):373-389, 1995.
Article Google Scholar
B. Baesens, R. Setiono, C. Mues, and J. Vanthienen. Using neural network rule extraction and decision tables for credit-risk evaluation. Management Science, 49(3):312-329, 2003.
Article Google Scholar
B. Baesens, T. Van Gestel, S. Viaene, M. Stepanova, J.A.K. Suykens, and J. Vanthienen. Benchmarking state-of-the-art classification algorithms for credit scoring. Journal of the Operational Research Society, 54(6):627-635, 2003.
Article MATH Google Scholar
N. Barakat and J. Diederich. Learning-based rule-extraction from support vector machines. In 14th International Conference on Computer Theory and Applications ICCTA 2004 Proceedings, Alexandria, Egypt, 2004.
Google Scholar
N. Barakat and J. Diederich. Eclectic rule-extraction from support vector machines. International Journal of Computational Intelligence, 2 (1):59-62, 2005.
Google Scholar
A. Ben-David. Monotonicity maintenance in information-theoretic machine learning algorithms. Machine Learning, 19(1):29-43, 1995.
Google Scholar
C.M. Bishop. Neural networks for pattern recognition. Oxford University Press, Oxford, UK, 1996.
MATH Google Scholar
G.E.P. Box and D.R. Cox. An analysis of transformations. Journal of the Royal Statistical Society Series B, 26:211-243, 1964.
MATH MathSciNet Google Scholar
O. Boz. Converting A Trained Neural Network To A Decision Tree. DecText -Decision Tree Extractor. PhD thesis, Lehigh University, Department of Computer Science and Engineering, 2000.
Google Scholar
L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and Regression trees. Wadsworth and Brooks, Monterey, CA, 1994.
Google Scholar
P.L. Brockett, X. Xia, and R. Derrig. Using kohonen's self-organizing feature map to uncover automobile bodily injury claims fraud. International Journal of Risk and Insurance, 65:245-274, 1998.
Article Google Scholar
M. Brown, W. Grundy, D. Lin, N. Cristianini, C. Sugnet, M. Ares Jr., and D. Haussler. Support vector machine classification of microarray gene expression data. Technical UCSC-CRL-99-09, University of California, Santa Cruz, 1999.
Google Scholar
F. Chen. Learning accurate and understandable rules from SVM classifiers. Master's thesis, Simon Fraser University, 2004.
Google Scholar
P. Clark and T. Niblett. The CN2 induction algorithm. Machine Learning, 3(4):261-283, 1989.
Google Scholar
W. Cohen. Fast effective rule induction. In Armand Prieditis and Stuart Russell, editors, Proceedings of the 12th International Conference on Machine Learning, pages 115-123, Tahoe City, CA, 1995. Morgan Kaufmann Publishers.
Google Scholar
M.W. Craven. Extracting Comprehensible Models from Trained Neural Networks. PhD thesis, Department of Computer Sciences, University of Wisconsin-Madison, 1996.
Google Scholar
M.W. Craven and J.W. Shavlik. Extracting tree-structured representations of trained networks. In D.S. Touretzky, M.C. Mozer, and M.E. Hasselmo, editors, Advances in Neural Information Processing Systems, volume 8, pages 24-30. The MIT Press, 1996.
Google Scholar
M.W. Craven and J.W. Shavlik. Rule extraction: Where do we go from here? Working paper, University of Wisconsin, Department of Computer Sciences, 1999.
Google Scholar
N. Cristianini and J. Shawe-Taylor. An introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, New York, NY, USA, 2000.
Google Scholar
H. Daniels and M. Velikova. Derivation of monotone decision models from non-monotone data. Discussion Paper 30, Tilburg University, Center for Economic Research, 2003.
Google Scholar
G. Deboeck and T. Kohonen. Visual Explorations in Finance with selforganizing maps. Springer-Verlag, 1998.
Google Scholar
EMC. Groundbreaking study forecasts a staggering 988 billion gigabytes of digital information created in 2010. Technical report, EMC, March 6, 2007.
Google Scholar
A.J. Feelders and M. Pardoel. Pruning for monotone classification trees. In Advanced in intelligent data analysis V, volume 2810, pages 1-12. Springer, 2003.
Google Scholar
G. Fung, S. Sandilya, and R.B. Rao. Rule extraction from linear support vector machines. In Proceedings of the 11th ACM SIGKDD international Conference on Knowledge Discovery in Data Mining, pages 32-40, 2005.
Google Scholar
S. Hettich and S. D. Bay. The uci kdd archive [http://kdd.ics.uci.edu], 1996.
T. Honkela, S. Kaski, K. Lagus, and T. Kohonen. WEBSOM—self-organizing maps of document collections. In Proceedings of Workshop on Self-Organizing Maps (WSOM’97), pages 310-315. Helsinki University of Technology, Neural Networks Research Centre, Espoo, Finland, 1997.
Google Scholar
J. Huysmans, B. Baesens, and J. Vanthienen. ITER: an algorithm for predictive regression rule extraction. In 8th International Conference on Data Warehousing and Knowledge Discovery (DaWaK 2006), volume 4081, pages 270-279. Springer Verlag, lncs 4081, 2006.
Google Scholar
J. Huysmans, B. Baesens, and J. Vanthienen. Using rule extraction to improve the comprehensibility of predictive models. Research 0612, K.U.Leuven KBI, 2006.
Google Scholar
J. Huysmans, B. Baesens, and J. Vanthienen. Minerva: sequential covering for rule extraction. 2007.
Google Scholar
J. Huysmans, D. Martens, B. Baesens, J. Vanthienen, and T. van Gestel. Country corruption analysis with self organizing maps and support vector machines. In International Workshop on Intelligence and Security Informatics (PAKDD-WISI 2006), volume 3917, pages 103-114. Springer Verlag, lncs 3917, 2006.
Google Scholar
J. Huysmans, C. Mues, B. Baesens, and J. Vanthienen. An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models. 2007.
Google Scholar
T. Joachims. Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms. Kluwer Academic Publishers, Norwell, MA, USA, 2002.
Google Scholar
U. Johansson, R. König, and L. Niklasson. Rule extraction from trained neural networks using genetic programming. In Joint 13th International Conference on Artificial Neural Networks and 10th International Conference on Neural Information Processing, ICANN/ICONIP 2003, pages 13-16, 2003.
Google Scholar
U. Johansson, R. König, and L. Niklasson. The truth is in there - rule extraction from opaque models using genetic programming. In 17th International Florida AI Research Symposium Conference FLAIRS Proceedings, 2004.
Google Scholar
R. Kohavi and J.R. Quinlan. Decision-tree discovery. In W. Klosgen and J. Zytkow, editors, Handbook of Data Mining and Knowledge Discovery, pages 267-276. Oxford University Press, 2002.
Google Scholar
T. Kohonen. Self-organized formation of topologically correct feature maps. Biological Cybernetics, 43:59-69, 1982.
Article MATH MathSciNet Google Scholar
T. Kohonen. Self-Organising Maps. Springer-Verlag, 1995.
Google Scholar
M. Mannino and M. Koushik. The cost-minimizing inverse classification problem: A genetic algorithm approach. Decision Support Systems, 29:283-300, 2000.
Article Google Scholar
U. Markowska-Kaczmar and M. Chumieja. Discovering the mysteries of neural networks. International Journal of Hybrid Intelligent Systems, 1(3-4):153-163, 2004.
MATH Google Scholar
U. Markowska-Kaczmar and W. Trelak. Extraction of fuzzy rules from trained neural network using evolutionary algorithm. In European Symposium on Artificial Neural Networks (ESANN), pages 149-154, 2003.
Google Scholar
D. Martens, B. Baesens, T. Van Gestel, and J. Vanthienen. Comprehensible credit scoring models using rule extraction from support vector machines. European Journal of Operational Research, Forthcoming.
Google Scholar
D. Martens, M. De Backer, R. Haesen, B. Baesens, C. Mues, and J. Vanthienen. Ant-based approach to the knowledge fusion problem. In Proceedings of the Fifth International Workshop on Ant Colony Optimization and Swarm Intelligence, Lecture Notes in Computer Science, pages 85-96. Springer, 2006.
Google Scholar
D. Martens, M. De Backer, R. Haesen, M. Snoeck, J. Vanthienen, and B. Baesens. Classification with ant colony optimization. IEEE Transaction on Evolutionary Computation, Forthcoming.
Google Scholar
R. Michalski. On the quasi-minimal solution of the general covering problem. In Proceedings of the 5th International Symposium on Information Processing (FCIP 69), pages 125-128, 1969.
Google Scholar
H. Nú ñez, C. Angulo, and A. Català. Rule extraction from support vector machines. In European Symposium on Artificial Neural Networks (ESANN), pages 107-112, 2002.
Google Scholar
M. Pazzani, S. Mani, and W. Shankle. Acceptance by medical experts of rules generated by machine learning. Methods of Information in Medicine, 40(5):380-385, 2001.
Google Scholar
J. R. Quinlan. Induction of decision trees. Machine Learning, 1(1):81-106, 1986.
Google Scholar
J.R. Quinlan. C4.5 programs for machine learning. Morgan Kaufmann, 1993.
Google Scholar
J.R. Rabuñal, J. Dorado, A. Pazos, J. Pereira, and D. Rivero. A new approach to the extraction of ANN rules and to their generalization capacity through GP. Neural Computation, 16(47):1483-1523, 2004.
Article MATH Google Scholar
B.D. Ripley. Neural networks and related methods for classification. Journal of the Royal Statistical Society B, 56:409-456, 1994.
MATH MathSciNet Google Scholar
G.P.J. Schmitz, C. Aldrich, and F.S. Gouws. Ann-dt: An algorithm for the extraction of decision trees from artificial neural networks. IEEE Transactions on Neural Networks, 10(6):1392-1401, 1999.
Article Google Scholar
R. Setiono, B. Baesens, and C. Mues. Risk management and regulatory compliance: A data mining framework based on neural network rule extraction. In Proceedings of the International Conference on Information Systems (ICIS 2006), 2006.
Google Scholar
J. Sill. Monotonic networks. In Advances in Neural Information Processing Systems, volume 10. The MIT Press, 1998.
Google Scholar
D.W. Silverman. Density Estimation for Statistics and Data Analysis. Chapman and Hall, 1986.
Google Scholar
I.A. Taha and J. Ghosh. Symbolic interpretation of artificial neural networks. IEEE Transactions on Knowledge and Data Engineering, 11(3):448-463, 1999.
Article Google Scholar
P.-N. Tan, M. Steinbach, and V. Kumar. Introduction to Data Mining. Addison Wesley, Boston, MA, 2005.
Google Scholar
M. Tipping. The relevance vector machine. In Advances in Neural Information Processing Systems, San Mateo, CA. Morgan Kaufmann, 2000.
Google Scholar
M. Tipping. Sparse bayesian learning and the relevance vector machine. Journal of Machine Learning Research, 1:211-244, 2001.
Article MATH MathSciNet Google Scholar
T. Van Gestel, B. Baesens, P. Van Dijcke, J. Garcia, J.A.K. Suykens, and J. Van-thienen. A process model to develop an internal rating system: credit ratings. Decision Support Systems, forthcoming.
Google Scholar
T. Van Gestel, B. Baesens, P. Van Dijcke, J.A.K. Suykens, J. Garcia, and T. Alderweireld. Linear and non-linear credit scoring by combining logistic regression and support vector machines. Journal of Credit Risk, 1(4), 2006.
Google Scholar
T. Van Gestel, D. Martens, B. Baesens, D. Feremans, J; Huysmans, and J. Vanthienen. Forecasting and analyzing insurance companies ratings.
Google Scholar
T. Van Gestel, J.A.K. Suykens, B. Baesens, S. Viaene, J. Vanthienen, G. Dedene, B. De Moor, and J. Vandewalle. Benchmarking least squares support vector machine classifiers. CTEO, Technical Report 0037, K.U. Leuven, Belgium, 2000.
Google Scholar
V. N. Vapnik. The nature of statistical learning theory. Springer-Verlag New York, Inc., New York, NY, USA, 1995.
MATH Google Scholar
M. Velikova and H. Daniels. Decision trees for monotone price models. Computational Management Science, 1(3-4):231-244, 2004.
Article MATH Google Scholar
M. Velikova, H. Daniels, and A. Feelders. Solving partially monotone problems with neural networks. In Proceedings of the International Conference on Neural Networks, Vienna, Austria, March 2006.
Google Scholar
J. Vesanto. Som-based data visualization methods. Intelligent Data Analysis, 3:111-26, 1999.
Article MATH Google Scholar
Z.-H. Zhou, Y. Jiang, and S.-F. Chen. Extracting symbolic rules from trained neural network ensembles. AI Communications, 16(1):3-15, 2003.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Decision Sciences and Information Management, K.U.Leuven, Naamsestraat 69, B-3000, Leuven, Belgium
David Martens, Johan Huysmans, Jan Vanthienen & Bart Baesens
School of Computing, National University of Singapore, 3 Science Drive 2, Singapore, 117543, Singapore
Rudy Setiono
School of Management, University of Southampton, Highfield Southampton, SO17 1BJ, UK
Bart Baesens

Authors

David Martens
View author publications
You can also search for this author in PubMed Google Scholar
Johan Huysmans
View author publications
You can also search for this author in PubMed Google Scholar
Rudy Setiono
View author publications
You can also search for this author in PubMed Google Scholar
Jan Vanthienen
View author publications
You can also search for this author in PubMed Google Scholar
Bart Baesens
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Information Technology and Electrical Engineering School of Medicine, Central Clinical Division, The University of Queensland, Brisbane, Q 4072, Australia
Joachim Diederich (Honorary Professor) (Honorary Professor)

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Martens, D., Huysmans, J., Setiono, R., Vanthienen, J., Baesens, B. (2008). Rule Extraction from Support Vector Machines: An Overview of Issues and Application in Credit Scoring. In: Diederich, J. (eds) Rule Extraction from Support Vector Machines. Studies in Computational Intelligence, vol 80. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75390-2_2

Download citation

DOI: https://doi.org/10.1007/978-3-540-75390-2_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75389-6
Online ISBN: 978-3-540-75390-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics