Abstract
We study how a symbolic representation for support vector machines (SVMs) specified by means of abstract interpretation can be exploited for: (1) enhancing the interpretability of SVMs through a novel feature importance measure, called abstract feature importance (AFI), that does not depend in any way on a given dataset or the accuracy of the SVM and is very fast to compute; and (2) certifying individual fairness of SVMs and producing concrete counterexamples when this verification fails. We implemented our methodology and we empirically showed its effectiveness on SVMs based on linear and nonlinear (polynomial and radial basis function) kernels. Our experimental results prove that, independently of the accuracy of the SVM, our AFI measure correlates much strongly with stability of the SVM to feature perturbations than major feature importance measures available in machine learning software such as permutation feature importance, therefore providing better insight into the trustworthiness of SVMs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
For multiple categorical features, we keep track of the relation between all the categorical features and their corresponding tiers through a global lookup table.
References
Aho, A.V., Lam, M.S., Sethi, R., Ullman, J.D.: Compilers: Principles, Techniques, and Tools, 2nd edn. Addison-Wesley Longman Publishing Co., Inc, USA (2006)
Albarghouthi, A.: Introduction to neural network verification. Found. Trends Program. Lang. 7(1–2), 1–157 (2021). https://doi.org/10.1561/2500000051
Angwin, J., Larson, J., Mattu, S., Kirchner, L.: Machine Bias. ProPublica 23 (2016), https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
Apley, D.W., Zhu, J.: Visualizing the effects of predictor variables in black box supervised learning models. J. R. Stat. Soc. Ser. B Stat Methodol. 82(4), 1059–1086 (2020). https://doi.org/10.1111/rssb.12377
Bhatt, U., et al.: Explainable machine learning in deployment. In: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, FAT* 2020, pp. 648–657. ACM (2020). https://doi.org/10.1145/3351095.3375624
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/A:1010933404324
Carlini, N., Wagner, D.A.: Towards evaluating the robustness of neural networks. In: Proceedings of 38th IEEE Symposium on Security and Privacy (S & P 2017), pp. 39–57 (2017). https://doi.org/10.1109/SP.2017.49
Casalicchio, G., Molnar, C., Bischl, B.: Visualizing the feature importance for black box models. In: Machine Learning and Knowledge Discovery in Databases - Proceedings of the European Conference, ECML PKDD 2018. Lecture Notes in Computer Science, vol. 11051, pp. 655–670. Springer (2018). https://doi.org/10.1007/978-3-030-10925-7_40
Cervantes, J., Garcia-Lamont, F., Rodríguez-Mazahua, L., Lopez, A.: A comprehensive survey on support vector machine classification: applications, challenges and trends. Neurocomputing 408, 189–215 (2020). https://doi.org/10.1016/j.neucom.2019.10.118
Chang, Y.W., Lin, C.J.: Feature ranking using linear SVM. In: Proceedings of the Workshop on the Causation and Prediction Challenge at WCCI 2008. Proceedings of Machine Learning Research, vol. 3, pp. 53–64. PMLR (2008), http://proceedings.mlr.press/v3/chang08a.html
Chouldechova, A.: Fair prediction with disparate impact: a study of bias in recidivism prediction instruments. Big Data 5(2), 153–163 (2017). https://doi.org/10.1089/big.2016.0047
Cousot, P.: Principles of Abstract Interpretation. MIT Press (2021)
Cousot, P., Cousot, R.: Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In: Proceedings of the 4th ACM Symposium on Principles of Programming Languages (POPL 1977), pp. 238–252 (1977). https://doi.org/10.1145/512950.512973
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press (2000). https://doi.org/10.1017/CBO9780511801389
Dua, D., Graff, C.: UCI Machine Learning repository (2017). https://archive.ics.uci.edu/ml
Dwork, C., Hardt, M., Pitassi, T., Reingold, O., Zemel, R.S.: Fairness through awareness. In: Innovations in Theoretical Computer Science 2012, pp. 214–226. ACM (2012). https://doi.org/10.1145/2090236.2090255
Fish, B., Kun, J., Lelkes, Á.D.: A confidence-based approach for balancing fairness and accuracy. In: Proceedings of the 2016 SIAM International Conference on Data Mining, pp. 144–152. SIAM (2016). https://doi.org/10.1137/1.9781611974348.17
Fisher, A., Rudin, C., Dominici, F.: All models are wrong, but many are useful: learning a variable’s importance by studying an entire class of prediction models simultaneously. J. Mach. Learn. Res. 20(177), 1–81 (2019). http://jmlr.org/papers/v20/18-760.html
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 1189–1232 (2001). http://www.jstor.org/stable/2699986
Ghorbal, K., Goubault, E., Putot, S.: The zonotope abstract domain Taylor1+. In: Computer Aided Verification, 21st International Conference, CAV 2009. Proceedings. Lecture Notes in Computer Science, vol. 5643, pp. 627–633. Springer (2009). https://doi.org/10.1007/978-3-642-02658-4_47
Ghosh, B., Basu, D., Meel, K.S.: Algorithmic fairness verification with graphical models. In: Thirty-Sixth AAAI Conference on Artificial Intelligence, AAAI 2022, pp. 9539–9548 (2022). https://doi.org/10.1609/aaai.v36i9.21187
Goldstein, A., Kapelner, A., Bleich, J., Pitkin, E.: Peeking inside the black box: visualizing statistical learning with plots of individual conditional expectation. J. Comput. Graph. Stat. 24(1), 44–65 (2015). https://doi.org/10.1080/10618600.2014.907095
Goodfellow, I., McDaniel, P., Papernot, N.: Making machine learning robust against adversarial inputs. Commun. ACM 61(7), 56–66 (2018). https://doi.org/10.1145/3134599
Hechtlinger, Y.: Interpretation of prediction models using the input gradient. CoRR arXiv (2016). http://arxiv.org/abs/1611.07634
Hooker, G., Mentch, L., Zhou, S.: Unrestricted permutation forces extrapolation: variable importance requires at least one more model, or there is no free variable importance. Stat. Comput. 31(6), 82 (2021). https://doi.org/10.1007/s11222-021-10057-z
Khandani, A.E., Kim, A.J., Lo, A.W.: Consumer credit-risk models via machine-learning algorithms. J. Bank. Finance 34(11), 2767–2787 (2010). https://doi.org/10.1016/j.jbankfin.2010.06.001
Langenberg, P., Balda, E.R., Behboodi, A., Mathar, R.: On the robustness of support vector machines against adversarial examples. In: 13th International Conference on Signal Processing and Communication Systems, ICSPCS 2019, pp. 1–6. IEEE (2019). https://doi.org/10.1109/ICSPCS47537.2019.9008746
Liu, C., Arnon, T., Lazarus, C., Strong, C.A., Barrett, C.W., Kochenderfer, M.J.: Algorithms for verifying deep neural networks. Found. Trends Optim. 4(3–4), 244–404 (2021). https://doi.org/10.1561/2400000035
Lundberg, S.M., Lee, S.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, pp. 4765–4774 (2017). https://proceedings.neurips.cc/paper/2017/hash/8a20a8621978632d76c43dfd28b67767-Abstract.html
Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., Galstyan, A.: A survey on bias and fairness in machine learning. ACM Comput. Surv. 54(6), 1–35 (2021). https://doi.org/10.1145/3457607
Messine, F.: Extentions of affine arithmetic: application to unconstrained global optimization. J. Univ. Comput. Sci. 8(11), 992–1015 (2002). https://doi.org/10.3217/jucs-008-11-0992
Mladenic, D., Brank, J., Grobelnik, M., Milic-Frayling, N.: Feature selection using linear classifier weights: interaction with classification models. In: SIGIR 2004: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 234–241. ACM (2004). https://doi.org/10.1145/1008992.1009034
Pal, A., Ranzato, F., Urban, C., Zanella, M.: Abstract Feature Importance for SVMs (2023). https://github.com/AFI-SVM
Park, S., Byun, J., Lee, J.: Privacy-preserving fair learning of support vector machine with homomorphic encryption. In: WWW 2022: The ACM Web Conference 2022, pp. 3572–3583. ACM (2022). https://doi.org/10.1145/3485447.3512252
Ranzato, F., Urban, C., Zanella, M.: Fairness-aware training of decision trees by abstract interpretation. In: Proceedings of the 30th ACM International Conference on Information & Knowledge Management, CIKM2021, pp. 1508–1517 (2021). https://doi.org/10.1145/3459637.3482342
Ranzato, F., Zanella, M.: Robustness verification of support vector machines. In: Proceedings of the 26th International Static Analysis Symposium (SAS 2019), pp. 271–295. LNCS vol. 11822 (2019). https://doi.org/10.1007/978-3-030-32304-2_14
Ranzato, F., Zanella, M.: Saver: SVM Abstract Verifier (2019). https://github.com/abstract-machine-learning/saver
Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should I trust you?”: explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, pp. 1135–1144. ACM (2016). https://doi.org/10.1145/2939672.2939778
Ribeiro, M.T.C.: Local Interpretable Model-agnostic Explanations (LIME) (2016). https://lime-ml.readthedocs.io
Roh, Y., Lee, K., Whang, S., Suh, C.: Fr-train: a mutual information-based approach to fair and robust training. In: Proceedings of the 37th International Conference on Machine Learning (ICML 2020). Proceedings of Machine Learning Research, vol. 119, pp. 8147–8157. PMLR (2020). http://proceedings.mlr.press/v119/roh20a.html
Ruoss, A., Balunovic, M., Fischer, M., Vechev, M.T.: Learning certified individually fair representations. In: Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems (NeurIPS 2020) (2020). https://proceedings.neurips.cc/paper/2020/hash/55d491cf951b1b920900684d71419282-Abstract.html
Shapley, L.S.: A value for n-person games. In: Kuhn, H.W., Tucker, A.W. (eds.) Contributions to the Theory of Games II, pp. 307–317. Princeton University Press, Princeton (1953)
Tjoa, E., Guan, C.: A survey on explainable artificial intelligence (XAI): toward medical XAI. IEEE Trans. Neural Networks Learn. Syst. 32(11), 4793–4813 (2021). https://doi.org/10.1109/TNNLS.2020.3027314
Urban, C., Christakis, M., Wüstholz, V., Zhang, F.: Perfectly parallel fairness certification of neural networks. Proc. ACM Program. Lang. 4(OOPSLA), 185:1-185:30 (2020). https://doi.org/10.1145/3428253
Urban, C., Miné, A.: A review of formal methods applied to machine learning. CoRR arXiv (2021). https://arxiv.org/abs/2104.02466
Verma, S., Rubin, J.: Fairness definitions explained. In: Proceedings of the International Workshop on Software Fairness, FairWare@ICSE 2018, pp. 1–7. ACM (2018). https://doi.org/10.1145/3194770.3194776
Xiao, H., Biggio, B., Nelson, B., Xiao, H., Eckert, C., Roli, F.: Support vector machines under adversarial label contamination. Neurocomputing 160, 53–62 (2015). https://doi.org/10.1016/j.neucom.2014.08.081
Yurochkin, M., Bower, A., Sun, Y.: Training individually fair ML models with sensitive subspace robustness. In: Proceedings of the 8th International Conference on Learning Representations, ICLR 2020 (2020). https://openreview.net/forum?id=B1gdkxHFDH
Acknowledgements
Francesco Ranzato and Marco Zanella were partially funded by the Italian MIUR, under the PRIN 2017 project no. 201784YSZ5. Francesco Ranzato was partially funded by: the Italian MUR, under the PRIN 2022 PNRR project no. P2022HXNSC; Meta (formerly Facebook) Research, under a “Probability and Programming Research Award” and under a WhatsApp Research Award on “Privacy-aware Program Analysis”; by an Amazon Research Award for “AWS Automated Reasoning”. Caterina Urban was partially funded by the French PEPR Intelligence Artificielle SAIF project (ANR-23-PEIA-0006).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Pal, A., Ranzato, F., Urban, C., Zanella, M. (2024). Abstract Interpretation-Based Feature Importance for Support Vector Machines. In: Dimitrova, R., Lahav, O., Wolff, S. (eds) Verification, Model Checking, and Abstract Interpretation. VMCAI 2024. Lecture Notes in Computer Science, vol 14499. Springer, Cham. https://doi.org/10.1007/978-3-031-50524-9_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-50524-9_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-50523-2
Online ISBN: 978-3-031-50524-9
eBook Packages: Computer ScienceComputer Science (R0)