Abstract Interpretation-Based Feature Importance for Support Vector Machines

Pal, Abhinandan; Ranzato, Francesco; Urban, Caterina; Zanella, Marco

doi:10.1007/978-3-031-50524-9_2

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14499))

Included in the following conference series:

International Conference on Verification, Model Checking, and Abstract Interpretation

129 Accesses

Abstract

We study how a symbolic representation for support vector machines (SVMs) specified by means of abstract interpretation can be exploited for: (1) enhancing the interpretability of SVMs through a novel feature importance measure, called abstract feature importance (AFI), that does not depend in any way on a given dataset or the accuracy of the SVM and is very fast to compute; and (2) certifying individual fairness of SVMs and producing concrete counterexamples when this verification fails. We implemented our methodology and we empirically showed its effectiveness on SVMs based on linear and nonlinear (polynomial and radial basis function) kernels. Our experimental results prove that, independently of the accuracy of the SVM, our AFI measure correlates much strongly with stability of the SVM to feature perturbations than major feature importance measures available in machine learning software such as permutation feature importance, therefore providing better insight into the trustworthiness of SVMs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
For multiple categorical features, we keep track of the relation between all the categorical features and their corresponding tiers through a global lookup table.

References

Aho, A.V., Lam, M.S., Sethi, R., Ullman, J.D.: Compilers: Principles, Techniques, and Tools, 2nd edn. Addison-Wesley Longman Publishing Co., Inc, USA (2006)
Google Scholar
Albarghouthi, A.: Introduction to neural network verification. Found. Trends Program. Lang. 7(1–2), 1–157 (2021). https://doi.org/10.1561/2500000051
Article Google Scholar
Angwin, J., Larson, J., Mattu, S., Kirchner, L.: Machine Bias. ProPublica 23 (2016), https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
Apley, D.W., Zhu, J.: Visualizing the effects of predictor variables in black box supervised learning models. J. R. Stat. Soc. Ser. B Stat Methodol. 82(4), 1059–1086 (2020). https://doi.org/10.1111/rssb.12377
Article MathSciNet Google Scholar
Bhatt, U., et al.: Explainable machine learning in deployment. In: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, FAT* 2020, pp. 648–657. ACM (2020). https://doi.org/10.1145/3351095.3375624
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/A:1010933404324
Article Google Scholar
Carlini, N., Wagner, D.A.: Towards evaluating the robustness of neural networks. In: Proceedings of 38th IEEE Symposium on Security and Privacy (S & P 2017), pp. 39–57 (2017). https://doi.org/10.1109/SP.2017.49
Casalicchio, G., Molnar, C., Bischl, B.: Visualizing the feature importance for black box models. In: Machine Learning and Knowledge Discovery in Databases - Proceedings of the European Conference, ECML PKDD 2018. Lecture Notes in Computer Science, vol. 11051, pp. 655–670. Springer (2018). https://doi.org/10.1007/978-3-030-10925-7_40
Cervantes, J., Garcia-Lamont, F., Rodríguez-Mazahua, L., Lopez, A.: A comprehensive survey on support vector machine classification: applications, challenges and trends. Neurocomputing 408, 189–215 (2020). https://doi.org/10.1016/j.neucom.2019.10.118
Article Google Scholar
Chang, Y.W., Lin, C.J.: Feature ranking using linear SVM. In: Proceedings of the Workshop on the Causation and Prediction Challenge at WCCI 2008. Proceedings of Machine Learning Research, vol. 3, pp. 53–64. PMLR (2008), http://proceedings.mlr.press/v3/chang08a.html
Chouldechova, A.: Fair prediction with disparate impact: a study of bias in recidivism prediction instruments. Big Data 5(2), 153–163 (2017). https://doi.org/10.1089/big.2016.0047
Article Google Scholar
Cousot, P.: Principles of Abstract Interpretation. MIT Press (2021)
Google Scholar
Cousot, P., Cousot, R.: Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In: Proceedings of the 4th ACM Symposium on Principles of Programming Languages (POPL 1977), pp. 238–252 (1977). https://doi.org/10.1145/512950.512973
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press (2000). https://doi.org/10.1017/CBO9780511801389
Book Google Scholar
Dua, D., Graff, C.: UCI Machine Learning repository (2017). https://archive.ics.uci.edu/ml
Dwork, C., Hardt, M., Pitassi, T., Reingold, O., Zemel, R.S.: Fairness through awareness. In: Innovations in Theoretical Computer Science 2012, pp. 214–226. ACM (2012). https://doi.org/10.1145/2090236.2090255
Fish, B., Kun, J., Lelkes, Á.D.: A confidence-based approach for balancing fairness and accuracy. In: Proceedings of the 2016 SIAM International Conference on Data Mining, pp. 144–152. SIAM (2016). https://doi.org/10.1137/1.9781611974348.17
Fisher, A., Rudin, C., Dominici, F.: All models are wrong, but many are useful: learning a variable’s importance by studying an entire class of prediction models simultaneously. J. Mach. Learn. Res. 20(177), 1–81 (2019). http://jmlr.org/papers/v20/18-760.html
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 1189–1232 (2001). http://www.jstor.org/stable/2699986
Ghorbal, K., Goubault, E., Putot, S.: The zonotope abstract domain Taylor1+. In: Computer Aided Verification, 21st International Conference, CAV 2009. Proceedings. Lecture Notes in Computer Science, vol. 5643, pp. 627–633. Springer (2009). https://doi.org/10.1007/978-3-642-02658-4_47
Ghosh, B., Basu, D., Meel, K.S.: Algorithmic fairness verification with graphical models. In: Thirty-Sixth AAAI Conference on Artificial Intelligence, AAAI 2022, pp. 9539–9548 (2022). https://doi.org/10.1609/aaai.v36i9.21187
Goldstein, A., Kapelner, A., Bleich, J., Pitkin, E.: Peeking inside the black box: visualizing statistical learning with plots of individual conditional expectation. J. Comput. Graph. Stat. 24(1), 44–65 (2015). https://doi.org/10.1080/10618600.2014.907095
Article MathSciNet Google Scholar
Goodfellow, I., McDaniel, P., Papernot, N.: Making machine learning robust against adversarial inputs. Commun. ACM 61(7), 56–66 (2018). https://doi.org/10.1145/3134599
Article Google Scholar
Hechtlinger, Y.: Interpretation of prediction models using the input gradient. CoRR arXiv (2016). http://arxiv.org/abs/1611.07634
Hooker, G., Mentch, L., Zhou, S.: Unrestricted permutation forces extrapolation: variable importance requires at least one more model, or there is no free variable importance. Stat. Comput. 31(6), 82 (2021). https://doi.org/10.1007/s11222-021-10057-z
Article MathSciNet Google Scholar
Khandani, A.E., Kim, A.J., Lo, A.W.: Consumer credit-risk models via machine-learning algorithms. J. Bank. Finance 34(11), 2767–2787 (2010). https://doi.org/10.1016/j.jbankfin.2010.06.001
Article Google Scholar
Langenberg, P., Balda, E.R., Behboodi, A., Mathar, R.: On the robustness of support vector machines against adversarial examples. In: 13th International Conference on Signal Processing and Communication Systems, ICSPCS 2019, pp. 1–6. IEEE (2019). https://doi.org/10.1109/ICSPCS47537.2019.9008746
Liu, C., Arnon, T., Lazarus, C., Strong, C.A., Barrett, C.W., Kochenderfer, M.J.: Algorithms for verifying deep neural networks. Found. Trends Optim. 4(3–4), 244–404 (2021). https://doi.org/10.1561/2400000035
Article Google Scholar
Lundberg, S.M., Lee, S.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, pp. 4765–4774 (2017). https://proceedings.neurips.cc/paper/2017/hash/8a20a8621978632d76c43dfd28b67767-Abstract.html
Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., Galstyan, A.: A survey on bias and fairness in machine learning. ACM Comput. Surv. 54(6), 1–35 (2021). https://doi.org/10.1145/3457607
Article Google Scholar
Messine, F.: Extentions of affine arithmetic: application to unconstrained global optimization. J. Univ. Comput. Sci. 8(11), 992–1015 (2002). https://doi.org/10.3217/jucs-008-11-0992
Article MathSciNet Google Scholar
Mladenic, D., Brank, J., Grobelnik, M., Milic-Frayling, N.: Feature selection using linear classifier weights: interaction with classification models. In: SIGIR 2004: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 234–241. ACM (2004). https://doi.org/10.1145/1008992.1009034
Pal, A., Ranzato, F., Urban, C., Zanella, M.: Abstract Feature Importance for SVMs (2023). https://github.com/AFI-SVM
Park, S., Byun, J., Lee, J.: Privacy-preserving fair learning of support vector machine with homomorphic encryption. In: WWW 2022: The ACM Web Conference 2022, pp. 3572–3583. ACM (2022). https://doi.org/10.1145/3485447.3512252
Ranzato, F., Urban, C., Zanella, M.: Fairness-aware training of decision trees by abstract interpretation. In: Proceedings of the 30th ACM International Conference on Information & Knowledge Management, CIKM2021, pp. 1508–1517 (2021). https://doi.org/10.1145/3459637.3482342
Ranzato, F., Zanella, M.: Robustness verification of support vector machines. In: Proceedings of the 26th International Static Analysis Symposium (SAS 2019), pp. 271–295. LNCS vol. 11822 (2019). https://doi.org/10.1007/978-3-030-32304-2_14
Ranzato, F., Zanella, M.: Saver: SVM Abstract Verifier (2019). https://github.com/abstract-machine-learning/saver
Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should I trust you?”: explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, pp. 1135–1144. ACM (2016). https://doi.org/10.1145/2939672.2939778
Ribeiro, M.T.C.: Local Interpretable Model-agnostic Explanations (LIME) (2016). https://lime-ml.readthedocs.io
Roh, Y., Lee, K., Whang, S., Suh, C.: Fr-train: a mutual information-based approach to fair and robust training. In: Proceedings of the 37th International Conference on Machine Learning (ICML 2020). Proceedings of Machine Learning Research, vol. 119, pp. 8147–8157. PMLR (2020). http://proceedings.mlr.press/v119/roh20a.html
Ruoss, A., Balunovic, M., Fischer, M., Vechev, M.T.: Learning certified individually fair representations. In: Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems (NeurIPS 2020) (2020). https://proceedings.neurips.cc/paper/2020/hash/55d491cf951b1b920900684d71419282-Abstract.html
Shapley, L.S.: A value for n-person games. In: Kuhn, H.W., Tucker, A.W. (eds.) Contributions to the Theory of Games II, pp. 307–317. Princeton University Press, Princeton (1953)
Google Scholar
Tjoa, E., Guan, C.: A survey on explainable artificial intelligence (XAI): toward medical XAI. IEEE Trans. Neural Networks Learn. Syst. 32(11), 4793–4813 (2021). https://doi.org/10.1109/TNNLS.2020.3027314
Article Google Scholar
Urban, C., Christakis, M., Wüstholz, V., Zhang, F.: Perfectly parallel fairness certification of neural networks. Proc. ACM Program. Lang. 4(OOPSLA), 185:1-185:30 (2020). https://doi.org/10.1145/3428253
Article Google Scholar
Urban, C., Miné, A.: A review of formal methods applied to machine learning. CoRR arXiv (2021). https://arxiv.org/abs/2104.02466
Verma, S., Rubin, J.: Fairness definitions explained. In: Proceedings of the International Workshop on Software Fairness, FairWare@ICSE 2018, pp. 1–7. ACM (2018). https://doi.org/10.1145/3194770.3194776
Xiao, H., Biggio, B., Nelson, B., Xiao, H., Eckert, C., Roli, F.: Support vector machines under adversarial label contamination. Neurocomputing 160, 53–62 (2015). https://doi.org/10.1016/j.neucom.2014.08.081
Article Google Scholar
Yurochkin, M., Bower, A., Sun, Y.: Training individually fair ML models with sensitive subspace robustness. In: Proceedings of the 8th International Conference on Learning Representations, ICLR 2020 (2020). https://openreview.net/forum?id=B1gdkxHFDH

Download references

Acknowledgements

Francesco Ranzato and Marco Zanella were partially funded by the Italian MIUR, under the PRIN 2017 project no. 201784YSZ5. Francesco Ranzato was partially funded by: the Italian MUR, under the PRIN 2022 PNRR project no. P2022HXNSC; Meta (formerly Facebook) Research, under a “Probability and Programming Research Award” and under a WhatsApp Research Award on “Privacy-aware Program Analysis”; by an Amazon Research Award for “AWS Automated Reasoning”. Caterina Urban was partially funded by the French PEPR Intelligence Artificielle SAIF project (ANR-23-PEIA-0006).

Author information

Authors and Affiliations

School of Computer Science, University of Birmingham, Birmingham, UK
Abhinandan Pal
Dipartimento di Matematica, University of Padova, Padova, Italy
Francesco Ranzato & Marco Zanella
INRIA and Ecole Normale Supérieure, Université PSL, Paris, France
Caterina Urban

Authors

Abhinandan Pal
View author publications
You can also search for this author in PubMed Google Scholar
Francesco Ranzato
View author publications
You can also search for this author in PubMed Google Scholar
Caterina Urban
View author publications
You can also search for this author in PubMed Google Scholar
Marco Zanella
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Francesco Ranzato .

Editor information

Editors and Affiliations

CISPA Helmholtz Center for Information Security, Saarbrücken, Germany
Rayna Dimitrova
Tel Aviv University, Tel Aviv, Israel
Ori Lahav
New York University, New York, NY, USA
Sebastian Wolff

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pal, A., Ranzato, F., Urban, C., Zanella, M. (2024). Abstract Interpretation-Based Feature Importance for Support Vector Machines. In: Dimitrova, R., Lahav, O., Wolff, S. (eds) Verification, Model Checking, and Abstract Interpretation. VMCAI 2024. Lecture Notes in Computer Science, vol 14499. Springer, Cham. https://doi.org/10.1007/978-3-031-50524-9_2

Download citation

DOI: https://doi.org/10.1007/978-3-031-50524-9_2
Published: 30 December 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-50523-2
Online ISBN: 978-3-031-50524-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Abstract Interpretation-Based Feature Importance for Support Vector Machines