Abstract
This paper studies the following problem: Given an SVM (kernel)-based binary classifier C as a black-box oracle, how much can we learn of its internal working by querying it? Specifically, we assume the feature space ℝd is known and the kernel machine has m support vectors such that d > m (or d > > m), and in addition, the classifier C is laconic in the sense that for a feature vector, it only provides a predicted label (±1) without divulging other information such as margin or confidence level. We formulate the problem of understanding the inner working of C as characterizing the decision boundary of the classifier, and we introduce the simple notion of bracketing to sample points on the decision boundary within a prescribed accuracy. For the five most common types of kernel function, linear, quadratic and cubic polynomial kernels, hyperbolic tangent kernel and Gaussian kernel, we show that with O(dm) number of queries, the type of kernel function and the (kernel) subspace spanned by the support vectors can be determined. In particular, for polynomial kernels, additional O(m 3) queries are sufficient to reconstruct the entire decision boundary, providing a set of quasi-support vectors that can be used to efficiently evaluate the deconstructed classifier. We speculate briefly on the future application potential of deconstructing kernel machines and we present experimental results validating the proposed method.
Chapter PDF
References
Vapnik, V.: Statistical Learning Theory.Wiley-Interscience (1998)
Brachat, J., Comon, P., Mourrain, B., Tsigaridas, E.: Symmetric tensor decomposition. Linear Algebra Appl. 433(11-12), 1851–1872 (2010)
Heath, M.: Scientific Computing. The McGraw-Hill Companies, Inc. (2002)
de Berg, M., Cheong, O., van Kreveld, M., Overmars, M.: Computational Geometry: Algorithms and Applications. Springer (2010)
Dey, T.: Curve and Surface Reconstruction: Algorithms with Mathematical Analysis. Cambridge University Press (2006)
Horn, B.: Robot Vision. MIT Press (2010)
Carlsson, G.: Topology and data. Bulletin of the American Mathematical Society 46(2), 255–308 (2009)
Cauwenberghs, G., Poggio, T.: Incremental and decremental support vector machine learning. In: Advances in Neural Information Processing Systems, pp. 409–415 (2001)
Diehl, C.P., Cauwenberghs, G.: SVM incremental learning, adaptation and optimization. In: Proceedings of the International Joint Conference on Neural Networks 2003., pp. 2685–2690. IEEE (2003)
Lowd, D., Meek, C.: Adversarial learning. In: Proceedings of the eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp. 641–647. ACM (2005)
Torkamani, M., Lowd, D.: Convex adversarial collective classification. In: Proc. Int. Conf. Machine Learning (ICML) (2013)
Dasgupta, S.: Analysis of a greedy active learning strategy. In: Advances in Neural Information Processing Systems (2004)
Balcan, M., Beygelzimer, A., Langford, J.: Agnostic active learning. In: Proc. Int. Conf. Machine Learning (ICML) (2006)
Balcan, M.-F., Broder, A., Zhang, T.: Margin based active learning. In: Bshouty, N.H., Gentile, C. (eds.) COLT. LNCS (LNAI), vol. 4539, pp. 35–50. Springer, Heidelberg (2007)
Hirsch, M.: Differential Topology. Springer (1997)
Nesterov, Y., Nemirovski, A.: Some first-order algorithm for l1/nuclear norm minimization. Acta Numerica (2014)
Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)
Belkin, M., Niyogi, P.: Semi-supervised learning on riemannian manifolds. Machine Learning 56(1-3), 209–239 (2004)
Rifai, S., Dauphin, Y., Vincent, P., Bengio, Y., Muller, X.: The manifold tangent classifier. In: NIPS, pp. 2294–2302 (2011)
Candes, E., Recht, B.: Exact matrix completion via convex optimization. Commun. ACM 55(6), 111–119 (2012)
Belkin, M., Niyogi, P., Sindhwani, V.: Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. The Journal of Machine Learning Research 7, 2399–2434 (2006)
Ballico, E., Bernardi, A.: Decompoision of homogeneous polynomials with low rank. Mathematische Zeitschrift 271(3-4), 1141–1149 (2012)
Golub, G., Loan, C.V.: Matrix Computation. John Hopkins University Press (1996)
LeCun, Y., et al.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ali, M., Rushdi, M., Ho, J. (2014). Deconstructing Kernel Machines. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2014. Lecture Notes in Computer Science(), vol 8724. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44848-9_3
Download citation
DOI: https://doi.org/10.1007/978-3-662-44848-9_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-44847-2
Online ISBN: 978-3-662-44848-9
eBook Packages: Computer ScienceComputer Science (R0)