Abstract
Kernel similarity function allows a Support Vector Machine (SVM) classifier to learn the maximum margin hyperplane in a higher dimensional space where two classes are linearly separable without explicitly mapping the data. Most existing kernel functions (e.g., RBF) use spatial positions of two data instances in the input space to compute their similarity. These kernels are data distribution independent and sensitive to data representation (i.e., units/scales used to measure/express data). Since this can be unknown in many real-world applications, a careful selection of a suitable kernel is required for a given problem. In this paper, we present a new kernel function based on probability data mass that is both data-dependent and scale-invariant. Our empirical results show that the proposed SVM kernel outperforms popular existing kernels.
V. V. Malgi and S. Aryal—They contributed equally to this work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aryal, S., Ting, K.M., Haffari, G., Washio, T.: Mp-dissimilarity: a data dependent dissimilarity measure. In: 2014 IEEE International Conference on Data Mining, pp. 707–712. IEEE (2014)
Aryal, S., Ting, K.M., Washio, T., Haffari, G.: A comparative study of data-dependent approaches without learning in measuring similarities of data objects. Data Min. Knowl. Disc. 34(1), 124–162 (2020)
Cristianini, N., Shawe-Taylor, J.: An introduction to support vector machines and other Kernel-based learning methods. Cambridge University Press (2000)
Dua, D., Graff, C.: UCI machine learning repository. http://archive.ics.uci.edu/ml. University of california, Irvine, CA. School Inf. Comput. Sci. 25, 27 (2019)
Fernando, T.L., Webb, G.I.: SimUSF: an efficient and effective similarity measure that is invariant to violations of the interval scale assumption. Data Min. Knowl. Disc. 31(1), 264–286 (2017)
Krumhansl, C.L.: Concerning the applicability of geometric models to similarity data: the interrelationship between similarity and spatial density. Psychol. Rev. 85(5), 445–463 (1978)
Lin, D., et al.: An information-theoretic definition of similarity. In: International Conference on Machine Learning (ICML), pp. 296–304 (1998)
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Ting, K.M., Zhu, Y., Zhou, Z.H.: Isolation kernel and its effect on SVM. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2329–2337 (2018)
Acknowledgment
This material is based upon work supported by the U.S Air Force Office of Scientific Research under award number FA2386-20-1-4005.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Malgi, V.V., Aryal, S., Rasool, Z., Tay, D. (2023). Data-dependent and Scale-Invariant Kernel for Support Vector Machine Classification. In: Kashima, H., Ide, T., Peng, WC. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2023. Lecture Notes in Computer Science(), vol 13935. Springer, Cham. https://doi.org/10.1007/978-3-031-33374-3_14
Download citation
DOI: https://doi.org/10.1007/978-3-031-33374-3_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-33373-6
Online ISBN: 978-3-031-33374-3
eBook Packages: Computer ScienceComputer Science (R0)