On Metric Correction and Conditionality of Raw Featureless Data in Machine Learning
Recently, raw experimental data in machine learning often appear as direct comparisons between objects (featureless data). Different ways to evaluate difference or similarity of a pair of objects in image and data mining, image analysis, bioinformatics, etc., are usually used in practice. Nevertheless, such comparisons often are not distances or correlations (scalar products) like a correct function defined on a limited set of elements in machine learning. This problem is denoted as metric violations in ill-posed matrices. Therefore, it needs to recover violated metrics and provide optimal conditionality of corresponding matrices of pairwise comparisons for distances and similarities. This is the correct basis for using of modern machine learning algorithms.
Keywordsmetrics similarity dissimilarity distance scalar product condition number determinant principal minor eigenvalue
Unable to display preview. Download preview PDF.
- 1.R. A. Horn and C. R. Johnson, Matrix Analysis, 2nd ed. (Cambridge University Press, Cambridge, 2013).Google Scholar
- 3.S. D. Dvoenko, “Clustering and separating of a set of members in terms of mutual distances and similarities,” Trans. Mach. Learn. Data Mining 2 (2), 80–99 (2009).Google Scholar
- 4.S. D. Dvoenko and D. O. Pshenichny, “A recovering of violated metric in machine learning”, in Proc. 7th Symposium on Information and Communication Technology (SoICT’16) (ACM, New York, 2016), pp. 15–21. DOI: https://doi.org/10.1145/3011077.3011084Google Scholar
- 10.Z.–H. Zhou, F. Roli, and J. Kittler (Eds.), Multiple Classifier Systems, tMCS 2013, Lecture Notes in Computer Science (Springer, Berlin, Heidelberg, 2013), Vol. 7872.Google Scholar
- 14.W. S. Torgerson, Theory and Methods of Scaling (Wiley, New York, 1958).Google Scholar
- 15.V. Mottl, S. Dvoenko, O. Seredin, C. Kulikowski, and I. Muchnik, “Featureless pattern recognition in an imaginary Hilbert space and its application to protein fold classification”, in Machine Learning and Data Mining in Pattern Recognition, Proc. MLDM 2001, Ed. by P. Perner, Lecture Notes in Computer Science (Springer, Berlin, Heidelberg, 2001), Vol. 2123, pp. 322–336.Google Scholar