Abstract
Finding the set of nearest neighbors for a query point of interest appears in a variety of algorithms for machine learning and pattern recognition. Examples include k nearest neighbor classification, information retrieval, case-based reasoning, manifold learning, and nonlinear dimensionality reduction. In this work, we propose a new approach for determining a distance metric from the data for finding such neighboring points. For a query point of interest, our approach learns a generalized quadratic distance (GQD) metric based on the statistical properties in a “small” neighborhood for the point of interest. The locally learned GQD metric captures information such as the density, curvature, and the intrinsic dimensionality for the points falling in this particular neighborhood. Unfortunately, learning the GQD parameters under such a local learning mechanism is a challenging problem with a high computational overhead. To address these challenges, we estimate the GQD parameters using the minimum volume covering ellipsoid (MVCE) for a set of points. The advantage of the MVCE is two-fold. First, the MVCE together with the local learning approach approximate the functionality of a well known robust estimator for covariance matrices. Second, computing the MVCE is a convex optimization problem which, in addition to having a unique global solution, can be efficiently solved using a first order optimization algorithm. We validate our metric learning approach on a large variety of datasets and show that the proposed metric has promising results when compared with five algorithms from the literature for supervised metric learning.
Similar content being viewed by others
Notes
The 60 cases are: 20 (datasets) \(\times \) 3 (k values).
References
Abou-Moustafa K, Ferrie F (2007) The minimum volume ellipsoid metric. In: LNCS 4713, 29th Symp. of the German Association of Pattern Recognition (DAGM), Springer, Heidelberg, pp 335–344
Abou-Moustafa K, Ferrie F (2012) A note on metric properties of some divergence measures: the Gaussian case. JMLR W&CP 25:1–15
Abou-Moustafa K, Schuurmans D, Ferrie F (2013) Learning a metric space for neighbourhood topology estimation. Appl Manifold Learn JMLR W&CP 29:341–356
Aggarwal C, Hinneburg A, Keim D (2001) On the surprising behavior of distance metrics in high dimensional space. In: Van den Bussche J, Vianu V (eds) Database Theory, vol 1973., Lecture Notes in Computer Science, Springer, pp 420–434
Atwood C (1969) Optimal and efficient design of experiments. Ann Math Stat 40:1570–1602
Bar-Hillel A, Hertz T, Shental N, Weinshall D (2005) Learning a Mahalanobis metric from equivalence constraints. J Mach Learn Res 6:937–965
Belkin M, Niyogi P (2003) Laplacian eigenmaps and spectral techniques for data representation. Neural Comput 15:1373–1396
Bottou L, Vapnik V (1992) Local learning algorithms. Neural Comput 4(6):888–900
Boyd S, Vandenberghe L (eds) (2004) Convex Optimization. Cambridge University Press
Chang H, Yeung DY (2007) Locally smooth metric learning with application to image retrieval. In: IEEE Proceedings of ICCV, pp 1–7
Coifman R, Lafon S (2006) Diffusion maps. Appl Comput Harmonic Anal 21:5–30
Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27
Damla S, Sun P, Todd M (2008) Linear convergence of a modified Frank–Wolfe algorithm for computing minimum-volume enclosing ellipsoids. J Optim Methods Softw 23:5–19
DeCoste D, Schölkopf B (2002) Training invariant support vector machines. J Mach Learn 46(1–3):161–190
Ding C, Li T (2007) Adaptive dimension reduction using discriminant analysis and K–means clustering. In: ACM Proceedings of the 24th ICML
Dolia A, Bie TD, Harris C, Shawe-Taylor J, Titterington D (2006) The minimum–volume covering ellipsoid estimation in kernel–defined feature spaces. In: Proceedings of the 17th ECML, Springer
Domeniconi C, Gunopulos D (2002) Adaptive nearest neighbor classification using support vector machines. In: NIPS 14
Domeniconi C, Peng J, Gunopulos D (2002) Locally adaptive metric nearest neighbor classification. IEEE Trans PAMI 24(9):1281–1285
Dornaika F, El Traboulsi Y (2015) Learning flexible graph-based semi-supervised embedding. IEEE Trans Cybern 46(1):206–218
François D, Wertz V, Verleysen M (2007) The concentration of fractional distances. IEEE Trans Knowl Data Eng 19:873–886
Friedman J (1994) Flexible metric nearest neighbor classification. Department of Statistics, Stanford University, Technical report
Fukunaga K (ed) (1972) Introduction to statistical pattern recognition. Academic Press, Cambridge
Hastie T, Tibshirani R (1996) Discriminant adaptive nearest neighbor classification. IEEE Trans PAMI 18(6):607–615
Hu W, Xie N, Li L, Zeng X, Maybank S (2011) A survey on visual content-based video indexing and retrieval. IEEE Trans Syst Man Cybern C Appl Rev 41(6):797–819
Keysers D (1998) United States Postal Service (USPS) data set. http://www-i6.informatik.rwth-aachen.de/keysers/usps.html
Kruskal J (1964) Nonmetric multidimensional scaling: a numerical method. Psychometrika 29:115–129
Kulis B (2013) Metric learning: a survey. Found Trends Mach Learn 5(4):287–364
Kumar P, Yildirim E (2005) Minimum-volume enclosing ellipsoids and core sets. J Optim Theory Appl 126(1):1–21
LeCun Y (1998) The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist/
Macleod J, Luk A, Titterington DM (1987) A re-examination of the distance-weighted k-nearest neighbor classification rule. IEEE Trans Syst Man Cybern 17(4):689–696
Newman D, Hettich S, Blake C, Merz C (1998) UCI Repository of Machine Learning Databases. http://www.ics.uci.edu/~mlearn/MLRepository.html
Peng J, Heisterkamp D, Dai H (2004) Adaptive quasi-conformal kernel for nearest neighbor classification. IEEE Trans PAMI 26(5):656–661
Rousseeuw P, Leroy A (eds) (1987) Robust regression and outlier detection. Wiley, New York
Roweis S, Saul L (2000) Nonlinear dimensionality reduction by locally linear embedding (LLE). Science 290(5500):2323–2326
Schultz M, Joachims T (2004) Learning a distance metric from relative comparisons. In: Thrun S, Saul LK, Schölkopf PB (eds) NIPS 16, MIT Press
Shepard R (1962) The analysis of proximities: multidimensional scaling with an unknown distance function. Psychometrika 27:219–246
Short R, Fukunaga K (1981) The optimal distance measure for nearest neighbor classification. IEEE Trans Inf Theory 27(5):622–627
Sun P, Freund R (2004) Computation of minimum-volume covering ellipsoids. Op Res 52:690–706
Tenenbaum J, de Silva V, Langford J (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323
Titterington D (1978) Estimation of correlation coefficients by ellipsoidal trimming. J Appl Stat 27(3):227–234
Todd M (2006) On minimum–volume ellipsoids. From John and Kiefer–Wolfowitz to Khachiyan and Nesterov–Nemirovski. Slides presented in the ABEL Symposium, http://people.orie.cornell.edu/~miketodd/ublndeta.pdf
Vincent P, Bengio Y (2002) K–Local hyperplane and convex distance nearest neighbor algorithms. In: Dietterich TG, Becker S, Ghahramani Z (eds) NIPS 14, The MIT Press, pp 985–992
Weinberger K, Blitzer J, Saul L (2006) Distance metric learning for large margin nearest neighbor classification. In: Weiss Y, Schölkopf PB, Platt JC (eds) NIPS 18, MIT Press, pp 1473–1480
Xing E, Ng A, Jordan M, Russell S (2003) Distance metric learning with application to clustering with side–information. In: Becker S, Thrun S, Obermayer K (eds) NIPS 15, MIT Press, pp 505–512
Yang L (2006) Distance metric learning: A comprehensive review. Technical report, Department of Computer Science and Engineering, Michigan State University
Acknowledgements
We would like to thank our Coordinating Editor and our anonymous Reviewers for their rigorous comments that improved various sections of the manuscript. We also would like to thank Michael Smith, Prasun Lala, Catherine Laporte, and Mathew Toews for reading earlier versions of this manuscript. The authors also acknowledge the supported of the Natural Sciences and Engineering Research Council of Canada, under Discovery Grant RGPIN-2016-04638.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Abou-Moustafa, K., Ferrie, F.P. Local generalized quadratic distance metrics: application to the k-nearest neighbors classifier. Adv Data Anal Classif 12, 341–363 (2018). https://doi.org/10.1007/s11634-017-0286-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-017-0286-x
Keywords
- Query-based operations
- k Nearest neighbors
- Distance metric learning
- Minimum volume covering ellipsoid
- Minimum volume ellipsoid estimator