Local generalized quadratic distance metrics: application to the k-nearest neighbors classifier

Abou-Moustafa, Karim; Ferrie, Frank P.

doi:10.1007/s11634-017-0286-x

Local generalized quadratic distance metrics: application to the k-nearest neighbors classifier

Regular Article
Published: 25 April 2017

Volume 12, pages 341–363, (2018)
Cite this article

Advances in Data Analysis and Classification Aims and scope Submit manuscript

448 Accesses
4 Citations
Explore all metrics

Abstract

Finding the set of nearest neighbors for a query point of interest appears in a variety of algorithms for machine learning and pattern recognition. Examples include k nearest neighbor classification, information retrieval, case-based reasoning, manifold learning, and nonlinear dimensionality reduction. In this work, we propose a new approach for determining a distance metric from the data for finding such neighboring points. For a query point of interest, our approach learns a generalized quadratic distance (GQD) metric based on the statistical properties in a “small” neighborhood for the point of interest. The locally learned GQD metric captures information such as the density, curvature, and the intrinsic dimensionality for the points falling in this particular neighborhood. Unfortunately, learning the GQD parameters under such a local learning mechanism is a challenging problem with a high computational overhead. To address these challenges, we estimate the GQD parameters using the minimum volume covering ellipsoid (MVCE) for a set of points. The advantage of the MVCE is two-fold. First, the MVCE together with the local learning approach approximate the functionality of a well known robust estimator for covariance matrices. Second, computing the MVCE is a convex optimization problem which, in addition to having a unique global solution, can be efficiently solved using a first order optimization algorithm. We validate our metric learning approach on a large variety of datasets and show that the proposed metric has promising results when compared with five algorithms from the literature for supervised metric learning.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Sum-of-Squares Relaxations for Information Theory and Variational Inference

Article 05 April 2024

Francis Bach

Hybrid approaches to optimization and machine learning methods: a systematic literature review

Article Open access 24 January 2024

Beatriz Flamia Azevedo, Ana Maria A. C. Rocha & Ana I. Pereira

A Comprehensive Survey of Anomaly Detection Algorithms

Article 26 November 2021

Durgesh Samariya & Amit Thakkar

Notes

The 60 cases are: 20 (datasets) \(\times \) 3 (k values).

References

Abou-Moustafa K, Ferrie F (2007) The minimum volume ellipsoid metric. In: LNCS 4713, 29th Symp. of the German Association of Pattern Recognition (DAGM), Springer, Heidelberg, pp 335–344
Abou-Moustafa K, Ferrie F (2012) A note on metric properties of some divergence measures: the Gaussian case. JMLR W&CP 25:1–15
Google Scholar
Abou-Moustafa K, Schuurmans D, Ferrie F (2013) Learning a metric space for neighbourhood topology estimation. Appl Manifold Learn JMLR W&CP 29:341–356
Google Scholar
Aggarwal C, Hinneburg A, Keim D (2001) On the surprising behavior of distance metrics in high dimensional space. In: Van den Bussche J, Vianu V (eds) Database Theory, vol 1973., Lecture Notes in Computer Science, Springer, pp 420–434
Atwood C (1969) Optimal and efficient design of experiments. Ann Math Stat 40:1570–1602
Article MathSciNet MATH Google Scholar
Bar-Hillel A, Hertz T, Shental N, Weinshall D (2005) Learning a Mahalanobis metric from equivalence constraints. J Mach Learn Res 6:937–965
MathSciNet MATH Google Scholar
Belkin M, Niyogi P (2003) Laplacian eigenmaps and spectral techniques for data representation. Neural Comput 15:1373–1396
Article MATH Google Scholar
Bottou L, Vapnik V (1992) Local learning algorithms. Neural Comput 4(6):888–900
Article Google Scholar
Boyd S, Vandenberghe L (eds) (2004) Convex Optimization. Cambridge University Press
Chang H, Yeung DY (2007) Locally smooth metric learning with application to image retrieval. In: IEEE Proceedings of ICCV, pp 1–7
Coifman R, Lafon S (2006) Diffusion maps. Appl Comput Harmonic Anal 21:5–30
Article MathSciNet MATH Google Scholar
Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27
Article MATH Google Scholar
Damla S, Sun P, Todd M (2008) Linear convergence of a modified Frank–Wolfe algorithm for computing minimum-volume enclosing ellipsoids. J Optim Methods Softw 23:5–19
Article MathSciNet MATH Google Scholar
DeCoste D, Schölkopf B (2002) Training invariant support vector machines. J Mach Learn 46(1–3):161–190
Article MATH Google Scholar
Ding C, Li T (2007) Adaptive dimension reduction using discriminant analysis and K–means clustering. In: ACM Proceedings of the 24th ICML
Dolia A, Bie TD, Harris C, Shawe-Taylor J, Titterington D (2006) The minimum–volume covering ellipsoid estimation in kernel–defined feature spaces. In: Proceedings of the 17th ECML, Springer
Domeniconi C, Gunopulos D (2002) Adaptive nearest neighbor classification using support vector machines. In: NIPS 14
Domeniconi C, Peng J, Gunopulos D (2002) Locally adaptive metric nearest neighbor classification. IEEE Trans PAMI 24(9):1281–1285
Article Google Scholar
Dornaika F, El Traboulsi Y (2015) Learning flexible graph-based semi-supervised embedding. IEEE Trans Cybern 46(1):206–218
Article Google Scholar
François D, Wertz V, Verleysen M (2007) The concentration of fractional distances. IEEE Trans Knowl Data Eng 19:873–886
Article Google Scholar
Friedman J (1994) Flexible metric nearest neighbor classification. Department of Statistics, Stanford University, Technical report
Fukunaga K (ed) (1972) Introduction to statistical pattern recognition. Academic Press, Cambridge
MATH Google Scholar
Hastie T, Tibshirani R (1996) Discriminant adaptive nearest neighbor classification. IEEE Trans PAMI 18(6):607–615
Article Google Scholar
Hu W, Xie N, Li L, Zeng X, Maybank S (2011) A survey on visual content-based video indexing and retrieval. IEEE Trans Syst Man Cybern C Appl Rev 41(6):797–819
Article Google Scholar
Keysers D (1998) United States Postal Service (USPS) data set. http://www-i6.informatik.rwth-aachen.de/keysers/usps.html
Kruskal J (1964) Nonmetric multidimensional scaling: a numerical method. Psychometrika 29:115–129
Article MathSciNet MATH Google Scholar
Kulis B (2013) Metric learning: a survey. Found Trends Mach Learn 5(4):287–364
Article MathSciNet MATH Google Scholar
Kumar P, Yildirim E (2005) Minimum-volume enclosing ellipsoids and core sets. J Optim Theory Appl 126(1):1–21
Article MathSciNet MATH Google Scholar
LeCun Y (1998) The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist/
Macleod J, Luk A, Titterington DM (1987) A re-examination of the distance-weighted k-nearest neighbor classification rule. IEEE Trans Syst Man Cybern 17(4):689–696
Article Google Scholar
Newman D, Hettich S, Blake C, Merz C (1998) UCI Repository of Machine Learning Databases. http://www.ics.uci.edu/~mlearn/MLRepository.html
Peng J, Heisterkamp D, Dai H (2004) Adaptive quasi-conformal kernel for nearest neighbor classification. IEEE Trans PAMI 26(5):656–661
Article Google Scholar
Rousseeuw P, Leroy A (eds) (1987) Robust regression and outlier detection. Wiley, New York
Roweis S, Saul L (2000) Nonlinear dimensionality reduction by locally linear embedding (LLE). Science 290(5500):2323–2326
Article Google Scholar
Schultz M, Joachims T (2004) Learning a distance metric from relative comparisons. In: Thrun S, Saul LK, Schölkopf PB (eds) NIPS 16, MIT Press
Shepard R (1962) The analysis of proximities: multidimensional scaling with an unknown distance function. Psychometrika 27:219–246
Article MathSciNet MATH Google Scholar
Short R, Fukunaga K (1981) The optimal distance measure for nearest neighbor classification. IEEE Trans Inf Theory 27(5):622–627
Article MathSciNet MATH Google Scholar
Sun P, Freund R (2004) Computation of minimum-volume covering ellipsoids. Op Res 52:690–706
Article MathSciNet MATH Google Scholar
Tenenbaum J, de Silva V, Langford J (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323
Article Google Scholar
Titterington D (1978) Estimation of correlation coefficients by ellipsoidal trimming. J Appl Stat 27(3):227–234
Article MATH Google Scholar
Todd M (2006) On minimum–volume ellipsoids. From John and Kiefer–Wolfowitz to Khachiyan and Nesterov–Nemirovski. Slides presented in the ABEL Symposium, http://people.orie.cornell.edu/~miketodd/ublndeta.pdf
Vincent P, Bengio Y (2002) K–Local hyperplane and convex distance nearest neighbor algorithms. In: Dietterich TG, Becker S, Ghahramani Z (eds) NIPS 14, The MIT Press, pp 985–992
Weinberger K, Blitzer J, Saul L (2006) Distance metric learning for large margin nearest neighbor classification. In: Weiss Y, Schölkopf PB, Platt JC (eds) NIPS 18, MIT Press, pp 1473–1480
Xing E, Ng A, Jordan M, Russell S (2003) Distance metric learning with application to clustering with side–information. In: Becker S, Thrun S, Obermayer K (eds) NIPS 15, MIT Press, pp 505–512
Yang L (2006) Distance metric learning: A comprehensive review. Technical report, Department of Computer Science and Engineering, Michigan State University

Download references

Acknowledgements

We would like to thank our Coordinating Editor and our anonymous Reviewers for their rigorous comments that improved various sections of the manuscript. We also would like to thank Michael Smith, Prasun Lala, Catherine Laporte, and Mathew Toews for reading earlier versions of this manuscript. The authors also acknowledge the supported of the Natural Sciences and Engineering Research Council of Canada, under Discovery Grant RGPIN-2016-04638.

Author information

Authors and Affiliations

Department of Computing Science, ATH 3-55, University of Alberta, Edmonton, AB, T6G 2E8, Canada
Karim Abou-Moustafa
Department of Electrical and Computer Engineering, McGill University, McConnell Engineering Building, Room 441, 3480 University Street, Montreal, QC, H3A 2E9, Canada
Frank P. Ferrie

Authors

Karim Abou-Moustafa
View author publications
You can also search for this author in PubMed Google Scholar
Frank P. Ferrie
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Karim Abou-Moustafa.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Abou-Moustafa, K., Ferrie, F.P. Local generalized quadratic distance metrics: application to the k-nearest neighbors classifier. Adv Data Anal Classif 12, 341–363 (2018). https://doi.org/10.1007/s11634-017-0286-x

Download citation

Received: 21 October 2016
Revised: 10 April 2017
Accepted: 17 April 2017
Published: 25 April 2017
Issue Date: June 2018
DOI: https://doi.org/10.1007/s11634-017-0286-x

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Local generalized quadratic distance metrics: application to the k-nearest neighbors classifier

Abstract

Access this article

Similar content being viewed by others

Sum-of-Squares Relaxations for Information Theory and Variational Inference

Hybrid approaches to optimization and machine learning methods: a systematic literature review

A Comprehensive Survey of Anomaly Detection Algorithms

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Local generalized quadratic distance metrics: application to the k-nearest neighbors classifier

Abstract

Access this article

Similar content being viewed by others

Sum-of-Squares Relaxations for Information Theory and Variational Inference

Hybrid approaches to optimization and machine learning methods: a systematic literature review

A Comprehensive Survey of Anomaly Detection Algorithms

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation