Skip to main content
Log in

Local generalized quadratic distance metrics: application to the k-nearest neighbors classifier

  • Regular Article
  • Published:
Advances in Data Analysis and Classification Aims and scope Submit manuscript

Abstract

Finding the set of nearest neighbors for a query point of interest appears in a variety of algorithms for machine learning and pattern recognition. Examples include k nearest neighbor classification, information retrieval, case-based reasoning, manifold learning, and nonlinear dimensionality reduction. In this work, we propose a new approach for determining a distance metric from the data for finding such neighboring points. For a query point of interest, our approach learns a generalized quadratic distance (GQD) metric based on the statistical properties in a “small” neighborhood for the point of interest. The locally learned GQD metric captures information such as the density, curvature, and the intrinsic dimensionality for the points falling in this particular neighborhood. Unfortunately, learning the GQD parameters under such a local learning mechanism is a challenging problem with a high computational overhead. To address these challenges, we estimate the GQD parameters using the minimum volume covering ellipsoid (MVCE) for a set of points. The advantage of the MVCE is two-fold. First, the MVCE together with the local learning approach approximate the functionality of a well known robust estimator for covariance matrices. Second, computing the MVCE is a convex optimization problem which, in addition to having a unique global solution, can be efficiently solved using a first order optimization algorithm. We validate our metric learning approach on a large variety of datasets and show that the proposed metric has promising results when compared with five algorithms from the literature for supervised metric learning.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. The 60 cases are: 20 (datasets) \(\times \) 3 (k values).

References

  • Abou-Moustafa K, Ferrie F (2007) The minimum volume ellipsoid metric. In: LNCS 4713, 29th Symp. of the German Association of Pattern Recognition (DAGM), Springer, Heidelberg, pp 335–344

  • Abou-Moustafa K, Ferrie F (2012) A note on metric properties of some divergence measures: the Gaussian case. JMLR W&CP 25:1–15

    Google Scholar 

  • Abou-Moustafa K, Schuurmans D, Ferrie F (2013) Learning a metric space for neighbourhood topology estimation. Appl Manifold Learn JMLR W&CP 29:341–356

    Google Scholar 

  • Aggarwal C, Hinneburg A, Keim D (2001) On the surprising behavior of distance metrics in high dimensional space. In: Van den Bussche J, Vianu V (eds) Database Theory, vol 1973., Lecture Notes in Computer Science, Springer, pp 420–434

  • Atwood C (1969) Optimal and efficient design of experiments. Ann Math Stat 40:1570–1602

    Article  MathSciNet  MATH  Google Scholar 

  • Bar-Hillel A, Hertz T, Shental N, Weinshall D (2005) Learning a Mahalanobis metric from equivalence constraints. J Mach Learn Res 6:937–965

    MathSciNet  MATH  Google Scholar 

  • Belkin M, Niyogi P (2003) Laplacian eigenmaps and spectral techniques for data representation. Neural Comput 15:1373–1396

    Article  MATH  Google Scholar 

  • Bottou L, Vapnik V (1992) Local learning algorithms. Neural Comput 4(6):888–900

    Article  Google Scholar 

  • Boyd S, Vandenberghe L (eds) (2004) Convex Optimization. Cambridge University Press

  • Chang H, Yeung DY (2007) Locally smooth metric learning with application to image retrieval. In: IEEE Proceedings of ICCV, pp 1–7

  • Coifman R, Lafon S (2006) Diffusion maps. Appl Comput Harmonic Anal 21:5–30

    Article  MathSciNet  MATH  Google Scholar 

  • Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27

    Article  MATH  Google Scholar 

  • Damla S, Sun P, Todd M (2008) Linear convergence of a modified Frank–Wolfe algorithm for computing minimum-volume enclosing ellipsoids. J Optim Methods Softw 23:5–19

    Article  MathSciNet  MATH  Google Scholar 

  • DeCoste D, Schölkopf B (2002) Training invariant support vector machines. J Mach Learn 46(1–3):161–190

    Article  MATH  Google Scholar 

  • Ding C, Li T (2007) Adaptive dimension reduction using discriminant analysis and K–means clustering. In: ACM Proceedings of the 24th ICML

  • Dolia A, Bie TD, Harris C, Shawe-Taylor J, Titterington D (2006) The minimum–volume covering ellipsoid estimation in kernel–defined feature spaces. In: Proceedings of the 17th ECML, Springer

  • Domeniconi C, Gunopulos D (2002) Adaptive nearest neighbor classification using support vector machines. In: NIPS 14

  • Domeniconi C, Peng J, Gunopulos D (2002) Locally adaptive metric nearest neighbor classification. IEEE Trans PAMI 24(9):1281–1285

    Article  Google Scholar 

  • Dornaika F, El Traboulsi Y (2015) Learning flexible graph-based semi-supervised embedding. IEEE Trans Cybern 46(1):206–218

    Article  Google Scholar 

  • François D, Wertz V, Verleysen M (2007) The concentration of fractional distances. IEEE Trans Knowl Data Eng 19:873–886

    Article  Google Scholar 

  • Friedman J (1994) Flexible metric nearest neighbor classification. Department of Statistics, Stanford University, Technical report

  • Fukunaga K (ed) (1972) Introduction to statistical pattern recognition. Academic Press, Cambridge

    MATH  Google Scholar 

  • Hastie T, Tibshirani R (1996) Discriminant adaptive nearest neighbor classification. IEEE Trans PAMI 18(6):607–615

    Article  Google Scholar 

  • Hu W, Xie N, Li L, Zeng X, Maybank S (2011) A survey on visual content-based video indexing and retrieval. IEEE Trans Syst Man Cybern C Appl Rev 41(6):797–819

    Article  Google Scholar 

  • Keysers D (1998) United States Postal Service (USPS) data set. http://www-i6.informatik.rwth-aachen.de/keysers/usps.html

  • Kruskal J (1964) Nonmetric multidimensional scaling: a numerical method. Psychometrika 29:115–129

    Article  MathSciNet  MATH  Google Scholar 

  • Kulis B (2013) Metric learning: a survey. Found Trends Mach Learn 5(4):287–364

    Article  MathSciNet  MATH  Google Scholar 

  • Kumar P, Yildirim E (2005) Minimum-volume enclosing ellipsoids and core sets. J Optim Theory Appl 126(1):1–21

    Article  MathSciNet  MATH  Google Scholar 

  • LeCun Y (1998) The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist/

  • Macleod J, Luk A, Titterington DM (1987) A re-examination of the distance-weighted k-nearest neighbor classification rule. IEEE Trans Syst Man Cybern 17(4):689–696

    Article  Google Scholar 

  • Newman D, Hettich S, Blake C, Merz C (1998) UCI Repository of Machine Learning Databases. http://www.ics.uci.edu/~mlearn/MLRepository.html

  • Peng J, Heisterkamp D, Dai H (2004) Adaptive quasi-conformal kernel for nearest neighbor classification. IEEE Trans PAMI 26(5):656–661

    Article  Google Scholar 

  • Rousseeuw P, Leroy A (eds) (1987) Robust regression and outlier detection. Wiley, New York

  • Roweis S, Saul L (2000) Nonlinear dimensionality reduction by locally linear embedding (LLE). Science 290(5500):2323–2326

    Article  Google Scholar 

  • Schultz M, Joachims T (2004) Learning a distance metric from relative comparisons. In: Thrun S, Saul LK, Schölkopf PB (eds) NIPS 16, MIT Press

  • Shepard R (1962) The analysis of proximities: multidimensional scaling with an unknown distance function. Psychometrika 27:219–246

    Article  MathSciNet  MATH  Google Scholar 

  • Short R, Fukunaga K (1981) The optimal distance measure for nearest neighbor classification. IEEE Trans Inf Theory 27(5):622–627

    Article  MathSciNet  MATH  Google Scholar 

  • Sun P, Freund R (2004) Computation of minimum-volume covering ellipsoids. Op Res 52:690–706

    Article  MathSciNet  MATH  Google Scholar 

  • Tenenbaum J, de Silva V, Langford J (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323

    Article  Google Scholar 

  • Titterington D (1978) Estimation of correlation coefficients by ellipsoidal trimming. J Appl Stat 27(3):227–234

    Article  MATH  Google Scholar 

  • Todd M (2006) On minimum–volume ellipsoids. From John and Kiefer–Wolfowitz to Khachiyan and Nesterov–Nemirovski. Slides presented in the ABEL Symposium, http://people.orie.cornell.edu/~miketodd/ublndeta.pdf

  • Vincent P, Bengio Y (2002) K–Local hyperplane and convex distance nearest neighbor algorithms. In: Dietterich TG, Becker S, Ghahramani Z (eds) NIPS 14, The MIT Press, pp 985–992

  • Weinberger K, Blitzer J, Saul L (2006) Distance metric learning for large margin nearest neighbor classification. In: Weiss Y, Schölkopf PB, Platt JC (eds) NIPS 18, MIT Press, pp 1473–1480

  • Xing E, Ng A, Jordan M, Russell S (2003) Distance metric learning with application to clustering with side–information. In: Becker S, Thrun S, Obermayer K (eds) NIPS 15, MIT Press, pp 505–512

  • Yang L (2006) Distance metric learning: A comprehensive review. Technical report, Department of Computer Science and Engineering, Michigan State University

Download references

Acknowledgements

We would like to thank our Coordinating Editor and our anonymous Reviewers for their rigorous comments that improved various sections of the manuscript. We also would like to thank Michael Smith, Prasun Lala, Catherine Laporte, and Mathew Toews for reading earlier versions of this manuscript. The authors also acknowledge the supported of the Natural Sciences and Engineering Research Council of Canada, under Discovery Grant RGPIN-2016-04638.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Karim Abou-Moustafa.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Abou-Moustafa, K., Ferrie, F.P. Local generalized quadratic distance metrics: application to the k-nearest neighbors classifier. Adv Data Anal Classif 12, 341–363 (2018). https://doi.org/10.1007/s11634-017-0286-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11634-017-0286-x

Keywords

Mathematics Subject Classification

Navigation