Manifold Based Local Classifiers: Linear and Nonlinear Approaches

Cevikalp, Hakan; Larlus, Diane; Neamtu, Marian; Triggs, Bill; Jurie, Frederic

doi:10.1007/s11265-008-0313-4

Manifold Based Local Classifiers: Linear and Nonlinear Approaches

Published: 07 December 2008

Volume 61, pages 61–73, (2010)
Cite this article

Journal of Signal Processing Systems Aims and scope Submit manuscript

Hakan Cevikalp¹,
Diane Larlus²,
Marian Neamtu³,
Bill Triggs⁴ &
…
Frederic Jurie⁵

306 Accesses
15 Citations
Explore all metrics

Abstract

In case of insufficient data samples in high-dimensional classification problems, sparse scatters of samples tend to have many ‘holes’—regions that have few or no nearby training samples from the class. When such regions lie close to inter-class boundaries, the nearest neighbors of a query may lie in the wrong class, thus leading to errors in the Nearest Neighbor classification rule. The K-local hyperplane distance nearest neighbor (HKNN) algorithm tackles this problem by approximating each class with a smooth nonlinear manifold, which is considered to be locally linear. The method takes advantage of the local linearity assumption by using the distances from a query sample to the affine hulls of query’s nearest neighbors for decision making. However, HKNN is limited to using the Euclidean distance metric, which is a significant limitation in practice. In this paper we reformulate HKNN in terms of subspaces, and propose a variant, the Local Discriminative Common Vector (LDCV) method, that is more suitable for classification tasks where the classes have similar intra-class variations. We then extend both methods to the nonlinear case by mapping the nearest neighbors into a higher-dimensional space where the linear manifolds are constructed. This procedure allows us to use a wide variety of distance functions in the process, while computing distances between the query sample and the nonlinear manifolds remains straightforward owing to the linear nature of the manifolds in the mapped space. We tested the proposed methods on several classification tasks, obtaining better results than both the Support Vector Machines (SVMs) and their local counterpart SVM-KNN on the USPS and Image segmentation databases, and outperforming the local SVM-KNN on the Caltech visual recognition database.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Local generalized quadratic distance metrics: application to the k-nearest neighbors classifier

Article 25 April 2017

Boosting k-Nearest Neighbors Classification

Mixtures of Large Margin Nearest Neighbor Classifiers

References

Simard, P., Le Cun, Y., Denker, J., & Victorri, B. (1998). Transformation invariance in pattern recognition—tangent distance and tangent propagation, lecture notes in computer science (vol. 1524, pp. 239–274). Berlin: Springer.
Google Scholar
Peng, J., Heisterkamp, D. R., & Dai, H. K. (2003). LDA/SVM Driven Nearest Neighbor Classification. IEEE Trans Neural Netw, 14, 940–942. doi:10.1109/TNN.2003.813835.
Article Google Scholar
Hastie, T., & Tibshirani, R. (1996). Discriminant adaptive nearest neighbor classification. IEEE Trans. PAMI, 18(6), 607–616.
Google Scholar
Vincent, P., & Bengio, Y. (2001). K-local hyperplane and convex distance nearest neighbor algorithms. Adv Neural Inf Process Syst, 14, 985–992.
Google Scholar
Domeniconi, C., & Gunopulos, D. (2002). Efficient local flexible nearest neighbor classification. In Proceedings of the 2nd SIAM International Conference on Data Mining.
Zhang, H., Berg, a. C., Maire, M., & Malik, J. (2006). SVM-KNN: discriminative nearest neighbor classification for visual category recognition, in CVPR 2006 (pp. 2126–2136).
Peng, J., Heisterkamp, D. R., & Dai, H. K. (2004). Adaptive quasiconformal kernel nearest neighbor classification. IEEE Trans Pattern Anal Mach Intell, 28, 656–661. doi:10.1109/TPAMI.2004.1273978.
Article Google Scholar
Domeniconi, C., Peng, J., & Gunopulos, D. (2002). Locally adaptive metric nearest-neighbor classification. IEEE Trans Pattern Anal Mach Intell, 24, 1281–1285. doi:10.1109/TPAMI.2002.1033219.
Article Google Scholar
Olkun, O. (2004). Protein fold recognition with K-local hyperplane distance nearest neighbor algorithm. In Proceedings of the 2nd European Workshop on data Mining and Text Mining in Bioinformatics, pp. 51–57.
Hinton, G. E., Dayan, P., & Revow, M. (1997). Modeling the manifolds of images of handwritten digits. IEEE Trans Neural Netw, 18, 65–74. doi:10.1109/72.554192.
Article Google Scholar
Roweis, S. T., & Saul, L. K. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290, 2323–2326. doi:10.1126/science.290.5500.2323.
Article Google Scholar
Verbeek, J. (2006). Learning non-linear image manifolds by global alignment of local linear models. IEEE Trans PAMI, 28, 1236–1250.
Google Scholar
Cevikalp, H., Neamtu, M., & Wilkes, M. (2005). Discriminative common vectors for face recognition. IEEE Trans PAMI, 27, 4–13.
Google Scholar
Kim, T.-K., & Kittler, J. (2005). Locally linear discriminant analysis for multimodally distributed classes for face recognition with a single model image. IEEE Trans PAMI, 27, 318–327.
Google Scholar
Fitzgibbon, A. W., & Zisserman, A. (2003). Joint manifold distance: a new approach to appearance based clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Zhang, J., Marszalek, M., Lazebnik, S., & Schmidt, C. (2006). Local features and kernels for classification of texture and object categories: a comprehensive study. In Proceedings of the Computer Vision and Pattern Recognition Workshop.
Tenenbaum, J. B., Silva, V., & Langford, J. C. (2000). A global geometric framework for nonlinear dimensionality reduction. Science, 290, 2319–2323. doi:10.1126/science.290.5500.2319.
Article Google Scholar
Gulmezoglu, M. B., Dzhafarov, V., & Barkana, A. (2001). The common vector approach and its relation to principal component analysis. IEEE Trans Speech Audio Process, 9(6), 655–662. doi:10.1109/89.943343.
Article Google Scholar
Boyd, S. (2004). Convex optimization pp. 399–401. Cambridge, UK: Cambridge University Press.
MATH Google Scholar
Schölkopf, B., Smola, A. J., & Muller, K. R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput, 10, 1299–1319. doi:10.1162/089976698300017467.
Article Google Scholar
Cevikalp, H., Neamtu, M., & Wilkes, M. (2006). Discriminative common vector method with kernels. IEEE Trans Neural Netw, 17, 1550–1565. doi:10.1109/TNN.2006.881485.
Article Google Scholar
Xu, J., & Zikatanov, L. (2002). The method of alternating projections and the method of subspace corrections in hilbert space. J Am Math Soc, 15, 573–597. doi:10.1090/S0894-0347-02-00398-3.
Article MATH MathSciNet Google Scholar
Fei-Fei, L. Fergus, R., & Perona, P. (2004) Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. In Proceedings of the IEEE CVPR Workshop of Generative Model Based Vision.
USPS dataset of handwritten characters created by the US Postal Service. Retrieved from ftp://ftp.kyb.tuebingen.mpg.de/pub/bs/data.
Keysers, D., Dohmen, J., Theiner, T., & Ney, H. (2000). Experiments with an extended tangent distance. In Proceedings of the 15th International Conference on Pattern Recognition, vol. 2, pp. 38–42.
C codes for computing tangent distances. Retrieved from http://www-i6.informatik.rwth-aachen.de/∼keysers/td/.
Golub, G. H., & Loan, C. F.-V. (1996). Matrix computations (3rd ed.). Baltimore, MD: Johns Hopkins University Press.
MATH Google Scholar
UCI—benchmark repository—a huge collection of artificial and real world data sets. University of California Irvine. Retrieved from http://www.ics.edu/∼mlearn/MLRepository.html.
Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C. (2004). Visual categorization with bags of keypoints. In Proceedings of the ECCV Workshop on Statistical Learning for Computer Vision.
Lazebnik, S., Schmid, C., & Ponce, J. (2005). A sparse texture representation using local affine regions. IEEE Trans PAMI, 27(8), 1265–1278.
Google Scholar
Fowlkes, C., Belogie, S., Chung, F., & Malik, J. (2004). Spectral grouping using the Nystrom method. IEEE Trans PAMI, 26, 1–12.
Google Scholar
Saul, L. K., & Roweis, S. T. (2003). Think globally, fit locally: unsupervised learning of low dimensional manifolds. J Mach Learn Res, 4, 119–155.
Article MathSciNet Google Scholar
Levina, E., & Bickel, P. J. (2005). Maximum likelihood estimation of intrinsic dimension. In L. K. Saul, Y. Weiss, & L. Bottou (Eds.), Advances in neural information processing system, 17 (pp. 777–784). Cambridge, MA: MIT Press.
Google Scholar
Camastra, F., & Vinciarelli, A. (2002). Estimating the intrinsic dimension of data with a fractal-based method. IEEE Trans Pattern Anal Mach Intell, 24(10), 1404–1407. doi:10.1109/TPAMI.2002.1039212.
Article Google Scholar
Fukunaga, K., & Olsen, D. R. (1971). An algorithm for finding intrinsic dimensionality of data. IEEE Trans Comput, C-20, 176–183. doi:10.1109/T-C.1971.223208.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Electrical and Electronics Engineering Department, Eskisehir Osmangazi University, Meselik, 26480, Eskisehir, Turkey
Hakan Cevikalp
Learning and Recognition in Vision (LEAR), INRIA, Grenoble, France
Diane Larlus
Department of Mathematics, Vanderbilt University, Nashville, TN, USA
Marian Neamtu
Laboratoire Jean Kuntzmann, Grenoble, France
Bill Triggs
University of Caen, Caen, France
Frederic Jurie

Authors

Hakan Cevikalp
View author publications
You can also search for this author in PubMed Google Scholar
Diane Larlus
View author publications
You can also search for this author in PubMed Google Scholar
Marian Neamtu
View author publications
You can also search for this author in PubMed Google Scholar
Bill Triggs
View author publications
You can also search for this author in PubMed Google Scholar
Frederic Jurie
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hakan Cevikalp.

Appendix

Theorem 1:
Let P and $P_{{\text{NS}}}^{\left( i \right)} $ be the projection matrices of the subspaces $R\left( {S_T^K } \right)$ and $N\left( {S_i^K } \right)$, $i = 1, \ldots ,C$, respectively. Then P and $P_{{\text{NS}}}^{\left( i \right)} $ commute, i.e.:
$$P_{{\text{NS}}}^{\left( i \right)} P = PP_{{\text{NS}}}^{\left( i \right)} ,\quad i = 1,...,C.$$

Proof of the theorem is omitted since it can be derived as in the proof of Theorem 1 in [21].
Theorem 2:
Assume that there are C classes in the training set. For a query x _q $\left\| {P_{{\text{NS}}}^{\left( i \right)} \left( {x_q - \mu _i } \right)} \right\| \leqslant \left\| {P_{{\text{NS}}}^{\left( j \right)} \left( {x_q - \mu _j } \right)} \right\|$ implies that $\left\| {P_{{\text{int}}}^{\left( i \right)} \left( {x_q - \mu _i } \right)} \right\| \leqslant \left\| {P_{{\text{int}}}^{\left( j \right)} \left( {x_q - \mu _j } \right)} \right\|$ for $i,j = 1,...,C$, and $i \ne j$.

Proof: We first recall several facts from [13] (see Lemma 1 of [13]). For each $i = 1, \ldots ,C,$ it holds $N\left( {S_T^K } \right) \subset N\left( {S_i^K } \right)$, where N(A) denotes the null space of a matrix A. Consequently, $N\left( {S_T^K } \right)$ and $R\left( {S_i^K } \right)$ are orthogonal, where $R\left( {S_i^K } \right)$ is the range of $S_i^K $. This implies the identity $\left( {I - P} \right)\left( {I - P_{{\text{NS}}}^{\left( i \right)} } \right) = 0$ or $\left( {I - P} \right) = \left( {I - P} \right)P_{{\text{NS}}}^{\left( i \right)} $.

Thus, we can write:

$$\begin{array}{*{20}c} {\left\| {P_{{\text{NS}}}^{\left( i \right)} \left( {x_q - \mu _i } \right)} \right\| = \left\| {PP_{{\text{NS}}}^{\left( i \right)} \left( {x_q - \mu _i } \right) + \left( {I - P} \right)P_{{\text{NS}}}^{\left( i \right)} \left( {x_q - \mu _i } \right)} \right\| = \left\| {PP_{{\text{NS}}}^{\left( i \right)} \left( {x_q - \mu _i } \right)} \right\| + } \\ {\left\| {\left( {I - P} \right)P_{{\text{NS}}}^{\left( i \right)} \left( {x_q - \mu _i } \right)} \right\| = \left\| {PP_{{\text{NS}}}^{\left( i \right)} \left( {x_q - \mu _i } \right)} \right\| + \left\| {\left( {I - P} \right)\left( {x_q - \mu _i } \right)} \right\|.} \\ \end{array} $$

(25)

We now note that the vector $\left( {I - P} \right)\left( {x_q - \mu _i } \right)$ is the same for each class (i.e., it does not depend on the class index i) since we have shown in [21] that (I − P)μ _i is a so-called common vector for the class consisting of all samples in $V = \left\{ {x_m^i } \right\}_{m = 1,i = 1}^{K,C} $ and that in fact (I − P)x is the same vector for all x in the affine hull of V.

Thus, we have shown that:

$$\left\| {P_{{\text{NS}}}^{\left( i \right)} \left( {x_q - \mu _i } \right)} \right\| = \left\| {PP_{{\text{NS}}}^{\left( i \right)} \left( {x_q - \mu _i } \right)} \right\| + \left\| v \right\|,$$

(26)

for some vector v independent of the class index i. The assertion of Theorem 2 now immediately follows from this fact. □

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cevikalp, H., Larlus, D., Neamtu, M. et al. Manifold Based Local Classifiers: Linear and Nonlinear Approaches. J Sign Process Syst 61, 61–73 (2010). https://doi.org/10.1007/s11265-008-0313-4

Download citation

Received: 12 February 2008
Revised: 22 September 2008
Accepted: 27 October 2008
Published: 07 December 2008
Issue Date: October 2010
DOI: https://doi.org/10.1007/s11265-008-0313-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Manifold Based Local Classifiers: Linear and Nonlinear Approaches

Abstract

Access this article

Similar content being viewed by others

Local generalized quadratic distance metrics: application to the k-nearest neighbors classifier

Boosting k-Nearest Neighbors Classification

Mixtures of Large Margin Nearest Neighbor Classifiers

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Manifold Based Local Classifiers: Linear and Nonlinear Approaches

Abstract

Access this article

Similar content being viewed by others

Local generalized quadratic distance metrics: application to the k-nearest neighbors classifier

Boosting k-Nearest Neighbors Classification

Mixtures of Large Margin Nearest Neighbor Classifiers

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation