Journal of Signal Processing Systems

, Volume 61, Issue 1, pp 61–73 | Cite as

Manifold Based Local Classifiers: Linear and Nonlinear Approaches

  • Hakan CevikalpEmail author
  • Diane Larlus
  • Marian Neamtu
  • Bill Triggs
  • Frederic Jurie


In case of insufficient data samples in high-dimensional classification problems, sparse scatters of samples tend to have many ‘holes’—regions that have few or no nearby training samples from the class. When such regions lie close to inter-class boundaries, the nearest neighbors of a query may lie in the wrong class, thus leading to errors in the Nearest Neighbor classification rule. The K-local hyperplane distance nearest neighbor (HKNN) algorithm tackles this problem by approximating each class with a smooth nonlinear manifold, which is considered to be locally linear. The method takes advantage of the local linearity assumption by using the distances from a query sample to the affine hulls of query’s nearest neighbors for decision making. However, HKNN is limited to using the Euclidean distance metric, which is a significant limitation in practice. In this paper we reformulate HKNN in terms of subspaces, and propose a variant, the Local Discriminative Common Vector (LDCV) method, that is more suitable for classification tasks where the classes have similar intra-class variations. We then extend both methods to the nonlinear case by mapping the nearest neighbors into a higher-dimensional space where the linear manifolds are constructed. This procedure allows us to use a wide variety of distance functions in the process, while computing distances between the query sample and the nonlinear manifolds remains straightforward owing to the linear nature of the manifolds in the mapped space. We tested the proposed methods on several classification tasks, obtaining better results than both the Support Vector Machines (SVMs) and their local counterpart SVM-KNN on the USPS and Image segmentation databases, and outperforming the local SVM-KNN on the Caltech visual recognition database.


Affine hull Common vector Convex hull Distance learning Image categorization Local classifier Manifold learning Object recognition 


  1. 1.
    Simard, P., Le Cun, Y., Denker, J., & Victorri, B. (1998). Transformation invariance in pattern recognition—tangent distance and tangent propagation, lecture notes in computer science (vol. 1524, pp. 239–274). Berlin: Springer.Google Scholar
  2. 2.
    Peng, J., Heisterkamp, D. R., & Dai, H. K. (2003). LDA/SVM Driven Nearest Neighbor Classification. IEEE Trans Neural Netw, 14, 940–942. doi: 10.1109/TNN.2003.813835.CrossRefGoogle Scholar
  3. 3.
    Hastie, T., & Tibshirani, R. (1996). Discriminant adaptive nearest neighbor classification. IEEE Trans. PAMI, 18(6), 607–616.Google Scholar
  4. 4.
    Vincent, P., & Bengio, Y. (2001). K-local hyperplane and convex distance nearest neighbor algorithms. Adv Neural Inf Process Syst, 14, 985–992.Google Scholar
  5. 5.
    Domeniconi, C., & Gunopulos, D. (2002). Efficient local flexible nearest neighbor classification. In Proceedings of the 2nd SIAM International Conference on Data Mining.Google Scholar
  6. 6.
    Zhang, H., Berg, a. C., Maire, M., & Malik, J. (2006). SVM-KNN: discriminative nearest neighbor classification for visual category recognition, in CVPR 2006 (pp. 2126–2136).Google Scholar
  7. 7.
    Peng, J., Heisterkamp, D. R., & Dai, H. K. (2004). Adaptive quasiconformal kernel nearest neighbor classification. IEEE Trans Pattern Anal Mach Intell, 28, 656–661. doi: 10.1109/TPAMI.2004.1273978.CrossRefGoogle Scholar
  8. 8.
    Domeniconi, C., Peng, J., & Gunopulos, D. (2002). Locally adaptive metric nearest-neighbor classification. IEEE Trans Pattern Anal Mach Intell, 24, 1281–1285. doi: 10.1109/TPAMI.2002.1033219.CrossRefGoogle Scholar
  9. 9.
    Olkun, O. (2004). Protein fold recognition with K-local hyperplane distance nearest neighbor algorithm. In Proceedings of the 2nd European Workshop on data Mining and Text Mining in Bioinformatics, pp. 51–57.Google Scholar
  10. 10.
    Hinton, G. E., Dayan, P., & Revow, M. (1997). Modeling the manifolds of images of handwritten digits. IEEE Trans Neural Netw, 18, 65–74. doi: 10.1109/72.554192.CrossRefGoogle Scholar
  11. 11.
    Roweis, S. T., & Saul, L. K. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290, 2323–2326. doi: 10.1126/science.290.5500.2323.CrossRefGoogle Scholar
  12. 12.
    Verbeek, J. (2006). Learning non-linear image manifolds by global alignment of local linear models. IEEE Trans PAMI, 28, 1236–1250.Google Scholar
  13. 13.
    Cevikalp, H., Neamtu, M., & Wilkes, M. (2005). Discriminative common vectors for face recognition. IEEE Trans PAMI, 27, 4–13.Google Scholar
  14. 14.
    Kim, T.-K., & Kittler, J. (2005). Locally linear discriminant analysis for multimodally distributed classes for face recognition with a single model image. IEEE Trans PAMI, 27, 318–327.Google Scholar
  15. 15.
    Fitzgibbon, A. W., & Zisserman, A. (2003). Joint manifold distance: a new approach to appearance based clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
  16. 16.
    Zhang, J., Marszalek, M., Lazebnik, S., & Schmidt, C. (2006). Local features and kernels for classification of texture and object categories: a comprehensive study. In Proceedings of the Computer Vision and Pattern Recognition Workshop.Google Scholar
  17. 17.
    Tenenbaum, J. B., Silva, V., & Langford, J. C. (2000). A global geometric framework for nonlinear dimensionality reduction. Science, 290, 2319–2323. doi: 10.1126/science.290.5500.2319.CrossRefGoogle Scholar
  18. 18.
    Gulmezoglu, M. B., Dzhafarov, V., & Barkana, A. (2001). The common vector approach and its relation to principal component analysis. IEEE Trans Speech Audio Process, 9(6), 655–662. doi: 10.1109/89.943343.CrossRefGoogle Scholar
  19. 19.
    Boyd, S. (2004). Convex optimization pp. 399–401. Cambridge, UK: Cambridge University Press.zbMATHGoogle Scholar
  20. 20.
    Schölkopf, B., Smola, A. J., & Muller, K. R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput, 10, 1299–1319. doi: 10.1162/089976698300017467.CrossRefGoogle Scholar
  21. 21.
    Cevikalp, H., Neamtu, M., & Wilkes, M. (2006). Discriminative common vector method with kernels. IEEE Trans Neural Netw, 17, 1550–1565. doi: 10.1109/TNN.2006.881485.CrossRefGoogle Scholar
  22. 22.
    Xu, J., & Zikatanov, L. (2002). The method of alternating projections and the method of subspace corrections in hilbert space. J Am Math Soc, 15, 573–597. doi: 10.1090/S0894-0347-02-00398-3.zbMATHCrossRefMathSciNetGoogle Scholar
  23. 23.
    Fei-Fei, L. Fergus, R., & Perona, P. (2004) Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. In Proceedings of the IEEE CVPR Workshop of Generative Model Based Vision.Google Scholar
  24. 24.
    USPS dataset of handwritten characters created by the US Postal Service. Retrieved from Scholar
  25. 25.
    Keysers, D., Dohmen, J., Theiner, T., & Ney, H. (2000). Experiments with an extended tangent distance. In Proceedings of the 15th International Conference on Pattern Recognition, vol. 2, pp. 38–42.Google Scholar
  26. 26.
    C codes for computing tangent distances. Retrieved from∼keysers/td/.
  27. 27.
    Golub, G. H., & Loan, C. F.-V. (1996). Matrix computations (3rd ed.). Baltimore, MD: Johns Hopkins University Press.zbMATHGoogle Scholar
  28. 28.
    UCI—benchmark repository—a huge collection of artificial and real world data sets. University of California Irvine. Retrieved from∼mlearn/MLRepository.html.
  29. 29.
    Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C. (2004). Visual categorization with bags of keypoints. In Proceedings of the ECCV Workshop on Statistical Learning for Computer Vision.Google Scholar
  30. 30.
    Lazebnik, S., Schmid, C., & Ponce, J. (2005). A sparse texture representation using local affine regions. IEEE Trans PAMI, 27(8), 1265–1278.Google Scholar
  31. 31.
    Fowlkes, C., Belogie, S., Chung, F., & Malik, J. (2004). Spectral grouping using the Nystrom method. IEEE Trans PAMI, 26, 1–12.Google Scholar
  32. 32.
    Saul, L. K., & Roweis, S. T. (2003). Think globally, fit locally: unsupervised learning of low dimensional manifolds. J Mach Learn Res, 4, 119–155.CrossRefMathSciNetGoogle Scholar
  33. 33.
    Levina, E., & Bickel, P. J. (2005). Maximum likelihood estimation of intrinsic dimension. In L. K. Saul, Y. Weiss, & L. Bottou (Eds.), Advances in neural information processing system, 17 (pp. 777–784). Cambridge, MA: MIT Press.Google Scholar
  34. 34.
    Camastra, F., & Vinciarelli, A. (2002). Estimating the intrinsic dimension of data with a fractal-based method. IEEE Trans Pattern Anal Mach Intell, 24(10), 1404–1407. doi: 10.1109/TPAMI.2002.1039212.CrossRefGoogle Scholar
  35. 35.
    Fukunaga, K., & Olsen, D. R. (1971). An algorithm for finding intrinsic dimensionality of data. IEEE Trans Comput, C-20, 176–183. doi: 10.1109/T-C.1971.223208.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  • Hakan Cevikalp
    • 1
    Email author
  • Diane Larlus
    • 2
  • Marian Neamtu
    • 3
  • Bill Triggs
    • 4
  • Frederic Jurie
    • 5
  1. 1.Electrical and Electronics Engineering DepartmentEskisehir Osmangazi UniversityEskisehirTurkey
  2. 2.Learning and Recognition in Vision (LEAR), INRIAGrenobleFrance
  3. 3.Department of MathematicsVanderbilt UniversityNashvilleUSA
  4. 4.Laboratoire Jean KuntzmannGrenobleFrance
  5. 5.University of CaenCaenFrance

Personalised recommendations