Journal of Classification

, Volume 34, Issue 1, pp 85–107 | Cite as

Fusing Vantage Point Trees and Linear Discriminants for Fast Feature Classification

  • Hugo ProençaEmail author
  • João C. Neves


This paper describes a classification strategy that can be regarded as a more general form of nearest-neighbor classification. It fuses the concepts of nearest neighbor, linear discriminant and Vantage-Point trees, yielding an efficient indexing data structure and classification algorithm. In the learning phase, we define a set of disjoint subspaces of reduced complexity that can be separated by linear discriminants, ending up with an ensemble of simple (weak) classifiers that work locally. In classification, the closest centroids to the query determine the set of classifiers considered, which responses are weighted. The algorithm was experimentally validated in datasets widely used in the field, attaining error rates that are favorably comparable to the state-of-the-art classification techniques. Lastly, the proposed solution has a set of interesting properties for a broad range of applications: 1) it is deterministic; 2) it classifies in time approximately logarithmic with respect to the size of the learning set, being far more efficient than nearest neighbor classification in terms of computational cost; and 3) it keeps the generalization ability of simple models.


Vantage-point tree Linear discriminants Nearest neighbor classification 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. ALPAYDIN, E. (1999), “Combined 5 × 2 cv F-Test for Comparing Supervised Classification Learning Algorithms”, Neural Computation, 11(8), 1885–1892.CrossRefGoogle Scholar
  2. BANFIELD, R., HALL, L., BOWYER, K., and KEGELMEYER,W. (2007), “A Comparison of Decision Tree Ensemble Creation Techniques”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(1), 173–180.CrossRefGoogle Scholar
  3. BAUER, E., and KOHAVI, R. (1999), “An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants”, Machine Learning, 36(1-2), 105–139.CrossRefGoogle Scholar
  4. BOCK, K., COUSSEMENT, K., and POEL, D. (2010), “Ensemble Classification Based on Generalized Additive Models”, Computational Statistics and Data Analysis, 54, 1535–1546.MathSciNetCrossRefzbMATHGoogle Scholar
  5. BREIMAN, L. (1996), “Bagging Predictors”, Machine Learning, 24(2), 123–140.zbMATHGoogle Scholar
  6. BREIMAN, L. (2001), “Random Forests”, Machine Learning, 45(1), 5–32.CrossRefzbMATHGoogle Scholar
  7. BRYLL, R., GUTIERREZ-OSUNA,R., and QUEK, F. (2003), “Attribute Bagging: Improving Accuracy of Classifier Ensembles by Using Random Feature Subsets”, Pattern Recognition, 36(6), 291–1302,CrossRefzbMATHGoogle Scholar
  8. CAI, D., HE, X., and HAN, J. (2008), “Training Linear Discriminant Analysis in Linear Time”, in Proceedings of the 2008 IEEE 24th International Conference on Data Engineering, pp. 209–217.Google Scholar
  9. CANUTO, A., ABREU, M., OLIVEIRA, L., XAVIER JR., J., and SANTOS, A. (2007), “Investigating the Influence of the Choice of the Ensemble Members in Accuracy and Diversity of Selection-Based and Fusion-Based Methods for Ensembles”, Pattern Recognition Letters, 28(4), 472–486.Google Scholar
  10. CECI, M., APPICE, A., and MALERBA, D. (2003), “Comparing Simplification Methods for Model Trees with Regression and Splitting Nodes”, in Proceedings of the Fourteenth International Symposium on Methodologies for Intelligent Systems, Lecture Notes in Artificial Intelligence Vol. 2871, pp. 49–56.Google Scholar
  11. CORTES, C., and VAPNIK, V. (1995), “Support Vector Networks”, Machine Learning, 20, 1–25.zbMATHGoogle Scholar
  12. COVER, T., and HART, P. (1967), “Nearest Neighbor Pattern Classification”, IEEE Transactions on Information Theory, 13(1), 21–27.CrossRefzbMATHGoogle Scholar
  13. DEMSAR, J. (2006), “Statistical Comparisons of Classifiers over Multiple Data Sets”, Journal of Machine Learning Research, 7, 1–30.MathSciNetzbMATHGoogle Scholar
  14. DIETTERICH, T. (2000), “An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization”, Machine Learning, 49(2), 139–157.CrossRefGoogle Scholar
  15. DOMINGOS, P. (1996), “Unifying Instance-Based and Rule-Based Induction”, Machine Learning, 24, 141–168.Google Scholar
  16. DUDA, R., HART, P., and STORK, D. (2000), Pattern Classification (2nd ed.), Wiley Interscience, ISBN 0-471-05669-3.Google Scholar
  17. FRANK, E., HALL, M., and PFAHRINGER, B. (2003), “Locally Weighted Naive Bayes”, in Proceedings of the 19th conference on Uncertainty in Artificial Intelligence, San Mateo, pp. 249–256.Google Scholar
  18. FREUND, Y., and SCHAPIRE, R. (1995), “A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting”, in Proceedings of the 2nd European Conference on Computational Learning Theory, pp. 23–37.Google Scholar
  19. HO, T.K. (1995), “Random Decision Forests”, in Proceedings of the 3rd International Conference on Document Analysis and Recognition, pp. 278–282.Google Scholar
  20. HO, T.K. (1998), “The Random Subspace Method for Constructing Decision Forests” IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(8), 832–844.Google Scholar
  21. HOTHORN, T., and LAUSEN, B. (2005), “Building Classifiers by Bagging Trees”, Computational Statistics and Data Analysis, 49, 1068–1078.MathSciNetCrossRefzbMATHGoogle Scholar
  22. JIRINA, M., and JIRINA JR., M. (2013), “Utilization of Singularity Exponent in Nearest Neighbor Based Classifier”, Journal of Classification, 30(1), 3–29.Google Scholar
  23. JOHNSON, R., and WICHERN, D. (1988), Applied Multivariate Statistic Analysis (2nd. ed.), Englewood Cliffs NJ: Prentice Hall Inc.Google Scholar
  24. KLEINBERG, E.M. (1990), “Stochastic Discrimination”, Annals of Mathematics and Artificial Intelligence, 1, 207–239.CrossRefzbMATHGoogle Scholar
  25. KUMAR, A. (2008), “Combining Pattern Classifiers: Methods and Algorithms”, IEEE Transactions on Industrial Electronics, 55(1), 348–363.CrossRefGoogle Scholar
  26. KUNCHEVA, L. (2004), Combining Pattern Classifiers: Methods and Algorithms, Hoboken NJ: John Wiley & Sons.CrossRefzbMATHGoogle Scholar
  27. KUNCHEVA, L., and RODRÍGUEZ, J. (2007), “Classifier Ensembles with a Random Linear Oracle”, IEEE Transactions on Knowledge and Data Engineering, 19(4), 500–508.CrossRefGoogle Scholar
  28. LU, J., and TAN, Y-P. (2011), Nearest Feature Space Analysis for Classification”, IEEE Signal Processing Letters, 18(1), 55–58.Google Scholar
  29. MOLLER,M. (1993), “A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning”, Neural Networks, 6, 525–533.CrossRefGoogle Scholar
  30. OJALA, T., PIETIKAINEN, M., and HARWOOD, D. (1996), “A Comparative Study of TextureMeasures with Classification Based on Feature Distributions”, Pattern Recognition, 29, 51–59.CrossRefGoogle Scholar
  31. GARCÍA-PEDRAJAS, N. (2009), “Constructing Ensembles of Classifiers by Means of Weighted Instance Selection”, IEEE Transactions on Neural Networks, 20(2), 258–277.CrossRefGoogle Scholar
  32. SCHAPIRE, R. (1990), “The Strength of Weak Learnability”, Machine Learning, 5(2), 197–227.Google Scholar
  33. SEIFFERT, C., KHOSHGOFTAAR, T., HULSE, J., and NAPOLITANO, A. (2008), “RUSBoost: Improving Classification Performance When Training Data Is Skewed”, in Proceedings of the 19th International Conference on Pattern Recognition, pp. 1-4.Google Scholar
  34. STALLKAMP, J., SCHLIPSING, M., SALMEN, J., and IGEL, C. (2012), “Man vs. Computer: Benchmarking Machine Learning Algorithms for Traffic Sign Recognition”, Neural Networks, 32, 323–332.CrossRefGoogle Scholar
  35. TING, K.M., WELLS, J.R., TAN, S.C., TENG, S.W., and WEBB, G.I. (2011), “Feature-Subspace Aggregating: Ensembles for Stable and Unstable Learners”, Machine Learning, 82, 375–397.MathSciNetCrossRefGoogle Scholar
  36. VIOLA, P.A., and JONES, M.J. (2004), “Robust Real-Time Face Detection”, International Journal of Computer Vision, 57(2), 137–154.CrossRefGoogle Scholar
  37. YAN, R., and TEŠIĆ, J. (2007), “Model-Shared Subspace Boosting for Multi-Label Classification”, in Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 834–843.Google Scholar
  38. YIANILOS, P. (1993), “Data Structures and Algorithms for Nearest Neighbor Search in GeneralMetric Spaces”, in Proceedings of the Fourth Annual ACM-SIAMSymposium on Discrete Algorithms, Society for Industrial and Applied Mathematics, pp. 311–321.Google Scholar
  39. YU, G., ZHANG, G., YU, Z., DOMENICONI, C., YOUC, J., and HANA, G. (2012), “Semi-Supervised Ensemble Classification in Subspaces”, Applied Soft Computing, 12, 1511–1522.Google Scholar
  40. ZAMAN, M., and HIROSE, H. (2013), “DF-SVM: A Decision Forest Constructed on Artificially Enlarged Feature Space by Support Vector Machine”, Artificial Intelligence Review, 40, 467–494.CrossRefGoogle Scholar

Copyright information

© Classification Society of North America 2017

Authors and Affiliations

  1. 1.Instituto de Telecomunicações (IT)University of Beira InteriorCovilhãPortugal
  2. 2.University of Beira InteriorCovilhãPortugal

Personalised recommendations