Skip to main content

Proximity-Graph Instance-Based Learning, Support Vector Machines, and High Dimensionality: An Empirical Comparison

  • Conference paper
Machine Learning and Data Mining in Pattern Recognition (MLDM 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7376))

Abstract

Previous experiments with low dimensional data sets have shown that Gabriel graph methods for instance-based learning are among the best machine learning algorithms for pattern classification applications. However, as the dimensionality of the data grows large, all data points in the training set tend to become Gabriel neighbors of each other, bringing the efficacy of this method into question. Indeed, it has been conjectured that for high-dimensional data, proximity graph methods that use sparser graphs, such as relative neighbor graphs (RNG) and minimum spanning trees (MST) would have to be employed in order to maintain their privileged status. Here the performance of proximity graph methods, in instance-based learning, that employ Gabriel graphs, relative neighborhood graphs, and minimum spanning trees, are compared experimentally on high-dimensional data sets. These methods are also compared empirically against the traditional k-NN rule and support vector machines (SVMs), the leading competitors of proximity graph methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Brighton, H., Mellish, C.S.: Advances in Instance Selection for Instance Based Learning Algorithms. Data Mining and Knowledge Discovery 6, 153–172 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  2. Bhattacharya, B., Mukherjee, K., Toussaint, G.T.: Geometric Decision Rules for Instance-Based Learning Problems. In: Pal, S.K., Bandyopadhyay, S., Biswas, S. (eds.) PReMI 2005. LNCS, vol. 3776, pp. 60–69. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  3. Bhattacharya, B., Mukherjee, K., Toussaint, G.T.: Geometric Decision Rules for High Dimensions. In: Proc. 55th Session of the International Statistics Institute, Sydney, Australia, April 5-12 (2005)

    Google Scholar 

  4. Cover, T.M., Hart, P.E.: Nearest Neighbor Pattern Classification. IEEE Transactions on Information Theory 13, 21–27 (1967)

    Article  MATH  Google Scholar 

  5. Cortes, C., Vapnik, V.: Support-Vector Networks. Machine Learning 20, September 1-25 (1995)

    Google Scholar 

  6. Devroye, L.: The Exptected Size of Some Graphs in Computational Geometry. Computers and Mathematics with Applications 15, 53–64 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  7. Devroye, L.: On the Inequality of Cover and Hart in Nearest Neighbor Discrimination. IEEE Transactions on Pattern Analysis and Machine Intelligence 3, 75–78 (1981)

    Article  MATH  Google Scholar 

  8. Devroye, L., Györfi, L., Lugosi, G.: A Probabilistic Theory of Pattern Recognition. Springer (1996)

    Google Scholar 

  9. Duan, K.-B., Keerthi, S.S.: Which Is the Best Multiclass SVM Method? An Empirical Study. In: Oza, N.C., Polikar, R., Kittler, J., Roli, F. (eds.) MCS 2005. LNCS, vol. 3541, pp. 278–285. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  10. Frank, A., Asuncion, A.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine, CA (2010), http://archive.ics.uci.edu/ml

    Google Scholar 

  11. Gomez, E., Herrera, P.: Comparative Analysis of Music Recordings from Western and Non-Western traditions by Automatic Tonal Feature Extraction. Empirical Musicology Review 3 (2008)

    Google Scholar 

  12. Hart, P.E.: The Condensed Nearest Neighbor Rule. IEEE Transactions on In-formation Theory 14, 515–516 (1968)

    Article  Google Scholar 

  13. Houle, M.: SASH: A Spatial Approximation Sample Hierarchy for Similarity Search. Tech. Report RT-0517, IBM Tokyo Research Lab (2003)

    Google Scholar 

  14. Jaromczyk, J.W., Toussaint, G.T.: Relative Neighborhood Graphs and their Relatives. Proceedings of the IEEE 80, 1502–1517 (1992)

    Article  Google Scholar 

  15. Kirkpatrick, D.G., Radke, J.D.: A Framework for Computational Morphology. In: Toussaint, G.T. (ed.) Computational Geometry, pp. 217–248. North Holland, Amsterdam (1985)

    Google Scholar 

  16. Merz, C.J., Murphy, P.M.: UCI Repository of Machine Learning Database, Department of Information and Computer Science, University of California, Internet, http://www.ics.uci.edu/mlearn/MLRepository.html

  17. Narasimhan, G., Zhu, J., Zachariasen, M.: Experiments with Computing Geometric Minimum Spanning Trees. In: Proceedings of Algorithm Engineering and Experiments (ALENEX 2000). LNCS, pp. 183–196. Springer, Heidelberg (2000)

    Google Scholar 

  18. Oliver, L.H., Poulsen, R.S., Toussaint, G.T.: Estimating False Positive and False Negative Error Rates in Cervical Cell Classification. J. Histochemistry and Cytochemistry 25, 696–701 (1977)

    Article  Google Scholar 

  19. Platt, J.: Fast Training of Support Vector Machines using Sequential Minimal Optimization. In: Schoelkopf, B., et al. (eds.) Advances in Kernel Methods - Support Vector Learning. MIT Press (1988)

    Google Scholar 

  20. Toussaint, G.T.: Geometric Proximity Graphs for Improving Nearest Neighbor Methods in Instance-Based Learning and Data Mining. International J. Computational Geometry and Applications 15, 101–150 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  21. Toussaint, G.T.: The Relative Neighborhood Graph of a Finite Planar Set. Pattern Recognition 12, 261–268 (1980)

    Article  MathSciNet  MATH  Google Scholar 

  22. Sánchez, J.S., Pla, F., Ferri, F.J.: Prototype Selection for the Nearest Neighbor Rule through Proximity Graphs. Pattern Recognition Letters 18, 507–513 (1997)

    Article  Google Scholar 

  23. Toussaint, G.T., Poulsen, R.S.: Some New Algorithms and Software Implementation Methods for Pattern Recognition Research. In: Proc. Third International Computer Software and Applications Conference, pp. 55–63. IEEE Computer Society (1979)

    Google Scholar 

  24. Toussaint, G.T., Bhattacharya, B.K., Poulsen, R.S.: The Application of Voronoi Diagrams to Nonparametric Decision Rules. In: Proc. Computer Science and Statistics: 16th Symposium on the Interface, pp. 97–108. North-Holland, Amsterdam (1985)

    Google Scholar 

  25. Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)

    MATH  Google Scholar 

  26. Wilson, D.L.: Asymptotic Properties of Nearest Neighbor Rules Using Edited-Data. IEEE Transactions on Systems, Man, and Cybernetics 2, 408–421 (1973)

    Article  Google Scholar 

  27. Wilson, D.R., Martinez, T.R.: Reduction techniques for instance-based learning algorithms. Machine Learning 38, 257–286 (2000)

    Article  MATH  Google Scholar 

  28. Zhang, W., King, I.: A Study of the Relationship Between Support Vector Machine and Gabriel Graph. In: Proc. IEEE International Joint Conference on Neural Networks, IJCNN 2002, Honolulu, vol. 1, pp. 239–244 (2002)

    Google Scholar 

  29. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Toussaint, G.T., Berzan, C. (2012). Proximity-Graph Instance-Based Learning, Support Vector Machines, and High Dimensionality: An Empirical Comparison. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2012. Lecture Notes in Computer Science(), vol 7376. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31537-4_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-31537-4_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-31536-7

  • Online ISBN: 978-3-642-31537-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics