Data Characterization for Effective Prototype Selection

  • Ramón A. Mollineda
  • J. Salvador Sánchez
  • José M. Sotoca
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3523)


The Nearest Neighbor classifier is one of the most popular supervised classification methods. It is very simple, intuitive and accurate in a great variety of real-world applications. Despite its simplicity and effectiveness, practical use of this rule has been historically limited due to its high storage requirements and the computational costs involved, as well as the presence of outliers. In order to overcome these drawbacks, it is possible to employ a suitable prototype selection scheme, as a way of storage and computing time reduction and it usually provides some increase in classification accuracy. Nevertheless, in some practical cases prototype selection may even produce a degradation of the classifier effectiveness. From an empirical point of view, it is still difficult to know a priori when this method will provide an appropriate behavior. The present paper tries to predict how appropriate a prototype selection algorithm will result when applied to a particular problem, by characterizing data with a set of complexity measures.


Complexity Measure Training Instance Lower Error Rate Neighbor Rule Prototype Selection 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Chang, C.-L.: Finding prototypes for nearest neighbor classifiers. IEEE Trans. on Computers 23, 1179–1184 (1974)zbMATHCrossRefGoogle Scholar
  2. 2.
    Chavez, E., Navarro, G., Baeza-Yates, R.A., Marroquin, J.L.: Searching in metric spaces. ACM Computing Surveys 33, 273–321 (2001)CrossRefGoogle Scholar
  3. 3.
    Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. IEEE Trans. on Information Theory 13, 21–27 (1967)zbMATHCrossRefGoogle Scholar
  4. 4.
    Dasarathy, B.V.: Minimal consistent subset (MCS) identification for optimal nearest neighbor decision systems design. IEEE Trans. on Systems, Man, and Cybernetics 24, 511–517 (1994)CrossRefGoogle Scholar
  5. 5.
    Devijver, P.A., Kittler, J.: Pattern Recognition: A Statistical Approach. Prentice Hall, Englewood Cliffs (1982)zbMATHGoogle Scholar
  6. 6.
    Hart, P.E.: The condensed nearest neighbor rule. IEEE Trans. on Information Theory 14, 515–516 (1968)CrossRefGoogle Scholar
  7. 7.
    Ho, T.-K., Basu, M.: Complexity measures of supervised classification problems. IEEE Trans. on Pattern Analysis and Machine Intelligence 24, 289–300 (2002)CrossRefGoogle Scholar
  8. 8.
    Bernardo, E., Ho, T.-K.: On classifier domain of competence. In: Proc. 17th. Int. Conf. on Pattern Recognition 1, Cambridge, UK, pp. 136–139 (2004)Google Scholar
  9. 9.
    Kim, S.-W., Oommen, B.J.: Enhancing prototype reduction schemes with LVQ3-type algorithms. Pattern Recognition 36, 1083–1093 (2003)zbMATHCrossRefGoogle Scholar
  10. 10.
    Kuncheva, L.I.: Editing for the k-nearest neighbors rule by a genetic algorithm. Pattern Recognition Letters 16, 809–814 (1995)CrossRefGoogle Scholar
  11. 11.
    Mollineda, R.A., Ferri, F.J., Vidal, E.: An efficient prototype merging strategy for the condensed 1-NN rule through class-conditional hierarchical clustering. Pattern Recognition 35, 2771–2782 (2002)zbMATHCrossRefGoogle Scholar
  12. 12.
    Ritter, G.L., Woodruff, H.B., Lowry, S.R., Isenhour, T.L.: An algorithm for a selective nearest neighbour decision rule. IEEE Trans. on Information Theory 21, 665–669 (1975)zbMATHCrossRefGoogle Scholar
  13. 13.
    Tomek, I.: An experiment with the edited nearest neighbor rule. IEEE Trans. on Systems, Man and Cybernetics 6, 448–452 (1976)zbMATHCrossRefMathSciNetGoogle Scholar
  14. 14.
    Wilson, D.L.: Asymptotic properties of nearest neighbor rules using edited data sets. IEEE Trans. on Systems, Man and Cybernetics 2, 408–421 (1972)zbMATHCrossRefGoogle Scholar
  15. 15.
    Wilson, D.R., Martinez, T.R.: Reduction techniques for instance-based learning algorithms. Machine Learning 38, 257–286 (2000)zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Ramón A. Mollineda
    • 1
  • J. Salvador Sánchez
    • 1
  • José M. Sotoca
    • 1
  1. 1.Dept. Llenguatges i Sistemes InformàticsUniversitat Jaume ICastelló de la PlanaSpain

Personalised recommendations