Recursive Prototype Reduction Schemes Applicable for Large Data Sets

  • Sang-Woon Kim
  • B. J. Oommen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2396)


Most of the Prototype Reduction Schemes (PRS), which have been reported in the literature, process the data in its entirety to yield a subset of prototpyes that are useful in nearest-neighbour-like classification. Foremost among these are the Prototypes for Nearest Neighbour (PNN) classifiers, the Vector Quantization (VQ) technique, and the Support Vector Machines (SVM). These methods suffer from a major disadvantage, namely, that of the excessive computational burden encountered by processing all the data. In this paper, we suggest a recursive and computationally superior mechanism. Rather than process all the data using a PRS, we propose that the data be recursively subdivided into smaller subsets. This recursive subdivision can be arbitrary, and need not utilize any underlying clustering philosophy. The advantage of this is that the PRS processes subsets of data points that effectively sample the entire space to yield smaller subsets of prototypes. These prototypes are then, in turn, gathered and processed by the PRS to yield more refined prototypes. Our experimental results demonstrate that the proposed recursive mechansim yields classification comparable to the best reported prototype condensation schemes to-date, for both artificial data sets and for samples involving real-life data sets. The results especially demonstrate the computational advantage of using such a recursive strategy for large data sets, such as those involved in data mining and text categorization applications.


Support Vector Machine Travelling Salesman Problem Near Neighbour Vector Quantization Support Vector Machine Method 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    A. K. Jain, R. P. W. Duin and J. Mao.: Statistical pattern recognition: A review. IEEE Trans. Pattern Anal. and Machine Intell., PAMI-22(1):4–37, 2000.CrossRefGoogle Scholar
  2. 2.
    D. V. Dasarachy.: Nearest Neighbor (NN) Norms: NN Pattern Classification Techniques. IEEE Computer Society Press, Los Alamitos, 1991.Google Scholar
  3. 3.
    P. E. Hart.: The condensed nearest neighbor rule. IEEE Trans. Inform. Theory, IT-14: 515–516, May 1968.Google Scholar
  4. 4.
    G. W. Gates.: The reduced nearest neighbor rule. IEEE Trans. Inform. Theory, IT-18: 431–433, May 1972.Google Scholar
  5. 5.
    C. L. Chang.: Finding prototypes for nearest neighbor classifiers. IEEE Trans. Computers, C-23(11): 1179–1184, Nov. 1974.Google Scholar
  6. 6.
    G. L. Ritter, H. B. Woodruff, S. R. Lowry and T. L. Isenhour.: An algorithm for a selective nearest neighbor rule. IEEE Trans. Inform. Theory, IT-21: 665–669, Nov. 1975.Google Scholar
  7. 7.
    I. Tomek.: Two modifcations of CNN. IEEE Trans. Syst., Man and Cybern., SMC-6(6): 769–772, Nov. 1976.Google Scholar
  8. 8.
    P. A. Devijver and J. Kittler.: On the edited nearest neighbor rule. Proc. 5th Int. Conf. on Pattern Recognition, 72–80, Dec. 1980.Google Scholar
  9. 9.
    K. Fukunaga.: Introdction to Statistical Pattern Recognition, Second Edition. Academic Press, San Diego, 1990.Google Scholar
  10. 10.
    Q. Xie, C. A. Laszlo and R. K. Ward.: Vector quantization techniques for nonpara-metric classifier design. IEEE Trans. Pattern Anal. and Machine Intell., PAMI-15(12): 1326–1330, Dec. 1993.Google Scholar
  11. 11.
    Y. Hamamoto, S. Uchimura and S. Tomita.: A bootstrap technique for nearest neighbor classifier design. IEEE Trans. Pattern Anal. and Machine Intell., PAMI-19(1):73–79, Jan. 1997.Google Scholar
  12. 12.
    C. J. C. Burges.: A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2):121–167, 1998.CrossRefGoogle Scholar
  13. 13.
    S.-W. Kim and B. J. Oommen.: Enhancing prototype reduction schemes with LVQ3-type algorithms. To appear in Pattern Recognition.Google Scholar
  14. 14.
    T. Kohonen.: Self-Oganizing Maps. Berlin, Springer-Verlag, 1995.Google Scholar
  15. 15.
    N. Aras, B. J. Oommen and I. K. Altinel.: The Kohonen network incorporating explicit statistics and its application to the travelling salesman problem. Neural Networks, 1273–1284, Dec. 1999.Google Scholar
  16. 16.
    S.-W. Kim and B. J. Oommen.: Recursive prototype reduction schemes applicable for large data sets. Unabridged version of this paper.Google Scholar
  17. 17.

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Sang-Woon Kim
    • 1
  • B. J. Oommen
    • 2
  1. 1.IEEE. Div. of Computer Science and EngineeringMyongji UniversityYonginKorea
  2. 2.IEEE. School of Computer ScienceCarleton UniversityOttawaCanada

Personalised recommendations