SSPR /SPR 2002: Structural, Syntactic, and Statistical Pattern Recognition pp 528-537 | Cite as
Recursive Prototype Reduction Schemes Applicable for Large Data Sets
Abstract
Most of the Prototype Reduction Schemes (PRS), which have been reported in the literature, process the data in its entirety to yield a subset of prototpyes that are useful in nearest-neighbour-like classification. Foremost among these are the Prototypes for Nearest Neighbour (PNN) classifiers, the Vector Quantization (VQ) technique, and the Support Vector Machines (SVM). These methods suffer from a major disadvantage, namely, that of the excessive computational burden encountered by processing all the data. In this paper, we suggest a recursive and computationally superior mechanism. Rather than process all the data using a PRS, we propose that the data be recursively subdivided into smaller subsets. This recursive subdivision can be arbitrary, and need not utilize any underlying clustering philosophy. The advantage of this is that the PRS processes subsets of data points that effectively sample the entire space to yield smaller subsets of prototypes. These prototypes are then, in turn, gathered and processed by the PRS to yield more refined prototypes. Our experimental results demonstrate that the proposed recursive mechansim yields classification comparable to the best reported prototype condensation schemes to-date, for both artificial data sets and for samples involving real-life data sets. The results especially demonstrate the computational advantage of using such a recursive strategy for large data sets, such as those involved in data mining and text categorization applications.
Keywords
Support Vector Machine Travelling Salesman Problem Near Neighbour Vector Quantization Support Vector Machine MethodReferences
- 1.A. K. Jain, R. P. W. Duin and J. Mao.: Statistical pattern recognition: A review. IEEE Trans. Pattern Anal. and Machine Intell., PAMI-22(1):4–37, 2000.CrossRefGoogle Scholar
- 2.D. V. Dasarachy.: Nearest Neighbor (NN) Norms: NN Pattern Classification Techniques. IEEE Computer Society Press, Los Alamitos, 1991.Google Scholar
- 3.P. E. Hart.: The condensed nearest neighbor rule. IEEE Trans. Inform. Theory, IT-14: 515–516, May 1968.Google Scholar
- 4.G. W. Gates.: The reduced nearest neighbor rule. IEEE Trans. Inform. Theory, IT-18: 431–433, May 1972.Google Scholar
- 5.C. L. Chang.: Finding prototypes for nearest neighbor classifiers. IEEE Trans. Computers, C-23(11): 1179–1184, Nov. 1974.Google Scholar
- 6.G. L. Ritter, H. B. Woodruff, S. R. Lowry and T. L. Isenhour.: An algorithm for a selective nearest neighbor rule. IEEE Trans. Inform. Theory, IT-21: 665–669, Nov. 1975.Google Scholar
- 7.I. Tomek.: Two modifcations of CNN. IEEE Trans. Syst., Man and Cybern., SMC-6(6): 769–772, Nov. 1976.Google Scholar
- 8.P. A. Devijver and J. Kittler.: On the edited nearest neighbor rule. Proc. 5th Int. Conf. on Pattern Recognition, 72–80, Dec. 1980.Google Scholar
- 9.K. Fukunaga.: Introdction to Statistical Pattern Recognition, Second Edition. Academic Press, San Diego, 1990.Google Scholar
- 10.Q. Xie, C. A. Laszlo and R. K. Ward.: Vector quantization techniques for nonpara-metric classifier design. IEEE Trans. Pattern Anal. and Machine Intell., PAMI-15(12): 1326–1330, Dec. 1993.Google Scholar
- 11.Y. Hamamoto, S. Uchimura and S. Tomita.: A bootstrap technique for nearest neighbor classifier design. IEEE Trans. Pattern Anal. and Machine Intell., PAMI-19(1):73–79, Jan. 1997.Google Scholar
- 12.C. J. C. Burges.: A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2):121–167, 1998.CrossRefGoogle Scholar
- 13.S.-W. Kim and B. J. Oommen.: Enhancing prototype reduction schemes with LVQ3-type algorithms. To appear in Pattern Recognition.Google Scholar
- 14.T. Kohonen.: Self-Oganizing Maps. Berlin, Springer-Verlag, 1995.Google Scholar
- 15.N. Aras, B. J. Oommen and I. K. Altinel.: The Kohonen network incorporating explicit statistics and its application to the travelling salesman problem. Neural Networks, 1273–1284, Dec. 1999.Google Scholar
- 16.S.-W. Kim and B. J. Oommen.: Recursive prototype reduction schemes applicable for large data sets. Unabridged version of this paper.Google Scholar
- 17.