A Fast Computation of Inter-class Overlap Measures Using Prototype Reduction Schemes
In most Pattern Recognition (PR) applications, it is advantageous if the accuracy (or error rate) of the classifier can be evaluated or bounded prior to testing it in a real-life setting. It is also well known that if the two class-conditional distributions have a large overlapping volume, the classification accuracy is poor. This is because, if we intend to use the classification accuracy as a criterion for evaluating a PR system, the points within the overlapping volume tend to have less significance in determining the prototypes. Unfortunately, the computation of the indices which quantify the overlapping volume is expensive. In this vein, we propose a strategy of using a Prototype Reduction Scheme (PRS) to approximately compute the latter. In this paper, we show that by completely discarding the points not included by the PRS, we can obtain a reduced set of sample points, using which, in turn, the measures for the overlapping volume can be computed. The value of the corresponding figures is comparable to those obtained with the original training set (i.e., the one which considers all the data points) even though the computations required to obtain the prototypes and the corresponding measures are significantly less. The proposed method has been rigorously tested on artificial and real-life data sets, and the results obtained are, in our opinion, quite impressive - sometimes faster by two orders of magnitude.
KeywordsPrototype Reduction Schemes (PRS) k-Nearest Neighbor (k −NN) Classifier Data Complexity Class-Overlapping
Unable to display preview. Download preview PDF.
- 2.Batista, G.E., Prati, R.C., Monard, M.C.: Balancing Strategies and Class Overlapping. In: Famili, A.F., Kok, J.N., Peña, J.M., Siebes, A., Feelders, A. (eds.) IDA 2005. LNCS, vol. 3646, pp. 24–35. Springer, Heidelberg (2005)Google Scholar
- 6.Dasarathy, B.V.: Nearest Neighbor (NN) Norms: NN Pattern Classification Techniques. IEEE Computer Society Press, Los Alamitos (1991)Google Scholar
- 7.Devijver, P.A., Kittler, J.: On the edited nearest neighbor rule. In: Proc. 5th Int. Conf. on Pattern Recognition, December 1980, pp. 72–80 (1980)Google Scholar
- 11.Ho, T.K., Basu, M.: Complexity Measures of Supervised Classification Problems. IEEE Trans. Pattern Anal. and Machine Intell. PAMI-24(3), 289–300 (2002)Google Scholar
- 12.Hoekstra, A., Duin, R.P.W.: On the nonlinearity of pattern classifiers. In: 13th International Conference on Pattern Recognition (ICPR 1996), pp. 271–275 (1996)Google Scholar
- 18.Mansilla, E.B., Ho, T.K.: On classifier domains of competence. In: 17th International Conference on Pattern Recognition (ICPR 2004), pp. 136–139 (2004)Google Scholar
- 20.Mollineda, R.A., Sanchez, J.S., Sotoca, J.M.: Data Characterization for Effective Prototype Selection. In: Marques, J.S., Pérez de la Blanca, N., Pina, P. (eds.) IbPRIA 2005. LNCS, vol. 3523, pp. 27–34. Springer, Heidelberg (2005)Google Scholar
- 24.Sotoca, J.M., Mollineda, R.A., Sanchez, J.S.: A meta-learning framework for pattern classification by means of data complexity measures. Revista Iberoamericana de Inteligencia Artificial 10(29), 31–38 (2006)Google Scholar
- 26.Tomek, I.: Two modifcations of CNN. IEEE Trans. Syst. Man and Cybern. SMC-6(6), 769–772 (1976)Google Scholar
- 27.Xie, Q., Laszlo, C.A., Ward, R.K.: Vector quantization techniques for nonparametric classifier design. IEEE Trans. Pattern Anal. and Machine Intell. PAMI-15(12), 1326–1330 (1993)Google Scholar
- 28.Kim, S.-W., Oommen, B.J.: On using prototype reduction schemes to enhance the computation of volume-based inter-class overlap measures (unabridged version of this paper)Google Scholar