Abstract
The k-Nearest Neighbour rule is one of the most popular non-parametric classification techniques in Pattern Recognition. This technique requires a set of good prototypes to represent pattern classes. One possibility is to define the given training set as the set of prototypes. Obviously, this approach presents a high computational cost if the training set is large. Alternatively, clustering techniques allow for the description of a training corpus in terms of clusters. A cluster is formed by patterns with certain simmilarities [1]. These clusters can be represented by a set of prototypes. The selection of adequate prototypes is one of the most important problems in Pattern Recognition.
This work was partially supported by the Spanish MCT under projects TIC2000-1703-C03-01 and TIC2000-1599-CO2-01.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
R. O. Duda, P. Hart, and D. G. Stork, Pattern Classification, (John Wiley, 2001 ).
D. L. Wilson, Asymptotic properties of nearest neighbour rules using edited data, IEEE Transactions on Systems, Man and Cybernetics, 2 (1972) pp. 408–421.
P. Devijver and J. Kittler, Pattern Recognition: A Statistical Approach. Prentice Hall, Inc., London, 1982.
P. E. Hart, The condensed nearest neighbour rule, IEEE Transactions on Information Theory, 14 (1968), pp. 515–516.
Fu, K. S., 1982. Syntactic Pattern Recognition. Prentice-Hall.
C. de la Higuera and F. Casacuberta, The topology of strings: two npcomplete problems, Theoretical Computer Science 230 (2000) pp. 39–48.
T. Kohonen, Median strings, Pattern Recognition Letters 3 (1985) pp. 309–313.
F. Kruzslicz, A greedy algorithm to look for median strings, in: Abstracts of the Conference on PhD Students in Computer Science, (Institute of informatics of the József Attila University, 1988 ).
I. Fischer and A. Zell, String averages and self-organizing maps for strings, in: Proceeding of the Second ICSC Symposium on Neural Computation, (2000) pp. 208–215.
X. Jiang, A. Munger, and H. Bunke, On Median Graphs: Properties, Algorithms, and Applications, IEEE Transactions on Pattern Analysis and Machine Intelligence 23 (10) (2001) pp. 1144–1151.
F. Casacuberta and M. de Antonio, A greedy algorithm for computing approximate median strings, in: Proceedings of the VII Simposium Nacional de Reconocimiento de Formas y Ancálisis de Imdgenes, (1997) pp. 193–198.
C. D. Martínez, A. Juan and F. Casacuberta, Use of Median String for Classification, in: Proceedings of the 15th International Conference on Pattern Recognition, (Vol. 2, Barcelona, Spain, 2000 ) pp. 907–910.
C. Martínez, A. Juan and F. Casacuberta, Improving classification using median string and nn rules, in: Proceedings of IX Simposium Nacional de Reconocimiento de Formas y Andlisis de Imcágenes, (2001) pp. 391–394.
R. Wagner and M. Fisher, The string-to-string correction problem. Journal of the ACM21 (1974) pp. 168–178.
E. Vidal, A. Marzal and P. Aibar, Fast computation of normalized edit distances, IEEE Transactions on Pattern Analysis and Machine Intelligence 17 (9) (1995) pp. 899–902.
A. Juan, and E. Vidal, On the Use of Edit Distances and an Efficient k-NN Search Technique (k-AESA) for Fast and Accurate String Classification, in: Proceedings of the 15th International Conference on Pattern Recognition, (Vol. 2, Barcelona, Spain, 2000 ) pp. 680–683.
C. Lundsteen, J. Philip and E. Granum, Quantitative Analysis of 6895 Digitized Trypsin G-banded Human Metaphase Chromosomes, Clinical Genetics 18 (1980) pp. 355–370.
E. Granum and M. Thomason, Automatically Inferred Markov Network Models for Classification of Chromosomal Band Pattern Structures, Cytometry 11 (1990) pp. 26–39.
J. Gregor and M. G. Thomason, A Disagreement Count Scheme for Inference of Constrained Markov Networks, in: L. Miclet and C. de la Higuera (Eds.), Grammatical Inference: Learning Syntax from Sentences, (Vol. 1147 of Lecture Notes in Computer Science, Springer, 1996 ) pp. 168–178.
E. Vidal, and M. J. Castro, Classification of Banded Chromosomes using Error-Correcting Grammatical Inference (ECGI) and Multilayer Perceptron (MLP), in: Proceedings of the VII Simposium Nacional de Reconocimiento de Formas y Anc lisis de Imc genes, (Vol. 1, Bellaterra, Spain, 1997 ) pp. 31–36.
E. Vidal, M. J. Castro and J. A. Sanchez, Classification of Banded Chromosomes, (tech. rep., DSIC, Universidad Politécnica de Valencia, Spain ) 1997.
C. D. Martínez-Hinarejos, A. Juan, F. Casacuberta, Median String for k-Nearest Neighbour classification, Pattern Recognition Letters, acepted for revision.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Kluwer Academic Publishers
About this chapter
Cite this chapter
Martínez-Hinarejos, C.D., Juan, A., Casacuberta, F. (2003). Prototype Extraction for k-NN Classifiers using Median Strings. In: Chen, D., Cheng, X. (eds) Pattern Recognition and String Matching. Combinatorial Optimization, vol 13. Springer, Boston, MA. https://doi.org/10.1007/978-1-4613-0231-5_18
Download citation
DOI: https://doi.org/10.1007/978-1-4613-0231-5_18
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-7952-2
Online ISBN: 978-1-4613-0231-5
eBook Packages: Springer Book Archive