Reducing the Computational Cost of Computing Approximated Median Strings

  • Carlos D. Martínez-Hinarejos
  • Alfonso Juan
  • Francisco Casacuberta
  • Ramón Mollineda
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2396)

Abstract

The k-Nearest Neighbour (k-NN) rule is one of the most popular techniques in Pattern Recognition. This technique requires good prototypes in order to achieve good results with a reasonable computational cost. When objects are represented by strings, the Median String of a set of strings could be the best prototype for representing the whole set (i.e., the class of the objects). However, obtaining the Median String is an NP-Hard problem, and only approximations to the median string can be computed with a reasonable computational cost. Although proposed algorithms to obtain approximations to Median String are polynomial, their computational cost is quite high (cubic order), and obtaining the prototypes is very costly. In this work, we propose several techniques in order to reduce this computational cost without degrading the classification performance by the Nearest Neighbour rule.

Keywords

Edit Distance Free Monoid Neighbour Rule Pattern Recognition Letter Reasonable Computational Cost 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Duda, R. O., Hart, P., Stork, D. G., 2001. Pattern Classification. John Wiley.Google Scholar
  2. 2.
    de la Higuera, C., Casacuberta, F., 2000. The topology of strings: two np-complete problems. Theoretical Computer Science 230, 39–48.MATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    Fu, K. S., 1982. Syntactic Pattern Recognition. Prentice-Hall.Google Scholar
  4. 4.
    Juan, A., Vidal, E., 1998. Fast Median Search in Metric Spaces. In: Proceedings of the 2nd International Workshop on Statistical Techniques in Pattern Recognition. Vol. 1451 of Lecture Notes in Computer Science. Springer-Verlag, Sydney, pp. 905–912.Google Scholar
  5. 5.
    Kohonen, T., 1985. Median strings. Pattern Recognition Letters 3, 309–313.CrossRefGoogle Scholar
  6. 6.
    Kruzslicz, F., 1988. A greedy algorithm to look for median strings. In: Abstracts of the Conference on PhD Students in Computer Science. Institute of informatics of the József Attila University.Google Scholar
  7. 7.
    Fischer, I., Zell, A., 2000. String averages and self-organizing maps for strings. In: Proceeding of the Second ICSC Symposium on Neural Computation. pp. 208–215.Google Scholar
  8. 8.
    Casacuberta, F., de Antonio, M., 1997. A greedy algorithm for computing approximate median strings. In: Proceedings of the VII Simposium Nacional de Reconocimiento de Formas y Análisis de Imágenes. pp. 193–198.Google Scholar
  9. 9.
    Martínez, C. D., Juan, A., Casacuberta, F., 2000. Use of Median String for Classification. In: Proceedings of the 15th International Conference on Pattern Recognition. Vol. 2. Barcelona (Spain), pp. 907–910.Google Scholar
  10. 10.
    Vidal, E., Marzal, A., Aibar, P., 1995. Fast computation of normalized edit distances. IEEE Transactions on Pattern Analysis and Machine Intelligence 17(9), 899–902.CrossRefGoogle Scholar
  11. 11.
    Martínez, C., Juan, A., Casacuberta, F., 2001. Improving classification using median string and nn rules. In: Proceedings of IX Simposium Nacional de Reconocimiento de Formas y Análisis de Imágenes. pp. 391–394.Google Scholar
  12. 12.
    Lundsteen, C., Philip, J., Granum, E., 1980. Quantitative Analysis of 6895 Digitized Trypsin G-banded Human Metaphase Chromosomes. Clinical Genetics 18, 355–370.CrossRefGoogle Scholar
  13. 13.
    Granum, E., Thomason, M., 1990. Automatically Inferred Markov Network Models for Classification of Chromosomal Band Pattern Structures. Cytometry 11, 26–39.CrossRefGoogle Scholar
  14. 14.
    Granum, E., Thomason, M. J., Gregor, J. On the use of automatically inferred Markov networks for chromosome analysis. In C Lundsteen and J Piper, editors, Automation of Cytogenetics, pages 233–251. Springer-Verlag, Berlin, 1989.Google Scholar
  15. 15.
    Martínez-Hinarejos, C. D., Juan, A., Casacuberta, F., Median String for k-Nearest Neighbour classification, Pattern Recognition Letters, acepted for revision.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Carlos D. Martínez-Hinarejos
    • 1
  • Alfonso Juan
    • 1
  • Francisco Casacuberta
    • 1
  • Ramón Mollineda
    • 1
  1. 1.Departament de Sistemes Informàtics i Computació Institut Tecnològic d’InformàticaUniversitat Politècnica de ValènciaValènciaSpain

Personalised recommendations