Pattern Analysis and Applications

, Volume 11, Issue 3–4, pp 353–363 | Cite as

Prototype reduction using an artificial immune model

Theoretical Advances

Abstract

Artificial immune system (AIS)-based pattern classification approach is relatively new in the field of pattern recognition. The study explores the potentiality of this paradigm in the context of prototype selection task that is primarily effective in improving the classification performance of nearest-neighbor (NN) classifier and also partially in reducing its storage and computing time requirement. The clonal selection model of immunology has been incorporated to condense the original prototype set, and performance is verified by employing the proposed technique in a practical optical character recognition (OCR) system as well as for training and testing of a set of benchmark databases available in the public domain. The effect of control parameters is analyzed and the efficiency of the method is compared with another existing techniques often used for prototype selection. In the case of the OCR system, empirical study shows that the proposed approach exhibits very good generalization ability in generating a smaller prototype library from a larger one and at the same time giving a substantial improvement in the classification accuracy of the underlying NN classifier. The improvement in performance has been statistically verified. Consideration of both OCR data and public domain datasets demonstrate that the proposed method gives results better than or at least comparable to that of some existing techniques.

Keywords

Nearest neighbor classification Prototype selection Artificial immune system Clonal selection algorithm Statistical significance 

References

  1. 1.
    Cover TM, Hart PE (1967) Nearest neighbor pattern classification. IEEE Trans Inform Theory 13:21–27MATHCrossRefGoogle Scholar
  2. 2.
    Hart PE (1968) The condensed nearest neighbor rule. IEEE Trans Inform Theory (IT) 14(3):515–516CrossRefGoogle Scholar
  3. 3.
    Swonger CW (1972) Sample set condensation for a condensed NN decision rule for pattern recognition. In: Watanab S (ed) Frontiers of pattern recognition. Academic Press, New York, pp 511–519Google Scholar
  4. 4.
    Gates GW (1972) The reduced nearest neighbour rule. IEEE Trans Inform Theory 18(3):431–433CrossRefGoogle Scholar
  5. 5.
    Sanchez JS, Pla F, Ferri FJ (1995) Prototype selection for the nearest neighbour rule through proximity graphs. Pattern Recognit Lett (PRL) 18(6):507–513CrossRefGoogle Scholar
  6. 6.
    Skalak DB (1995) Prototype selection for composite nearest neighbor classifiers. PhD thesis, Computer Science, University of Massachusetts Amherst, USAGoogle Scholar
  7. 7.
    Wilson DR, Martinez TR (2000) Reduction techniques for instance-based learning algorithms. Mach Learn 38(3):257–286MATHCrossRefGoogle Scholar
  8. 8.
    Brighton H, Mellish C (2002) Advances in instance selection for instance-based learning algorithms. Data Min Knowl Discov 6:153–172MATHCrossRefMathSciNetGoogle Scholar
  9. 9.
    Susheela Devi V, Narasimha Murty M (2002) An incremental prototype set building technique. Pattern Recognit 35:505–513MATHCrossRefGoogle Scholar
  10. 10.
    Mollineda R, Ferri FJ, Vidal E (2002) An efficient prototype merging strategy for the condensed 1-NN rule through class-conditional hierarchical clustering. Pattern Recognit 35:2771–2782MATHCrossRefGoogle Scholar
  11. 11.
    Pekalska E, Duin RPW (2002) Prototype selection for finding efficient representations of dissimilarity data. In: Sixteenth international conference on pattern recognition (ICPR), vol 3, pp 37–40Google Scholar
  12. 12.
    Sanchez JS, Barandela R, Marques AI, Alejo R, Badenas J (2003) Analysis of new techniques to obtain quality training sets. Pattern Recognit Lett (PRL) 24(7):1015–1022CrossRefGoogle Scholar
  13. 13.
    Cano JR, Herrera F, Lozano M (2003) Using evolutionary algorithms as instance selection for data reduction in KDD: an experimental study. IEEE Trans Evol Comput 7(6):561–575CrossRefGoogle Scholar
  14. 14.
    Sanchez JS (2004) High training set size reduction by space partitioning and prototype abstraction. Pattern Recognit 37(7):1561–1564CrossRefGoogle Scholar
  15. 15.
    Li Y, Hu Z, Cai Y, Zhang W (2005) Support vector based prototype selection method for nearest neighbor rules. Advances in natural computation. Lecture notes in computer science, vol 3610. Springer, Berlin, pp 528–535Google Scholar
  16. 16.
    Barandela R, Ferri FJ, Sanchez JS (2005) Decision boundary preserving prototype selection for nearest neighbor classification. Int J Pattern Recognit Artif Intell (IJPRAI) 19(6):787–806CrossRefGoogle Scholar
  17. 17.
    Huang DD, Chow TWS (2006) Enhancing density-based data reduction using entropy. Neural Comput 18(2):470–495MATHCrossRefGoogle Scholar
  18. 18.
    Paredes R, Vidal E (2006) Learning prototypes and distances: a prototype reduction technique based on nearest neighbor error minimization. Pattern Recognit 39(2):180–188MATHCrossRefGoogle Scholar
  19. 19.
    Kim S-W, John B Oommen (2003) A brief taxonomy and ranking of creative prototype reduction schemes. Pattern Anal Appl (PAA) 6(3):232–244CrossRefGoogle Scholar
  20. 20.
    Dasgupta D, Ji Z, Gonzalez FF (2003) Artificial immune system (AIS) research in the last five years. In: Congress on evolutionary computation (CEC’03) 1:123–130CrossRefGoogle Scholar
  21. 21.
    Dasgupta D (1998) An overview of artificial immune systems and their applications. In: Dasgupta D (ed) Artificial immune systems and their applications. Springer, Berlin, pp 3–21Google Scholar
  22. 22.
    Zheng Tang, Koichi Tashima, Cao QP (2003) Pattern recognition system using a clonal selection-based immune network. Syst Comput Japan 34(12):56–63CrossRefGoogle Scholar
  23. 23.
    Ji Z, Dasgupta D (2004) Real-valued negative selection algorithm with variable-sized detectors. In: Proceedings of GECCO. LNCS, vol 3102, pp 287–298Google Scholar
  24. 24.
    de Castro LN, Zuben FVJ (2002) Learning and optimization using the clonal selection principle. IEEE Trans Evol Comput Spec Issue Artif Immune Syst 6:239–251Google Scholar
  25. 25.
    Carter HJ (2000) The immune system as a model for pattern recognition and classification. J Am Med Inf Assoc 7(3):28–41Google Scholar
  26. 26.
    Watkins AB (2001) AIRS: a resource limited artificial immune classifier. Master’s dissertation, Department of Computer Science, Mississippi State UniversityGoogle Scholar
  27. 27.
    Garain U, Chakraborty PM, Dutta Majumder D (2006) Improvement of OCR accuracy by similar character pair discrimination: an approach based on artificial immune system. In: Proceedings of the 18th international conference on pattern recognition (ICPR), August 2006, Hong Kong II, pp 1046–1049Google Scholar
  28. 28.
    Garain U, Chakraborty PM, Dasgupta D (2006) Recognition of handwritten indic script using clonal selection algorithm. In: Bersini H, Carneiro J (eds) 5th international conference on artificial immune systems (ICARIS), 2006, LNCS, vol 4163. Springer, Berlin, pp 256–266Google Scholar
  29. 29.
    de Castro LN, Timmis J (2002) Artificial immune systems: a novel approach to pattern recognition. In: Alonso L, Corchado J, Fyfe C (eds) Artificial neural networks in pattern recognition. University of Paisley, pp 67–84Google Scholar
  30. 30.
    Timmis J (2001) Artificial immune systems: a novel data analysis techniques inspired by the immune network theory. Ph.D. thesis, University of Wales, AberystwythGoogle Scholar
  31. 31.
    Burnet FM (1959) The clonal selection theory of acquired immunity. Vanderbuilt University, Nashville, TN, USAGoogle Scholar
  32. 32.
    Jerne NK (1974) Towards a network theory of the immune system. Ann Immunol (Inst Pasteur) 125C:373–389Google Scholar
  33. 33.
    Perelsen AS, Oster GF (1979) Theoretical studies of clonal selection: minimal antibody repertoire size and reliability of self-nonself discrimination. J Theor Biol 81:645–670CrossRefGoogle Scholar
  34. 34.
    Chaudhuri BB, Garain U, Mitra M (2003) On OCR of the most popular Indian scripts: Devnagari and Bangla,” Technical report, no. TR/ISI/CVPR/03/2003, Indian Statistical Institute, Kolkata, August 2003. A product named Chitrankan is developed based on this research (http://www.cdac.in/HTML/gist/products/chitra.asp)
  35. 35.
    Baird HS (1993) Perfect metrics. In: Proceedings of the second international conference on document analysis and recognition, Tsukuba, Japan, pp 593–597Google Scholar
  36. 36.
    Garain U, Chaudhuri BB (1998) Compound character recognition by run number based metric distance. In: Proceedings of the IS&T/SPIE’s 10th international symposium on electronic imaging: Science & Technology, SPIE, vol 3305. San Jose, CA, USA, pp 90–97Google Scholar
  37. 37.
    Kohonen T (1990) The Self-organizing map. Proc IEEE 78(9):464–1480CrossRefGoogle Scholar
  38. 38.
    Blake C, Keogh E, Merz C. UCI repository of machine learning databases. http://www.ics.uci.edu/mlearn/MLRepository.html
  39. 39.
    D. Statistics and M.S.S.S. University, Statlog Corp.http://ftp.strath.ac.uk
  40. 40.
    Box GEP, Hunter GW, Hunter SJ (1978) Statistics for experimenters. Wiley, New YorkMATHGoogle Scholar

Copyright information

© Springer-Verlag London Limited 2008

Authors and Affiliations

  1. 1.Computer Vision and Pattern Recognition UnitIndian Statistical InstituteKolkataIndia

Personalised recommendations