Skip to main content
Log in

MOPG: a multi-objective evolutionary algorithm for prototype generation

  • Theoretical Advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

Prototype generation deals with the problem of generating a small set of instances, from a large data set, to be used by KNN for classification. The two key aspects to consider when developing a prototype generation method are: (1) the generalization performance of a KNN classifier when using the prototypes; and (2) the amount of data set reduction, as given by the number of prototypes. Both factors are in conflict because, in general, maximizing data set reduction implies decreasing accuracy and viceversa. Therefore, this problem can be naturally approached with multi-objective optimization techniques. This paper introduces a novel multi-objective evolutionary algorithm for prototype generation where the objectives are precisely the amount of reduction and an estimate of generalization performance achieved by the selected prototypes. Through a comprehensive experimental study we show that the proposed approach outperforms most of the prototype generation methods that have been proposed so far. Specifically, the proposed approach obtains prototypes that offer a better tradeoff between accuracy and reduction than alternative methodologies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. One should note there are works comparing a few PS and PG methods over a small number of data sets [19, 22].

  2. One should note that among the considered data sets numeric and nominal attributes are included. For simplicity we have deliberatively transformed nominal attributes into integers and applied MOPG without any modification.

  3. Please note that, in general, in evolutionary algorithms large populations do not necessarily mean better performance. This behavior is observed when the search space has not been explored extensively, which is beneficial for avoiding overfitting.

  4. Please note that the 25 methods have been evaluated on small data sets, but only 20 out of the 25 were evaluated in large data sets [28]. Five methods were not considered for large data sets because they were too computational expensive, see [28] for details.

  5. See also http://sci2s.ugr.es/pgtax/.

  6. This is the statistical test recommended by Demsar for comparing classification methods over multiple data sets [11].

References

  1. Aler R, Handl J, Knowles JD (2013) Comparing multi-objective and threshold-moving roc curve generation for a prototype-based classifier. In: Proceedings of the fifteenth annual conference on Genetic and evolutionary computation conference. ACM, pp 1029–1036

  2. Cervantes A, Galvan IM, Isasi P (2009) AMPSO: a new particle swarm method for nearest neighborhood classification. IEEE Trans. Sys. Man Cybern. B 39(5):1082–1091

    Article  Google Scholar 

  3. Chatelain Clément, Adam Sébastien, Lecourtier Yves, Heutte Laurent, Paquet Thierry (2010) A multi-model selection framework for unknown and/or evolutive misclassification cost problems. Pattern Recogn. 43(3):815–823

    Article  MATH  Google Scholar 

  4. Chen JH, Chen HM, Ho SY (2005) Design of nearest neighbor classifiers: multi-objective approach. Int. J. Approx. Reason. 40:3–22

    Article  MATH  Google Scholar 

  5. Coello Coello CA, Lamont GB, Veldhuizen DAV (2007) Evolutionary algorithms for solving multi-objective problems. Genetic and evolutionary computation, 2nd edn. Springer, USA

  6. Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans. Inform. Theory 13(1):21–27

    Article  MATH  Google Scholar 

  7. Cruz-Vega I, Garcia-Limon M, Escalante HJ (2014) Adaptive surrogates with a neuro-fuzzy network and granular computing. In: Proceedings of GECCO 2014. ACM Press, pp 761–768

  8. Deb K (2001) Multi-objective optimization using evolutionary algorithms. Wiley

  9. Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2):182–197

    Article  Google Scholar 

  10. Decaestecker C (1997) Finding prototypes for nearest neighbour classification by means of gradient descent and deterministic annealing. Pattern Recogn. 30(2):281–288

    Article  Google Scholar 

  11. Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  MATH  Google Scholar 

  12. Dos-Santos EM, Sabourina R, Maupinb P (2008) A dynamic overproduce-and-choose strategy for the selection of classifier ensembles. Pattern Recogn. 41:2993–3009

    Article  MATH  Google Scholar 

  13. Eiben AE, Smith JE (2010) Introduction to evolutionary computing. Natural computing. Springer

  14. Escalante HJ, Mendoza KM, Graff M, Morales-Reyes A (2013) Genetic programming of prototypes for pattern classification. In: Proceedings of IbPRIA 2013, vol. 7887 of LNCS. Springer, pp 100–107

  15. Fernandez F, Isasi P (2004) Evolutionary design of nearest prototype classifiers. J. Heuristics 10:431–454

    Article  Google Scholar 

  16. Garain U (2008) Prototype reduction using an artificial immune system. Pattern Anal. Appl. 11(3–4):353–363

    Article  MathSciNet  Google Scholar 

  17. García S, Derrac J, Cano JR, Herrera F (2012) Prototype selection for nearest neighbor classification: Taxonomy and empirical study. IEEE Trans. Pattern Anal. Mach. Intell. 34(3):417–435

    Article  Google Scholar 

  18. Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer, New York

    Book  MATH  Google Scholar 

  19. Kim SW, Oommen BJ (2003) A brief taxonomy and ranking of creative prototype reduction schemes. Pattern Anal. Appl. 6:232–244

    Article  MathSciNet  Google Scholar 

  20. Koplowitz J, Brown T (1981) On the relation of performance to editing in nearest neighbor rules. Pattern Recogn. 13(3):251–255

    Article  Google Scholar 

  21. Li J, Wang Y (2013) A nearest prototype selection algorithm using multi-objective optimization and partition. In: Proceedings of the 9th International Conference on Computational Intelligence and Security. IEEE, pp. 264–268

  22. Lozano M, Sotoca JM, Sánchez JS, Pla F, Pkalska E, Duin RPW (2006) Experimental study on prototype optimisation algorithms for prototype-based classification in vector spaces. Pattern Recogn. 39(10):1827–1838

    Article  MATH  Google Scholar 

  23. Nanni L, Lumini A (2008) Particle swarm optimization for prototype reduction. Neurocomputing 72(4–6):1092–1097

    Google Scholar 

  24. Olvera A, Carrasco-Ochoa JA, Martinez-Trinidad JF, Kittler J (2010) A review of instance selection methods. Artif. Intell. Rev. 34:133–143

    Article  Google Scholar 

  25. Storn R, Price KV (1997) Differential evolution a simple and efficient heuristic for global optimization over continuous spaces. J. Global Optim. 11(10):341–359

    Article  MathSciNet  MATH  Google Scholar 

  26. Rosales A, Coello CA, Gonzalez J, Reyes CA, Escalante HJ (2013) A hybrid surrogate-based approach for evolutionary multi-objective optimization. In: Proceedings of Congress on Evolutionary Computation 2013. IEEE, pp 2548–2555

  27. Rosales A, Gonzalez J, Coello CA, Escalante HJ, Reyes CA (2014) Surrogate-assisted multi-objective model selection for support vector machines. Neurocomputing (in press)

  28. Triguero I, Derrac J, García S, Herrera F (2012) A taxonomy and experimental study on prototype generation for nearest neighbor classification. IEEE Trans. Sys. Man Cybern. C 42(1):86–100

    Article  Google Scholar 

  29. Triguero I, Peralta D, Bacardit J, Garcia S, Herrera F (2014) MRPR: a mapreduce solution for prototype reduction in big data classification. Neurocomputing (in press)

  30. Triguero I, Garcia S, Herrera F (2011) Differential evolution for optimizing the positioning of prototypes in nearest neighbor classification. Pattern Recogn. 44:901–916

    Article  Google Scholar 

  31. Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Yu Ps, Zhou ZH, Steinbach M, Hand DJ, Steinberg D (2007) Top 10 algorithms in data mining. Knowl Inf Syst 14(1):1–37

  32. Xia H, Zhuang J, Yu D (2013) Novel soft subspace clustering with multi-objective evolutionary approach for high-dimensional data. Pattern Recogn. 46:2562–2575

    Article  MATH  Google Scholar 

Download references

Acknowledgments

This work was partially supported by the LACCIR programme under project ID R1212LAC006. Hugo Jair Escalante was supported by the internships programme of CONACyT under grant No. 234415.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hugo Jair Escalante.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Escalante, H.J., Marin-Castro, M., Morales-Reyes, A. et al. MOPG: a multi-objective evolutionary algorithm for prototype generation. Pattern Anal Applic 20, 33–47 (2017). https://doi.org/10.1007/s10044-015-0454-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-015-0454-6

Keywords

Navigation