Hybrid genetic algorithm for dual selection

Ros, Frederic; Guillaume, Serge; Pintore, Marco; Chrétien, Jacques R.

doi:10.1007/s10044-007-0089-3

Hybrid genetic algorithm for dual selection

Theoretical Advances
Published: 19 October 2007

Volume 11, pages 179–198, (2008)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Frederic Ros¹,
Serge Guillaume²,
Marco Pintore³ &
…
Jacques R. Chrétien³

215 Accesses
18 Citations
Explore all metrics

Abstract

In this paper, a hybrid genetic approach is proposed to solve the problem of designing a subdatabase of the original one with the highest classification performances, the lowest number of features and the highest number of patterns. The method can simultaneously treat the double problem of editing instance patterns and selecting features as a single optimization problem, and therefore aims at providing a better level of information. The search is optimized by dividing the algorithm into self-controlled phases managed by a combination of pure genetic process and dedicated local approaches. Different heuristics such as an adapted chromosome structure and evolutionary memory are introduced to promote diversity and elitism in the genetic population. They particularly facilitate the resolution of real applications in the chemometric field presenting databases with large feature sizes and medium cardinalities. The study focuses on the double objective of enhancing the reliability of results while reducing the time consumed by combining genetic exploration and a local approach in such a way that excessive computational CPU costs are avoided. The usefulness of the method is demonstrated with artificial and real data and its performance is compared to other approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Diversity-Driven Selection Operator for Combinatorial Optimization

An enhancement of selection and crossover operations in real-coded genetic algorithm for large-dimensionality optimization

Article 13 January 2016

Inclusive Genetic Programming

References

Fauchère LJ, Bouting JA, Henlin JM, Kucharczyk N, Ortuno JC (1998) Combinatorial chemistry for the generation of molecular diversity and the discovery of bioactive lead. Chem Intell Lab Syst 43:43–68
Article Google Scholar
Borman S (1999) Reducing time to drug discovery. Recent advances in solid phase synthesis and high-throughpout screening suggest combinatorial chemistry is coming of age. CENEAR 77(10):33–48
Google Scholar
Guyon I, Elisseeff A (2003) An Introduction to Variable and Descriptor Selection. J Mach Learn Res 3:1157–1182
Article MATH Google Scholar
Ng AY (1998) Descriptor selection: learning with exponentially many irrelevant descriptors as training examples. In: 15th international conference on machine learning, San Francisco, pp 404–412
Dasarathy BV (1990) Nearest neighbor (NN) norms: NN pattern recognition techniques. IEEE Computer Society Press, Los Alamitos
Google Scholar
Dasarathy BV (1994) Minimal consistent set (MSC) identification for optimal nearest neighbor decision system design. IEEE Trans Syst Man Cybern 24:511–517
Article Google Scholar
Ramaswamy S, Rastogi R, Shim K (2000) Efficient algorithms for mining outliers from large data sets. In: Proceedings of the ACM SIGMOD conference, pp 427–438
Dasarathy BV, Sanchez JS, Townsend S (2003) Nearest neighbour editing and condensing tools-synergy exploitation. Pattern Anal Appl 3:19–30
Article Google Scholar
Kuncheva LI, Jain LC (1999) Nearest neighbor classifier: simultaneous editing and descriptor selection. Pattern Recognit Lett 20(11–13):1149–1156
Article Google Scholar
Ho SY, Chang XI (1999) An efficient generalized multiobjective evolutionary algorithm. In: Proceedings of the genetic and evolutionary computation conference. Morgan Kaufmann Publishers, Los Altos, pp 871–878
Davis TE, Principe JC (1991) A simulated annealing-like converge theory for the simple genetic algorithm, In: ICGA, pp 174–181
Ye T, Kaur HT, Kalyanaraman S (2003) A recursive random search algorithm for large scale network parameter configuration. In: SIGMETRICS 2003, San Diego
Glover F (1989) Tabu Search. ORSA J Comput 1(3):190–206
MATH Google Scholar
Boyan J, Moore A (2000) Learning evaluation functions to improve optimisation by local search. J Mach Learn Res 1:77–112
Article Google Scholar
Goldberg DE (1989) Genetic algorithms in search, optimization and machine learning. Addison-Wesley, Boston
MATH Google Scholar
Forrest S, Mitchell M (1993) What makes a problem hard for a genetic algorithm? some anomalous results and their explanation. Mach Learn 13:285–319
Article Google Scholar
Glicman MR, Sycara K (2000) Reasons for premature convergence of self-adapting mutation rates. In: Proceedings of the congress on evolutionary computation, San Diego, vol 1, pp 62–69
Schaffer J, Caruana R, Eshelman L, Das R (1989) A study of control parameters affecting online performance of genetic algorithms for function optimization. In: Proceedings of 3rd international conference on genetic algorithm, Morgan Kaufman, pp 51–60
Costa J, Tavares R, Rosa A (1999) An experimental study on dynamic random variation of population size. In: Proceedings of IEEE systems, man and cybernetics conference, Tokyo, vol 6, pp 607–612
Tuson A, Ross P (1998) Adapting operator settings. Genet Algorithms Evol Comput 6(2):161–184
Google Scholar
Pelikan M, Lobo FG (2000) Parameter-less genetic algorithm: a worst-case time and space complexity analysis. In: Proceedings of the genetic and evolutionary computation conference, San Francisco, pp 370–377
Eiben AE, Marchiori E, Valko VA (2004) Evolutionary algorithms with on-the-fly population size adjustment. In: Proceedings of the 8th international conference on parallel problem solving from nature (PPSN VIII), Birmingham, pp 41–50
Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1:131–156
Article Google Scholar
Piramuthu S (2004) Evaluating feature selection methods for learning in data mining application. Eur J Oper Res 156:483–494
Article MATH Google Scholar
Kohavi R, John G (1997) Wrappers for feature selection. Artif Intell 97:273–324
Article MATH Google Scholar
Stracuzzi DJ, Utgoff PE (2004) Randomized variable elimination. J Mach Learn Res 5:1331–1362
MathSciNet Google Scholar
Kira K, Rendell LA (1992) The feature selection problem: traditional methods and a new algorithm. In: Proceedings of the 9th national conference on artificial intelligence, pp 129–134
Almuallim H, Diettrerich TG (1994) Learning boolean concepts in the presence of many irrelevant feautres. Artif Intell 69(1–2):279–305
Article MATH Google Scholar
Ratanamahatan A, Gunopulos D (2003) Feature selection for the naive bayesian classifier using decision trees. Appl Artif Intell 17:475–487
Article Google Scholar
Shalkoff R (1992) Pattern recognition statistical, structural and neural approaches. Wiley, Singapore
Google Scholar
Devijver PA, Kittler J (1982) Pattern recognition: a statistical approach. Prentice-Hall, Englewood Cliffs
MATH Google Scholar
Caruana R, Freitag D (1994) Greedy attibute selection. In: Proceedings of 11th international conference on machine learning. Morgan Kaufman, New Jersey, pp 28–36
Shalak DB (1994) Prototype and feature selection by sampling and random mutation hill climbing algorithms. In: Proceedings of the 11th international conference on machine learning, New Brunswick. Morgan Kaufman, New Jersey, pp 293–301
Collins RJ, Jeferson DR (1991) Selection in massively parallel genetic algorithms. In: Proceedings of the 4th international conference on genetic algorithms, San Diego, pp 244–248
Jain AK, Zongker D (1997) Feature selection: evaluation, application, and small sample performance. IEEE Trans Pattern Anal Mach Intell 19(2):153–158
Article Google Scholar
Zongker D, Jain AK (2004) Algorithms for feature selection: an evaluation. IEEE Trans Pattern Anal Mach Intell 26(9):1105–1113
Article Google Scholar
Zhang H, Sun G (2002) Optimal reference subset selection for nearest neighbor classification by tabu search. Pattern Recognit 35:1481–1490
Article MATH Google Scholar
Brighton H, Mellish C (2002) Advances in instance selection for instance-based learning algorithms. Data Min Knowl Discov 6:153–172
Article MathSciNet MATH Google Scholar
Dasarathy BV (1994) Minimal consistent subset (MCS) identification for optimal nearest neighbor decision systems design. IEEE Trans Syst Man Cybern 24:511–517
Article Google Scholar
Hart PE (1968) The condensed nearest neighbor rule. IEEE Trans Inf Theory 16:515–516
Article Google Scholar
Gates GW (1972) The reduced nearest neighbor rule. IEEE Trans Inf Theory 18(3):431–433
Article Google Scholar
Swonger CW (1972) Sample set condensation for a condensed nearest neighbour decision rule for pattern recognition. In: Watanabe S (ed) Academic, Orlando, pp 511–519
Aha D, Kibler D, Albert MK (1991) Instance-based learning algorithms. Mach Learn 6:37–66
Google Scholar
Wilson DR, Martinez TR (2000) Reduction techniques for instance-based learning algorithms. Mach Learn 38(3):257–286
Article MATH Google Scholar
Kuncheva LI (1997) Fitness functions in editing k-NN reference set by genetic algorithms. Pattern Recognit 30(6):1041–1049
Article Google Scholar
Guo L, Huang DS, Zhao W (2003) Combining genetic optimization with hybrid learning algorithm for radial basis function neural networks. Electron Lett Online 39(22)
Bezdek JC, Kuncheva LI (2000) Nearest prototype classifier designs: an experimental study. Int J Intell Syst 16(12):1445–1473
Article Google Scholar
Bezdek JC, Kuncheva LI (2000) Some notes on twenty one (21) nearest prototype classifiers. In: Ferri FJ et al (eds) SSPR&SPR. Springer, Berlin, pp 1–16
Google Scholar
Kim SW, Oommen BJ (2003) A brief taxonomy and ranking of creative prototype reduction schemes. Pattern Anal Appl 6:232–244
Article MathSciNet Google Scholar
Shekhar S, Lu CT, Zhang P (2003) A unified approach to detecting spatial outliers. Geoinformatica 7(2):139–166
Article Google Scholar
Knorr EM, Ng RT, Tucakov V (2000) Distance-based outliers: algorithms and applications. VLDB J 8(3–4):237–253
Google Scholar
Shekhar S, Lu CT, Zhang P (2002) Detecting graph-based spatial outliers. Int J Intell Data Anal 6(5):451–468
MATH Google Scholar
Lun C-T, Chen, Kou Y. (2003) Algorithms for spatial outliers detection. In: Proceedings of the 3rd IEEE international conference on data mining
Aguilar JC, Riquelme JC, Toro M (2001) Data set editing by ordered projection. Intell Data Anal 5(5):1–13
Google Scholar
Quinlan J (1992) C4.5 programs for machine learning. Morgan Kaufman, San Francisco
Google Scholar
Kim SW, Oommen BJ (2003) Enhancing Prototype reduction schemes with recursion: a method applicable for “Large” data sets. IEEE Trans Syst Man Cybern 34(3):Part B
Wilson DL (1972) Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans Syst Man Cybern 2:408–421
Article MATH Google Scholar
Francesco JF, Jesus V, Vidal A (1999) Considerations about sample-size sensitivity of a family of edited nearest-neighbor rules. IEEE Trans Syst Man Cybern 29(4):Part B
Devijver P, Kittler J (1980) On the Edited Nearest Neighbor Rule. IEEE Pattern Recognition 1:72–80
Google Scholar
Garfield E (1979) Citation indexing: its theory and application in science, technology and humanities. Wiley, New York
Google Scholar
Barandela R, Gasca E (2000) Decontamination of training samples for supervised pattern recognition methods. In: Ferri FJ, Inesta Quereda JM, Amin A, Paudil P (eds) Lecture Notes in Computer Science, vol 1876. Springer, Berlin, pp 621–630
Jiang Y, Zhou ZH () Editing training data for kNN classifiers with neural network ensemble
Eiben AE, Hinterding R, Michalewicz Z (1999) Parameter control in evolutionary algorithms. IEEE Trans Evol Comput 3(2):124–141
Article Google Scholar
Tuson A, Ross P (1998) Adapting operator settings. Genet Algorithms Evol Comput 6(2):161–184
Google Scholar
Costa J, Tavares R, Rosa A (1999) An experimental study on dynamic random variation of population size. In: Proceedings of IEEE systems, man and cybernetics Conference, Tokyo, vol 6, pp 607–612
Arabas J, Michalewicz Z, Mulawka J (1994) A genetic algorithm with varying population size. In: Proceedings of the 1st IEEE conference on evolutionary computation, Piscataway, pp 73–78
Deb K, Goldberg DE (1989) An investigation of niche and species formation in genetic function optimisation. In: Schaffer JD (ed) Proceedings of the 3rd international conference on genetic algorithms. Morgan Kaufmann, San Mateo, pp 42–50
Beasley D, Bull DR, Martin RR (1993) A sequential niche technique for multimodal function optimization. Evol Comput 1(2):101–125
Article Google Scholar
Goldberg DE, Richardson J (1987) Genetic algorithms with sharing for multimodal function optimisation. In: Grefensette JJ (ed) Proceedings of the 2nd international conference on genetic algorithms, Hillsdale, pp 41–49
Deb K (1989) Genetic Algorithm in multimodal function optimisation. MS thesis, TCGA Report n°89002, University of Alabama
Miller BL, Shaw MJ (1996) Genetic algorithms with dynamic sharing for multimodal function optimization. In: Proceedings of international conference on evolutionary computation, Piscataway, pp 786–791
Sareni B, Krahenbuhl L (1998) Fitness sharing and niching methods revisited. IEEE Trans Evol Comput 2(3):97–106
Article Google Scholar
Youang B (2002) Deterministic crowding, recombination and self-similarity. In: Proceedings of IEEE
Li JP, Balazs ME, Parks GT, Clarkson PJ (2002) A species conserving genetic algorithm for multimodal function optimization. Evol Comput 10(3):207–234
Article Google Scholar
DeJong KA (1975) Analysis of the behavior of a class of genetic adaptive systems. PhD thesis, University of Michigan
Mahfoud SW (1992) Crowding and preselection revisited. In: 2nd Conference on parallel problem solving from nature (PPSN’92), Brussels, vol 2, pp 27–36
Harik G (1995) Finding multimodal solutions using restricted tournament selection. In: Eshelman LJ (ed) Proceedings of 6th international conference on genetic algorithms. Morgan Kaufman, San Mateo, pp 24–31
Deb K, Pratap A, Agarwal S, Meyarivan T (2000) A fast and elitist multi-objective genetic algorithm: NSGA-II, KanGal (Kanpur Genetic Algorithm Laboratory) Report No. 200001
Wiese K, Goodwin SD (1998) Keep-best reproduction: a selection strategy for genetic algorithms. In: Proceedings of the 1998 symposium on applied computing, pp 343–348
Matsui K (1999) New selection method to improve the population diversity in genetic algorithms systems, man and cybernetics. IEEE Int Conf 1:625–630
Google Scholar
Lozano M, Herrera F, Cano JR (2007) Replacement strategies to preserve useful diversity in steady-state genetic algorithms. Elsevier, Amsterdam (in press)
Knowles JD (2002) Local search and hybrid evolutionary algorithms for Pareto optimization. PhD Thesis, University of Reading
Zitzler E, Teich J, Bhattacharyya (2000) Optimizing the efficiency of parameterized local search within global search: a preliminary study. In: Proceedings of the congress on evolutionary computation, San Diego, pp 365–372
Moscato P (1999) Memetic algorithms: a short introduction. In: Corne D, Glover F, Dorigo M (eds) New ideas in optimization. McGraw-Hill, Maidenhead, pp 219–234
Google Scholar
Hart WE (1994) adaptative global optimization with local search. PhD Thesis, University of California, San Diego
Land MWS (1998) Evolutionary algorithms with local search for combinatorial optimization. PhD Thesis, University of California, San Diego
Ros F, Pintore M, Chretien JR (2002) Molecular description selection combining genetic algorithms and fuzzy logic: application to database mining procedures. J Chem Int Lab Syst 63:15–22
Article Google Scholar
Leardi R, Gonzalez AL (1998) Genetic algorithms applied to feature selection in PLS regression: how and when to use them. Chem Intell Lab Syst 41(2):195–207
Article Google Scholar
Merz P (2000) Memetic algorithms for combinatorial optimization problems: fitness landscapes and effective search strategies. PhD thesis, University of Siegen
Merz P, Freisleben (1999) A comparison of memetic algorithms, tabu search and ant colonies for the quadratic assignment problem. In: Proceedings of the international congress of evolutionary computation, Washington DC
Krasnogor N (2002) Studies on the theory and design space of memetic algorithms. Thesis University of the West of England, Bristol
Zitzler E, Laumanns M, Bleuler S (2004) A tutorial on evolutionary multiobjective optimization
Goldberg DE (1989) Genetic algorithms in search, optimization, and machine learning. Addison-Wesley, Reading
MATH Google Scholar
Schaffer JD (1985) Multiple objective optimization with vector evaluated genetic algorithms. In: Proceedings of the11th international conference on genetic algorithms, pp 93–100
Horn J, Nafpliotis N, Goldberg DE (1994) A niched Pareto genetic algorithm for multiobjective optimization. In: Proceedings of the 1st IEEE conference on evolutionary computation, vol 1, pp 82–87
Laumanns M, Thiele L, Deb K, Zitzler E (2000) On the convergence and diversity-preservation properties of multi-objective evolutionary algorithms. Evol Comput 8(2):149–172
Article Google Scholar
Mitsuo G, Runwei C (1997) Genetic algorithms and engineering design. Wiley, NewYork
Google Scholar
Coello CA, Van Veldhuizen, Lamont GB (2002) Evolutionary algorithms for solving multi-objective problems. Kluwer, New York
MATH Google Scholar
Zitzler E (1999) Evolutionary algorithms for multiobjective optimization: methods and applications. PhD Thesis, Shaker Verlag, Aachen
Tamaki H, Mori M, Araki M, Ogai H (1995) Multicriteria optimization by genetic algorithms: a case of scheduling in hot rolling process. In: Proceedings of the 3rd APORS, pp 374–381
Skalak DB (1997) Prototype selection for composite nearest neighbor classifiers, Phd Thesis. University of Massachuset Amherst
Kuncheva LI, Jain LC (1999) Nearest neighbor classifier: simultaneous editing and descriptor selection. Pattern Recognit Lett 20(11–13):1149–1156
Article Google Scholar
Ho S-H, Lui C-C, Liu S (2002) Design of an optimal nearest neighbor classifier using an intelligent genetic algorithm. Pattern Recognit Lett 23:1495–1503
Article MATH Google Scholar
Cano JR, Herrera F, Lozano (2003) Using evolutionary algorithms as instance selection for data reduction in kdd: an experimental study. IEEE Trans Evol Comput 7(6):193–208
Google Scholar
Chen JH, Chen HM, Ho SY (2005) Design of nearest neighbor classifiers: multi-objective approach. Int J Approx Reason (in press)
Blake C, Keogh E, Merz CJ (1998) UCI repository of machine learning databases (http://www.ics.uci.edi/∼mlearn/MLRepository.html), Department of Information and Computer Science, University of California
Geiger DL, Brooke LT, Call DJ (Eds) (1990) Acute toxicities of organic chemicals to Fathead Minnows (Pimephales promelas), Center for Lake Superior Environmental Studies, University of Wisconsin, Superior
Directive 92/32/ECC (1992), the 7th amendment to directive 67/548/ECC, OJL 154 of 5.VI.92, p1
Knowles JD, Corne DW (2000) Approximating the nondominated front using the Pareto archived evolution strategy. Evol Comput 8(2):149–172
Article Google Scholar
Jacquet-Lagrèze E (1990) Interactive assessment of preferences using holistic judgements: the PREFCALC system. In: Bana e Costa CA (ed) Readings in multiple criteria decision aid, Springer, Heidelberg, pp 336–350
Blayo F, Demartines P (1991) Data analysis: How to compare Kohonen neural networks to others techniques? International workshop in artificial neural networks (IWANN 1991), Barcelona, Lectures Notes on Computer Science. Springer, Heidelberg, pp 469–476
Kireev D, Bernard D, Chretien JR, Ros F (1998) Application of Kohonen neural networks in classification of biologically active compounds. SAR QSAR Environ Res 8:93–107
Article Google Scholar

Download references

Author information

Authors and Affiliations

GEMALTO, avenue de la Pomme de Pin, St. Cyr en Val, 45060, Orléans Cedex, France
Frederic Ros
Cemagref, 34000, Montpellier, France
Serge Guillaume
BioChemics Consulting, 16 rue Leonard de Vinci, 45074, Orléans Cedex 2, France
Marco Pintore & Jacques R. Chrétien

Authors

Frederic Ros
View author publications
You can also search for this author in PubMed Google Scholar
Serge Guillaume
View author publications
You can also search for this author in PubMed Google Scholar
Marco Pintore
View author publications
You can also search for this author in PubMed Google Scholar
Jacques R. Chrétien
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Frederic Ros.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ros, F., Guillaume, S., Pintore, M. et al. Hybrid genetic algorithm for dual selection. Pattern Anal Applic 11, 179–198 (2008). https://doi.org/10.1007/s10044-007-0089-3

Download citation

Received: 15 August 2006
Accepted: 21 August 2007
Published: 19 October 2007
Issue Date: June 2008
DOI: https://doi.org/10.1007/s10044-007-0089-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hybrid genetic algorithm for dual selection

Abstract

Access this article

Similar content being viewed by others

Diversity-Driven Selection Operator for Combinatorial Optimization

An enhancement of selection and crossover operations in real-coded genetic algorithm for large-dimensionality optimization

Inclusive Genetic Programming

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Hybrid genetic algorithm for dual selection

Abstract

Access this article

Similar content being viewed by others

Diversity-Driven Selection Operator for Combinatorial Optimization

An enhancement of selection and crossover operations in real-coded genetic algorithm for large-dimensionality optimization

Inclusive Genetic Programming

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation