Pattern Analysis and Applications

, Volume 13, Issue 3, pp 273–288 | Cite as

Domain-independent feature extraction for multi-classification using multi-objective genetic programming

Theoretical Advances


We propose three model-free feature extraction approaches for solving the multiple class classification problem; we use multi-objective genetic programming (MOGP) to derive (near-)optimal feature extraction stages as a precursor to classification with a simple and fast-to-train classifier. Statistically-founded comparisons are made between our three proposed approaches and seven conventional classifiers over seven datasets from the UCI Machine Learning database. We also make comparisons with other reported evolutionary computation techniques. On almost all the benchmark datasets, the MOGP approaches give better or identical performance to the best of the conventional methods. Of our proposed MOGP-based algorithms, we conclude that hierarchical feature extraction performs best on multi-classification problems.


Feature extraction Multi-classification Multi-objective optimization Genetic programming Confidence measurement Feature partitioning 



One of us (YZ) is grateful for the financial support of a Universities UK Overseas Research Student Award Scheme (ORSAS) scholarship and the Henry Lester Trust.


  1. 1.
    Bailey A (2001) Class-dependent features and multicategory classification. PhD Thesis, Department of Electronics and Computer Science, University of Southampton, Southampton, UKGoogle Scholar
  2. 2.
    Dietterich TG, Bakiri G (1995) Solving multiclass learning problems via error-correcting output codes. J Artif Intell Res 2:263–286MATHGoogle Scholar
  3. 3.
    Schölkopf B, Burges C, Vapnik V (1995) Extracting support data for a given task. In: 1st International Conference on Knowledge Discovery and Data Mining, Menlo Park, CA, USA, pp 252–257Google Scholar
  4. 4.
    Safavian SR, Landgrebe D (1991) A survey of decision tree classifier methodology. IEEE Trans Syst Man Cybern 21(3):660–674CrossRefMathSciNetGoogle Scholar
  5. 5.
    Guyon I, Elisseeff A (2006) An introduction to feature extraction. In: Guyon I, Gunn S, Nikravesh M, Zadeh L (eds) Feature extraction. Foundations and applications. Springer, HeidelbergGoogle Scholar
  6. 6.
    Addison D, Wermter S, Arevian G (2003) A comparison of feature extraction and selection techniques. In: International conference on artificial neural networks (Supplementary Proceedings). Istanbul, Turkey, pp 212–215Google Scholar
  7. 7.
    Markovitch S, Rosenstein D (2002) Feature generation using general constructor functions. Mach Learn 49(1):59–98MATHCrossRefGoogle Scholar
  8. 8.
    Shafti LS, Pérez EP (2005) Constructive induction and genetic algorithms for learning concepts with complex interaction. In: Genetic and evolutionary computation conference (GECCO2005). Washington, DC, USA, pp 1811–1818Google Scholar
  9. 9.
    Bensusan H, Kuscu I (1996) Constructive induction using genetic programming. In: ICML’96 evolutionary computing and machine learning workshop, Bari, ItalyGoogle Scholar
  10. 10.
    Gilad-Bachrach R, Navot A, Tishby N (2006) Large margin principles for feature selection. In: Guyon I, Gunn S, Nikravesh M, Zadeh L (eds) Feature extraction, foundations and applications. Springer, Heidelberg, pp 579–598Google Scholar
  11. 11.
    Guyon I, Gunn S, Nikravesh M, Zadeh L (eds) (2006) Feature extraction, foundations and applications. Springer, HeidelbergMATHGoogle Scholar
  12. 12.
    Cortes C, Vapnik V (1995) Support vector networks. Mach Learn 20(3):273–297MATHGoogle Scholar
  13. 13.
    Lee Y, Lin Y, Wahba G (2001) Multicategory support vector machines. Technical Report 1043, Department of Statistics, University of Wisconsin, Madison, WI, USAGoogle Scholar
  14. 14.
    Weston J, Watkins C (1999) Support vector machines for multi-class pattern recognition. In: 7th European symposium on artificial neural networks (ESANN’99). Bruges, Belgium, pp 219–224Google Scholar
  15. 15.
    Koza JR (1994) Genetic programming II: automatic discovery of reusable programs. MIT Press, CambridgeMATHGoogle Scholar
  16. 16.
    Bot MJC, Langdon WB (1999) Application of genetic programming to induction of linear classification trees. In: 11th Belgium/Netherlands conference on artificial intelligence (BNAIC’99). pp 107–114Google Scholar
  17. 17.
    Harris C (1997) An investigation into the application of genetic programming techniques to signal analysis and feature detection. PhD Thesis, Department of Computer Science, University College, LondonGoogle Scholar
  18. 18.
    Guo H, Jack LB, Nandi AK (2005) Feature generation using genetic programming with application to fault classification. IEEE Trans Syst Man Cybern B 35(1):89–99CrossRefGoogle Scholar
  19. 19.
    Kotani M, Nakai M, Azakawa K (1999) Feature extraction using evolutionary computation. In: Congress on evolutionary computation, pp 1230–1236Google Scholar
  20. 20.
    Tackett WA (1993) Genetic programming for feature discovery and image discrimination. In: 5th International conference on genetic algorithms (ICGA93). Urbana-Champaign, IL, USA pp 303–309Google Scholar
  21. 21.
    Sherrah JR, Bogner RE, Bouzerdoum A (1997) The evolutionary pre-processor: automatic feature extraction for supervised classification using genetic programming. In: 2nd Annual conference on genetic programming. Stanford University, CA, USA, pp 304–312Google Scholar
  22. 22.
    Krawiec K (2002) Genetic programming-based construction of features for machine learning and knowledge discovery tasks. Genetic Programm Evolvable Mach 3(4):329–343MATHCrossRefGoogle Scholar
  23. 23.
    Zhang Y, Rockett PI (2005) Evolving optimal feature extraction using multi-objective genetic programming: a methodology and preliminary study on edge detection. In: Genetic and evolutionary computation conference (GECCO 2005), Washington, DC, pp 795–802Google Scholar
  24. 24.
    Zhang Y, Rockett PI (2006) A generic optimal feature extraction method using multiobjective genetic programming: methodology and applications. Technical Report VIE 2006/001, Department of Electronic and Electrical Engineering, University of Sheffield, Sheffield, UKGoogle Scholar
  25. 25.
    Zhang L, Jack L, Nandi AK (2005) Extending genetic programming for multi-class classification by combining k-nearest neighbor. In: IEEE International conference on acoustics, speech, and signal processing, 2005 (ICASSP ’05). Philadelphia, PA, USA, pp 349–352Google Scholar
  26. 26.
    Zhang M, Ciesielski V (1999) Genetic programming for multiple class object detection. In: 12th Australian joint conference on artificial intelligence (AI’99). Sydney, Australia, pp 180–192Google Scholar
  27. 27.
    Loveard T, Ciesielski V (2001) Representing classification problems in genetic programming. In: Congress on evolutionary computation, Gangnam-gu, Seoul, Korea, pp 1070–1077Google Scholar
  28. 28.
    Bot MJC (2001) Feature extraction for the k-nearest neighbour classifier with genetic programming. In: EuroGP 2001, Lake Como, Italy, pp 256–267Google Scholar
  29. 29.
    Zhang M, Ciesielski VB, Andreaec P (2003) A domain-independent window approach to multiclass object detection using genetic programming. EURASIP J Appl Signal Process (8):841–859Google Scholar
  30. 30.
    Smart W, Zhang M (2005) Using genetic programming for multiclass classification by simultaneously solving component binary classification problems. Technical Report CS-TR-05/1, School of Mathematical and Computing Sciences, Victoria University of Wellington, Wellington, New ZealandGoogle Scholar
  31. 31.
    Smart W, Zhang M (2004) Probability-based genetic programming for multiclass object classification. Technical Report CS-TR-04/7, School of Mathematical and Computing Sciences, Victoria University of Wellington, Wellington, New ZealandGoogle Scholar
  32. 32.
    Muni DP, Pal NR, Das J (2004) A novel approach to design classifiers using genetic programming. IEEE Trans Evolut Comput 8(2)183–196CrossRefGoogle Scholar
  33. 33.
    Tsakonas A, Dounias G (2002) Hierarchical classification trees using type-constrained genetic programming. In: 1st International IEEE symposium on intelligent systems, pp 50–54Google Scholar
  34. 34.
    Coello CAC (2000) An updated survey of GA-based multiobjective optimization techniques. ACM Comput Surv 32(2):109–143CrossRefGoogle Scholar
  35. 35.
    Jin Y, Sendhoff B (2008) Pareto-based multiobjective machine learning: an overview and case studies. IEEE Trans Syst Man Cybern C Appl Rev 38(3):397–415CrossRefGoogle Scholar
  36. 36.
    Zitzler E, Thiele L (1998) An evolutionary algorithm for multiobjective optimization: the strength Pareto approach. Technical Report 43, Computer Engineering and Communications Networks Lab (TIK), Swiss Federal Institute of Technology (ETH), Zurich, SwitzerlandGoogle Scholar
  37. 37.
    Bleuler S, Brack M, Theile L, Zitzler E (2001) Multiobjective genetic programming: reducing bloat using SPEA2. In: Congress on evolutionary computation, Seoul, Korea, pp 536–543Google Scholar
  38. 38.
    Fonseca C, Fleming PJ (1998) Multiobjective optimization and multiple constraint handling with evolutionary algorithms - Part I: a unified formulation. IEEE Trans Syst Man Cybern A Syst Hum 28(1):26–37CrossRefGoogle Scholar
  39. 39.
    Kumar R, Rockett PI (2002) Improved sampling of the Pareto-front in multiobjective genetic optimizations by steady-state evolution: a Pareto converging genetic algorithm. Evolut Comput 10(3):283–314CrossRefGoogle Scholar
  40. 40.
    Zhang Y, Rockett PI (2006) Feature extraction using multi-objective genetic programming. In: Jin PI (ed) Multi-objective machine learning. Springer, Heidelberg, pp 75–99Google Scholar
  41. 41.
    Blake, CL, Merz CJ (1998) UCI Repository of machine learning databases, Departtment of Information & Computer Science, University of California, Irvine CA, USA
  42. 42.
    Heath MT (1997) Scientific computing: an introductory survey. McGraw-Hill, New YorkGoogle Scholar
  43. 43.
    Mui JK, Fu K-S (1980) Automated classification of nucleated blood cells using a binary classifier. IEEE Trans Pattern Anal Mach Intell 2(5):429–443Google Scholar
  44. 44.
    Hyafil L, Rivest RL (1976) Constructing optimal binary decision trees is NP-complete. Inf Process Lett 5(1):15–17MATHCrossRefMathSciNetGoogle Scholar
  45. 45.
    Dietterich T (1998) Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput 10(7):1895–1923CrossRefGoogle Scholar
  46. 46.
    Alpaydin E (1999) Combined 5 × 2 cv F test for comparing supervised classification learning algorithms. Neural Comput 11(8):1885–1892CrossRefGoogle Scholar
  47. 47.
    Alimoğlu F, Alpaydin E (1996) Methods of combining multipleclassifiers based on different representations for pen-based handwriting recognition. In: 5th Turkish artificial intelligence and artificial neural networks symposium (TAINN 96), Istanbul, TurkeyGoogle Scholar
  48. 48.
    Lim T, Loh W, Shih Y (2000) A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms. Mach Learn 40(3):203–228MATHCrossRefGoogle Scholar
  49. 49.
    Schiffmann W, Joost W, Werner R (1992) Synthesis and performance analysis of multilayer neural network architectures. Technical Report 16/1992, Institute für Physics, University of Koblenz, Koblenz, GermanyGoogle Scholar
  50. 50.
    Coomans D, Broeckaert I (1988) Potential pattern recognition in chemical and medical decision making. Research Studies Press, LetchworthGoogle Scholar
  51. 51.
    Aeberhard S, Coomans D, de Vel O (1992) The classification performance of RDA. Technical Report 92-01, Department of Computer Science and Department of Mathematics and Statistics, James Cook University, North Queensland, AustraliaGoogle Scholar
  52. 52.
    Witten IH, Frank E (2005) Data mining: practical machine learning tools, 2nd edn. Morgan Kaufmann, San FranciscoGoogle Scholar
  53. 53.
    Otero FEB, Silva MMS, Freitas AA, Nievola JC (2003) Genetic programming for attribute construction in data mining. In: 6th European conference (EuroGP 2003), Essex, UK, pp 384–393Google Scholar
  54. 54.
    Parrott D, Li X, Ciesielski V (2005) Multi-objective techniques in genetic programming for evolving classifiers. In: IEEE congress on evolutionary computation (CEC2005), Edinburgh, UKGoogle Scholar
  55. 55.
    Smith MG, Bull L (2005) Genetic programming with a genetic algorithm for feature construction and selection. Genetic Programm Evolvable Mach 6(3):265–281CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Limited 2009

Authors and Affiliations

  1. 1.Laboratory for Image and Vision Engineering, Department of Electronic and Electrical EngineeringUniversity of SheffieldSheffieldUK

Personalised recommendations