Skip to main content

Advertisement

Log in

Multi-objective genetic programming for feature extraction and data visualization

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Feature extraction transforms high-dimensional data into a new subspace of lower dimensionality while keeping the classification accuracy. Traditional algorithms do not consider the multi-objective nature of this task. Data transformations should improve the classification performance on the new subspace, as well as to facilitate data visualization, which has attracted increasing attention in recent years. Moreover, new challenges arising in data mining, such as the need to deal with imbalanced data sets call for new algorithms capable of handling this type of data. This paper presents a Pareto-based multi-objective genetic programming algorithm for feature extraction and data visualization. The algorithm is designed to obtain data transformations that optimize the classification and visualization performance both on balanced and imbalanced data. Six classification and visualization measures are identified as objectives to be optimized by the multi-objective algorithm. The algorithm is evaluated and compared to 11 well-known feature extraction methods, and to the performance on the original high-dimensional data. Experimental results on 22 balanced and 20 imbalanced data sets show that it performs very well on both types of data, which is its significant advantage over existing feature extraction algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Alcalá R, Alcalá-Fdez J, Gacto MJ, Herrera F (2008) On the use of multiobjective genetic algorithms to improve the accuracy-interpretability trade-off of fuzzy rule-based systems. In: Multi-objective evolutionary algorithms for knowledge discovery from data bases, vol 98. Springer, New York, pp 91–107

  • Alcalá-Fdez J, Fernández A, Luengo J, Derrac J, García S, Sánchez L, Herrera F (2011) KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. Anal Framew J Mult-Valued Log Soft Comput 17:255–287

    Google Scholar 

  • Bae SH, Choi JY, Qiu J, Fox GC (2010) Dimension reduction and visualization of large high-dimensional data via interpolation. In: Proceedings of the 19th ACM international symposium on high performance distributed computing, pp 203–214

  • Ben-David A (2008) About the relationship between ROC curves and Cohen’s kappa. Eng Appl Artif Intell 21(6):874–882

    Article  MathSciNet  Google Scholar 

  • Bertini E, Tatu A, Keim D (2011) Quality metrics in high-dimensional data visualization: an overview and systematization. IEEE Trans Vis Comput Graph 17(12):2203–2212

    Article  Google Scholar 

  • Bezdek JC, Pal NR (1998) Some new indexes of cluster validity. IEEE Trans Syst Man Cybern Part B Cybern 28(3):301–315

    Article  Google Scholar 

  • Biber D (1992) The multi-dimensional approach to linguistic analyses of genre variation: an overview of methodology and findings. Comput Humanit 26(5):331–345

    Google Scholar 

  • Borg I, Groenen PJF (2005) Modern multidimensional scaling: theory and applications. In: Springer series in statistics. Springer, New York

  • Cai D (2012) Matlab codes for dimensionality reduction (subspace learning). http://www.cad.zju.edu.cn/home/dengcai/Data/DimensionReduction.html

  • Cai D, He X, Han J (2007a) Spectral regression for efficient regularized subspace learning. In: Proceedings of the IEEE international conference on computer vision, pp 1–8

  • Cai D, He X, Zhou K, Han J, Bao H (2007b) Locality sensitive discriminant analysis. In: Proceedings of the international joint conference on artificial intelligence, pp 1713–1726

  • Cano A, Ventura S (2014) Gpu-parallel subtree interpreter for genetic programming. In: Proceedings of the conference on genetic and evolutionary computation, pp 887–894

  • Cano A, Zafra A, Ventura S (2012) Speeding up the evaluation phase of GP classification algorithms on GPUs. Soft Comput 16(2):187–202

    Article  Google Scholar 

  • Cano A, Zafra A, Ventura S (2015a) Speeding up multiple instance learning classification rules on GPUs. Knowl Inf Syst 44(1):127–145

  • Cano A, Luna JM, Zafra A, Ventura S (2015b) A classification module for genetic programming algorithms in JCLEC. J Mach Learn Res 16:491–494

  • Comon P (1994) Independent component analysis, a new concept? Signal Process 36(3):287–314

    Article  MATH  Google Scholar 

  • Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell 1(2):224–227

    Article  Google Scholar 

  • Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast elitist multi-objective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6(2):182–197

    Article  Google Scholar 

  • Derrac J, García S, Hui S, Nagaratnam Suganthan P, Herrera F (2014) Analyzing convergence performance of evolutionary algorithms: a statistical approach. Inf Sci 289:41–58

  • Dhir CS, Lee J, Lee SY (2012) Extraction of independent discriminant features for data with asymmetric distribution. Knowl Inf Syst 30(2):359–375

    Article  Google Scholar 

  • Espejo PG, Ventura S, Herrera F (2010) A survey on the application of genetic programming to classification. IEEE Trans Syst Man Cybern Part C (Appl Rev) 40(2):121–144

  • Fayyad U, Grinstein GG, Wierse A (2001) Information visualization in data mining and knowledge discovery. Morgan Kaufmann, San Francisco

  • Fernández A, González AM, Díaz J, Dorronsoro JR (2015) Diffusion maps for dimensionality reduction and visualization of meteorological data. Neurocomputing 163:25–37

    Article  Google Scholar 

  • Fernández-Blanco E, Rivero D, Gestal M, Dorado J (2013) Classification of signals by means of genetic programming. Soft Comput 17(10):1929–1937

    Article  Google Scholar 

  • Ferreira de Oliveira MC, Levkowitz H (2003) From visual data exploration to visual data mining: a survey. IEEE Trans Vis Comput Graph 9(3):378–394

    Article  Google Scholar 

  • Ferri C, Hernandez-Orallo J, Modroiu R (2009) An experimental comparison of performance measures for classification. Pattern Recognit Lett 30(1):27–38

    Article  Google Scholar 

  • Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7:179–188

    Article  Google Scholar 

  • Fradkin D, Madigan D (2003) Experiments with random projections for machine learning. In: Proceedings of the SIGKDD international conference on knowledge discovery and data mining, pp 517–522

  • García S, Herrera F (2008) An extension on “statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons. J Mach Learn Res 9:2677–2694

    MATH  Google Scholar 

  • García S, Molina D, Lozano M, Herrera F (2009) Study on the use of non-parametric tests for analyzing the evolutionary algorithms’ behaviour: a case study. J Heuristics 15:617–644

    Article  MATH  Google Scholar 

  • García S, Fernández A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inf Sci 180(10):2044–2064

    Article  Google Scholar 

  • Gisbrecht A, Hammer B (2015) Data visualization by nonlinear dimensionality reduction. Wiley Interdiscip Rev Data Min Knowl Discov 5(2):51–73

    Article  Google Scholar 

  • Guo H, Jack LB, Nandi AK (2005) Feature generation using genetic programming with application to fault classification. IEEE Trans Syst Man Cybern Part B Cybern 35(1):89–99

    Article  Google Scholar 

  • Guyon I, Gunn S, Nikravesh M, Zadeh LA (2006) Feature extraction: foundations and applications. In: Studies in fuzziness and soft computing. Springer, New York

  • He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284

    Article  Google Scholar 

  • Huang J, Ling CX (2005) Using AUC and accuracy in evaluating learning algorithms. IEEE Trans Knowl Data Eng 17(3):299–310

    Article  Google Scholar 

  • Hubert LJ, Levin JR (1976) A general statistical framework for assessing categorical clustering in free recall. Psychol Bull 78(6):1072–1080

    Article  Google Scholar 

  • Icke I, Rosenberg A (2011) Multi-objective genetic programming for visual analytics. In: Proceedings of the European conference on genetic programming, pp 322–334

  • Kohonen T (1982) Self-organized formation of topologically correct feature maps. Biol Cybern 43(1):59–69

    Article  MathSciNet  MATH  Google Scholar 

  • Krawiec K (2002) Genetic programming-based construction of features for machine learning and knowledge discovery tasks. Genet Program Evol Mach 3:329–343

    Article  MATH  Google Scholar 

  • Lee JA, Verleysen M (2010) Unsupervised dimensionality reduction: overview and recent advances. In: Proceedings of the IJCNN IEEE world congress on computational intelligence, pp 4163–4170

  • Liu H, Motoda H (1998) Feature extraction, construction and selection: a data mining perspective. Kluwer Academic Publishers, Norwell

    Book  MATH  Google Scholar 

  • Liu B, Xiao Y, Yu PS, Hao Z, Cao L (2014) An efficient orientation distance-based discriminative feature extraction method for multi-classification. Knowl Inf Syst 39(2):409–433

    Article  Google Scholar 

  • López V, Fernández A, García S, Palade V, Herrera F (2013) An insight into classification with imbalanced data: empirical results and current trends on using data intrinsic characteristics. Inf Sci 250:113–141

    Article  Google Scholar 

  • Mckay RI, Hoai NX, Whigham PA, Shan Y, O’Neill M (2010) Grammar-based genetic programming: a survey. Genet Program Evol Mach 11(3–4):365–396

    Article  Google Scholar 

  • Mukhopadhyay A, Maulik U, Bandyopadhyay S, Coello CA (2014) A survey of multiobjective evolutionary algorithms for data mining: part I. IEEE Trans Evol Comput 18(1):4–19

    Article  Google Scholar 

  • Neshatian K, Zhang M, Johnston M (2007) Feature construction and dimension reduction using genetic programming. In: Orgun MA, Thornton J (eds) AI 2007: advances in artificial intelligence. Lecture notes in computer science, vol 4830, pp 160–170

  • Pearson K (1901) On lines and planes of closest fit to systems of points in space. Philos Mag 2(6):559–572

    Article  MATH  Google Scholar 

  • Sammon JW (1969) A nonlinear mapping for data structure analysis. IEEE Trans Comput 18(5):401–409

    Article  Google Scholar 

  • Sanger TD (1989) Optimal unsupervised learning in a single-layer linear feedforward neural network. Neural Netw 2(6):459–473

    Article  Google Scholar 

  • Schölkopf B, Smola A, Müller KR (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10(5):1299–1319

    Article  Google Scholar 

  • van der Maaten L, Postma EO, van den Herik HJ (2009) Dimensionality reduction: a comparative review. Technical report, Tilburg University Technical Report, TiCC-TR 2009–005

  • Venna J, Peltonen J, Nybo K, Aidos H, Kaski S (2010) Information retrieval perspective to nonlinear dimensionality reduction for data visualization. J Mach Learn Res 11:451–490

    MathSciNet  MATH  Google Scholar 

  • Verleysen M, Franois D (2005) The curse of dimensionality in data mining and time series prediction. In: Cabestany J, Prieto A, Sandoval F (eds) Computational intelligence and bioinspired systems. Lecture notes in computer science, vol 3512. Springer, Berlin, pp 758–770

  • Wilcoxon F (1945) Individual comparisons by ranking methods. Biom Bull 1(6):80–83

    Article  Google Scholar 

  • Yeh TT, Chen TY, Chen YC, Wei HW (2011) Parallel non-linear dimension reduction algorithm on GPU. Int J GranuL Comput Rough Sets Intell Syst 2(2):149–165

    Article  Google Scholar 

  • Zhang Y, Rockett PI (2006) Feature extraction using multi-objective genetic programming. In: Jin Y (ed) Multi-objective machine learning. Studies in computational intelligence, vol 16, chapter 4. Springer, New York, pp 79–106

  • Zhang Y, Rockett PI (2007) Multiobjective genetic programming feature extraction with optimized dimensionality. In: Soft computing in industrial applications. Advances in soft computing, vol 39. Springer, New York, pp 159–168

  • Zhang Y, Rockett PI (2009) A generic multi-dimensional feature extraction method using multiobjective genetic programming. Evol Comput 17(1):89–115

    Article  Google Scholar 

  • Zhang Y, Rockett PI (2010) Domain-independent feature extraction for multi-classification using multi-objective genetic programming. Pattern Anal Appl 13:273–288

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

This work has been supported by the National Institutes of Health Grant 1R01HD056235-01A1 (KJC), the Spanish Ministry of Economy and Competitiveness Project TIN2014-55252-P and FEDER funds (SV).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sebastián Ventura.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cano, A., Ventura, S. & Cios, K.J. Multi-objective genetic programming for feature extraction and data visualization. Soft Comput 21, 2069–2089 (2017). https://doi.org/10.1007/s00500-015-1907-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-015-1907-y

Keywords

Navigation