Skip to main content
Log in

A novel crossover operator based on variable importance for evolutionary multi-objective optimization with tree representation

  • Published:
Journal of Heuristics Aims and scope Submit manuscript

Abstract

Selecting reliable predictors has always been crucial in classification. Especially decision trees are very popular for solving supervised variable selection and classification problems. When variable selection has to be performed with regard to acquisition costs, which have to be paid whenever the respective variable is extracted for a new observation, the problem of balancing the predictive power of the model against its costs describes a multi-objective optimization problem which can be solved with meta-heuristics such as evolutionary multi-objective algorithms. In this paper, we present a non-hierarchical evolutionary multi-objective tree learner (NHEMOtree) based on genetic programming using a binary decision tree representation to handle multi-objective optimization problems with equitable optimization criteria. This tree learner is applied to a multi-objective classification problem from medicine as well as to simulated data to evaluate its performance relative to two wrapper approaches based on either NSGA-II or SMS-EMOA with bitstring representation and CART as the enclosed classification algorithm. Moreover, a novel crossover operator based on a multi-objective variable importance measure is introduced. Using this crossover operator, NHEMOtree can be improved.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Alba, E., Garcia-Nieto, J., Jourdan, L., Talbi, E.G.: Gene selection in cancer classification using PSO/SVM and GA/SVM hybrid algorithms. In: Proceedings of the IEEE Congress on Evolutionary Computation, pp. 284–290 (2007)

  • Angeline, P.J.: An investigation into the sensitivity of genetic programming to the frequency of leaf selection during subtree crossover. In: Proceedings of the First Annual Conference on Genetic Programming, pp. 21–29 (1996)

  • Banzhaf, W., Nordin, P., Keller, R., Francone, F.: Genetic Programming: An Introduction. Morgan Kaufmann, San Francisco (1998)

    Book  MATH  Google Scholar 

  • Berney, S.C., Gordon, I.R., Opdam, H.I., Denehy, L.: A classification and regression tree to assist clinical decision making in airway management for patients with cervical spinal cord injury. Spinal Cord 49(2), 244–250 (2010)

    Article  Google Scholar 

  • Beume, N., Naujoks, B., Emmerich, M.: SMS-EMOA: multiobjective selection based on dominated hypervolume. Eur. J. Oper. Res. 181(3), 1653–1669 (2007)

    Article  MATH  Google Scholar 

  • Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. CRC, Boca Raton (1998)

    Google Scholar 

  • Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  MATH  Google Scholar 

  • Castillo Tapia, M.G., Coello Coello, C.A.: Applications of multi-objective evolutionary algorithms in economics and finance: a survey. In: Proceedings of the IEEE Congress on Evolutionary Computation, pp. 532–539 (2007)

  • Coello Coello, C.A.: Twenty years of evolutionary multi-objective optimization: a historical view of the field. IEEE Comput. Intell. Mag. 1(1), 28–36 (2006)

    Article  MathSciNet  Google Scholar 

  • Coello Coello, C.A.: Evolutionary multi-objective optimization: some current research trends and topics that remain to be explored. Front. Comput. Sci. China 3(1), 18–30 (2009)

    Article  Google Scholar 

  • De Jong, K.A.: Parameter setting in EAs: a 30 year perspective. Stud. Comput. Intell. 54, 1–18 (2007)

    Article  Google Scholar 

  • Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)

  • Derrac, J., García, S., Molina, D., Herrera, F.: A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm Evol. Comput. 1(1), 3–18 (2011)

    Article  Google Scholar 

  • Diao, R., Sun, K., Vittal, V., O’Keefe, R.J., Richardson, M.R., Bhatt, N., Stradford, D., Sarawgi, S.K.: Decision tree-based online voltage security assessment using PMU measurements. IEEE Trans. Power Syst. 24(2), 832–839 (2009)

    Article  Google Scholar 

  • Emmanouilidis, C., Hunter, A., MacIntyre, J.: A multiobjective evolutionary setting for feature selection and a commonality-based crossover operator. In: Proceedings of the IEEE Congress on Evolutionary Computation, pp. 309–316 (2000)

  • Emmerich, M., Beume, N., Naujoks, B.: An EMO algorithm using the hypervolume measure as selection criterion. In: Evolutionary Multi-Criterion Optimization, pp. 62–76 (2005)

  • Espejo, P.G., Ventura, S., Herrera, F.: A survey on the application of genetic programming to classification. IEEE Trans. Syst. Man Cybern. C 40(2), 121–144 (2010)

    Article  Google Scholar 

  • Garcia-Nieto, J., Alba, E., Jourdan, L., Talbi, E.: Sensitivity and specificity based multiobjective approach for feature selection: application to cancer diagnosis. Inf. Process. Lett. 109(16), 887–896 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  • Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, New York (2009)

    Book  Google Scholar 

  • Ito, T., Iba, H., Sato, S.: Depth-dependent crossover for genetic programming. In: Proceedings of the IEEE Congress on Evolutionary Computation, pp. 775–780 (1998)

  • Jabeen, H., Baig, A.R.: Review of classification using genetic programming. Int. J. Eng. Sci. Technol. 2(2), 94–103 (2010)

    Google Scholar 

  • Jin, Y., Sendhoff, B.: Pareto-based multiobjective machine learning: an overview and case studies. IEEE Trans. Syst. Man Cybern. C 38(3), 397–415 (2008)

    Article  Google Scholar 

  • Jones, D.F., Mirrazavi, S.K., Tamiz, M.: Multi-objective meta-heuristics: an overview of the current state-of-the-art. Eur. J. Oper. Res. 137(1), 1–9 (2002)

    Article  MATH  Google Scholar 

  • Kim, T.-W., Koh, D.-H., Park, C.-Y.: Decision tree of occupational lung cancer using classification and regression analysis. Saf. Health Work 1(2), 140–148 (2010)

    Article  Google Scholar 

  • Kinnear Jr., K.E.: Generality and difficulty in genetic programming: evolving a sort. In: Proceedings of the Fifth International Conference on Genetic Algorithms, pp. 287–294 (1993).

  • Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1), 273–324 (1997)

    Article  MATH  Google Scholar 

  • Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge (1992)

    MATH  Google Scholar 

  • Kraft, D.H., Petry, F.E., Buckles, B.P., Sadasivan, T.: The use of genetic programming to build queries for information retrieval. In: Proceedings of the IEEE Conference on Evolutionary Computation, pp. 468–473 (1994)

  • Matuszyk, A., Mues, C., Thomas, L.C.: Modelling LGD for unsecured personal loans: decision tree approach. J. Oper. Res. Soc. 61(3), 393–398 (2009)

    Article  Google Scholar 

  • McCarty Jr, K., Miller, L., Cox, E., Konrath, J., McCarty, K.: Estrogen receptor analyses. Correction of biochemical and immunohistochemical methods using monoclonal antireceptor antibodies. Arch. Pathol. Lab. Med. 109(8), 716–721 (1985)

    Google Scholar 

  • Mugambi, E., Hunter, A.: Multi-objective genetic programming optimization of decision trees for classifying medical data. In: Palade, V., Howlett, R.J., Jain, L. (eds.) Knowledge-Based Intelligent Information and Engineering Systems. Lecture Notes in Computer Science, vol. 2773, pp. 293–299. Springer, Berlin (2003)

  • Oliveira, L., Sabourin, R., Bortolozzi, F., Suen, C.: A methodology for feature selection using multiobjective genetic algorithms for handwritten digit string recognition. Int. J. Patt. Recogn. Artif. Intell. 17(6), 903–929 (2003)

    Article  Google Scholar 

  • Pesch, B., Casjens, S., Stricker, I., Westerwick, D., Taeger, D., Rabstein, S., Wiethege, T., Tannapfel, A., Brüning, T., Johnen, G.: Notch1, hif1a and other cancer-related proteins in lung tissue from uranium miners—variation by occupational exposure and subtype of lung cancer. PLoS ONE (2012). doi:10.1371/journal.pone.0045305

  • Poli, R., Langdon, W., McPhee, N.: A Field Guide to Genetic Programming. Lulu Enterprises, UK (2008)

    Google Scholar 

  • R Development Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing. http://www.R-project.org (2013)

  • Rechenberg, I.: Evolutionsstrategie: Optimierung Technischer Systeme nach Prinzipien der biologischen Evolution. Fromman-Holzboog, Stuttgart (1973)

    Google Scholar 

  • Reynolds, A., de la Iglesia, B.: Rule induction for classification using multi-objective genetic programming. In: Obayashi, S., Deb, K., Poloni, C., Hiroyasu, T., Murata, T. (eds.) Evolutionary Multi-Criterion Optimization. Lecture Notes in Computer Science, vol. 4403, pp. 516–530. Springer, Berlin (2007)

  • Santner, T., Williams, B., Notz, W.: The Design and Analysis of Computer Experiments. Springer, New York (2003)

    Book  MATH  Google Scholar 

  • Schwender, H., Ickstadt, K.: Identification of SNP interactions using logic regression. Biostatistics 9(1), 187–198 (2008)

    Article  MATH  Google Scholar 

  • Srinivas, N., Deb, K.: Muiltiobjective optimization using nondominated sorting in genetic algorithms. Evol. Comput. 2(3), 221–248 (1994)

    Article  Google Scholar 

  • Strobl, C., Boulesteix, A.L., Zeileis, A., Hothorn, T.: Bias in Random Forest variable importance measures: illustrations, sources and a solution. BMC Bioinform. (2007). doi:10.1186/1471-2105-8-25

  • Wagner T., Trautmann H.: Online convergence detection for evolutionary multi-objective algorithms revisited. In: Proceedings of the IEEE Congress on Evolutionary Computation, pp. 1–8 (2010)

  • Wyns, B., Boullart, L.: Efficient tree traversal to reduce code growth in tree-based genetic programming. J. Heuristics 15(1), 77–104 (2009)

    Article  Google Scholar 

  • Zhao, H.: A multi-objective genetic programming approach to developing Pareto optimal decision trees. Decis. Support Syst. 43(3), 809–826 (2007)

    Article  Google Scholar 

  • Zitzler E., Thiele L.,: Multiobjective optimization using evolutionary algorithms - a comparative case study. In: Eiben, A.E., Bäck, T., Schoenauer, M., Schwefel, H.-P. (eds.) Parallel Problem Solving from Nature—PPSN V. Lecture Notes in Computer Science, vol. 1498, pp. 292–301. Springer, Berlin (1998).

  • Zitzler, E., Deb, K., Thiele, L.: Comparison of multiobjective evolutionary algorithms: empirical results. Evol. Comput. 8(2), 173–195 (2000)

    Article  Google Scholar 

  • Zitzler, E., Thiele, L., Laumanns, M., Fonseca, C.M., da Grunert Fonseca, V.: Performance assessment of multiobjective optimizers: an analysis and review. IEEE Trans. Evol. Comput. 7(2), 117–132 (2003)

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported by the Federal Office for Radiation Protection, Neuherberg, Germany [StSch 4528]; Deutsche Forschungsgemeinschaft (DFG) [SCHW1508/3-1 to H.S.]; and DFG within the Collaborative Research Center SFB 876 “Providing Information by Resource-Constrained Analysis”, project C4, to K.I.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Swaantje Casjens.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Casjens, S., Schwender, H., Brüning, T. et al. A novel crossover operator based on variable importance for evolutionary multi-objective optimization with tree representation. J Heuristics 21, 1–24 (2015). https://doi.org/10.1007/s10732-014-9269-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10732-014-9269-7

Keywords

Navigation