Robust Pareto Front Genetic Programming Parameter Selection Based on Design of Experiments and Industrial Data

  • Flor Castillo
  • Arthur Kordon
  • Guido Smits
Part of the Genetic and Evolutionary Computation book series (GEVO)


Symbolic regression based on Pareto-Front GP is a very effective approach for generating high-performance parsimonious empirical models acceptable for industrial applications. The chapter addresses the issue of finding the optimal parameter settings of Pareto-Front GP which direct the simulated evolution toward simple models with acceptable prediction error. A generic methodology based on statistical design of experiments is proposed. It includes determination of the number of replicates by half width confidence intervals, determination of the significant factors by fractional factorial design of experiments, approaching the optimum by steepest ascent/descent, and local exploration around the optimum by Box Behnken design of experiments. The results from implementing the proposed methodology to different types of industrial data sets show that the statistically significant factors are the number of cascades, the number of generations, and the population size. The optimal values for the three parameters have been defined based on second order regression models with R 2 hig herthan 0.97 for small, medium, and large-sized data sets. The robustness of the optimal parameters toward the types of data sets was explored and a robust setting for the three significant parameters was obtained. It reduces the calculation time by 30% to 50% without statistically significant reduction in the mean response.


Genetic programming symbolic regression industrial applications design of experiments parameter selection 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Banzhaf, W., Nordin, P., Keller, R., and Francone, F. (1998). Genetic Programming: An Introduction. Morgan Kaufmann, San Francisco, CA.zbMATHGoogle Scholar
  2. Box, G., Hunter, W., and Hunter, J. (2005). Statistics for Experiments: An Itroduction to Design, Data Analysis, and Model Building. Wiley, New York, NY, 2 edition.Google Scholar
  3. Castillo, F., Kordon, A., Smits, G., Christenson, B., and Dickerson, D. (2006). Pareto front genetic programming parameter selection based on design of experiments and industrial data. In Proceedings of GECCO 2006.Google Scholar
  4. Chan, Kit Yan, Aydin, M. Emin, and Fogarty, Terence C. (2003). New factorial design theoretic crossover operator for parametrical problem. In Ryan, Conor, Soule, Terence, Keijzer, Maarten, Tsang, Edward, Poli, Riccardo, and Costa, Ernesto, editors, Genetic Programming, Proceedings of EuroGP’2003, volume 2610 of LNCS, pages 22–33, Essex. Springer-Verlag.Google Scholar
  5. Derringer, G. and Suich, R. (1980). Simultaneous optimization of several response variables. Journal of Quality Technology, 28(1):61–70.Google Scholar
  6. Feldt, Robert and Nordin, Peter (2000). Using factorial experiments to evaluate the effect of genetic programming parameters. In Poli, Riccardo, Banzhaf, Wolfgang, Langdon, William B., Miller, Julian F., Nordin, Peter, and Fogarty, Terence C, editors, Genetic Programming, Proceedings of EuroGP’ 2000, volume 1802 of LNCS, pages 271–282, Edinburgh. Springer-Verlag.Google Scholar
  7. Kordon, A., Kalos, A., and Adams, B. (2003a). Empirical emulators for process monitoring and optimization. In Proceedings of the IEEE 11th Conference on Control and Automation MED’2003, page 111, Greece. Rhodes.Google Scholar
  8. Kordon, A., Smits, G., Kalos, A., and Jordaan, E. (2003b). Robust soft sensor development using genetic programming. In Leardi, R., editor, Nature-Inspired Methods in Chemometrics, Amsterdam. Elsevier.Google Scholar
  9. Kordon, Arthur, Castillo, Flor, Smits, Guido, and Kotanchek, Mark (2005). Application issues of genetic programming in industry. In Yu, Tina, Riolo, Rick L., and Worzel, Bill, editors, Genetic Programming Theory and Practice III, volume 9 of Genetic Programming, chapter 16, pages 241–258. Springer, Ann Arbor.Google Scholar
  10. Kotanchek, M., Smits, G., and Vladislavleva, E. (2006). Pursuing the pareto paradigm. In Yu, Tina, Riolo, Rick L., and Soule, Terry, editors, Genetic Programming Theory and Practice IV. Kluwer, Ann Arbor.Google Scholar
  11. Koza, John R. (1992). Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, USA.zbMATHGoogle Scholar
  12. Montgomery, D. (1995). Design and Analysis of Experiments. Wiley, New York, NY.Google Scholar
  13. Myers, R. and Montgomery, D. (1995). Design and Analysis of Experiments.Google Scholar
  14. Petrovski, A., Brownless, A., and McCall, J. (2005). Statistical optimization and tuning of ga factors. In Proceedings of the Congress of Evolutionary Computation (CEC’2005), pages 758–764, Edinburgh, UK.Google Scholar
  15. Reeves, C. and Wright, C. (1995). An experimental design prospective on genetic algorithms. In Whitley, D. and Vose, M., editors, Foundations of Genetic Algorithms 3, San Mateo, CA. Morgan Kaufmann.Google Scholar
  16. Smits, G. and Kotanchek, M. (2004). Pareto front exploitation in symbolic regression. In Yu, Tina, Riolo, Rick L., O’Reilly, U.M., and Worzel, Bill, editors, Genetic Programming Theory and Practice II, pages 283–300. Springer, New York.Google Scholar
  17. Welch, B.L. (1951). On the comparison of several mean values: An alternative approach. Biometrika, 38:330–336.zbMATHMathSciNetGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2007

Authors and Affiliations

  • Flor Castillo
    • 1
  • Arthur Kordon
    • 1
  • Guido Smits
    • 2
  1. 1.The Dow Chemical CompanyFreeport
  2. 2.Dow BeneluxTerneuzenThe Netherlands

Personalised recommendations