“Better solutions faster” is the reality of the industrial modeling world, now more than ever. Efficiency requirements, market pressures, and ever changing data force us to use symbolic regression via genetic programming (GP) in a highly automated fashion. This is why we want our GP system to produce simple solutions of the highest possible quality with the lowest computational effort, and a high consistency in the results of independent GP runs.
In this chapter, we show that genetic programming with a focus on ranking in combination with goal softening is a very powerful way to improve the efficiency and effectiveness of the evolutionary search. Our strategy consists of partial fitness evaluations of individuals on random subsets of the original data set, with a gradual increase in the subset size in consecutive generations. From a series of experiments performed on three test problems, we observed that those evolutions that started from the smallest subset sizes (10%) consistently led to results that are superior in terms of the goodness of fit, consistency between independent runs, and computational effort. Our experience indicates that solutions obtained using this approach are also less complex and more robust against over-fitting.
We find that the near-optimal strategy of allocating computational budget over a GP run is to evenly distribute it over all generations. This implies that initially, more individuals can be evaluated using small subset sizes, promoting better exploration. Exploitation becomes more important towards the end of the run, when all individuals are evaluated using the full data set with correspondingly smaller population sizes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Gathercole, Chris and Ross, Peter (1994). Dynamic training subset selection for supervised learning in genetic programming. In Davidor, Yuval, Schwefel, Hans-Paul, and M änner, Reinhard, editors, Parallel Problem Solving from Nature III, volume 866 of LNCS, pages 312-321, Jerusalem. Springer-Verlag.
Ho, Y.-C., Cassandras, C.G., Chen, C.-H, and Dai, L (2000). Ordinal optimization and simulation. Journal of the Operational Research Society, 51:490-500.
Ho, Yu-Chi (2000). Soft optimization for hard problems, computerized lecture via private communication/distribution.
Keijzer, Maarten and Foster, James (2007). Crossover bias in genetic programming. In Ebner, Marc, O’Neill, Michael, Ek árt, Anik ó, Vanneschi, Leonardo, and Esparcia-Alc ázar, Anna Isabel, editors, Proceedings of the 10th European Conference on Genetic Programming, volume 4445 of Lecture Notes in Computer Science, Valencia, Spain. Springer.
Kotanchek, Mark, Smits, Guido, and Vladislavleva, Ekaterina (2006). Pursuing the pareto paradigm tournaments, algorithm variations & ordinal optimization. In Riolo, Rick L., Soule, Terence, and Worzel, Bill, editors, Genetic Programming Theory and Practice IV, volume 5 of Genetic and Evolutionary Computation, chapter 3, pages -. Springer, Ann Arbor.
Langdon, W. B. and Poli, Riccardo (2002). Foundations of Genetic Programming. Springer-Verlag.
Lau, T.W. Edward and Ho, Yu-Chi (1997). Universal alignment probabilities and subset selection for ordinal optimization. J. Optim. Theory Appl., 93(3):455-489.
Laumanns, Marco, Thiele, Lothar, Zitzler, Eckart, and Deb, Kalyanmoy (2002). Archiving with guaranteed convergence and diversity in multi-objective optimization. In GECCO, pages 439-447.
Smits, Guido, Kordon, Arthur, Vladislavleva, Katherine, Jordaan, Elsa, and Kotanchek, Mark (2005). Variable selection in industrial datasets using pareto genetic programming. In Yu, Tina, Riolo, Rick L., and Worzel, Bill, editors, Genetic Programming Theory and Practice III, volume 9 of Genetic Programming, chapter 6, pages 79-92. Springer, Ann Arbor.
Smits, Guido and Kotanchek, Mark (2004). Pareto-front exploitation in symbolic regression. In O’Reilly, Una-May, Yu, Tina, Riolo, Rick L., and Worzel, Bill, editors, Genetic Programming Theory and Practice II, chapter 17, pages 283-299. Springer, Ann Arbor.
Smits, Guido and Vladislavleva, Ekaterina (2006). Ordinal pareto genetic programming. In Yen, Gary G., Wang, Lipo, Bonissone, Piero, and Lucas, Simon M., editors, Proceedings of the 2006 IEEE Congress on Evolutionary Computation, pages 3114 - 3120, Vancouver. IEEE Press.
Teller, Astro and Andre, David (1997). Automatically choosing the number of fitness cases: The rational allocation of trials. In Koza, John R., Deb, Kalyanmoy, Dorigo, Marco, Fogel, David B., Garzon, Max, Iba, Hitoshi, and Riolo, Rick L., editors, Genetic Programming 1997: Proceedings of the Second Annual Conference, pages 321-328, Stanford University, CA, USA. Morgan Kaufmann.
Zhang, Byoung-Tak and Cho, Dong-Yeon (1998). Genetic programming with active data selection. In McKay, R. I. Bob, Yao, X., Newton, Charles S., Kim, J.-H., and Furuhashi, T., editors, Simulated Evolution and Learning: Second Asia-Pacific Conference on Simulated Evolution and Learning, SEAL’98. Selected Papers, volume 1585 of LNAI, pages 146-153, Australian Defence Force Academy, Canberra, Australia. Springer-Verlag. published in 1999.
Zitzler, Eckart and Thiele, Lothar (1998). Multiobjective optimization using evolutionary algorithms - a comparative case study. In Eiben, A. E., B äck, Thomas, Schoenauer, Marc, and Schwefel, Hans-Paul, editors, PPSN, volume 1498 of Lecture Notes in Computer Science, pages 292-304. Springer.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Vladislavleva, E., Smits, G., Kotanchek, M. (2008). Better Solutions Faster: Soft Evolution of Robust Regression Models InParetogeneticprogramming. In: Riolo, R., Soule, T., Worzel, B. (eds) Genetic Programming Theory and Practice V. Genetic and Evolutionary Computation Series. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-76308-8_2
Download citation
DOI: https://doi.org/10.1007/978-0-387-76308-8_2
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-76307-1
Online ISBN: 978-0-387-76308-8
eBook Packages: Computer ScienceComputer Science (R0)