Skip to main content

Trustable symbolic regression models: using ensembles, interval arithmetic and pareto fronts to develop robust and trust-aware models

  • Chapter
Genetic Programming Theory and Practice V

Trust is a major issue with deploying empirical models in the real world since changes in the underlying system or use of the model in new regions of parameter space can produce (potentially dangerous) incorrect predictions. The trepidation involved with model usage can be mitigated by assembling ensembles of diverse models and using their consensus as a trust metric, since these models will be constrained to agree in the data region used for model development and also constrained to disagree outside that region. The problem is to define an appropriate model complexity (since the ensemble should consist of models of similar complexity), as well as to identify diverse models from the candidate model set.

In this chapter we discuss strategies for the development and selection of robust models and model ensembles and demonstrate those strategies against industrial data sets. An important benefit of this approach is that all available data may be used in the model development rather than a partition into training, test and validation subsets. The result is constituent models are more accurate without risk of over-fitting, the ensemble predictions are more accurate and the ensemble predictions have a meaningful trust metric.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Castillo, Flor, Kordon, Arthur, Sweeney, Jeff, and Zirk, Wayne (2004). Using genetic programming in industrial statistical model building. In O’Reilly, Una-May, Yu, Tina, Riolo, Rick L., and Worzel, Bill, editors, Genetic Programming Theory and Practice II, chapter 3, pages 31-48. Springer, Ann Arbor.

    Google Scholar 

  • Hamill, Thomas (2002). An overview of ensemble forecasting and data assimilation. In Preprints of the 14th conference on Numerical Weather Prediction, Ft.Lauderdale, USA. American Meteorological Society.

    Google Scholar 

  • Keijzer, Maarten (2003). Improving symbolic regression with interval arithmetic and linear scaling. In Ryan, Conor, Soule, Terence, Keijzer, Maarten, Tsang, Edward, Poli, Riccardo, and Costa, Ernesto, editors, Genetic Programming, Proceedings of EuroGP’2003, volume 2610 of LNCS, pages 70-82, Essex. Springer-Verlag.

    Google Scholar 

  • Kordon, Arthur, Smits, Guido, Kalos, Alex, and Jordaan, Elsa (2003). Robust soft sensor development using genetic programming. In Leardi, R., editor, Nature-Inspired Methods in Chemometrics: Genetic Algorithms and Artificial Neural Networks. Elsevier, Amsterdam.

    Google Scholar 

  • Kordon, Arthur, Smits, Guido, and Kotanchek, Mark (2006). Industrial evolutionary computing. In GECCO 2006: Tutorials of the 8th annual conference on Genetic and evolutionary computation, Seattle, Washington, USA. ACM Press.

    Google Scholar 

  • Korns, Michael F. (2006). Large-scale, time-constrained symbolic regression. In Riolo, Rick L., Soule, Terence, and Worzel, Bill, editors, Genetic Programming Theory and Practice IV, volume 5 of Genetic and Evolutionary Computation, chapter 16. Springer, Ann Arbor.

    Google Scholar 

  • Kotanchek, Mark, Smits, Guido, and Vladislavleva, Ekaterina (2006). Pursuing the pareto paradigm tournaments, algorithm variations & ordinal optimization. In Riolo, Rick L., Soule, Terence, and Worzel, Bill, editors, Genetic Programming Theory and Practice IV, volume 5 of Genetic and Evolutionary Computation, chapter 3. Springer, Ann Arbor.

    Google Scholar 

  • Smits, Guido and Vladislavleva, Ekaterina (2006). Ordinal pareto genetic programming. In Proceedings of the 2006 IEEE Congress on Evolutionary Computation, Vancouver. IEEE Press.

    Google Scholar 

  • DataModeler (2007). Add-on analysis package for Mathematica.

    Google Scholar 

  • Vladislavleva, Ekaterina and Smits, Guido (2007). Order of non-linearity as a complexity measure for models generated by symbolic regression via genetic programming. In review at IEEE Trans. on Evolutionary Computation (sumbitted).

    Google Scholar 

  • Wichard, Joerg (2006). Model selection in an ensemble framework. In Proceedings of the IEEE World Congress on Computational Intelligence WCCI 2006, Vancouver, Canada.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Kotanchek, M., Smits, G., Vladislavleva, E. (2008). Trustable symbolic regression models: using ensembles, interval arithmetic and pareto fronts to develop robust and trust-aware models. In: Riolo, R., Soule, T., Worzel, B. (eds) Genetic Programming Theory and Practice V. Genetic and Evolutionary Computation Series. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-76308-8_12

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-76308-8_12

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-76307-1

  • Online ISBN: 978-0-387-76308-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics