Skip to main content

Loss-Based Estimation with Evolutionary Algorithms and Cross-Validation

  • Chapter
Book cover Computational Intelligence in Expensive Optimization Problems

Part of the book series: Adaptation Learning and Optimization ((ALO,volume 2))

  • 2605 Accesses

Abstract

Statistical estimation in multivariate data sets presents myriad challenges when the form of the regression function linking the outcome and explanatory variables is unknown. Our study seeks to understand the computational challenges of regression estimation’s underlying optimization problem and design intelligent procedures for this setting. We begin by analyzing the size of the parameter space in polynomial regression in terms of the number of variables and the constraints on the polynomial degree and the number of interacting explanatory variables.We subsequently propose a new procedure for statistical estimation that relies upon cross-validation to select the optimal parameter subspace and an evolutionary algorithm to minimize risk within this subspace based upon the available data. This general purpose procedure can be shown to perform well in a variety of challenging multivariate estimation settings. Furthermore, the procedure is sufficiently flexible to allow the user to incorporate known causal structures into the estimate and to adjust computational parameters such as the population mutation rate according to the problem’s specific challenges. Furthermore, the procedure can be shown to asymptotically converge to the globally optimal estimate. We compare this evolutionary algorithm to a variety of competitors over the course of simulation studies and in the context of a study of disease progression in diabetes patients.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bäck, T.: Evolutionary Algorithms in Theory and Practice. Oxford University Press, Oxford (1996)

    MATH  Google Scholar 

  2. Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)

    Article  MATH  Google Scholar 

  3. Breiman, L., Friedman, J.H., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. Chapman and Hall, Boca Raton (1984)

    MATH  Google Scholar 

  4. Candes, E., Tao, T.: The dantzig selector: Statistical estimation when p is much larger than n. Annals of Statistics 35(6), 2313–2351 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  5. Chambers, J.M., Cleveland, W.S., Tukey, P.A.: Graphical Methods for Data Analysis. Duxbury Press (1983)

    Google Scholar 

  6. Cormen, T.H., Leiserson, C.E., Rivest, R.L.: Introduction to Algorithms. MIT Press, Cambridge (1990)

    MATH  Google Scholar 

  7. Dudoit, S., van der Laan, M.J.: Asymptotics of cross-validated risk estimation in estimator selection and performance assessment. Statistical Methodology 2(2), 131–154 (2005)

    Article  MathSciNet  Google Scholar 

  8. Dudoit, S., van der Laan, M.J., Keleş, S., Molinaro, A.M., Sinisi, S.E., Teng, S.L.: Loss-based estimation with cross-validation: Applications to microarray data analysis. In: Piatetsky-Shapiro, G., Tamayo, P. (eds.) Microarray Data Mining. SIGKDD Explorations, vol. 5, pp. 56–68. ACM, New York (2003), http://www.acm.org/sigs/sigkdd/explorations/issue5-2.htm

    Google Scholar 

  9. Durbin, B., Dudoit, S., van der Laan, M.J.: Optimization of the architecture of neural networks using a Deletion/Substitution/Addition algorithm. Tech. Rep. 170, Division of Biostatistics, University of California, Berkeley (2005), www.bepress.com/ucbbiostat/paper170

  10. Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Annals of Statistics 32(4), 407–499 (2004)

    MATH  MathSciNet  Google Scholar 

  11. Fogel, D.B.: Evolutionary Computation: Toward a New Philosophy of Machine Intelligence. IEEE Press, Los Alamitos (2005)

    Google Scholar 

  12. Freedman, D.A.: Statistical Models: Theory and Practice, 2nd edn. Cambridge University Press, Cambridge (2009)

    MATH  Google Scholar 

  13. Friedman, J.H.: Multivariate adaptive regression splines. The Annals of Statistics 19(1), 1–141 (1991)

    Article  MATH  MathSciNet  Google Scholar 

  14. Friedman, J.H.: Fast sparse regression and classification. Tech. rep., Department of Statistics, Stanford University (2008), http://www-stat.stanford.edu/~jfh/

  15. van der Laan, M.J., Dudoit, S.: Unified cross-validation methodology for selection among estimators and a general cross-validated adaptive ε-net estimator: Finite sample oracle inequalities and examples. Tech. Rep. 130, Division of Biostatistics, University of California, Berkeley (2003), www.bepress.com/ucbbiostat/paper130

  16. Sinisi, S.E., van der Laan, M.J.: Deletion/substitution/addition algorithm in learning with applications in genomics. Statistical Applications in Genetics and Molecular Biology 3(1), Article 18 (2004), www.bepress.com/sagmb/vol3/iss1/art18

  17. Specht, D.F.: A general regression neural network. IEEE Transactions on Neural Networks 2(6), 568–576 (1991)

    Article  Google Scholar 

  18. Stoll, M.: Introduction to Real Analysis. Addison-Wesley, Reading (2000)

    Google Scholar 

  19. Tibshirani, R.: Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society 58(1), 267–288 (1996)

    MATH  MathSciNet  Google Scholar 

  20. Wolpert, D.H., MacReady, W.G.: No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation 1(1), 67–82 (1997)

    Article  Google Scholar 

Download references

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Shilane, D., Liang, R.H., Dudoit, S. (2010). Loss-Based Estimation with Evolutionary Algorithms and Cross-Validation. In: Tenne, Y., Goh, CK. (eds) Computational Intelligence in Expensive Optimization Problems. Adaptation Learning and Optimization, vol 2. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10701-6_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-10701-6_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-10700-9

  • Online ISBN: 978-3-642-10701-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics