Loss-Based Estimation with Evolutionary Algorithms and Cross-Validation

Shilane, David; Liang, Richard H.; Dudoit, Sandrine

doi:10.1007/978-3-642-10701-6_18

David Shilane,
Richard H. Liang &
Sandrine Dudoit

Part of the book series: Adaptation Learning and Optimization ((ALO,volume 2))

2605 Accesses

Abstract

Statistical estimation in multivariate data sets presents myriad challenges when the form of the regression function linking the outcome and explanatory variables is unknown. Our study seeks to understand the computational challenges of regression estimation’s underlying optimization problem and design intelligent procedures for this setting. We begin by analyzing the size of the parameter space in polynomial regression in terms of the number of variables and the constraints on the polynomial degree and the number of interacting explanatory variables.We subsequently propose a new procedure for statistical estimation that relies upon cross-validation to select the optimal parameter subspace and an evolutionary algorithm to minimize risk within this subspace based upon the available data. This general purpose procedure can be shown to perform well in a variety of challenging multivariate estimation settings. Furthermore, the procedure is sufficiently flexible to allow the user to incorporate known causal structures into the estimate and to adjust computational parameters such as the population mutation rate according to the problem’s specific challenges. Furthermore, the procedure can be shown to asymptotically converge to the globally optimal estimate. We compare this evolutionary algorithm to a variety of competitors over the course of simulation studies and in the context of a study of disease progression in diabetes patients.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Hardcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bäck, T.: Evolutionary Algorithms in Theory and Practice. Oxford University Press, Oxford (1996)
MATH Google Scholar
Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)
Article MATH Google Scholar
Breiman, L., Friedman, J.H., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. Chapman and Hall, Boca Raton (1984)
MATH Google Scholar
Candes, E., Tao, T.: The dantzig selector: Statistical estimation when p is much larger than n. Annals of Statistics 35(6), 2313–2351 (2007)
Article MATH MathSciNet Google Scholar
Chambers, J.M., Cleveland, W.S., Tukey, P.A.: Graphical Methods for Data Analysis. Duxbury Press (1983)
Google Scholar
Cormen, T.H., Leiserson, C.E., Rivest, R.L.: Introduction to Algorithms. MIT Press, Cambridge (1990)
MATH Google Scholar
Dudoit, S., van der Laan, M.J.: Asymptotics of cross-validated risk estimation in estimator selection and performance assessment. Statistical Methodology 2(2), 131–154 (2005)
Article MathSciNet Google Scholar
Dudoit, S., van der Laan, M.J., Keleş, S., Molinaro, A.M., Sinisi, S.E., Teng, S.L.: Loss-based estimation with cross-validation: Applications to microarray data analysis. In: Piatetsky-Shapiro, G., Tamayo, P. (eds.) Microarray Data Mining. SIGKDD Explorations, vol. 5, pp. 56–68. ACM, New York (2003), http://www.acm.org/sigs/sigkdd/explorations/issue5-2.htm
Google Scholar
Durbin, B., Dudoit, S., van der Laan, M.J.: Optimization of the architecture of neural networks using a Deletion/Substitution/Addition algorithm. Tech. Rep. 170, Division of Biostatistics, University of California, Berkeley (2005), www.bepress.com/ucbbiostat/paper170
Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Annals of Statistics 32(4), 407–499 (2004)
MATH MathSciNet Google Scholar
Fogel, D.B.: Evolutionary Computation: Toward a New Philosophy of Machine Intelligence. IEEE Press, Los Alamitos (2005)
Google Scholar
Freedman, D.A.: Statistical Models: Theory and Practice, 2nd edn. Cambridge University Press, Cambridge (2009)
MATH Google Scholar
Friedman, J.H.: Multivariate adaptive regression splines. The Annals of Statistics 19(1), 1–141 (1991)
Article MATH MathSciNet Google Scholar
Friedman, J.H.: Fast sparse regression and classification. Tech. rep., Department of Statistics, Stanford University (2008), http://www-stat.stanford.edu/~jfh/
van der Laan, M.J., Dudoit, S.: Unified cross-validation methodology for selection among estimators and a general cross-validated adaptive ε-net estimator: Finite sample oracle inequalities and examples. Tech. Rep. 130, Division of Biostatistics, University of California, Berkeley (2003), www.bepress.com/ucbbiostat/paper130
Sinisi, S.E., van der Laan, M.J.: Deletion/substitution/addition algorithm in learning with applications in genomics. Statistical Applications in Genetics and Molecular Biology 3(1), Article 18 (2004), www.bepress.com/sagmb/vol3/iss1/art18
Specht, D.F.: A general regression neural network. IEEE Transactions on Neural Networks 2(6), 568–576 (1991)
Article Google Scholar
Stoll, M.: Introduction to Real Analysis. Addison-Wesley, Reading (2000)
Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society 58(1), 267–288 (1996)
MATH MathSciNet Google Scholar
Wolpert, D.H., MacReady, W.G.: No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation 1(1), 67–82 (1997)
Article Google Scholar

Download references

Authors

David Shilane
View author publications
You can also search for this author in PubMed Google Scholar
Richard H. Liang
View author publications
You can also search for this author in PubMed Google Scholar
Sandrine Dudoit
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Mechanical Engineering and Science-Faculty of Engineering, Kyoto University, Yoshida-honmachi, Sakyo-Ku,, 606-8501, Kyoto, Japan
Yoel Tenne
Advanced Technology Centre, Rolls-Royce Singapore Pte Ltd, 50 Nanyang Avenue, Block N2, Level B3C,Unit 05-08, 639798, Singapore
Chi-Keong Goh

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Shilane, D., Liang, R.H., Dudoit, S. (2010). Loss-Based Estimation with Evolutionary Algorithms and Cross-Validation. In: Tenne, Y., Goh, CK. (eds) Computational Intelligence in Expensive Optimization Problems. Adaptation Learning and Optimization, vol 2. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10701-6_18

Download citation

DOI: https://doi.org/10.1007/978-3-642-10701-6_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10700-9
Online ISBN: 978-3-642-10701-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics