Abstract
This paper proposes the use of the bootstrap in penalized model selection for possibly dependent heterogeneous data. The results show that we can establish (at least asymptotically) a direct relationship between estimation error and a data based complexity penalization. This requires redefinition of the target function as the sum of the individual expected predicted risks. In this framework, the wild bootstrap and related approaches can be used to estimate the penalty with no need to account for heterogeneous dependent data. The methodology is highlighted by a simulation study whose results are particularly encouraging.
Similar content being viewed by others
References
Bartlett P., Boucheron G., Lugosi G. (2002) Model selection and error estimation. Machine Learning 48: 85–113
Bartlett P., Bousquet O., Mendelson S. (2005) Local rademacher complexities. Annals of Statistics 33: 1497–1537
Bühlmann P. (1997) Sieve Bootstrap for time series. Bernoulli 3: 123–148
Cesa-Bianchi N., Lugosi G. (2001) Worst-case bounds for the logarithmic loss of predictors. Machine Learning 43: 247–264
Dawid A.P. (1984) Present position and potential developments: some personal views: statistical theory: the prequential approach. Journal of the Royal Statistical Society Series A 147: 278–292
Dawid A.P. (1985) Calibration-based empirical probability. The Annals of Statistics 13: 1251–1274
Dawid, A. P. (1986). Probability forecasting. In S. Kotz, N. L. Johnson, C. B. Read (Eds.), Encyclopedia of statistical sciences (Vol. 7, pp. 210–218). New York: Wiley.
De la Peña V.H. (1999) A general class of exponential inequalities for Martingales and ratios. Annals of Probability 27: 537–564
Devroye L., Györfi L., Lugosi G. (1996) A probabilistic theory of pattern recognition. Springer, New York
Doukhan P., Leon J.R., Portal F. (1987) Principes d’Invariance Faible pour la Mesure Empirique d’un Suite de Variables Aléatoires Mélangeante. Probability Theory and Related Fields 76: 51–70
Dudley R.M. (2002) Real analysis and probability. Cambridge University Press, Cambridge
Efron B. (1983) Estimating the error rate of a prediction rule: improvement on cross-validation. Journal American Statistical Association 78: 316–331
Friedman J.H. (2001) Greedy function approximation: a gradient boosting machine. Annals of Statistics 29: 1189–1232
Fromont M. (2007) Model selection by bootstrap penalization for classification. Machine Learning 66: 165–207
Gray R.M., Kieffer J.C. (1980) Asymptotically mean stationary measures. Annals of Probability 8: 962–973
Koltchinskii V. (2001) Rademacher penalties and structural risk minimization. IEEE Transactions on Information Theory 47: 1902–1914
Levental S. (1989) A uniform CLT for uniformly bounded families of Martingale differences. Journal of Theoretical Probability 2: 271–287
Lugosi G., Wegkamp M. (2004) Complexity regularization via localized random penalties. Annals of Statistics 32: 1679–1697
Mammen E. (1992) Bootstrap, wild bootstrap, and asymptotic normality. Probability Theory Related Fields 93: 439–455
McLeish D.L. (1974) Dependent central limit theorems and invariance principles. Annals of Probability 2: 620–628
Petrov V. (1995) Limit Theorems of probability theory. Oxford University Press, Oxford
Ripley B. (1996) Pattern recognition and neural networks. Cambridge University Press, Cambridge
Rüschendorf L., de Valk V. (1993) On regression representation of stochastic processes. Stochastic Processes and their Applications 46: 183–198
Seillier-Moiseiwitsch P., Dawid A.P. (1993) On testing the validity of sequential probability forecasts. Journal of the American Statistical Association 88: 355–359
Skouras, K., Dawid, P. (2000). Consistency in misspecified models. Research report 218. Department of Statistical Science, University College London.
Van der Laan, M. J., Dudoit, S. (2003). Unified cross-validation methodology for selection among estimators and a general cross-validated adaptive epsilon-net estimator: finite sample oracle inequalities and examples. U.C. Berkeley Division of Biostatistics Working Paper Series, Working Paper 130.
Van der Vaart, A., Wellner, J. A. (2000). Weak convergence of empirical processes. Springer series in statistics. New York: Springer.
Vapnik V.N. (1998) Statistical learning theory. Wiley, New York
Author information
Authors and Affiliations
Corresponding author
Additional information
I thank the associate editor and the referee for comments that improved the quality and presentation of the paper.
About this article
Cite this article
Sancetta, A. Bootstrap model selection for possibly dependent and heterogeneous data. Ann Inst Stat Math 62, 515–546 (2010). https://doi.org/10.1007/s10463-008-0183-3
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10463-008-0183-3