Bootstrap model selection for possibly dependent and heterogeneous data

Sancetta, Alessio

doi:10.1007/s10463-008-0183-3

Bootstrap model selection for possibly dependent and heterogeneous data

Published: 16 July 2008

Volume 62, pages 515–546, (2010)
Cite this article

Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Alessio Sancetta¹

110 Accesses
1 Citation
Explore all metrics

Abstract

This paper proposes the use of the bootstrap in penalized model selection for possibly dependent heterogeneous data. The results show that we can establish (at least asymptotically) a direct relationship between estimation error and a data based complexity penalization. This requires redefinition of the target function as the sum of the individual expected predicted risks. In this framework, the wild bootstrap and related approaches can be used to estimate the penalty with no need to account for heterogeneous dependent data. The methodology is highlighted by a simulation study whose results are particularly encouraging.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bootstrap for inference after model selection and model averaging for likelihood models

Article 05 March 2024

Statistical estimation in the presence of possibly incorrect model assumptions

Article 01 September 2017

Smoothed Bootstrap Methods for Hypothesis Testing

Article Open access 04 March 2024

References

Bartlett P., Boucheron G., Lugosi G. (2002) Model selection and error estimation. Machine Learning 48: 85–113
Article MATH Google Scholar
Bartlett P., Bousquet O., Mendelson S. (2005) Local rademacher complexities. Annals of Statistics 33: 1497–1537
Article MATH MathSciNet Google Scholar
Bühlmann P. (1997) Sieve Bootstrap for time series. Bernoulli 3: 123–148
Article MATH MathSciNet Google Scholar
Cesa-Bianchi N., Lugosi G. (2001) Worst-case bounds for the logarithmic loss of predictors. Machine Learning 43: 247–264
Article MATH Google Scholar
Dawid A.P. (1984) Present position and potential developments: some personal views: statistical theory: the prequential approach. Journal of the Royal Statistical Society Series A 147: 278–292
Article MATH MathSciNet Google Scholar
Dawid A.P. (1985) Calibration-based empirical probability. The Annals of Statistics 13: 1251–1274
Article MATH MathSciNet Google Scholar
Dawid, A. P. (1986). Probability forecasting. In S. Kotz, N. L. Johnson, C. B. Read (Eds.), Encyclopedia of statistical sciences (Vol. 7, pp. 210–218). New York: Wiley.
De la Peña V.H. (1999) A general class of exponential inequalities for Martingales and ratios. Annals of Probability 27: 537–564
Article MATH MathSciNet Google Scholar
Devroye L., Györfi L., Lugosi G. (1996) A probabilistic theory of pattern recognition. Springer, New York
MATH Google Scholar
Doukhan P., Leon J.R., Portal F. (1987) Principes d’Invariance Faible pour la Mesure Empirique d’un Suite de Variables Aléatoires Mélangeante. Probability Theory and Related Fields 76: 51–70
Article MATH MathSciNet Google Scholar
Dudley R.M. (2002) Real analysis and probability. Cambridge University Press, Cambridge
MATH Google Scholar
Efron B. (1983) Estimating the error rate of a prediction rule: improvement on cross-validation. Journal American Statistical Association 78: 316–331
Article MATH MathSciNet Google Scholar
Friedman J.H. (2001) Greedy function approximation: a gradient boosting machine. Annals of Statistics 29: 1189–1232
Article MATH MathSciNet Google Scholar
Fromont M. (2007) Model selection by bootstrap penalization for classification. Machine Learning 66: 165–207
Article Google Scholar
Gray R.M., Kieffer J.C. (1980) Asymptotically mean stationary measures. Annals of Probability 8: 962–973
Article MATH MathSciNet Google Scholar
Koltchinskii V. (2001) Rademacher penalties and structural risk minimization. IEEE Transactions on Information Theory 47: 1902–1914
Article MATH MathSciNet Google Scholar
Levental S. (1989) A uniform CLT for uniformly bounded families of Martingale differences. Journal of Theoretical Probability 2: 271–287
Article MATH MathSciNet Google Scholar
Lugosi G., Wegkamp M. (2004) Complexity regularization via localized random penalties. Annals of Statistics 32: 1679–1697
Article MATH MathSciNet Google Scholar
Mammen E. (1992) Bootstrap, wild bootstrap, and asymptotic normality. Probability Theory Related Fields 93: 439–455
Article MATH MathSciNet Google Scholar
McLeish D.L. (1974) Dependent central limit theorems and invariance principles. Annals of Probability 2: 620–628
Article MATH MathSciNet Google Scholar
Petrov V. (1995) Limit Theorems of probability theory. Oxford University Press, Oxford
MATH Google Scholar
Ripley B. (1996) Pattern recognition and neural networks. Cambridge University Press, Cambridge
MATH Google Scholar
Rüschendorf L., de Valk V. (1993) On regression representation of stochastic processes. Stochastic Processes and their Applications 46: 183–198
Article MATH MathSciNet Google Scholar
Seillier-Moiseiwitsch P., Dawid A.P. (1993) On testing the validity of sequential probability forecasts. Journal of the American Statistical Association 88: 355–359
Article MATH MathSciNet Google Scholar
Skouras, K., Dawid, P. (2000). Consistency in misspecified models. Research report 218. Department of Statistical Science, University College London.
Van der Laan, M. J., Dudoit, S. (2003). Unified cross-validation methodology for selection among estimators and a general cross-validated adaptive epsilon-net estimator: finite sample oracle inequalities and examples. U.C. Berkeley Division of Biostatistics Working Paper Series, Working Paper 130.
Van der Vaart, A., Wellner, J. A. (2000). Weak convergence of empirical processes. Springer series in statistics. New York: Springer.
Vapnik V.N. (1998) Statistical learning theory. Wiley, New York
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Economics, University of Cambridge, Austin Robinson Building, Sidgwick Avenue, Cambridge, CB3 9DD, UK
Alessio Sancetta

Authors

Alessio Sancetta
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alessio Sancetta.

Additional information

I thank the associate editor and the referee for comments that improved the quality and presentation of the paper.

About this article

Cite this article

Sancetta, A. Bootstrap model selection for possibly dependent and heterogeneous data. Ann Inst Stat Math 62, 515–546 (2010). https://doi.org/10.1007/s10463-008-0183-3

Download citation

Received: 28 August 2006
Revised: 20 February 2008
Published: 16 July 2008
Issue Date: June 2010
DOI: https://doi.org/10.1007/s10463-008-0183-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bootstrap model selection for possibly dependent and heterogeneous data

Abstract

Access this article

Similar content being viewed by others

Bootstrap for inference after model selection and model averaging for likelihood models

Statistical estimation in the presence of possibly incorrect model assumptions

Smoothed Bootstrap Methods for Hypothesis Testing

References

Author information

Authors and Affiliations

Corresponding author

Additional information

About this article

Cite this article

Keywords

Navigation

Bootstrap model selection for possibly dependent and heterogeneous data

Abstract

Access this article

Similar content being viewed by others

Bootstrap for inference after model selection and model averaging for likelihood models

Statistical estimation in the presence of possibly incorrect model assumptions

Smoothed Bootstrap Methods for Hypothesis Testing

References

Author information

Authors and Affiliations

Corresponding author

Additional information

About this article

Cite this article

Share this article

Keywords

Search

Navigation