Model-free model-fitting and predictive distributions

Politis, Dimitris N.

doi:10.1007/s11749-013-0317-7

Model-free model-fitting and predictive distributions

Invited Paper
Published: 05 April 2013

Volume 22, pages 183–221, (2013)
Cite this article

TEST Aims and scope Submit manuscript

Dimitris N. Politis¹

720 Accesses
36 Citations
Explore all metrics

Abstract

The problem of prediction is revisited with a view towards going beyond the typical nonparametric setting and reaching a fully model-free environment for predictive inference, i.e., point predictors and predictive intervals. A basic principle of model-free prediction is laid out based on the notion of transforming a given setup into one that is easier to work with, namely i.i.d. or Gaussian. As an application, the problem of nonparametric regression is addressed in detail; the model-free predictors are worked out, and shown to be applicable under minimal assumptions. Interestingly, model-free prediction in regression is a totally automatic technique that does not necessitate the search for an optimal data transformation before model fitting. The resulting model-free predictive distributions and intervals are compared to their corresponding model-based analogs, and the use of cross-validation is extensively discussed. As an aside, improved prediction intervals in linear regression are also obtained.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Nonparametric predictive distributions based on conformal prediction

Article Open access 17 August 2018

Regression

Prior Information in Bayesian Linear Multivariate Regression

Notes

The qualitative difference is that the interest of the MF practitioner is on observable quantities, i.e., current and future data, as opposed to unobservable model parameters and estimates thereof. In this sense, despite being frequentist in nature, the MF principle is in concordance with Bruno de Finetti’s statistical philosophy—see e.g. Dawid (2004) and the references therein.
Rather than doing a two-dimensional search over h and q to minimize PRESS, the simple constraint q=h will be imposed in what follows which has the additional advantage of rendering \(M_{x}\geq m^{2}_{x}\) as needed for a well-defined estimator \(s^{2}_{x}\) in Eq. (11). Note that the choice q=h is not necessarily optimal; see e.g. Wang et al. (2008). Furthermore, note that these are global bandwidths; techniques for picking local bandwidths, i.e., a different optimal bandwidth for each x, are widely available but will not be discussed further here in order not to obscure the paper’s main focus. Similarly, there are several recent variations on the cross-validation theme such as the one-sided cross-validation of Hart and Yi (1998), and the far casting cross-validation for dependent data of Carmack et al. (2009) that present attractive alternatives. However, our discussion will focus on the well-known standard form of cross-validation for concreteness especially since our aim is to show how the Model-Free prediction principle applies in nonparametric regression with any type of kernel smoother, and any type of bandwidth selector.
In general, the L ₂-optimal predictor of Y _f would be given by the conditional expectation of Y _f given Y ₁,…,Y _n as well as x _f; see e.g. Goldberger (1962). However, under model (8), the Y data are assumed independent; therefore, E(Y _f|x _f,Y ₁,…,Y _n) simplifies to just E(Y _f|x _f).
Here, and for the remainder of Sect. 3, we will assume that the form of the estimator m _x is linear in the Y data; our running example of a kernel smoother obviously satisfies this requirement, and so do other popular methods such as local polynomial fitting.
Strictly speaking, the W _t’s are not exactly independent because of dependence of \(m_{x_{t}}\) and \(s_{x_{t}}\) to \(m_{x_{k}}\) and \(s_{x_{k}}\). However, under typical conditions, \(m_{x}\stackrel{P}{\longrightarrow}E(Y|x)\) and \(s^{2}_{x}\stackrel {P}{\longrightarrow}\mathit{Var}(Y|x)\) as n→∞. Therefore, the W _t’s are—at least—asymptotically independent.
If σ ²(x) is not assumed constant, then \(\tilde{e}_{t}= e_{t} C_{t}/(1-\delta_{x_{t}})\) where \(C_{t}=s_{x_{t}}/s_{x_{t}}^{(t)}\).
Efron (1983) proposed an iterated bootstrap method in order to correct the downward bias of the bootstrap estimate of prediction error; his method notably involved the use of predictive residuals albeit at the second bootstrap tier—see Efron and Tibshirani (1993, Chap. 17.7) for details.
For \(\bar{D}_{x_{\mathrm{f}}}^{-1}\) to be an accurate estimator of \(D_{x_{\mathrm{f}}}^{-1}\), the value x _f must be such that it has an appreciable number of h-close neighbors among the original predictors x ₁,…,x _n as discussed in Remark 4.1. As an extreme example, note that prediction of Y _f when x _f is outside the range of the original predictors x ₁,…,x _n, i.e., extrapolation, is not feasible in the model-free paradigm.

References

Altman NS (1992) An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat 46(3):175–185
MathSciNet Google Scholar
Atkinson AC (1985) Plots, transformations and regression. Clarendon, Oxford
MATH Google Scholar
Beran R (1990) Calibrating prediction regions. J Am Stat Assoc 85:715–723
Article MathSciNet MATH Google Scholar
Bickel P, Li B (2006) Regularization in statistics. Test 15(2):271–344
Article MathSciNet MATH Google Scholar
Box GEP, Cox DR (1964) An analysis of transformations. J R Stat Soc, Ser B, Stat Methodol 26:211–252
MathSciNet MATH Google Scholar
Breiman L, Friedman J (1985) Estimating optimal transformations for multiple regression and correlation. J Am Stat Assoc 80:580–597
Article MathSciNet MATH Google Scholar
Carmack PS, Schucany WR, Spence JS, Gunst RF, Lin Q, Haley RW (2009) Far casting cross-validation. J Comput Graph Stat 18(4):879–893
Article MathSciNet Google Scholar
Carroll RJ, Ruppert D (1988) Transformations and weighting in regression. Chapman & Hall, New York
Google Scholar
Carroll RJ, Ruppert D (1991) Prediction and tolerance intervals with transformation and/or weighting. Technometrics 33:197–210
Article MathSciNet Google Scholar
Cox DR (1975) Prediction intervals and empirical Bayes confidence intervals. In: Gani J (ed) Perspectives in probability and statistics. Academic Press, London, pp 47–55
Google Scholar
Dai J, Sperlich S (2010) Simple and effective boundary correction for kernel densities and regression with an application to the world income and Engel curve estimation. Comput Stat Data Anal 54(11):2487–2497
Article MathSciNet Google Scholar
DasGupta A (2008) Asymptotic theory of statistics and probability. Springer, New York
MATH Google Scholar
Davison AC, Hinkley DV (1997) Bootstrap methods and their applications. Cambridge University Press, Cambridge
Google Scholar
Dawid AP (2004) Probability, causality, and the empirical world: a Bayes–de Finetti–Popper–Borel synthesis. Stat Sci 19(1):44–57
Article MathSciNet MATH Google Scholar
Draper NR, Smith H (1998) Applied regression analysis, 3rd edn. Wiley, New York
MATH Google Scholar
Efron B (1979) Bootstrap methods: another look at the jackknife. Ann Stat 7:1–26
Article MathSciNet MATH Google Scholar
Efron B (1983) Estimating the error rate of a prediction rule: improvement on cross-validation. J Am Stat Assoc 78:316–331
Article MathSciNet MATH Google Scholar
Efron B, Tibshirani RJ (1993) An introduction to the bootstrap. Chapman & Hall, New York
MATH Google Scholar
Fan J, Gijbels I (1996) Local polynomial modelling and its applications. Chapman & Hall, London
MATH Google Scholar
Freedman DA (1981) Bootstrapping regression models. Ann Stat 9:1218–1228
Article MATH Google Scholar
Gangopadhyay AK, Sen PK (1990) Bootstrap confidence intervals for conditional quantile functions. Sankhya, Ser A 52(3):346–363
MathSciNet MATH Google Scholar
Goldberger AS (1962) Best linear unbiased prediction in the generalized linear regression model. J Am Stat Assoc 57:369–375
Article MathSciNet MATH Google Scholar
Geisser S (1993) Predictive inference: an introduction. Chapman & Hall, New York
MATH Google Scholar
Hahn J (1995) Bootstrapping quantile regression estimators. Econom Theory 11(1):105–121
Article Google Scholar
Hall P (1992) The bootstrap and edgeworth expansion. Springer, New York
Google Scholar
Hall P (1993) On edgeworth expansion and bootstrap confidence bands in nonparametric curve estimation. J R Stat Soc, Ser B, Stat Methodol 55:291–304
MATH Google Scholar
Hall P, Wehrly TE (1991) A geometrical method for removing edge effects from kernel type nonparametric regression estimators. J Am Stat Assoc 86:665–672
Article MathSciNet Google Scholar
Härdle W (1990) Applied nonparametric regression. Cambridge University Press, Cambridge
Book MATH Google Scholar
Härdle W, Bowman AW (1988) Bootstrapping in nonparametric regression: local adaptive smoothing and confidence bands. J Am Stat Assoc 83:102–110
MATH Google Scholar
Härdle W, Marron JS (1991) Bootstrap simultaneous error bars for nonparametric regression. Ann Stat 19:778–796
Article MATH Google Scholar
Hart JD (1997) Nonparametric smoothing and lack-of-fit tests. Springer, New York
Book MATH Google Scholar
Hart JD, Yi S (1998) One-sided cross-validation. J Am Stat Assoc 93(442):620–631
Article MathSciNet MATH Google Scholar
Hong Y (1999) Hypothesis testing in time series via the empirical characteristic function: a generalized spectral density approach. J Am Stat Assoc 94:1201–1220
Article MATH Google Scholar
Hong Y, White H (2005) Asymptotic distribution theory for nonparametric entropy measures of serial dependence. Econometrica 73(3):837–901
Article MathSciNet MATH Google Scholar
Horowitz J (1998) Bootstrap methods for median regression models. Econometrica 66(6):1327–1351
Article MathSciNet MATH Google Scholar
Koenker R (2005) Quantile regression. Cambridge University Press, Cambridge
Book MATH Google Scholar
Li Q, Racine JS (2007) Nonparametric econometrics. Princeton University Press, Princeton
MATH Google Scholar
Linton OB, Sperlich S, van Keilegom I (2008) Estimation of a semiparametric transformation model. Ann Stat 36(2):686–718
Article MATH Google Scholar
Loader C (1999) Local regression and likelihood. Springer, New York
MATH Google Scholar
McCullagh P, Nelder J (1983) Generalized linear models. Chapman & Hall, London
MATH Google Scholar
McMurry T, Politis DN (2008) Bootstrap confidence intervals in nonparametric regression with built-in bias correction. Stat Probab Lett 78:2463–2469
Article MathSciNet MATH Google Scholar
McMurry T, Politis DN (2010) Banded and tapered estimates of autocovariance matrices and the linear process bootstrap. J Time Ser Anal 31:471–482
Article MathSciNet MATH Google Scholar
Nadaraya EA (1964) On estimating regression. Theory Probab Appl 9:141–142
Article Google Scholar
Neumann M, Polzehl J (1998) Simultaneous bootstrap confidence bands in nonparametric regression. J Nonparametr Stat 9:307–333
Article MathSciNet MATH Google Scholar
Olive DJ (2007) Prediction intervals for regression models. Comput Stat Data Anal 51:3115–3122
Article MathSciNet MATH Google Scholar
Pagan A, Ullah A (1999) Nonparametric econometrics. Cambridge University Press, Cambridge
Google Scholar
Patel JK (1989) Prediction intervals: a review. Commun Stat, Theory Methods 18:2393–2465
Article MATH Google Scholar
Politis DN (2003) A normalizing and variance-stabilizing transformation for financial time series. In: Akritas MG, Politis DN (eds) Recent advances and trends in nonparametric statistics. Elsevier, Amsterdam, pp 335–347
Chapter Google Scholar
Politis DN (2007a) Model-free vs. model-based volatility prediction. J Financ Econom 5(3):358–389
MathSciNet Google Scholar
Politis DN (2007b) Model-free prediction. In: Bulletin of the international statistical institute—volume LXII, Lisbon, 22–29 Aug 2007, pp 1391–1397
Google Scholar
Politis DN (2010) Model-free model-fitting and predictive distributions. Discussion Paper, Department of Economics, Univ of California—San Diego. Retrieved from: http://escholarship.org/uc/item/67j6s174
Politis DN, Romano JP, Wolf M (1999) Subsampling. Springer, New York
Book MATH Google Scholar
Rosenblatt M (1952) Remarks on a multivariate transformation. Ann Math Stat 23:470–472
Article MathSciNet MATH Google Scholar
Ruppert D, Cline DH (1994) Bias reduction in kernel density estimation by smoothed empirical transformations. Ann Stat 22:185–210
Article MathSciNet MATH Google Scholar
Schmoyer RL (1992) Asymptotically valid prediction intervals for linear models. Technometrics 34:399–408
Article MathSciNet MATH Google Scholar
Schucany WR (2004) Kernel smoothers: an overview of curve estimators for the first graduate course in nonparametric statistics. Stat Sci 19:663–675
Article MathSciNet MATH Google Scholar
Seber GAF, Lee AJ (2003) Linear regression analysis. Wiley, New York
Book MATH Google Scholar
Shao J, Tu D (1995) The jackknife and bootstrap. Springer, New York
Book MATH Google Scholar
Shi SG (1991) Local bootstrap. Ann Inst Stat Math 43:667–676
Article MATH Google Scholar
Stine RA (1985) Bootstrap prediction intervals for regression. J Am Stat Assoc 80:1026–1031
Article MathSciNet MATH Google Scholar
Tibshirani R (1988) Estimating transformations for regression via additivity and variance stabilization. J Am Stat Assoc 83:394–405
Article MathSciNet Google Scholar
Tibshirani R (1996) Regression shrinkage and selection via the Lasso. J R Stat Soc, Ser B, Stat Methodol 58(1):267–288
MathSciNet MATH Google Scholar
Wang L, Brown LD, Cai TT, Levine M (2008) Effect of mean on variance function estimation in nonparametric regression. Ann Stat 36:646–664
Article MathSciNet MATH Google Scholar
Watson GS (1964) Smooth regression analysis. Sankhya, Ser A 26:359–372
MathSciNet MATH Google Scholar
Wolfowitz J (1957) The minimum distance method. Ann Math Stat 28:75–88
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

A preliminary version of this paper was presented as a Plenary Talk at the 10th International Vilnius Conference on Probability and Mathematical Statistics, June 28–July 3, 2010, and as a Special Invited Talk at the 28th European Meeting of Statisticians, August 17–22, 2010; the author is grateful to the audiences in these two—and several other—occasions for their helpful feedback. Many thanks are due to Arthur Berg, Wilson Cheung and Tim McMurry for invaluable help with R functions and computing, and to Richard Davis, Jeff Racine, Bill Schucany, Dimitrios Thomakos and Slava Vasiliev for helpful discussions. The author is also grateful to the Editors, Ricardo Cao and Domingo Morales, for their support and encouragement, and to six (!) anonymous referees for their very detailed and constructive comments; one of the referees deserves special thanks for an astute observation that helped shed light on the workings of the ‘uniformize’ algorithm of Sect. 4. This work has been partially supported by NSF grants DMS-07-06732 and DMS-10-07513, and by a fellowship from the Guggenheim Foundation.

Author information

Authors and Affiliations

Department of Mathematics, University of California—San Diego, La Jolla, CA, 92093-0112, USA
Dimitris N. Politis

Authors

Dimitris N. Politis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dimitris N. Politis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Politis, D.N. Model-free model-fitting and predictive distributions. TEST 22, 183–221 (2013). https://doi.org/10.1007/s11749-013-0317-7

Download citation

Published: 05 April 2013
Issue Date: June 2013
DOI: https://doi.org/10.1007/s11749-013-0317-7

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Model-free model-fitting and predictive distributions

Abstract

Access this article

Similar content being viewed by others

Nonparametric predictive distributions based on conformal prediction

Regression

Prior Information in Bayesian Linear Multivariate Regression

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Model-free model-fitting and predictive distributions

Abstract

Access this article

Similar content being viewed by others

Nonparametric predictive distributions based on conformal prediction

Regression

Prior Information in Bayesian Linear Multivariate Regression

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation