Metrika

, Volume 69, Issue 2–3, pp 283–304 | Cite as

Selecting local models in multiple regression by maximizing power

Article

Abstract

This paper considers multiple regression procedures for analyzing the relationship between a response variable and a vector of d covariates in a nonparametric setting where tuning parameters need to be selected. We introduce an approach which handles the dilemma that with high dimensional data the sparsity of data in regions of the sample space makes estimation of nonparametric curves and surfaces virtually impossible. This is accomplished by abandoning the goal of trying to estimate true underlying curves and instead estimating measures of dependence that can determine important relationships between variables. These dependence measures are based on local parametric fits on subsets of the covariate space that vary in both dimension and size within each dimension. The subset which maximizes a signal to noise ratio is chosen, where the signal is a local estimate of a dependence parameter which depends on the subset dimension and size, and the noise is an estimate of the standard error (SE) of the estimated signal. This approach of choosing the window size to maximize a signal to noise ratio lifts the curse of dimensionality because for regions with sparsity of data the SE is very large. It corresponds to asymptotically maximizing the probability of correctly finding nonspurious relationships between covariates and a response or, more precisely, maximizing asymptotic power among a class of asymptotic level αt-tests indexed by subsets of the covariate space. Subsets that achieve this goal are called features. We investigate the properties of specific procedures based on the preceding ideas using asymptotic theory and Monte Carlo simulations and find that within a selected dimension, the volume of the optimally selected subset does not tend to zero as n → ∞ unless the volume of the subset of the covariate space where the response depends on the covariate vector tends to zero.

Keywords

Testing Efficacy Signal to noise Curse of dimensionality Local linear regression Bandwidth selection 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ait-Sahalia Y, Bickel PJ, Stoker TM (2001) Goodness-of-fit tests for Kernel regression with an application to option implied volatilities. J Econ 105: 363–412MATHMathSciNetGoogle Scholar
  2. Azzalini A, Bowman AW, Hardle W (1989) On the use of nonparametric regression for model checking. Biometrika 76(1): 1–11MATHCrossRefMathSciNetGoogle Scholar
  3. Blyth S (1993) Optimal Kernel weights under a power criterion. J Am Stat Assoc 88(424): 1284–1286MATHCrossRefMathSciNetGoogle Scholar
  4. Bowman A, Young S (1996) Graphical comparison of nonparametric curves. Appl Stat 45(1): 83–98MATHCrossRefGoogle Scholar
  5. Doksum K, Samarov A (1995) Nonparametric estimation of global functionals and a measure of the explanatory power of covariates in regression. Ann Stat 23(5): 1443–1473MATHCrossRefMathSciNetGoogle Scholar
  6. Doksum KA, Schafer CM (2006) Powerful choices: tuning parameter selection based on power. In: Fan J, Koul H (eds) Frontiers in statistics, pp 113–141Google Scholar
  7. Doksum KA, Miura R, Yamauchi H (2000) On financial time series decompositions with applications to volatility. Hitotsubashi J Commerce Manag 35: 19–47Google Scholar
  8. Donoho DL, Liu RC (1991a) Geometrizing rates of convergence, II. Ann Stat 19(2): 633–667MATHCrossRefMathSciNetGoogle Scholar
  9. Donoho DL, Liu RC (1991b) Geometrizing rates of convergence, III. Ann Stat 19(2): 668–701MATHCrossRefMathSciNetGoogle Scholar
  10. Eubank RL, LaRiccia VN (1993) Testing for no effect in nonparametric regression. J Stat Plan Inference 36: 1–14MATHCrossRefMathSciNetGoogle Scholar
  11. Fan J (1992) Design-adaptive nonparametric regression. J Am Stat Assoc 87(420): 998–1004MATHCrossRefGoogle Scholar
  12. Fan J (1993) Local linear regression smoothers and their minimax efficiencies. Ann Stat 21(1): 196–216MATHCrossRefGoogle Scholar
  13. Fan JQ, Gijbels I (1996) Local polynomial modelling and its applications. Chapman & Hall, LondonMATHGoogle Scholar
  14. Fan J, Zhang C, Zhang J (2001) Generalized likelihood ratio statistics and wilks phenomenon. Ann Stat 29(1): 153–193MATHCrossRefMathSciNetGoogle Scholar
  15. Hall P, Heckman NE (2000) Testing for monotonicity of a regression mean by calibrating for linear functions. Ann Stat 28(1): 20–39MATHCrossRefMathSciNetGoogle Scholar
  16. Hardle W, Mammen E (1993) Comparing nonparametric versus parametric regression fits. Ann Stat 21(4): 1926–1947CrossRefMathSciNetGoogle Scholar
  17. Hart JD (1997) Nonparametric smoothing and lack-of-fit tests. Springer, New YorkMATHGoogle Scholar
  18. Hastie T (1987) A closer look at the deviance. Am Stat 41: 16–20MATHCrossRefMathSciNetGoogle Scholar
  19. Ingster YI (1982) Minimax nonparametric detection of signals in white Gaussian noise. Probl Inform Transm 18: 130–140MATHMathSciNetGoogle Scholar
  20. Karpoff JM (1987) The relation between price changes and trading volume: a survey. J Fin Quant Anal 22(1): 109–126CrossRefGoogle Scholar
  21. Lehmann EL (1999) Elements of large sample theory. Springer, New YorkMATHGoogle Scholar
  22. Lepski OV, Spokoiny VG (1999) Minimax nonparametric hypothesis testing: the case of an inhomogeneous alternative. Bernoulli 5: 333–358MATHCrossRefMathSciNetGoogle Scholar
  23. Pitman EJG (1948) Lecture notes on nonparametric statistics. Columbia University Press, ColumbiaGoogle Scholar
  24. Pitman EJG (1979) Some basic theory for statistical inference. Chapman & Hall, LondonGoogle Scholar
  25. Polzehl J, Spokoiny V (2006) Local likelihood modeling by adaptive weights smoothing. Prob Theory Rel Fields 135: 335–362MATHCrossRefMathSciNetGoogle Scholar
  26. Raz J (1990) Testing for no effect when estimating a smooth function by nonparametric regression: a randomization approach. J Am Stat Assoc 85(409): 132–138CrossRefMathSciNetGoogle Scholar
  27. Ruppert D, Wand M (1994) Multivariate locally weighted least squares regression. Ann Stat 22: 1346–1370MATHCrossRefMathSciNetGoogle Scholar
  28. Samarov A, Spokoiny V, Vial C (2005) Component identification and estimation in nonlinear high- dimensional regression models by structural adaptation. J Am Stat Assoc 100: 429–445MATHCrossRefMathSciNetGoogle Scholar
  29. Serfling RJ (1980) Approximation theorems of statistics. Wiley, New YorkMATHGoogle Scholar
  30. Stute W (1997) Nonparametric model checks for regression. Ann Stat 25: 613–641MATHCrossRefMathSciNetGoogle Scholar
  31. Zhang CM (2003) Adaptive tests of regression functions via multi-scale generalized likelihood ratios. Can J Stat 31: 151–171MATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2008

Authors and Affiliations

  1. 1.Carnegie Mellon UniversityPittsburghUSA
  2. 2.University of WisconsinMadisonUSA

Personalised recommendations