Nonparametric Methods for Big Data Analytics

  • Hao Helen ZhangEmail author
Part of the Springer Handbooks of Computational Statistics book series (SHCS)


Nonparametric methods provide more flexible tools than parametric methods for modeling complex systems and discovering nonlinear patterns hidden in data. Traditional nonparametric methods are challenged by modern high dimensional data due to the curse of dimensionality. Over the past two decades, there have been rapid advances in nonparametrics to accommodate analysis of large-scale and high dimensional data. A variety of cutting-edge nonparametric methodologies, scalable algorithms, and the state-of-the-art computational tools have been designed for model estimation, variable selection, statistical inferences for high dimensional regression, and classification problems. This chapter provides an overview of recent advances on nonparametrics in big data analytics.


Sparsity Smoothing Nonparametric estimation Regularization GAM COSSO 


  1. Altman NS (1990) Kernel smoothing of data with correlated errors. J Am Stat Assoc 85:749–759MathSciNetCrossRefGoogle Scholar
  2. Breiman L (1995) Better subset selection using the non-negative garrote. Technometrics 37:373–384MathSciNetCrossRefGoogle Scholar
  3. Breiman L, Friedman JH (1985) Estimating optimal transformations for multiple regression and correlations (with discussion). J Am Stat Assoc 80:580–619CrossRefGoogle Scholar
  4. Breiman L, Spector P (1992) Subset selection and evaluation in regression: the X-random case. Int Stat Rev 60:291–319CrossRefGoogle Scholar
  5. Buja A, Hastie TJ, Tibshirani RJ (1989) Linear smoothers and additive models. Ann Stat 17:453–555MathSciNetCrossRefGoogle Scholar
  6. Candes E, Tao T (2007) The Dantzig selector: statistical estimation when p is much larger than n. Ann Stat 35:2313–2351MathSciNetCrossRefGoogle Scholar
  7. Chen J, Chen Z (2008) Extended Bayesian information criteria for model selection with large model space. Biometrika 95:759–771MathSciNetCrossRefGoogle Scholar
  8. Chen S, Donoho DL, Saunders MA (1999) Atomic decomposition by basis pursuit. SIAM J Sci Comput 20(1):33–61MathSciNetCrossRefGoogle Scholar
  9. Cleveland W (1979) Robust locally weighted fitting and smoothing scatterplots. J Am Stat Assoc 74:829–836CrossRefGoogle Scholar
  10. Craven P, Wahba G (1979) Smoothing noise data with spline functions: estimating the correct degree of smoothing by the method of generalized cross validation. Numer Math 31:377–403CrossRefGoogle Scholar
  11. de Boor C (1978) A practical guide to splines. Springer, New YorkCrossRefGoogle Scholar
  12. Donoho D, Johnstone I (1994) Ideal spatial adaptation by wavelet shrinkage. Biometrika 81:425–455MathSciNetCrossRefGoogle Scholar
  13. Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Ann Stat 32:407–451MathSciNetCrossRefGoogle Scholar
  14. Fan J, Fan Y (2008) High-dimensional classification using features annealed independence rules. Ann Stat 36:2605–2637MathSciNetCrossRefGoogle Scholar
  15. Fan J, Gijbels I (1996) Local polynomial modeling and its applications. Chapman and Hall, Boca RatonzbMATHGoogle Scholar
  16. Fan J, Jiang J (2005) Nonparametric inference for additive models. J Am Stat Assoc 100:890–907MathSciNetCrossRefGoogle Scholar
  17. Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle property. J Am Stat Assoc 96:1348–1360CrossRefGoogle Scholar
  18. Fan J, Li R (2004). New estimation and model selection procedures for semiparametric modeling in longitudinal data analysis. J Am Stat Assoc 96:1348–1360MathSciNetCrossRefGoogle Scholar
  19. Fan J, Lv J (2008) Sure independence screening for ultrahigh dimensional feature space. J R Stat Soc B 70:849–911MathSciNetCrossRefGoogle Scholar
  20. Fan J, Härdle W, Mammen E (1998) Direct estimation of additive and linear components for high-dimensional Data. Ann Stat 26:943–971CrossRefGoogle Scholar
  21. Fan J, Feng Y, Song R (2011) Nonparametric independence screening in sparse ultra-high-dimensional additive models. J Am Stat Assoc 106:544–557MathSciNetCrossRefGoogle Scholar
  22. Friedman JH (1991) Multivariate adaptive regression splines (invited paper). Ann Stat 19:1–141CrossRefGoogle Scholar
  23. Friedman JH, Silverman BW (1989) Flexible parsimonious smoothing and additive modeling. Technometrics 31:3–39MathSciNetCrossRefGoogle Scholar
  24. Friedman JH, Stuetzle W (1981) Projection pursuit regression. J Am Stat Assoc 76:817–823MathSciNetCrossRefGoogle Scholar
  25. Green P, Silverman B (1994) Nonparametric regression and generalized linear models: a roughness penalty approach. Chapman & Hall, Boca RatonCrossRefGoogle Scholar
  26. Gu C (2002) Smoothing spline ANOVA models. Springer, BerlinCrossRefGoogle Scholar
  27. Hao N, Zhang HH (2014) Interaction screening for ultra-high dimensional data. J Am Stat Assoc 109:1285–1301CrossRefGoogle Scholar
  28. Hastie TJ, Tibshirani RJ (1990) Generalized additive models. Chapman and Hall, Boca RatonzbMATHGoogle Scholar
  29. Huang J, Horovitz J, Wei F (2010) Variable selection in nonparametric additive models. Ann Stat 38:2282–2313MathSciNetCrossRefGoogle Scholar
  30. Kimeldorf G, Wahba G (1971) Some results on Tchebycheffian spline functions. J Math Anal Appl 33:82–95MathSciNetCrossRefGoogle Scholar
  31. Lafferty J, Wasserman L (2008) RODEO: sparse, greedy nonparametric regression. Ann Stat 36:28–63MathSciNetCrossRefGoogle Scholar
  32. Leng C, Zhang HH (2007) Nonparametric model selection in hazard regression. J Nonparametric Stat 18:417–429CrossRefGoogle Scholar
  33. Lin Y, Zhang HH (2006) Component selection and smoothing in multivariate nonparametric regression. Ann Stat 34:2272–2297MathSciNetCrossRefGoogle Scholar
  34. Linton OB (1997) Efficient estimation of additive nonparametric regression models. Biometrika 84:469–473MathSciNetCrossRefGoogle Scholar
  35. Mallet S (2008) A wavelet tour of signal processing: the sparse way. Elsevier, Burlington, MAGoogle Scholar
  36. Mammen E, van de Geer S (1997) Locally adaptive regression splines. Ann Stat 25:387–413MathSciNetCrossRefGoogle Scholar
  37. Mammen E, Linton O, Nielsen J (1999) The existence and asymptotic properties of a backfitting projection algorithm under weak conditions. Ann Stat 27:1443–1490MathSciNetzbMATHGoogle Scholar
  38. Meier L, Van De Geer S, Buhlmann P (2009) High-dimensional additive modeling. Ann Stat 37:3779–3821MathSciNetCrossRefGoogle Scholar
  39. Nadaraya E (1964) On estimating regression. Theory Probab Appl 9:141–142CrossRefGoogle Scholar
  40. Opsomer JD (2000) Asymptotic properties of backfitting estimators. J Multivar Anal 73:166–179MathSciNetCrossRefGoogle Scholar
  41. Opsomer JD, Ruppert D (1998) A fully automated bandwidth selection method for fitting additive models. J Am Stat Assoc 93:605–619MathSciNetCrossRefGoogle Scholar
  42. Ravikumar P, Liu H, Lafferty J, Wasserman L (2009) Sparse additive models. J R Stat Soc Ser B 71:1009–1030MathSciNetCrossRefGoogle Scholar
  43. Schumaker L (1981) Spline functions: basic theory. Cambridge mathematical library. Cambridge University Press, CambridgezbMATHGoogle Scholar
  44. Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464MathSciNetCrossRefGoogle Scholar
  45. Stone C, Buja A, Hastie T (1994) The use of polynomial splines and their tensor-products in multivariate function estimation. Ann Stat 22:118–184MathSciNetCrossRefGoogle Scholar
  46. Stone C, Hansen M, Kooperberg C, Truong Y (1997) Polynomial splines and their tensor products in extended linear modeling. Ann Stat 25:1371–1425MathSciNetCrossRefGoogle Scholar
  47. Storlie C, Bondell H, Reich B, Zhang HH (2011) The adaptive COSSO for nonparametric surface estimation and model selection. Stat Sin 21:679–705CrossRefGoogle Scholar
  48. Tibshirani R (1996) Regression shrinkage and selection via the Lasso. J R Stat Soc Ser B 58:147–169MathSciNetzbMATHGoogle Scholar
  49. Tsybakov AB (2009) Introduction to Nonparametric Estimation. Springer, New YorkCrossRefGoogle Scholar
  50. Wahba G (1983) Bayesian “confidence intervals” for the cross-validated smoothing spline. J R Stat Soc Ser B 45:133–150MathSciNetzbMATHGoogle Scholar
  51. Wahba G (1985) A comparison of GCV and GML for choosing the smoothing parameter in the generalized spline smoothing problems. Ann Stat 13:1378–1402MathSciNetCrossRefGoogle Scholar
  52. Wahba G (1990) Spline models for observational data. In: SIAM CBMS-NSF regional conference series in applied mathematics, vol 59Google Scholar
  53. Wahba G, Wendelberger J (1980) Some new mathematical methods for variational objective analysis using splines and cross-validation. Mon Weather Rev 108:1122–1145CrossRefGoogle Scholar
  54. Wahba G, Wang Y, Gu C, Klein R, Klein B (1995) Smoothing spline ANOVA for exponential families, with application to the Wisconsin epidemiological study of diabetic retinopathy. Ann Stat 23:1865–1895MathSciNetCrossRefGoogle Scholar
  55. Wand MP (1999) A central limit theorem for local polynomial backfitting estimators. J Multivar Anal 70:57–65MathSciNetCrossRefGoogle Scholar
  56. Wang H (2009) Forward regression for ultra-high dimensional variable screening. J Am Stat Assoc 104:1512–1524MathSciNetCrossRefGoogle Scholar
  57. Wood S (2006) Generalized additive models: an introduction with R. CRC Press, Boca RatonCrossRefGoogle Scholar
  58. Zhang HH (2006) Variable selection for support vector machines via smoothing spline ANOVA. Stat Sin 16:659–674MathSciNetzbMATHGoogle Scholar
  59. Zhang C-H (2010) Nearly unbiased variable selection under minimax concave penalty. Ann Stat 38:894–942MathSciNetCrossRefGoogle Scholar
  60. Zhang HH, Lin Y (2006) Component selection and smoothing for nonparametric regression in exponential families. Stat Sin 16:1021–1042MathSciNetzbMATHGoogle Scholar
  61. Zhang HH, Lu W (2007) Adaptive-LASSO for Cox’s proportional hazard model. Biometrika 94:691–703MathSciNetCrossRefGoogle Scholar
  62. Zhang HH, Wahba G, Lin Y, Voelker M, Ferris M, Klein R, Klein B (2004) Variable selection and model building via likelihood basis pursuit. J Am Stat Assoc 99:659–672MathSciNetCrossRefGoogle Scholar
  63. Zhang HH, Cheng G, Liu Y (2011) Linear or nonlinear? Automatic structure discovery for partially linear models. J Am Stat Assoc 106:1099–1112MathSciNetCrossRefGoogle Scholar
  64. Zhao P, Yu B (2006) On model selection of lasso. J Mach Learn Res 7:2541–2563MathSciNetzbMATHGoogle Scholar
  65. Zhu H, Yao F, Zhang HH (2014) Structured functional additive regression in reproducing kernel Hilbert spaces. J R Stat Soc B 76:581–603MathSciNetCrossRefGoogle Scholar
  66. Zou H (2006) The adaptive lasso and its oracle properties. J Am Stat Assoc 101:1418–1429MathSciNetCrossRefGoogle Scholar
  67. Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc B 67:301–320MathSciNetCrossRefGoogle Scholar
  68. Zou H, Zhang HH (2009) On the adaptive elastic-net with a diverging number of parameters. Ann Stat 37:1733–1751MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of MathematicsENR2 S323, University of ArizonaTucsonUSA

Personalised recommendations