Statistical Papers

, Volume 59, Issue 2, pp 813–843 | Cite as

Regression estimation by local polynomial fitting for multivariate data streams

  • Aboubacar Amiri
  • Baba ThiamEmail author
Regular Article


In this paper we study a local polynomial estimator of the regression function and its derivatives. We propose a sequential technique based on a multivariate counterpart of the stochastic approximation method for successive experiments for the local polynomial estimation problem. We present our results in a more general context by considering the weakly dependent sequence of stream data, for which we provide an asymptotic bias-variance decomposition of the considered estimator. Additionally, we study the asymptotic normality of the estimator and we provide algorithms for the practical use of the method in data streams framework.


Local polynomial Data streams Stochastic approximation Weakly dependent sequences Kernel methods 

Mathematics Subject Classification

62G05 62G08 62G20 62L12 


  1. Aggarwal CC (2007) Data streams: models and algorithms. Springer, New YorkCrossRefzbMATHGoogle Scholar
  2. Amiri A (2012) Recursive regression estimators with application to nonparametric prediction. J Nonparametr Stat 24:169–186MathSciNetCrossRefzbMATHGoogle Scholar
  3. Amiri A, Crambes C, Thiam B (2014) Recursive estimation of nonparametric regression with functional covariate. Comput Stat Data Anal 69:154–172MathSciNetCrossRefGoogle Scholar
  4. Cao Y, He H, Man H (2012) SOMKE: kernel density estimation over data streams by sequences of selforganizing maps. IEEE Trans Neural Netw Learn Syst 23(8):1254–1268CrossRefGoogle Scholar
  5. Dedecker J, Doukhan P, Lang G, Leon JR, Louhichi S, Prieur C (2007) Weak dependence: with examples and applications. Lecture Notes in Statistics. Springer, New YorkCrossRefzbMATHGoogle Scholar
  6. Domingos P, Hulten G (2003) A general framework for mining massive data stream. J Comput Graph Stat 12(4):945–949MathSciNetCrossRefGoogle Scholar
  7. Doukhan P, Louhichi S (2001) Functional estimation of a density under a new weak dependence condition. Scand J Stat 28(2):325–341MathSciNetCrossRefzbMATHGoogle Scholar
  8. Doukhan P, Neumann MH (2008) The notion of \(\psi \)-weak dependence and its applications to bootstrapping time series. Probab Surv 5:146–168MathSciNetCrossRefzbMATHGoogle Scholar
  9. Fan J, Gijbels I (1992) Variable bandwidth and local linear regression smoothers. Ann Stat 2:2008–2036MathSciNetCrossRefzbMATHGoogle Scholar
  10. Fan J, Gijbels I (1995) Data-driven bandwidth selection in local polynomial fitting: variable bandwidth and spatial adaption. J R Stat Soc B 57(2):371–394zbMATHGoogle Scholar
  11. Fan J, Gijbels I (1996) Local polynomial modeling and its applications. Chapman and Hall, LondonzbMATHGoogle Scholar
  12. Fan J, Gijbels I, Hu TC, Huang LS (1996) A study of variable bandwidth selection for local polynomial regression. Stat Sin 6:113–127MathSciNetzbMATHGoogle Scholar
  13. Gu J, Li Q, Yang JC (2015) Multivariate local polynomial kernel estimators: leading bias and asymptotic distribution. Econom Rev 34(6–10):979–1010MathSciNetCrossRefGoogle Scholar
  14. Hansen B (2008) Uniform convergence rates for kernel estimation with dependent data. Econom Theory 24:726–748MathSciNetCrossRefzbMATHGoogle Scholar
  15. Huang Y, Chen X, Wu WB (2014) Recursive nonparametric estimation for time series. IEEE Trans Inf Theory 60(2):1301–1312MathSciNetCrossRefzbMATHGoogle Scholar
  16. Li J, Zheng M (2009) Robust estimation of multivariate regression model. Stat Papers 50(1):81–100MathSciNetCrossRefzbMATHGoogle Scholar
  17. Liang HY, Baek JI (2016) Asymptotic normality of conditional density estimation with left-truncated and dependent data. Stat Papers 57(1):1–20MathSciNetCrossRefzbMATHGoogle Scholar
  18. Masry E (1996a) Multivariate regression estimation local polynomial fitting for time series. Stoch Process Appl 65:81–101MathSciNetCrossRefzbMATHGoogle Scholar
  19. Masry E (1996b) Multivariate local polynomial regression for time series: uniform strong consistency and rates. J Time Ser Anal 17:571–599MathSciNetCrossRefzbMATHGoogle Scholar
  20. Nze AP, Bühlmann P, Doukhan P (2002) Weak dependence beyond mixing and asymptotic for nonparametric regression. Ann Stat 30(2):397–430MathSciNetCrossRefzbMATHGoogle Scholar
  21. Rio E (2000) Théorie asymptotique des processus aléatoires faiblement dépendants. Springer, BerlinzbMATHGoogle Scholar
  22. Robbins R, Monro SA (1951) A stochastic approximation method. Ann Stat 22(3):400–407MathSciNetCrossRefzbMATHGoogle Scholar
  23. Ruppert D (1985) A Newton-Raphson version of the multivariate Robbins-Monro procedure. Ann Stat 13(1):236–245MathSciNetCrossRefzbMATHGoogle Scholar
  24. Ruppert D, Wand P (1994) Multivariate locally weighted least squares regression. Ann Stat 22:1346–1370MathSciNetCrossRefzbMATHGoogle Scholar
  25. Ruppert D, Sheather SJ, Wand P (1995) An effective bandwidth selector for local least squares regression. J Am Stat Assoc 90:1257–1270MathSciNetCrossRefzbMATHGoogle Scholar
  26. Vilar JA, Vilar JM (1998) Recursive estimation of regression functions by local polynomial fitting. Ann Inst Stat Math 50(4):729–754MathSciNetCrossRefzbMATHGoogle Scholar
  27. Vilar JA, Vilar JM (2000) Recursive local polynomial regression under dependence conditions. Test 9(1):209–232MathSciNetCrossRefzbMATHGoogle Scholar
  28. Woodbury MA (1950) Inverting modified matrices. Statistical Research Group, Memorandum Report No. 42. Princeton University, PrincetonGoogle Scholar
  29. Xu M, Ishibuchi H, Gu X, Wang S (2014) Dm-KDE: dynamical kernel density estimation by sequences of KDE estimators with fixed number of components over data streams. Front Comput Sci 8(4):563–580MathSciNetCrossRefzbMATHGoogle Scholar
  30. Zhou A, Cai Z, Wei L, Qian W (2003) M-kernel merging: towards density estimation over data streams. In: Proceedings of the 18th international conference on database systems for advanced applications, pp 285–292Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  1. 1.Université Lille 3, LEM-CNRS (UMR 9221), Domaine universitaire du “pont de bois”Villeneuve d’Ascq CedexFrance

Personalised recommendations