Abstract
In this article, we consider nonparametric regression analysis between two variables when data are sampled through a complex survey. While nonparametric regression analysis has been widely used with data that may be assumed to be generated from independently and identically distributed (iid) random variables, the methods and asymptotic analyses established for iid data need to be extended in the framework of complex survey designs. Local polynomial regression estimators are studied, which include as particular cases design-based versions of the Nadaraya–Watson estimator and of the local linear regression estimator. In this paper, special emphasis is given to the local linear regression estimator. Our estimators incorporate both the sampling weights and the kernel weights. We derive the asymptotic mean squared error (MSE) of the kernel estimators using a combined inference framework, and as a corollary consistency of the estimators is deduced. Selection of a bandwidth is necessary for the resulting estimators; an optimal bandwidth can be determined, according to the MSE criterion in the combined mode of inference. Simulation experiments are conducted to illustrate the proposed methodology and an application with the Canadian survey of labour and income dynamics is presented.
Similar content being viewed by others
References
Bellhouse DR, Stafford JE (1999) Density estimation from complex surveys. Stat Sin 9: 407–424
Bellhouse DR, Stafford JE (2001) Local polynomial regression in complex surveys. Surv Methodol 27: 197–203
Breidt FJ, Opsomer JD (2000) Local polynomial regression estimators in survey sampling. Ann Stat 28: 1026–1053
Breidt FJ, Claeskens G, Opsomer JD (2005) Model-assisted estimation for complex surveys using penalised splines. Biometrika 92: 831–846
Buskirk TD (1998) Nonparametric density estimation using complex survey data. In: 1998 Proceedings of the survey research methods section, American statistical association, pp 799–801
Buskirk TD, Lohr SL (2005) Asymptotic properties of kernel density estimation with complex survey data. J Stat Plan Infer 128: 165–190
Deville JC, Särndal CE (1992) Calibration estimators in survey sampling. J Am Stat Assoc 87: 376–382
Goga C (2005) Réduction de la variance dans les sondages en présence d’information auxiliaire: une approche non paramétrique par splines de régression. Can J Stat 33: 163–180
Graubard BI, Korn EL (2002) Inference for superpopulation parameters using sample surveys. Stat Sin 17: 73–96
Härdle W (1991) Smoothing techniques with implementation in S. Springer, New York
Hartley HO, Sielken RL Jr (1975) A “super-population viewpoint” for finite population sampling. Biometrics 31: 411–422
Johnson AA, Breidt FJ, Opsomer JD (2008) Estimating distribution functions from survey data using nonparametric regression. J Stat Theor Pract 2: 419–431
Korn E, Graubard B (1998) Scatterplots with survey data. Am Stat 52: 58–69
Korn E, Midthune D, Graubard B (1997) Estimating interpolated percentiles from grouped data with large samples. J Off Stat 13: 385–399
Nadaraya EA (1964) On estimating regression. Theor Probab Appl 9: 141–142
Opsomer JD, Miller CP (2005) Selecting the amount of smoothing in nonparametric regression estimation for complex surveys. J Nonparametr Stat 17: 593–611
Parzen E (1962) On estimation of a probability density function and mode. Ann Math Stat 33: 1065–1076
Pfeffermann D (1993) The role of sampling weights when modeling survey data. Int Stat Rev 61: 317–337
Rosenblatt M (1956) Remarks on some nonparametric estimates of a density function. Ann Math Stat 27: 186–190
Ruppert D, Wand MP (1994) Multivariate locally weighted least squares regression. Ann Stat 22: 1346–1370
Ruppert D, Sheather SJ, Wand MP (1995) An effective bandwidth selector for local least squares regression. J Am Stat Assoc 90: 1257–1270
Särndal CE (1996) Efficient estimators with simple variance in unequal probability sampling. J Am Stat Assoc 91: 1289–1300
Särndal CE, Swensson B, Wretman J (1992) Model assisted survey sampling. Springer, New York
Skinner CJ, Holt D, Smith TMF (1989) Analysis of complex surveys. Wiley, New York
Wand MP, Jones MC (1995) Kernel smoothing. Chapman & Hall/CRC, New York
Watson GS (1964) Smooth regression analysis. Sankhyā A 26: 359–372
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Harms, T., Duchesne, P. On kernel nonparametric regression designed for complex survey data. Metrika 72, 111–138 (2010). https://doi.org/10.1007/s00184-009-0244-5
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00184-009-0244-5