Abstract
This article focuses on the problem of kernel regression estimation in the presence of nonignorable incomplete data with particular focus on the limiting distribution of the maximal deviation of the proposed estimators. From an applied point of view, such a limiting distribution enables one to construct asymptotically correct uniform bands, or perform tests of hypotheses, for a regression curve when the available data set suffers from missing (not necessarily at random) response values. Furthermore, such asymptotic results have always been of theoretical interest in mathematical statistics. We also present some numerical results that further confirm and complement the theoretical developments of this paper.
Similar content being viewed by others
Data availability
Enquiries about data availability should be directed to the authors.
References
Al-Sharadqah A, Mojirsheibani M (2020) A simple approach to construct confidence bands for a regression function with incomplete data. AStA Adv Stat Anal 104:81–99
Burke M (1998) A Gaussian bootstrap approach to estimation and tests. In: Szyszkowicz EB (ed) Asymptotic methods in probability and statistics. North-Holland, Amsterdam, pp 697–706
Burke M (2000) Multivariate tests-of-fit and uniform confidence bands using a weighted bootstrap. Stat Probab Lett 46:13–20
Cai T, Low M, Zongming M (2014) Adaptive confidence bands for nonparametric regression functions. J Am Stat Assoc 109:1054–1070
Chen X, Diao G, Qin J (2020) Pseudo likelihood-based estimation and testing of missingness mechanism function in nonignorable missing data problems. Scand J Stat 47:1377–1400
Claeskens G, Van Keilegom I (2003) Bootstrap confidence bands for regression curves and their derivatives. Ann Stat 31:1852–1884
Dehuvels P, Mason D (2004) General asymptotic confidence bands based on kernel-type function estimators. Stat Inference Stoch Processes 7:225–277
Devroye L, Györfi L, Lugosi G (1996) A probabilistic theory of pattern recognition. Springer, New York
Eubank R, Speckman P (2012) Confidence bands in nonparametric regression. J Am Stat Assoc 88:1287–1301
Fang F, Zhao J, Shao J (2018) Imputation-based adjusted score equations in generalized linear models with nonignorable missing covariate values. Stat Sin 28:1677–1701
Gardes L (2020) Nonparametric confidence intervals for conditional quantiles with large-dimensional covariates. Electron J Stat 14:661–701
Gu L, Yang L (2015) Oracally efficient estimation for single-index link function with simultaneous confidence band. Electron J Stat 9:1540–1561
Gu L, Wang S, Yang L (2021) Smooth simultaneous confidence band for the error distribution function in nonparametric regression. Comput Stat Data Anal 155:107106
Härdle W (1989) Asymptotic maximal deviation of M-smoothers. J Multivar Anal 29:163–179
Härdle W, Song S (2010) Confidence bands in quantile regression. Econom Theory 26:1–22
Horváth L (2000) Approximations for hybrids of empirical and partial sums processes. J Stat Plan Inference 88:1–18
Horváth L, Kokoszka P, Steinebach J (2000) Approximations for weighted bootstrap processes with an application. Stat Probab Lett 48:59–70
Janssen A (2005) Resampling Student’s t-type statistics. Ann Inst Stat Math 57:507–529
Janssen A, Pauls T (2003) How do bootstrap and permutation tests work? Ann Stat 31:768–806
Johnston G (1982) Probabilities of maximal deviations for nonparametric regression function estimates. J Multivar Anal 12:402–414
Kim JK, Yu C (2011) A semiparametric estimation of mean functionals with nonignorable missing data. J Am Stat Assoc 106:157–165
Kojadinovic I, Yan J (2012) Goodness-of-fit testing based on a weighted bootstrap: a fast large-sample alternative to the parametric bootstrap. Can J Stat 40:480–500
Konakov V, Piterbarg V (1984) On the convergence rate of maximal deviation distribution. J Multivar Anal 15:279–294
Liero H (1982) On the maximal deviation of the kernel regression function estimate. Ser Stat 13:171–182
Liu T, Yuan X (2020) Doubly robust augmented-estimating-equations estimation with nonignorable nonresponse data. Stat Pap 61:2241–2270
Liu Z, Yau CY (2021) Fitting time series models for longitudinal surveys with nonignorable missing data. J Stat Plan Inference 214:1–12
Lu X, Kuriki S (2017) Simultaneous confidence bands for contrasts between several nonlinear regression curves. J Multivar Anal 155:83–104
Lütkepohl H (2013) Reducing confidence bands for simulated impulse responses. Stat Pap 54:1131–1145
Mack Y, Silverman Z (1982) Weak and strong uniform consistency of kernel regression estimates. Z Wahrsch Verw Gebiete 61:405–415
Maity A, Pradhan V, Das U (2019) Bias reduction in logistic regression with missing responses when the missing data mechanism is nonignorable. Am Stat 73:340–349
Massé P, Meiniel W (2014) Adaptive confidence bands in the nonparametric fixed design regression model. J Nonparametr Stat 26:451–469
Mojirsheibani M (2021) On classification with nonignorable missing data. J Multivariate Anal 184:104775
Morikawa K, Kano Y (2018) Identification problem of transition models for repeated measurement data with nonignorable missing values. J Multivariate Anal 165:216–230
Morikawa K, Kim JK (2018) A note on the equivalence of two semiparametric estimation methods for nonignorable nonresponse. Stat Probab Lett 140:1–6
Morikawa K, Kim JK, Kano Y (2017) Semiparametric maximum likelihood estimation with data missing not at random. Can J Statist 45:393–409
Muminov M (2011) On the limit distribution of the maximum deviation of the empirical distribution density and the regression function. I. Theory Probab Appl 55:509–517
Muminov M (2012) On the limit distribution of the maximum deviation of the empirical distribution density and the regression function II. Theory Probab Appl 56:155–166
Nemouchi N, Mohdeb Z (2010) Asymptotic confidence bands for density and regression functions in the Gaussian case. J Afrika Statistika 5:279–287
Neumann M, Polzehl J (1998) Simultaneous bootstrap confidence bands in nonparametric regression. J Nonparametr Stat 9:307–333
O’Brien J, Gunawardena H, Paulo J, Chen X, Ibrahim J, Gygi S, Qaqish B (2018) The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments. Ann Appl Statist 12:2075–2095
Praestgaard J, Wellner J (1993) Exchangeably weighted bootstraps of the general empirical process. Ann Probab 21:2053–2086
Proksch K (2016) On confidence bands for multivariate nonparametric regression. Ann Inst Stat Math 68:209–236
Racine J, Hayfield T (2008) Nonparametric econometrics: the np package. J Stat Softw 27:1–32
Racine J, Li Q (2004) Cross-validated local linear nonparametric regression. Stat Sin 14:485–512
Rosenblatt M (1952) Remarks on a multivariate transformation. Ann Math Stat 23:470–472
Sabbah C (2014) Uniform confidence bands for local polynomial quantile estimators. ESAIM: PS 18:265–276
Sadinle M, Reiter J (2019) Sequentially additive nonignorable missing data modelling using auxiliary marginal information. Biometrika 106:889–911
Shao J, Wang L (2016) Semiparametric inverse propensity weighting for nonignorable missing data. Biometrika 103:175–187
Song S, Ritov Y, Härdle W (2012) Bootstrap confidence bands and partial linear quantile regression. J Multivar Anal 107:244–262
Sun J, Loader C (1994) Simultaneous confidence bands for linear regression and smoothing. Ann Probab 22:1328–1345
Sun L, Zhou Y (1998) Sequential confidence bands for densities under truncated and censored data. Stat Probab Lett 40:31–41
Tang N, Zhao P, Zhu H (2014) Empirical likelihood for estimating equations with nonignorably missing data. Stat Sin 24:723–47
Uehara M, Kim JK (2018) Semiparametric response model with nonignorable nonresponse. Preprint. arXiv:1810.12519
Wandl H (1980) On kernel estimation of regression functions. Wissenschaftliche Sitzungen zur Stochastik (WSS-03), Berlin
Wang J, Cheng F, Yang L (2013) Smooth simultaneous confidence bands for cumulative distribution functions. J Nonparametr Stat 25:395–407
Withers C, Nadarajah S (2012) Maximum modulus confidence bands. Stat Pap 53:811–819
Wojdyla J, Szkutnik Z (2018) Nonparametric confidence bands in Wicksell’s problem. Stat Sin 28:93–113
Xia Y (1998) Bias-corrected confidence bands in nonparametric regression. J R Stat Soc Ser B Stat Methodol 60:797–811
Yang F, Barber R (2019) Contraction and uniform convergence of isotonic regression. Electron J Stat 13:646–677
Yuan C, Hedeker D, Mermelstein R, Xie H (2020) A tractable method to account for high-dimensional nonignorable missing data in intensive longitudinal data. Stat Med 39:2589–2605
Zhao J, Shao J (2015) Semiparametric pseudo-likelihoods in generalized linear models with nonignorable missing data. J Am Stat Assoc 110:1577–1590
Zhao P, Wang L, Shao J (2019) Empirical likelihood and Wilks phenomenon for data with nonignorable missing values. Scand J Stat 46:1003–1024
Zhou S, Wang D, Zhu J (2020) Construction of simultaneous confidence bands for a percentile hyper-plane with predictor variables constrained in an ellipsoidal region. Stat Pap 61:1335–1346
Acknowledgements
This work is supported by the NSF Grant DMS-1916161 of Majid Mojirsheibani.
Funding
The authors have not disclosed any funding.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have not disclosed any competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: Proofs
Appendix: Proofs
To prove our main results, we first state a number of lemmas.
Lemma 1
Let \({\widetilde{\pi }}_{{\widehat{\gamma }}}(x, y)\) be the estimator obtained from \({\widetilde{\pi }}_{\gamma }(x, y)\) upon replacing \(\gamma \) by any estimator \({\widehat{\gamma }}\) in (9). Then, under the conditions of Theorem 2, one has
Lemma 2
Let \({\widetilde{m}}_{\pi ,n}(x)\) and \({\widehat{m}}_{n}(x)\) be as in (8) and (11), respectively. Then,
To state our next lemma, we first need to define the following auxiliary quantities, which may be viewed as particular estimates of \(\nu ^2(x)\) defined in (14)
where \({\widetilde{\pi }}_{\gamma }(x, y)\) is as in (9) and
Lemma 3
Let \({\widehat{\nu }}^2_{{\widetilde{\pi }}}(x)\), \(\nu ^2(x)\), \({\widetilde{\nu }}^2_{{\widetilde{\pi }}}(x)\), and \({\widetilde{\nu }}^2_{\pi }(x)\) be as in (13), (14), (29), and (28), respectively. Then
Proof of Theorem 2
To prove Theorem 2, we first consider the following simple decomposition
where the remainder term, \({{\mathcal {R}}}_n\), is given by
To deal with the first term on the r.h.s of (34), first observe that \({\widetilde{m}}_{\pi ,n}(x)\) and \({\widetilde{\nu }}_{\pi ,n}^2(x)\) that appear in this supremum term are, respectively, the kernel regression estimator of \(E(Y^*|X=x)\) and the kernel estimator of the conditional variance of \(Y^*\) based on the iid “data” \((X_i, Y^*_i),\) \(i=1,\ldots ,n\), where \(Y^*=\Delta Y\big /\pi _{\gamma }(X,Y) + \varepsilon \); see (12). Furthermore, when assumptions (A), (F), and (G) hold, we have \(P\{B_L\le Y^* \le B^U\}=1\) for finite constants \(B_L\) and \(B^U\). In fact, one can take \(B_L=\pi ^{-1}_{\mathrm{min}}\min (0,B_1)+a_0\) and \(B^U=\pi ^{-1}_{\mathrm{min}}B_2+b_0\), where \(B_1\) and \(B_2\) are the constants in Assumption (A), the term \(\pi _{\mathrm{min}}\) is as in assumption (F), and \(a_0\) and \(b_0\) are given in Assumption (G). Therefore, when Assumption (A) holds for the distribution of (X, Y) then, in view of assumptions (F) and (G), it also holds for the distribution of \((X,Y^*)\) with \(B_1\) and \(B_2\) replaced by \(B_L\) and \(B^U\). Additionally, it is not hard to show that, in view of Assumption (F), if \(\nu _0^2(x) := E[(Y-m(X))^2|X=x]\) satisfies Assumption (C) then so does \(\nu ^2(x)\). Hence, in view of Theorem 1, and under assumptions (A), (B), (C), (D\('\)), (E\('\)), (F), and (G), the first term on the r.h.s of (34) satisfies
where \(c_K=\int K^2(t)\,dt\) and \(\varphi (n)\) is as in (15). Now to finish the proof of Theorem 2, we have to show that \(\sqrt{n h_n \log n}\,{{\mathcal {R}}}_n\rightarrow ^p 0\), as \(n\rightarrow \infty \). However, by (35), it is sufficient to show that \(\sqrt{n h_n \log n}\,\big |{{\mathcal {R}}}_n(i)\big |\rightarrow ^p 0\) and \(\sqrt{n h_n \log n}\,\big |{{\mathcal {R}}}_n(ii)\big |\rightarrow ^p 0\). To this end, first note that (36) yields
We also note that
However, in view of (32) and (31),
Also, observe that
Now, taking the limit, as \(n\rightarrow \infty \), of both sides of (40) and (41) and taking into account Lemma 3, we arrive at
This together with (39), (38), and (37) yields
from which we arrive at
To deal with the term \({{\mathcal {R}}}_n(i)\) in (35), first note that by Lemma 2
where we have used the fact that \(\beta <\delta \). Furthermore, since by (42), \( \sup _{x\in [0,1]}\, \big | f_n(x)/ {\widehat{\nu }}^2_{{\widetilde{\pi }}} (x)\big | \le \, \big \{\sup _{x\in [0,1]}\, \big | f_n(x)-f(x)\big |+\sup _{x\in [0,1]}f(x)\big \}\big / \inf _{x\in [0,1]}{\widehat{\nu }}^2_{{\widetilde{\pi }}}(x) \,=\, {{\mathcal {O}}}_p(1), \) one finds
This completes the proof of Theorem 2. \(\square \)
Proof of Theorem 3
The proof is similar to that of Theorem 2, but uses a result of Konakov and Piterbarg (1984, Theorem 1.1) instead of that of Liero (1982). \(\square \)
Proof of Lemma 1
We start by defining the following quantities
Then it is straightforward to see
Now, put \(c :=\max (|B_1|, |B_2|)\), where \(B_1\) and \(B_2\) are as in Assumption (A), and observe that a one-term Taylor expansion gives
where the bound does not depend on x. Similarly, we note that
where the bound in (50) does not depend on the particular x or \(Y_i\). Now, observe that
To deal with the right side of (51), first note that
Now, since the bound in (50) does not depend on any particular x or \(Y_i\), one finds
Next, let n be large enough so that \(A h_n <\epsilon \), where \(\epsilon \) is as in assumption (B), and observe that by the results of Mack and Silverman (1982; Theorem B), one has
where \(c :=\max (|B_1|, |B_2|)\) as before, and \(B_1\) and \(B_2\) are as in assumption (A). Furthermore,
We also need to deal with the infimum of the term \(\big |{\widehat{\phi }}_2(x) \big |\) that appears in the denominator of (52). To this end, we first note that \(\big |{\widehat{\phi }}_2(x) \big |\) can be upper- and lower-bounded as follows
Taking the infimum over \(x\in [-h_n,\, 1+A h_n]\), we find \(\inf _x\left| \phi _2(x)\right| - \sup _x\big |{\widetilde{\phi }}_2(x) - \phi _2(x)\big | -\sup _x\big |{\widehat{\phi }}_2(x) -{\widetilde{\phi }}_2(x)\big |\le ~\inf _x \big |{\widehat{\phi }}_2(x)\big | \,\le \, \sup _x\big |{\widehat{\phi }}_2(x)-{\widetilde{\phi }}_2(x)\big |+ \sup _x\big |{\widetilde{\phi }}_2(x) - \phi _2(x)\big |+ \sup _x\big |\phi _2(x)\big |. \) Therefore, taking the limit as \(n\rightarrow \infty \), one finds
for a positive constant \(\varphi _0\) not depending on n. Here, (56) follows from (49) in conjunction with Theorem B of Mack and Silverman (1982). Furthermore, similar (and in fact easier) arguments can also be used to show that
Now (25) follows from (57), (56), (55), (54), (53), (51), and (48). The proof of (26) is very similar to (and, in fact, easier than) that of (25) and therefore will not be given. \(\square \)
Proof of Lemma 2
Let \({\widetilde{m}}_{{\widetilde{\pi }},n}(x)\) be as in (30), and note that
where \(c_2\) is a positive constant not depending on n. Therefore, in view of (26),
Similarly, one has
which, together with (25), yields
The proof of Lemma 2 now follows from (58) and (59) and the fact that \( \big |{\widehat{m}}_{n}(x) - {\widetilde{m}}_{\pi ,n}(x)\big | \le \big |{\widehat{m}}_{n}(x) - {\widetilde{m}}_{{\widetilde{\pi }},n}(x)\big | + \big |{\widetilde{m}}_{{\widetilde{\pi }},n}(x) - {\widetilde{m}}_{\pi ,n}(x)\big |. \) \(\square \)
Proof of Lemma 3
We start with the proof of (31). First observe that
However, we have
where \(r_n(x) =\sum _{i=1}^n \Delta _i Y^2_i\, {{\mathcal {K}}}((x-X_i)/h_n) / \sum _{i=1}^n {{\mathcal {K}}}((x-X_i)/h_n)\le (|B_1|\vee |B_2|)^2\), where \(B_1\) and \(B_2\) are as in assumption (A). Therefore, in view of (25) and (26), we obtain
Similarly, we have
Next, to deal with the term \(\big |U_{n,3}(x)\big |\) in (60), we observe that \(\big |U_{n,3}(x)\big | \le \big |{\widetilde{m}}_{{\widetilde{\pi }},n}(x) - {\widehat{m}}_{n}(x)\big |\times \big \{ \big |{\widetilde{m}}_{{\widetilde{\pi }},n}(x) - {\widehat{m}}_{n}(x)\big | +2 \big |{\widetilde{m}}_{{\widetilde{\pi }},n}(x) - {\widetilde{m}}_{\pi ,n}(x)\big | +2 \big |{\widetilde{m}}_{\pi ,n}(x) - m(x)\big | +2|m(x)|\big \}\). Consequently, in view of (58) and (59) and the result of Mack and Silverman (1982, Theorem B), we get
Now, (31) follows from the above bounds together with (60). The proof of (32) is similar and goes as follows.
But
where \(c_3\) is a positive constant not depending on n. Therefore, by (26) and the second part of assumption (F), we have
Similarly, one has \(\sup _{x\in [0,1]}\,\big |T_{n,2}(x)\big |= {{\mathcal {O}}}_p \left( \sqrt{\log n/(n\lambda _n)}\right) \). Furthermore, since
one finds (in view of (58)) \( \sup _{x\in [0,1]}\,\big |T_{n,2}(x)\big | = {{\mathcal {O}}}_p \left( \sqrt{\log n/(n\lambda _n)}\right) . \) Now, (32) follows from (61) together with the above bounds. The proof of (33) is straightforward and, in fact, easier than those of (32) and (31), and hence will not be given. \(\square \)
Rights and permissions
About this article
Cite this article
Mojirsheibani, M. On the maximal deviation of kernel regression estimators with NMAR response variables. Stat Papers 63, 1677–1705 (2022). https://doi.org/10.1007/s00362-022-01293-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-022-01293-0