Optimal subsampling for functional quantile regression

Yan, Qian; Li, Hanyu; Niu, Chengmei

doi:10.1007/s00362-022-01367-z

Optimal subsampling for functional quantile regression

Regular Article
Published: 19 October 2022

Volume 64, pages 1943–1968, (2023)
Cite this article

Statistical Papers Aims and scope Submit manuscript

Qian Yan¹,
Hanyu Li¹ &
Chengmei Niu¹

429 Accesses
2 Citations
Explore all metrics

Abstract

Subsampling is an efficient method to deal with massive data. In this paper, we investigate the optimal subsampling for linear quantile regression when the covariates are functions. The asymptotic distribution of the subsampling estimator is first derived. Then, we obtain the optimal subsampling probabilities based on the A-optimality criterion. Furthermore, the modified subsampling probabilities without estimating the densities of the response variables given the covariates are also proposed, which are easier to implement in practise. Numerical experiments on synthetic and real data show that the proposed methods always outperform the one with uniform sampling and can approximate the results based on full data well with less computational efforts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimal subsampling for composite quantile regression in big data

Article 08 February 2022

Simultaneous variable selection and parametric estimation for quantile regression

Article 16 July 2014

Bayesian Analysis of Composite Quantile Regression

Article 06 July 2016

Notes

In Figures 1, 2, and 3, the three columns correspond to the three distributions of the basis coefficients (mvNormal, mvT3, mvT2), respectively, and the three rows correspond to the three distributions of random errors (Normal, T1, Hetero), respectively. For example, the figure in the first column and the first row is for the mvNormal-Normal datasets.

References

Ai M, Wang F, Yu J, Zhang H (2021) Optimal subsampling for large-scale quantile regression. J Complex 62:10512
Article MathSciNet MATH Google Scholar
Ai M, Yu J, Zhang H, Wang H (2021) Optimal subsampling algorithms for big data regression. Stat Sinica 31(2):749–772
Atkinson A, Donev AN, Tobias RD (2007) Optimum experimental designs, with SAS. Oxford University Press, New York
Book MATH Google Scholar
Cardot H, Ferraty F, Sarda P (2003) Spline estimators for the functional linear model. Stat Sin 13:571–591
MathSciNet MATH Google Scholar
Cardot H, Crambes C, Sarda P (2005) Quantile regression when the covariates are functions. J Nonparameter Stat 17(7):841–856
Article MathSciNet MATH Google Scholar
Cardot H, Crambes C, Sarda P (2004) Conditional quantiles with functional covariates: an application to ozone pollution forecasting. In: Compstat 2004 Proceedings, pp 769–776
Chen K, Müller H (2012) Conditional quantile analysis when covariates are functions, with application to growth data. J R Stat Soc B 74(2):67–89
Article MathSciNet MATH Google Scholar
Chen K, Breitner S, Wolf K et al (2021) Ambient carbon monoxide and daily mortality: a global time-series study in 337 cities. Lancet Planet Health 5(4):e191–e199
Article Google Scholar
Claeskens G, Krivobokova T, Opsomer JD (2009) Asymptotic properties of penalized spline estimators. Biometrika 96(3):529–544
Article MathSciNet MATH Google Scholar
de Boor C (2001) A practical guide to splines. Springer, Berlin
MATH Google Scholar
Dobriban E, Liu S (2019) Asymptotics for sketching in least squares regression. In: Advances in Neural Information Processing Systems 32, pp 3675–3685
Drineas P, Magdon-Ismail M, Mahoney MW, Woodruff DP (2012) Fast approximation of matrix coherence and statistical leverage. J Mach Learn Res 13(1):3441–3472
MathSciNet MATH Google Scholar
Drineas P, Mahoney MW, Muthukrishnan S (2006) Sampling algorithms for $l_2$ regression and applications. In: Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithm, pp 1127–1136
Fan Y, Liu Y, Zhu L (2021) Optimal subsampling for linear quantile regression models. Can J Stat 49(4):1039–1057
Article MathSciNet MATH Google Scholar
He S, Yan X (2022) Functional principal subspace sampling for large scale functional data analysis. Electron J Stat 16(1):2621–2682
Article MathSciNet MATH Google Scholar
Hjort NL, Pollard D (2011) Asymptotics for minimisers of convex processes. arXiv preprint arXiv:1107.3806
Homrighausen D, McDonald DJ (2019) Compressed and penalized linear regression. J Comput Graph Stat 29:309–322
Article MathSciNet MATH Google Scholar
Kato K (2012) Estimation in functional linear quantile regression. Ann Stat 40(6):3108–3136
Article MathSciNet MATH Google Scholar
Kinoshita H, Türkan H, Vucinic S et al (2020) Carbon monoxide poisoning. Toxicol Rep 7:169–173
Article Google Scholar
Koenker R (2005) Quantile regression. Cambridge University Press, Cambridge
Book MATH Google Scholar
Koenker R, Bassett G (1978) Regression quantiles. Econometrica 46(1):33–50
Article MathSciNet MATH Google Scholar
Liu C, Yin P, Chen R et al (2018) Ambient carbon monoxide and cardio-vascular mortality: a nationwide time-series analysis in 272 cities in China. Lancet Planet Health 2(1):e12–e18
Article Google Scholar
Liu H, You J, Cao J (2021) Functional L-optimality subsampling for massive data. arXiv preprint arXiv:2104.03446
Ma P, Mahoney MW, Yu B (2015) A statistical perspective on algorithmic leveraging. J Mach Learn Res 16(27):861–911
MathSciNet MATH Google Scholar
Mahoney MW (2011) Randomized algorithms for matrices and data. Found Trends Mach Learn 3:123–224
MATH Google Scholar
Ma P, Zhang X, Xing X, Ma J, Mahoney MW (2020) Asymptotic analysis of sampling estimators for randomized numerical linear algebra algorithms. In: Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, pp 1026–1035
Moazami S, Noori R, Amiri BJ et al (2016) Reliable prediction of carbon monoxide using developed support vector machine. Atmos Pollut Res 7(3):412–418
Article Google Scholar
Raskutti G, Mahoney MW (2016) A statistical perspective on randomized sketching for ordinary least-squares. J Mach Learn Res 17(213):1–31
MathSciNet MATH Google Scholar
Reiss P, Huang L (2012) Smoothness selection for penalized quantile regression splines. Int J Biostat. https://doi.org/10.1515/1557-4679.1381
Article MathSciNet Google Scholar
Ruppert D (2002) Selecting the number of knots for penalized splines. J Comput Graph Stat 11(4):735–757
Article MathSciNet Google Scholar
Sang P, Cao J (2020) Functional single-index quantile regression models. Stat Comput 30(4):771–781
Article MathSciNet MATH Google Scholar
Shams R, Jahani A, Moeinaddini M, Khorasani N (2020) Air carbon monoxide forecasting using an artificial neural network in comparison with multiple regression. Model Earth Syst Environ 6:1467–1475
Article Google Scholar
Shao Y, Wang L (2021) Optimal subsampling for composite quantile regression model in massive data. Stat Pap 63:1139–1161
Article MathSciNet MATH Google Scholar
Shao L, Song S, Zhou Y (2022) Optimal subsampling for large-sample quantile regression with massive data. Can J Stat. https://doi.org/10.1002/cjs.11697
Article MATH Google Scholar
Stone CJ (1985) Additive regression and other nonparametric models. Ann Stat 13(2):689–705
Article MathSciNet MATH Google Scholar
Wang H (2019) More efficient estimation for logistic regression with optimal subsamples. J Mach Learn Res 20(132):1–59
MathSciNet MATH Google Scholar
Wang H, Ma Y (2021) Optimal subsampling for quantile regression in big data. Biometrika 108(1):99–112
Article MathSciNet MATH Google Scholar
Wang H, Zhu R, Ma P (2018) Optimal subsampling for large sample logistic regression. J Am Stat Assoc 113(522):829–844
Article MathSciNet MATH Google Scholar
Wang S, Gittens A, Mahoney MW (2018) Sketched ridge regression: optimization perspective, statistical perspective, and model averaging. J Mach Learn Res 18(218):1–50
MathSciNet MATH Google Scholar
Yao Y, Wang H (2019) Optimal subsampling for softmax regression. Stat Pap 60(2):585–599
Article MathSciNet MATH Google Scholar
Yoshida T (2013) Asymptotics for penalized spline estimators in quantile regression. Commun Stat Theory M. https://doi.org/10.1080/03610926.2013.765477
Article MATH Google Scholar
Yu J, Wang H, Ai M, Zhang H (2020) Optimal distributed subsampling for maximum quasi-likelihood estimators with massive data. J Am Stat Assoc 117(537):265–276
Article MathSciNet MATH Google Scholar
Yuan M (2006) GACV for quantile smoothing splines. Comput Stat Data Ann 50(3):813–829
Article MathSciNet MATH Google Scholar
Yuan X, Li Y, Dong X, Liu T (2022) Optimal subsampling for composite quantile regression in big data. Stat Pap 63:1649–1676
Article MathSciNet MATH Google Scholar
Zhou S, Shen X, Wolfe D (1998) Local asymptotics for regression splines and confidence regions. Ann Stat 26(25):1760–1782
MathSciNet MATH Google Scholar

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (No. 11671060) and the Natural Science Foundation Project of CQ CSTC (No. cstc2019jcyj-msxmX0267). The authors would like to thank the editor and the anonymous reviewers for their detailed comments and helpful suggestions.

Author information

Authors and Affiliations

College of Mathematics and Statistics, Chongqing University, Chongqing, 401331, People’s Republic of China
Qian Yan, Hanyu Li & Chengmei Niu

Authors

Qian Yan
View author publications
You can also search for this author in PubMed Google Scholar
Hanyu Li
View author publications
You can also search for this author in PubMed Google Scholar
Chengmei Niu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hanyu Li.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: proofs for theoretical results

To prove our theorems, we begin with the following several lemmas. Note that the subsampling model involves two kinds of random errors: sampling error and model error, so we need to consider these two types of randomness in the calculation.

Lemma 1

Under Assumptions 1 and 5, for any vector $ \varvec{\mu } \in \mathbb {R}^{K+p+1} $, there are some positive constants $ C_3$, $C_4 $, $ C_5 $ and $ C_6 $ such that

$$\begin{aligned} C_3K^{-1}\le \sigma _{min}(\varvec{G})\le \sigma _{max}(\varvec{G})\le C_4K^{-1},\\ C_5K^{2q-1}\Vert \varvec{\mu }\Vert ^2_2\le \varvec{\mu }^T\varvec{D}_q\varvec{\mu }\le C_6K^{2q-1}\Vert \varvec{\mu }\Vert ^2_2, \end{aligned}$$

where $ \sigma _{min}(\cdot ) $ and $\sigma _{max}(\cdot ) $ denote the smallest and largest eigenvalues of a matrix, respectively. In addition, we have $\Vert \varvec{G}\Vert _{\infty }=O(K^{-1})$ and $\Vert \varvec{D}_q\Vert _{\infty }=O(K^{2q-1})$.

Proof

These results can be derived directly from Lemma S2 and S3 in the supplementary file of Liu et al. (2021). $\square $

Lemma 2

Under Assumptions 1, and 3–5, there are two positive constants $ C_7 $ and $ C_8 $ such that

$$\begin{aligned} C_7K^{-1}\le \sigma _{min}(\varvec{H}_{\tau })\le \sigma _{max}(\varvec{H}_{\tau })\le C_8K^{-1}, \end{aligned}$$

and $\Vert \varvec{H}_{\tau }\Vert _{\infty }=O(K^{-1})$.

Proof

From Assumption 3, we have that there are two positive constants $ c_{\epsilon } $ and $ C_{\epsilon } $ such that $ c_{\epsilon }\le f_{\epsilon \mid \varvec{X}(t)}(0,x(t))\le C_{\epsilon } $. On the other hand, by Lemma 1, we have $\Vert \varvec{G}_{\tau }\Vert _{\infty }=O(K^{-1})$. Thus, the lemma can be directly proved by combining Lemma 1 with Assumptions 3 and 4. $\square $

Lemma 3

Let $ \psi _\tau (u)=\tau -I(u<0) $ and $ u_i=y_i-\varvec{B}^T_i\varvec{\theta }_0 $. Under the same assumptions as Theorem 3, for any non-zero $ \varvec{\delta } \in \mathbb {R}^{K+p+1}$, we have

$$\begin{aligned} -\sqrt{\frac{K}{r}}\sum _{i=1}^{n}\frac{R_i}{n\pi _i}\varvec{B}^T_i\varvec{\delta }\psi _\tau (u_i)=-\sqrt{K}\varvec{W}^T\varvec{\delta }+o_P(1), \end{aligned}$$

(A1)

where $ \left\{ \tau (1-\tau )(\varvec{V}_{\pi }+\eta \varvec{G})\right\} ^{-1/2}\varvec{W}\rightarrow {N(\varvec{0},\varvec{I})} $ in distribution.

Proof

Set

$$\begin{aligned} U_r=-\sqrt{\frac{K}{r}}\sum _{i=1}^{n}\frac{R_i}{n\pi _i}\varvec{B}^T_i\varvec{\delta }\psi _\tau (u_i). \end{aligned}$$

To prove the asymptotic normality of $ U_r $, it suffices to verify that $ U_r $ satisfies the Lindeberg-Feller conditions. Firstly, the conditional expectation and conditional variance are given by

$$\begin{aligned} \textrm{E}\left\{ U_r\mid \mathcal {F}_n\right\}= & {} -\sqrt{\frac{K}{r}}\sum _{i=1}^{n}\textrm{E}\left\{ \frac{R_i}{n\pi _i}\varvec{B}^T_i\varvec{\delta }\psi _\tau (u_i)\mid \mathcal {F}_n\right\} \nonumber \\= & {} -\frac{\sqrt{rK}}{n}\sum _{i=1}^{n}\varvec{B}^T_i\varvec{\delta }\psi _\tau (u_i),\\ \textrm{Var}\left\{ U_r\mid \mathcal {F}_n\right\}= & {} \frac{K}{r}\sum _{i=1}^{n}\textrm{Var}\left\{ \frac{R_i}{n\pi _i}\varvec{B}^T_i\varvec{\delta }\psi _\tau (u_i)\mid \mathcal {F}_n\right\} \nonumber \\ {}= & {} \frac{K}{n^2}\sum _{i=1}^{n}\frac{\pi _i(1-\pi _i)}{\pi ^2_i}(\varvec{B}^T_i\varvec{\delta })^2\psi ^2_\tau (u_i). \end{aligned}$$

From the fact that $ \textrm{P}(y_i<\int ^1_0 x_i(t)\beta (t)\textrm{d}t\mid x_i(t))=\tau $, we have

$$\begin{aligned} \textrm{E}\left\{ \psi _\tau (u_i)\mid x_i(t)\right\}= & {} \tau -\textrm{E}\left\{ I(u_i<0)\mid x_i(t)\right\} \nonumber \\= & {} \tau -\textrm{P}\left( y_i<\varvec{B}^T_i\theta _0\mid x_i(t)\right) \nonumber \\= & {} \tau -\textrm{P}\left( y_i<\int ^1_0 x_i(t)(\beta (t)+b_a(t)(1+o_P(1)))\textrm{d}t\mid x_i(t)\right) \nonumber \\= & {} - b_i f_{\epsilon \mid \varvec{X}(t)}(0,x_i(t))(1+o_P(1)) \nonumber \\= & {} o_P(1), \end{aligned}$$

where $ b_i=\int _{0}^{1}x_i(t)b_a(t)\textrm{d}t $, and the third equality is from the definition of $ \varvec{\theta }_0 $ and the fourth equality is obtained by the Taylor expansion of the cumulative distribution function of the error $ \epsilon _i $ at point $\epsilon _i=0 $. As a result, the unconditional expectation of $ U_r $ can be calculated as

$$\begin{aligned} \textrm{E}\left[ U_r\right]= & {} -\frac{\sqrt{rK}}{n}\textrm{E}\left\{ \sum _{i=1}^{n}\varvec{B}^T_i\varvec{\delta }\psi _\tau (u_i)\mid x_i(t)\right\} \nonumber \\= & {} \frac{\sqrt{rK}}{n}\sum _{i=1}^{n}\varvec{B}^T_i\varvec{\delta } b_if_{\epsilon \mid \varvec{X}(t)}(0,x_i(t))(1+o_P(1)) \nonumber \\= & {} O(\sqrt{rK}K^{-(d+1)}). \end{aligned}$$

(A2)

More specifically, since $ x_i(t) $ are square integrable functions, by the Cauchy-Schwarz inequality in integral form, there exist constant c such that

$$\begin{aligned} \varvec{B}^2_i= & {} \left( \int _0^1x_i(t)\varvec{B}(t)\textrm{d}t\right) ^2\\\le & {} \int _0^1x^2_i(t)\textrm{d}t\cdot \int _0^1\varvec{B}^2(t)\textrm{d}t\le c\int _0^1\varvec{B}^2(t)\textrm{d}t. \end{aligned}$$

Similarly, we have

$$\begin{aligned} b_i^2\le & {} c\int _0^1b_a^2(t)\textrm{d}t. \end{aligned}$$

Thus, by the property of B-spline function, $ \int _0^1\varvec{B}(t)\textrm{d}t = O(K^{-1}) $, and $ b_a(t)=O(K^{-d})$, we can find that $ \Vert \varvec{B}_i \Vert _{\infty }= O(K^{-1})$ and $ b_i=O(K^{-d})$ are satisfied. Putting them together, we obtain (A2).

On the other hand, according to law of total variance, the unconditional variance is given by

$$\begin{aligned} \textrm{Var}\left[ U_r\right]= & {} \textrm{Var}\left\{ -\sqrt{\frac{K}{r}}\sum _{i=1}^{n}\frac{R_i}{n\pi _i}\varvec{B}^T_i\varvec{\delta }\psi _\tau (u_i)\right\} \nonumber \\= & {} \textrm{E}\left\{ \textrm{Var}\left\{ -\sqrt{\frac{K}{r}}\sum _{i=1}^{n}\frac{R_i}{n\pi _i}\varvec{B}^T_i\varvec{\delta }\psi _\tau (u_i)\mid \mathcal {F}_n\right\} \right\} \nonumber \\{} & {} +\textrm{ Var}\left\{ \textrm{E}\left\{ -\sqrt{\frac{K}{r}}\sum _{i=1}^{n}\frac{R_i}{n\pi _i}\varvec{B}^T_i\varvec{\delta }\psi _\tau (u_i)\mid \mathcal {F}_n\right\} \right\} . \end{aligned}$$

(A3)

We first deal with the first term in (A3) as follows

$$\begin{aligned}&\textrm{E}\left\{ \textrm{Var}\left\{ -\sqrt{\frac{K}{r}}\sum _{i=1}^{n}\frac{R_i}{n\pi _i}\varvec{B}^T_i\varvec{\delta }\psi _\tau (u_i)\mid \mathcal {F}_n\right\} \right\} \nonumber \\&\quad = \frac{K}{n^2}\textrm{E}\left\{ \sum _{i=1}^{n}\frac{\pi _i(1-\pi _i)}{\pi ^2_i}(\varvec{B}^T_i\varvec{\delta })^2\psi ^2_\tau (u_i)\mid x_i(t)\right\} \nonumber \\&\quad =K\tau (1-\tau )\varvec{\delta }^T\left\{ \sum _{i=1}^{n}\frac{\varvec{B}_i\varvec{B}^T_i}{n^2\pi _i}-\sum _{i=1}^{n}\frac{\varvec{B}_i\varvec{B}^T_i}{n^2}\right\} \varvec{\delta }(1+o_P(1)). \end{aligned}$$

(A4)

Similarly, the second term in (A3) equals

$$\begin{aligned}&\textrm{Var}\left\{ \textrm{E}\left\{ -\sqrt{\frac{K}{r}}\sum _{i=1}^{n}\frac{R_i}{n\pi _i}\varvec{B}^T_i\varvec{\delta }\psi _\tau (u_i)\mid \mathcal {F}_n\right\} \right\} \nonumber \\&\quad = \frac{rK}{n^2}\textrm{Var}\left\{ \sum _{i=1}^{n}\varvec{B}^T_i\varvec{\delta }\psi _\tau (u_i)\mid x_i(t)\right\} \nonumber \\&\quad =rK\tau (1-\tau )\varvec{\delta }^T\left( \sum _{i=1}^{n}\frac{\varvec{B}_i\varvec{B}^T_i}{n^2}\right) \varvec{\delta }(1+o_P(1)). \end{aligned}$$

(A5)

Thus, substituting (A4) and (A5) into (A3), we have

$$\begin{aligned} \textrm{Var}\left[ U_r\right]= & {} K\tau (1-\tau )\varvec{\delta }^T\left\{ \sum _{i=1}^{n}\frac{\varvec{B}_i\varvec{B}^T_i}{n^2\pi _i}+\frac{r-1}{n}\sum _{i=1}^{n}\frac{\varvec{B}_i\varvec{B}^T_i}{n}\right\} \varvec{\delta }(1+o_P(1))\nonumber \\= & {} K\tau (1-\tau )\varvec{\delta }^T\left( \varvec{V}_{\pi }+\eta \varvec{G}\right) \varvec{\delta }(1+o_P(1)). \end{aligned}$$

(A6)

Denote $\xi _i = -\sqrt{\frac{K}{r}}\frac{R_i}{n\pi _i}\varvec{B}^T_i\varvec{\delta }\psi _\tau (u_i)$. We now check the Lindeberg-Feller conditions. For every $ \epsilon >0 $,

$$\begin{aligned} \sum _{i=1}^{n}\textrm{E}\left\{ \Vert \xi _i\Vert ^2I(\Vert \xi _i\Vert >\epsilon )\right\}&\le {\frac{1}{\epsilon }\sum _{i=1}^{n}\textrm{E}\left\{ \Vert \xi _i\Vert ^3\right\} }\nonumber \\&\le \left( \frac{K}{r}\right) ^{3/2}\frac{1}{\epsilon }\sum _{i=1}^{n}\textrm{E}\left\{ \frac{R^3_i\Vert \varvec{B}^T_i\varvec{\delta }\Vert ^3\Vert \psi _\tau (u_i)\Vert ^3}{n^3\pi ^3_i}\right\} \nonumber \\&=\left( \frac{K}{r}\right) ^{3/2}\frac{1}{\epsilon }\sum _{i=1}^{n}\frac{\textrm{E}\left[ R_i^3\right] \mid \varvec{B}^T_i\varvec{\delta }\mid ^3\textrm{E}\left\{ \Vert \psi _\tau (u_i)\Vert ^3\mid x_i(t)\right\} }{n^3\pi _i^3}\nonumber \\&= o_P(1), \end{aligned}$$

(A7)

where

$$\begin{aligned} \textrm{E}\left[ R_i^3\right] =r(r-1)(r-2)\pi _i^3+3r(r-1)\pi _i^2+r\pi _i, \end{aligned}$$

and the last equality holds by combining Assumption 6 and the fact that $ \mid \psi _\tau (u_i)\mid \le 1 $. Thus, by Lindeberg-Feller central limit theorem, it can be concluded that as $ n \rightarrow \infty $, $ r \rightarrow \infty $,

$$\begin{aligned} \frac{U_r-\textrm{E}\left[ U_r\right] }{\sqrt{\textrm{Var}\left[ U_r\right] }} \rightarrow N(0,1) \end{aligned}$$

in distribution, which implies that the equation (A1) holds because $ \textrm{E}\left[ U_r\right] =O(\sqrt{rK}K^{-(d+1)})=o_P(1) $. This completes the proof. $\square $

Lemma 4

Let $ v_i=\sqrt{K/r}\varvec{B}^T_i\mathrm {\delta } $. Under the same assumptions as Theorem 3,

$$\begin{aligned} \sum _{i=1}^{n}\frac{R_i\int _{0}^{v_i}\{I(u_i \le s)-I(u_i\le 0)\}\textrm{d}s}{n\pi _i}=\frac{K}{2}\varvec{\delta }^T\varvec{G}_{\tau }\varvec{\delta } +o_P(1) . \end{aligned}$$

Proof

Let

$$\begin{aligned} M_r=\sum _{i=1}^{n}\frac{R_i\int _{0}^{v_i}\{I(u_i \le s)-I(u_i\le 0)\}\textrm{d}s}{n\pi _i}. \end{aligned}$$

Since

$$\begin{aligned}{} & {} \textrm{E}\left\{ \frac{R_i\int _{0}^{v_i}\left\{ I(u_i \le s)-I(u_i\le 0)\right\} \textrm{d}s}{n\pi _i}\right\} \\{} & {} \quad = \textrm{E}\left\{ \textrm{E}\left\{ \frac{R_i\int _{0}^{v_i}\left\{ I(u_i \le s)-I(u_i\le 0)\right\} \textrm{d}s}{n\pi _i}\mid \mathcal {F}_n\right\} \right\} \nonumber \\{} & {} \quad = \frac{r}{n}\textrm{E}\left\{ \int _{0}^{v_i}\left\{ I(u_i \le s)-I(u_i\le 0)\right\} \textrm{d}s\mid x_i(t)\right\} \\{} & {} \quad = \frac{r}{n}\int _{0}^{v_i}\left\{ \textrm{P}\left( y_i<\varvec{B}^T_i\varvec{\theta }_0+s\mid x_i(t)\right) -\textrm{P}\left( y_i<\varvec{B}^T_i\varvec{\theta }_0\mid x_i(t)\right) \right\} \textrm{d}s \\{} & {} \quad = \frac{\sqrt{rK}}{n} \int _{0}^{\varvec{B}^T_i\varvec{\delta }}\left\{ \textrm{P}\left( y_i<\varvec{B}^T_i\varvec{\theta }_0+l\sqrt{\frac{K}{r}}\mid x_i(t)\right) -\textrm{P}\left( y_i<\varvec{B}^T_i\varvec{\theta }_0\mid x_i(t)\right) \right\} \textrm{d}l \\{} & {} \quad = \frac{K}{n}\int _{0}^{\varvec{B}^T_i\varvec{\delta }}f_{\epsilon \mid \varvec{X}(t)}(\varvec{B}^T_i\varvec{\theta }_0,x_i(t))l\textrm{d}l\cdot (1+o_P(1)) \\{} & {} \quad = \frac{K}{2n}f_{\epsilon \mid \varvec{X}(t)}(\varvec{B}^T_i\varvec{\theta }_0,x_i(t))(\varvec{B}^T_i\varvec{\delta })^2(1+o_P(1)), \end{aligned}$$

we can obtain the total expectation of $ M_r $ as follows

$$\begin{aligned} \textrm{E}\left[ M_r\right]= & {} \frac{K}{2n}\sum _{i=1}^{n}f_{\epsilon \mid \varvec{X}(t)}(\varvec{B}^T_i\varvec{\theta }_0,x_i(t))(\varvec{B}^T_i\varvec{\delta })^2(1+o_P(1)) \nonumber \\= & {} \frac{K}{2}\varvec{\delta }^T\left( \frac{1}{n}\sum _{i=1}^{n}f_{\epsilon \mid \varvec{X}(t)}(0+o(1),x_i(t))\varvec{B}_i\varvec{B}^T_i\right) \varvec{\delta }(1+o_P(1)) \nonumber \\= & {} \frac{K}{2}\varvec{\delta }^T\varvec{G}_{\tau }\varvec{\delta }(1+o_P(1)). \end{aligned}$$

(A8)

Now, we show the total variance of $ M_r $ satisfying $ \textrm{Var}[M_r]=o_P(1) $. Note that the variance of $ M_r $ can be evaluated as

$$\begin{aligned} \textrm{Var}\left[ M_r\right]\le & {} \sum _{i=1}^{n}\textrm{E}\left\{ \frac{R_i\int _{0}^{v_i}\{I(u_i \le s)-I(u_i\le 0)\}\textrm{d}s}{n\pi _i}\right\} ^2 \nonumber \\\le & {} \sqrt{\frac{K}{r}}\left\{ \mathop {\textrm{max}}\limits _{i=1,2,\dots ,n}\frac{\Vert \varvec{B}^T_i\varvec{\delta }\Vert }{n\pi _i}\right\} \cdot \textrm{E}\left[ M_r\right] \nonumber \\\le & {} \sqrt{\frac{K}{r}}\left\{ \mathop {\textrm{max}}\limits _{i=1,2,\dots ,n}\frac{1}{n\pi _i}\right\} \cdot \left\{ \mathop {\textrm{max}}\limits _{i=1,2,\dots ,n}\mid \varvec{B}^T_i\varvec{\delta }\mid \right\} \cdot \textrm{E}\left[ M_r\right] , \end{aligned}$$

(A9)

where the second inequality is from the fact that

$$\begin{aligned} \int _{0}^{v_i}\{I(u_i \le s)-I(u_i\le 0)\}\textrm{d}s\le & {} \left| \int _{0}^{v_i}\left| \{I(u_i \le s)-I(u_i\le 0)\}\right| \textrm{d}s\right| \nonumber \\\le & {} \sqrt{\frac{K}{r}}\left| \varvec{B}^T_i\varvec{\delta }\right| ,\quad i=1,2,\dots ,n. \end{aligned}$$

Thus, from (A8), (A9) and Assumption 6, and noting $ \textrm{E}\left[ M_r\right] =O(1) $, we have $ \textrm{Var}\left[ M_r\right] =o_P(\sqrt{K/r^3})=o_P(1) $. As a result, Lemma 4 holds by Chebyshev’s inequality. $\square $

In the following, we present the proofs of Theorems 1, 2, 3, 4, and 5 in turn.

Proof of Theorem 1 and 2

Theorem 1 can be proved similar to Theorem 1 of Yoshida (2013), and Theorem 2 can be obtained directly from Theorem 1 by considering Assumptions 4 and 5. Here we omit the details. $\square $ $\square $

Proof of Theorem 3

Let

$$\begin{aligned} Z_r(\varvec{\delta })= & {} \sum _{i=1}^{n}\frac{R_i(\rho _\tau (u_i-v_i)-\rho _\tau (u_i))}{\pi _i}\nonumber \\{} & {} +\frac{r\lambda }{2}(\varvec{\theta }_0+\sqrt{\frac{K}{r}}\varvec{\delta })^T \varvec{D}_q(\varvec{\theta }_0+\sqrt{\frac{K}{r}}\varvec{\delta })-\frac{r\lambda }{2}\varvec{\theta }_0^T\varvec{D}_q\varvec{\theta }_0, \end{aligned}$$

where $ u_i=y_i-\varvec{B}^T_i\varvec{\theta }_0 $ and $ v_i=\sqrt{r/K}\varvec{B}^T_i\varvec{\delta } $. It is easy to see that this function is convex and minimized at $\sqrt{r/K}(\varvec{\tilde{\theta }}-\varvec{\theta }_0) $.

On the other hand, using Knight’s identity,

$$\begin{aligned} \rho _\tau (u-v)-\rho _\tau (u)=-v\psi _\tau (u)+\int _{0}^{v}\{I(u\le s)-I(u\le 0)\}\textrm{d}s, \end{aligned}$$

(A10)

where $ \psi _\tau (u)=\tau -I(u<0) $, we have

$$\begin{aligned} Z_r(\varvec{\delta })=Z_{1r}(\varvec{\delta })+Z_{2r}(\varvec{\delta })+Z_{3r}(\varvec{\delta })+Z_{4r}(\varvec{\delta }), \end{aligned}$$

(A11)

where

$$\begin{aligned}&Z_{1r}(\varvec{\delta })=-\sqrt{\frac{K}{r}}\sum _{i=1}^{n}\frac{R_i}{\pi _i}\varvec{B}^T_i\varvec{\delta }\psi _\tau (u_i),\\&Z_{2r}(\varvec{\delta })=\sum _{i=1}^{n}\frac{R_i\int _{0}^{v_i}\left\{ I(u_i \le s)-I(u_i\le 0)\right\} \textrm{d}s}{\pi _i},\\&Z_{3r}(\varvec{\delta })=\frac{K\lambda }{2}\varvec{\delta }^T \varvec{D}_q\varvec{\delta },\\&Z_{4r}(\varvec{\delta })=\sqrt{rK}\lambda \varvec{\theta }_0^T\varvec{D}_q\varvec{\delta }. \end{aligned}$$

From Lemma 3, $ Z_{1r}(\varvec{\delta }) $ in (A11) satisfies

$$\begin{aligned} \frac{Z_{1r}(\varvec{\delta })}{n}=-\sqrt{K}\varvec{W}^T\varvec{\delta }+o_P(1), \end{aligned}$$

(A12)

where $\left\{ \tau (1-\tau )(\varvec{V}_{\pi }+\eta \varvec{G})\right\} ^{-1/2}\varvec{W}\rightarrow {N(\varvec{0},\varvec{I})}$ in distribution. Furthermore, Lemma 4 and $ Z_{3r}(\varvec{\delta }) $ in (A11) yield

$$\begin{aligned} \frac{Z_{2r}(\varvec{\delta })}{n}+\frac{Z_{3r}(\varvec{\delta })}{n}=\frac{K}{2}\varvec{\delta }^T\left( \varvec{G}_{\tau }+\frac{\lambda }{n}\varvec{D}_q\right) \varvec{\delta } +o_P(1) =\frac{K}{2}\varvec{\delta }^T\varvec{H}_{\tau }\varvec{\delta } +o_P(1).\qquad \end{aligned}$$

(A13)

Therefore, from (A11), (A12) and (A13), we can obtain

$$\begin{aligned} \frac{Z_{r}(\varvec{\delta })}{n}=-\sqrt{K}\varvec{W}^T\varvec{\delta }+\frac{K}{2}\varvec{\delta }^T\varvec{H}_{\tau }\varvec{\delta } +\frac{\sqrt{rK}}{n}\lambda \varvec{\theta }_0^T \varvec{D}_q\varvec{\delta }+ o_P(1). \end{aligned}$$

Since $ Z_{r}(\varvec{\delta })/n$ is convex with respect to $ \varvec{\delta } $ and has unique minimizer, from the corollary in page 2 of Hjort and Pollard (2011), its minimizer, $ \sqrt{r/K}(\varvec{\tilde{\theta }}-\varvec{\theta }_0)$, satisfies that

$$\begin{aligned} \sqrt{\frac{r}{K}}(\varvec{\tilde{\theta }}-\varvec{\theta }_0)=\varvec{H}_{\tau }^{-1}\left( \frac{1}{\sqrt{K}}\varvec{W} -\sqrt{\frac{r}{K}}\cdot \frac{\lambda }{n}\varvec{D}_q\varvec{\theta }_0\right) + o_P(1). \end{aligned}$$

Because the random vector is only $ \varvec{W}$ in asymptotic form of $ \varvec{\tilde{\theta }} $ and $ \tilde{\beta }(t)-\beta _0(t)=\varvec{B}^{T}(t)(\varvec{\tilde{\theta }}-\varvec{\theta }_0) $, the expectation of $ \tilde{\beta }(t)-\beta _0(t) $ can be written as

$$\begin{aligned} \textrm{E}\{\tilde{\beta }(t)-\beta _0(t)\}=b_{\lambda }(t)(1+o_P(1)), \end{aligned}$$

where $ b_{\lambda }(t)=-\frac{\lambda }{n}\varvec{B}^{T}(t)\varvec{H}_{\tau }^{-1}\varvec{D}_q\varvec{\theta }_0 $. Together with $ \tilde{\beta }(t)-\beta (t)= \tilde{\beta }(t)-\beta _0(t)+ \beta _0(t)-\beta (t) $, we have the asymptotic bias of $ \tilde{\beta }(t)$ as

$$\begin{aligned} \textrm{E}\{\tilde{\beta }(t)-\beta (t)\}=b_a(t)(1+o_P(1))+b_{\lambda }(t)(1+o_P(1)). \end{aligned}$$

Thus, we have

$$\begin{aligned}&\{\varvec{B}(t)^T\varvec{V}\varvec{B}(t)\}^{-1/2}\sqrt{r/K}(\tilde{\beta }(t)-\beta (t)-b_a(t)-b_{\lambda }(t))\\ =&\{\varvec{B}(t)^T\varvec{V}\varvec{B}(t)\}^{-1/2}\varvec{B}^T(t)\varvec{H}_{\tau }^{-1}\frac{1}{\sqrt{K}}\varvec{W}+o_P(1). \end{aligned}$$

Combining the fact that

$$\begin{aligned} \{\varvec{B}(t)^T\varvec{V}\varvec{B}(t)\}^{-1/2}\varvec{B}^T(t)\varvec{V}\varvec{B}(t)\{\varvec{B}(t)^T\varvec{V}\varvec{B}(t)\}^{-1/2}=1, \end{aligned}$$

by the definition of $ \varvec{W} $ and Slutsky’s Theorem, we can obtain for $ t\in [0,1] $, as $r, n\rightarrow \infty $,

$$\begin{aligned} \{\varvec{B}(t)^T\varvec{V}\varvec{B}(t)\}^{-1/2}\sqrt{r/K}(\tilde{\beta }(t)-\beta (t)-b_a(t)-b_{\lambda }(t))\rightarrow N(0,1). \end{aligned}$$

Further, from the discussions before Theorem 2, we know that $ b_{\lambda }(t) $ and $ b_a(t) =o_P(1)$ are negligible. Thus, we have

$$\begin{aligned} \{\varvec{B}(t)^T\varvec{V}\varvec{B}(t)\}^{-1/2}\sqrt{r/K}(\tilde{\beta }(t)-\beta (t))\rightarrow N(0,1). \end{aligned}$$

So Theorem 3 is proved. $\square $

Proof of Theorem 4

Note that

$$\begin{aligned} \textrm{tr}(\varvec{V})= & {} \frac{\tau (1-\tau )}{K}\textrm{tr}\left[ \varvec{H}^{-1}_{\tau }\left( \sum _{i=1}^{n}\frac{\varvec{B}_i\varvec{B}^T_i}{n^2\pi _i}+\eta \sum _{i=1}^{n}\frac{\varvec{B}_i\varvec{B}^T_i}{n}\right) \varvec{H}^{-1}_{\tau }\right] \\= & {} \frac{\tau (1-\tau )}{Kn^2}\sum _{i=1}^{n}\textrm{tr}\left[ \frac{\varvec{H}^{-1}_{\tau }\varvec{B}_i\varvec{B}^T_i\varvec{H}^{-1}_{\tau }}{\pi _i}\right] \\{} & {} +\frac{\tau (1-\tau )\eta }{Kn}\sum _{i=1}^{n}\textrm{tr}\left[ H^{-1}_{\tau }\varvec{B}_i\varvec{B}^T_i\varvec{H}^{-1}_{\tau }\right] \\= & {} \frac{\tau (1-\tau )}{Kn^2}\sum _{i=1}^{n}\frac{\Vert H^{-1}_{\tau }\varvec{B}_i\Vert _2^2}{\pi _i}+\frac{\tau (1-\tau )\eta }{Kn}\sum _{i=1}^{n}\Vert \varvec{H}^{-1}_{\tau }\varvec{B}_i\Vert _2^2\\= & {} \frac{\tau (1-\tau )}{Kn^2}\left( \sum _{i=1}^{n}\pi _i\right) \left( \sum _{i=1}^{n}\frac{\Vert H^{-1}_{\tau }\varvec{B}_i\Vert _2^2}{\pi _i}\right) +\frac{\tau (1-\tau )\eta }{Kn}\sum _{i=1}^{n}\Vert H^{-1}_{\tau }\varvec{B}_i\Vert _2^2\\\ge & {} \frac{\tau (1-\tau )}{Kn^2}\left( \sum _{i=1}^{n}\Vert H^{-1}_{\tau }\varvec{B}_i\Vert _2\right) ^2+\frac{\tau (1-\tau )\eta }{Kn}\sum _{i=1}^{n}\Vert H^{-1}_{\tau }\varvec{B}_i\Vert _2^2, \end{aligned}$$

where the last inequality is from the Cauchy-Schwarz inequality and the equality in it holds if and only if $ \pi _i \propto \Vert \varvec{H}^{-1}_{\tau }\varvec{B}_i\Vert _2 $. So the proof is completed by considering $\sum _{i=1}^{n}\pi _i=1 $. $\square $

Proof of Theorem 5

Note that

$$\begin{aligned} \textrm{tr}\left[ \varvec{V}_{\pi }\right]= & {} \textrm{tr}\left( \sum _{i=1}^{n}\frac{\varvec{B}_i\varvec{B}^T_i}{n^2\pi _i}\right) =\frac{1}{n^2}\sum _{i=1}^{n}\textrm{tr}\left( \frac{\varvec{B}_i\varvec{B}^T_i}{\pi _i}\right) \\= & {} \frac{1}{n^2}\sum _{i=1}^{n}\frac{\Vert \varvec{B}_i\Vert _2^2}{\pi _i}=\frac{1}{n^2}\left( \sum _{i=1}^{n}\pi _i\right) \left( \sum _{i=1}^{n}\frac{\Vert \varvec{B}_i\Vert _2^2}{\pi _i}\right) \\\ge & {} \frac{1}{n^2}\left( \sum _{i=1}^{n}\Vert \varvec{B}_i\Vert _2\right) ^2, \end{aligned}$$

where the last inequality is from the Cauchy-Schwarz inequality and the equality in it holds if and only if $ \pi _i \propto \Vert \varvec{B}_i\Vert _2 $. So the proof is completed by considering $\sum _{i=1}^{n}\pi _i=1 $. $\square $

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yan, Q., Li, H. & Niu, C. Optimal subsampling for functional quantile regression. Stat Papers 64, 1943–1968 (2023). https://doi.org/10.1007/s00362-022-01367-z

Download citation

Received: 06 May 2022
Accepted: 27 September 2022
Published: 19 October 2022
Issue Date: December 2023
DOI: https://doi.org/10.1007/s00362-022-01367-z

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimal subsampling for functional quantile regression

Abstract

Access this article

Similar content being viewed by others

Optimal subsampling for composite quantile regression in big data

Simultaneous variable selection and parametric estimation for quantile regression

Bayesian Analysis of Composite Quantile Regression

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix A: proofs for theoretical results

Lemma 1

Proof

Lemma 2

Proof

Lemma 3

Proof

Lemma 4

Proof

Proof of Theorem 1 and 2

Proof of Theorem 3

Proof of Theorem 4

Proof of Theorem 5

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Optimal subsampling for functional quantile regression

Abstract

Access this article

Similar content being viewed by others

Optimal subsampling for composite quantile regression in big data

Simultaneous variable selection and parametric estimation for quantile regression

Bayesian Analysis of Composite Quantile Regression

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix A: proofs for theoretical results

Appendix A: proofs for theoretical results

Lemma 1

Proof

Lemma 2

Proof

Lemma 3

Proof

Lemma 4

Proof

Proof of Theorem 1 and 2

Proof of Theorem 3

Proof of Theorem 4

Proof of Theorem 5

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation