Skip to main content
Log in

Robust optimal subsampling based on weighted asymmetric least squares

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

With the development of contemporary science, a large amount of generated data includes heterogeneity and outliers in the response and/or covariates. Furthermore, subsampling is an effective method to overcome the limitation of computational resources. However, when data include heterogeneity and outliers, incorrect subsampling probabilities may select inferior subdata, and statistic inference on this subdata may have a far inferior performance. Combining the asymmetric least squares and \(L_2\) estimation, this paper proposes a double-robustness framework (DRF), which can simultaneously tackle the heterogeneity and outliers in the response and/or covariates. The Poisson subsampling is implemented based on the DRF for massive data, and a more robust probability will be derived to select the subdata. Under some regularity conditions, we establish the asymptotic properties of the subsampling estimator based on the DRF. Numerical studies and actual data demonstrate the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Algorithm 1
Algorithm 2
Algorithm 3
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Ai M, Wang F, Yu J, Zhang H (2021) Optimal subsampling for large-scale quantile regression. J Complexity 62:101512

    Article  MathSciNet  Google Scholar 

  • Ai M, Yu J, Zhang H, Wang H (2021) Optimal subsampling algorithms for big data regressions. Stat Sin 31(2):749–772

    MathSciNet  Google Scholar 

  • Aigner D, Amemiya T, Poirier D (1976) On the estimation of production frontiers: maximum likelihood estimation of the parameters of a discontinuous density function. Int Econ Rev 17(2):377–396

    Article  MathSciNet  Google Scholar 

  • Barry A, Oualkacha K, Charpentier A (2022) A new gee method to account for heteroscedasticity using asymmetric least-square regressions. J Appl Stat 49(14):3564–3590

    Article  MathSciNet  Google Scholar 

  • Bowman A (1984) An alternative method of cross-validation for the smoothing of density estimates. Biometrika 71(2):353–360

    Article  MathSciNet  Google Scholar 

  • Ciuperca G (2021) Variable selection in high-dimensional linear model with possibly asymmetric errors. Comput Stat Data Anal 155:107–112

    Article  MathSciNet  Google Scholar 

  • Drineas P, Mahoney M, Muthukrishnan S, Sarlós T (2011) Faster least squares approximation. Numer Math 117:219–249

    Article  MathSciNet  Google Scholar 

  • Efron B (1991) Regression percentiles using asymmetric squared error loss. Stat Sin 1(1):93–125

    MathSciNet  Google Scholar 

  • Fan J, Han F, Liu H (2014) Challenges of big data analysis. Nat Sci Rev 1(2):293–314

    Article  Google Scholar 

  • Gijbels I, Karim R, Verhasselt A (2019) On quantile-based asymmetric family of distributions: properties and inference. Int Stat Rev 87(3):471–504

    Article  MathSciNet  Google Scholar 

  • Gu Y, Zou H (2016) High-dimensional generalizations of asymmetric least squares regression and their applications. Ann Stat 44(6):2661–2694

    Article  MathSciNet  Google Scholar 

  • Hájek J (1964) Asymptotic theory of rejective sampling with varying probabilities from a finite population. Ann Math Stat 35(4):1491–1523

    Article  MathSciNet  Google Scholar 

  • Hjort N, Pollard D (2011) Asymptotics for minimisers of convex processes. arXiv:1107.3806

  • Koenker R (2005) In quantile regression. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Koenker R, Bassett G (1978) Regression quantiles. Econometrica 46(1):33–50

    Article  MathSciNet  Google Scholar 

  • Liao L, Park C, Choi H (2019) Penalized expectile regression: an alternative to penalized quantile regression. Ann Inst Stat Math 71(2):409–438

    Article  MathSciNet  Google Scholar 

  • Lin L, Li F (2019) A global bias-correction dc method for biased estimation under memory constraint. arXiv:1904.07477

  • Ma P, Mahoney W, Yu B (2015) A statistical perspective on algorithmic leveraging. J Mach Learn Res 16(1):861–919

    MathSciNet  Google Scholar 

  • Ma P, Zhang X, Xing X, Ma J, Mandal A (2022) Asymptotic analysis of sampling estimators for randomized numerical linear algebra algorithms. J Mach Learn Res 23(177):1–45

    MathSciNet  Google Scholar 

  • Meng C, Xie R, Mandal A, Zhang X, Zhong W, Ma P (2021) Lowcon: a design-based subsampling approach in a misspecified linear model. J Comput Graph Stat 30:694–708

    Article  MathSciNet  Google Scholar 

  • Newey W, Powell J (1987) Asymmetric least squares estimation and testing. Econometric 55(4):819–847

    Article  MathSciNet  Google Scholar 

  • Pollard D (1982) Empirical choice of histogram and kernel density estimators. Scand J Stat 9(2):65–78

    MathSciNet  Google Scholar 

  • Pollard D (1991) Asymptotics for least absolution deviation regression estimators. Economet Theor 7(2):186–199

    Article  Google Scholar 

  • Pukelsheim F (2006) In optimal design of experiments. Society for Industrial and Applied Mathematics, Philadelphia

    Book  Google Scholar 

  • Schifano E, Wu J, Wang C, Yan J, Chen H (2016) Online updating of statistical inference in the big data setting. Technometrics 58(3):393–403

    Article  MathSciNet  Google Scholar 

  • Scott D (2012) Parametric statistical modeling by minimum integrated square error. Technometrics 43(3):274–285

    Article  MathSciNet  Google Scholar 

  • Shao Y, Wang L (2022) Optimal subsampling for composite quantile regression model in massive data. Stat Pap 63:1139–1161

    Article  MathSciNet  Google Scholar 

  • van der Vaart (1998) In asymptotic statistics. Cambridge University Press, London

    Book  Google Scholar 

  • Wang H, Ma Y (2021) Optimal subsampling for quantile regression in big data. Biometrika 108(1):99–112

    Article  MathSciNet  Google Scholar 

  • Wang H, Zhu R, Ma P (2018) Optimal subsampling for large sample logistic regression. J Am Stat Assoc 113(522):829–844

    Article  MathSciNet  Google Scholar 

  • Wang H, Yang M, Stufken J (2019) Information-based optimal subdata s-election for big data linear regression. J Am Stat Assoc 114(525):393–405

    Article  Google Scholar 

  • Wang L, Elmstedt J, Wong W, Xu H (2021) Orthogonal subsampling for big data linear regression. Ann Appl Stat 15(3):1273–1290

    Article  MathSciNet  Google Scholar 

  • Xiong S, Li G (2008) Some results on the convergence of conditional distributions. Stat Probab Lett 78(18):3249–3253

    Article  MathSciNet  Google Scholar 

  • Yan Q, Li Y, Niu M (2022) Optimal subsampling for functional quantile regression. Stat Pap. https://doi.org/10.1007/s00362-022-01367-z

    Article  Google Scholar 

  • Yu J, Wang H, Ai M, Zhang H (2022) Optimal distributed subsampling for maximum quasi-likelihood estimators with massive data. J Am Stat Assoc 117(537):265–276

    Article  MathSciNet  Google Scholar 

  • Yuan X, Li Y, Dong X, Liu T (2022) Optimal subsampling for composite quantile regression in big data. Stat Pap 63:1649–1676

    Article  MathSciNet  Google Scholar 

  • Yu J, Wang H (2022) Subdata selection algorithm for linear model discrimination. Stat Pap. 63:1883–1906

    Article  MathSciNet  Google Scholar 

  • Yu J, Ai M, Ye Z (2023a) A review on design inspired subsampling for big data. Stat Pap. https://doi.org/10.1007/s00362-022-01386-w

  • Yu J, Liu J, Wang H (2023b) Information-based optimal subdata selection for non-linear models. Stat Pap. https://doi.org/10.1007/s00362-023-01430-3

  • Zhu X, Li F, Wang H (2021) Least-square approximation for a distributed system. J Comput Graph Stat 30(4):1004–1018

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the Editor and two referees for the constructive suggestions that lead to a significant improvement over the article. This research is supported in part by the National Natural Science Foundation of China (12171277, 12271294, 12071248).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mingqiu Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Technical details

Appendix: Technical details

Proof of Proposition 1

Direct calculation yields

$$\begin{aligned}&\int _{-\infty }^{+\infty } f(u)^{2} du\\&\quad =\int _{-\infty }^{+\infty } \frac{4 \tau (1-\tau )}{\pi \sigma ^{2}(\sqrt{\tau }+\sqrt{1-\tau })^{2}} \exp \left\{ -2 \rho _{\tau }\left( \frac{u-\mu _{\tau }}{\sigma }\right) \right\} du \\&\quad =\frac{4 \tau (1-\tau )}{\pi \sigma ^{2}(1+2 \sqrt{\tau (1-\tau )})} \int _{-\infty }^{+\infty } \exp \left\{ -2|\tau -\mathbb {1}\left( u \le \mu _{\tau }\right) |\frac{\left( u-\mu _{\tau }\right) ^{2}}{\sigma ^{2}}\right\} du\\&\quad =\frac{4 \tau (1-\tau )}{\pi \sigma ^{2}(1+2 \sqrt{\tau (1-\tau )})}\left[ \int _{\mu _{\tau }}^{+\infty } \exp \left\{ -2 \tau \left( \frac{u-u_{\tau }}{\sigma }\right) ^{2}\right\} du\right. \\&\qquad +\left. \int _{-\infty }^{\mu _{\tau }} \exp \left\{ -2(1-\tau ) \left( \frac{u-u_{\tau }}{\sigma }\right) ^{2}\right\} du\right] \\&\quad =\frac{4 \tau (1-\tau )}{\pi \sigma ^{2}(1+2 \sqrt{\tau (1-\tau )})}\times \left( \frac{\sigma \sqrt{\pi }}{2\sqrt{2\tau }}+\frac{\sigma \sqrt{\pi }}{2\sqrt{2(1-\tau )}} \right) \\&\quad =\frac{\sqrt{2(1-\tau )\tau }}{\sigma \sqrt{\pi }(\sqrt{\tau }+\sqrt{1-\tau })} \end{aligned}$$

and

$$\begin{aligned}&\int _{-\infty }^{+\infty } f(u)^{2}du-\frac{2}{N}\sum _{i=1}^{N}f(u_i)\\&\quad =\frac{\sqrt{2(1-\tau )\tau }}{\sigma \sqrt{\pi }(\sqrt{\tau }+\sqrt{1-\tau })}\\&\qquad -\frac{2}{N}\sum _{i=1}^{N}\frac{2}{\sqrt{\pi \sigma ^2}}\frac{\sqrt{\tau (1-\tau )}}{\sqrt{\tau }+\sqrt{1-\tau }}\exp \left\{ -|\tau -\mathbb {1}\left( u_{i} \le u_{\tau }\right) |\left( \frac{u_i-u_{\tau }}{\sigma } \right) ^{2}\right\} \\&\quad =\frac{1}{N}\sum _{i=1}^{N} \frac{\sqrt{2(1-\tau )\tau }}{\sigma \sqrt{\pi }(\sqrt{\tau }+\sqrt{1-\tau })}\left[ 1-2\sqrt{2}\exp \left\{ -|\tau -\mathbb {1}(u_i\le u_{\tau }) |\left( \frac{u_i-u_{\tau }}{\sigma }\right) ^2 \right\} \right] . \end{aligned}$$

The result can be derived. \(\square \)

Lemma 1

(Gu and Zou 2016) Denote \(r(v_{i}) = \rho _{\tau }(\varepsilon _{i} - v_{i}) - \rho _{\tau }(\varepsilon _{i}) + 2\varepsilon _{i}v_{i}\psi _{\tau }(\varepsilon _{i})\), \(i = 1, \ldots , N\). The asymmetric squared error loss \(\rho _{\tau }(\cdot )\) is continuously differentiable, but is not twice differentiable at zero when \(\tau \ne 0.5\). Moreover, for any \(\varepsilon _{i}\), \(v_{i} \in {\mathbb {R}}\) and \(\tau \in (0,1)\), we have

$$\begin{aligned} (\tau \wedge (1-\tau ))v_{i}^{2} \le r(v_{i}) \le (\tau \vee (1-\tau ))v_{i}^{2}, \end{aligned}$$

where \(\tau \wedge (1-\tau ) = \text {min}\{\tau , 1-\tau \}\) and \(\tau \vee (1-\tau ) = \text {max }\{\tau , 1-\tau \}\). It follows that \(\rho _{\tau }(\cdot )\) is strongly convex.

Lemma 2

(Corollary of Hjort and Pollard (2011)) Suppose \(\varvec{Z}_{n}(\varvec{d})\) is convex and can be represented as \(\frac{1}{2} \varvec{d}' \varvec{V} \varvec{d} + \varvec{W}'_{n}\varvec{d} + C_{n} + a_{n}(\varvec{d})\), where \(\varvec{V}\) is symmetric and positive definite, \(\varvec{W}_{n}\) is stochastically bounded, \(C_{n}\) is an arbitrary constant and \(a_{n}(\varvec{d})\) goes to zero in probability for each \(\varvec{d}\). Then \(\varvec{\beta }_{n} = \arg \min \varvec{Z}_{n}\) is only \(o_P(1)\) away from \(\varvec{\alpha }_{n} = -\varvec{V}^{-1}\varvec{W}_{n}\), where \(\varvec{\alpha }_{n} = \arg \min (\frac{1}{2}\varvec{d}'\varvec{V} \varvec{d} + \varvec{W}'_{n}\varvec{d} + \varvec{C}_{n})\). If \(\varvec{W}_{n} \overset{d}{\rightarrow }\ \varvec{W}\), then \(\varvec{\beta }_{n} \overset{d}{\rightarrow }\ -\varvec{V}^{-1}\varvec{W}\).

Lemma 3

If Conditions 1, 3, 4, 5 hold, as \(n, N \rightarrow \infty \), then

  1. (a)

    \(\underset{\varvec{\beta }\in \Lambda _\textrm{B}}{\textrm{sup}}|Q_n(\varvec{\beta }) - Q_N(\varvec{\beta })|\rightarrow 0\) in conditional probability for given \({\mathcal {F}}_{N}\),

  2. (b)

    \(\parallel \tilde{\varvec{\beta }} - \varvec{\beta }_{t}\parallel = o_{P}(1)\).

Proof

Direct calculation yields

$$\begin{aligned} {\mathbb {E}}\left\{ Q_n(\varvec{\beta })\mid {\mathcal {F}}_{N} \right\} = \frac{1}{N}\sum _{i=1}^{N}\frac{{\mathbb {E}}(R_i)\omega _i(\varvec{\beta }_0)\rho _{\tau }(y_i-\varvec{x}_i'\varvec{\beta })}{\pi _i}=Q_N(\varvec{\beta }). \end{aligned}$$

Since \((\varvec{x}_{i}, y_{i})\)’s are i.i.d., we have

$$\begin{aligned}&{\mathbb {E}}\left( Q_n(\varvec{\beta })-Q_N(\varvec{\beta })\mid {\mathcal {F}}_{N}\right) ^{2}\nonumber \\&\quad =\frac{1}{N^2}{\mathbb {V}}\left\{ \sum _{i=1}^{N}\frac{R_i}{\pi _i}\omega _i(\varvec{\beta }_0)\rho _{\tau }(y_i-\varvec{x}_i'\varvec{\beta })\mid {\mathcal {F}}_{N}\right\} \nonumber \\&\quad =\frac{1}{N^2}\sum _{i=1}^{N}\frac{1}{\pi _i}\omega _i^2(\varvec{\beta }_0)\rho _{\tau }^2(y_i-\varvec{x}_i'\varvec{\beta })-\frac{1}{N^2}\sum _{i=1}^{N}\omega _i^2(\varvec{\beta }_0)\rho _{\tau }^2(y_i-\varvec{x}_i'\varvec{\beta })\nonumber \\&\quad \le \underset{1\le i \le N}{\max }\left( \frac{1}{N\pi _i}\right) \left\{ \frac{1}{N}\sum _{i=1}^{N}\omega _i^2(\varvec{\beta }_0)\rho _{\tau }^2(y_i-\varvec{x}_i'\varvec{\beta })\right\} -\frac{1}{N^2}\sum _{i=1}^{N}\omega _i^2(\varvec{\beta }_0)\rho _{\tau }^2(y_i-\varvec{x}_i'\varvec{\beta })\nonumber \\&\quad =O_{P}\left( \frac{1}{n}\right) \left\{ \frac{1}{N}\sum _{i=1}^{N}\omega _i^2(\varvec{\beta }_0)\rho _{\tau }^2(y_i-\varvec{x}_i'\varvec{\beta })\right\} -\frac{1}{N^2}\sum _{i=1}^{N}\omega _i^2(\varvec{\beta }_0)\rho _{\tau }^2(y_i-\varvec{x}_i'\varvec{\beta })\nonumber \\&\quad =O_{P}\left( \frac{1}{n}\right) , \end{aligned}$$
(A.1)

where the last equality is derived by

$$\begin{aligned} \frac{1}{N}\sum _{i=1}^{N}\omega _i^2(\varvec{\beta }_0)\rho _{\tau }^2(y_i-\varvec{x}_i'\varvec{\beta })&\le \frac{1}{N}\sum _{i=1}^{N}\omega _i^2(\varvec{\beta }_0)(\tau \vee (1-\tau ))^2(y_i-\varvec{x}_i'\varvec{\beta })^4\\&\le \frac{2}{N}\sum _{i=1}^{N}(y_i-\varvec{x}_i'\varvec{\beta }_t)^4+\frac{2}{N}\sum _{i=1}^{N}[\varvec{x}_i'(\varvec{\beta }_t-\varvec{\beta })]^4\\&\le \frac{2}{N}\sum _{i=1}^{N}\varepsilon _i^4+\frac{2}{N}\sum _{i=1}^{N}\Vert \varvec{x}_i\Vert ^4\Vert \varvec{\beta }_t-\varvec{\beta }\Vert ^4\\&= \frac{2}{N}\sum _{i=1}^{N}\varepsilon _i^4+2({\mathbb {E}}\Vert \varvec{x}_i\Vert ^4+o_P(1))\Vert \varvec{\beta }_t-\varvec{\beta }\Vert ^4\\&=O_P(1), \end{aligned}$$

where the last equality is due to Conditions 1, 3, 4. So \({\mathbb {E}}\left\{ Q_n(\varvec{\beta })-Q_N(\varvec{\beta })\mid {\mathcal {F}}_{N} \right\} ^{2}\rightarrow 0\) as \(N\rightarrow \infty \) and \(n\rightarrow \infty \). Combining (A.1) and the Chebyshev inequality, \(Q_n(\varvec{\beta }) - Q_N(\varvec{\beta }) \rightarrow 0\) in conditional probability for given \({\mathcal {F}}_{N}\). Since \(Q_n(\varvec{\beta })\) is convex function of \(\varvec{\beta }\), by the Convexity Lemma of Pollard (1991), then \(\underset{\varvec{\beta }\in \Lambda _{\textrm{B}}}{\text {sup}}|Q_n(\varvec{\beta }) - Q_N(\varvec{\beta })|\rightarrow 0\) in conditional probability for given \({\mathcal {F}}_{N}\).

\(Q_N(\varvec{\beta })\) has a unique minimum \(\hat{\varvec{\beta }}_{f}\) by Lemma A of Newey and Powell (1987). Thus, based on Theorem 5.9 and its remark of van der Vaart (1998), we have

$$\begin{aligned} \Vert \tilde{\varvec{\beta }}-\hat{\varvec{\beta }}_{f}\Vert = o_{P\mid {\mathcal {F}}_{N}}(1). \end{aligned}$$

Xiong and Li (2008) showed that if a sequence is bounded in conditional probability, then it is bounded in unconditional probability, so we have \(\Vert \tilde{\varvec{\beta }}-\hat{\varvec{\beta }}_{f}\Vert = o_{P}(1)\). Newey and Powell (1987) proved that \(\Vert \hat{\varvec{\beta }}_{f} - \varvec{\beta }_{t} \Vert = o_{P}(1)\). By the triangle inequality, then

$$\begin{aligned} \Vert \tilde{\varvec{\beta }}-\varvec{\beta }_{t}\Vert \le \Vert \tilde{\varvec{\beta }}-\hat{\varvec{\beta }}_{f}\Vert + \Vert \hat{\varvec{\beta }}_{f} - \varvec{\beta }_{t}\Vert = o_{P}(1). \end{aligned}$$

This completes the proof. \(\square \)

Lemma 4

Denote \(\varvec{L}^{*}(\varvec{\beta })=\frac{\partial Q_n(\varvec{\beta })}{\partial \varvec{\beta }}\). If Conditions 1, 3, 4, 5 hold, as \(n, N \rightarrow \infty \) then

$$\begin{aligned} \varvec{L}^{*}(\varvec{\beta }_{t})=O_{P}\left( \frac{1}{\sqrt{n}}\right) . \end{aligned}$$

Proof

For any \(\varvec{\beta } \in \Lambda _{\textrm{B}}\), direct calculation yields

$$\begin{aligned} \varvec{L}^{*}(\varvec{\beta }) =&-\frac{2}{N}\sum _{i=1}^{N}\frac{\omega _i(\varvec{\beta }_0)\psi _{\tau }(y_i-\varvec{x}_i'\varvec{\beta })(y_i-\varvec{x}_i'\varvec{\beta })R_i\varvec{x}_i}{\pi _i}. \end{aligned}$$
(A.2)

Substituting \(\varvec{\beta }=\varvec{\beta }_{t}\) into (A.2), we have

$$\begin{aligned} \varvec{L}^{*}(\varvec{\beta }_t) =&-\frac{2}{N}\sum _{i=1}^{N}\frac{\omega _i(\varvec{\beta }_0)\psi _{\tau }(\varepsilon _i)R_i\varepsilon _i\varvec{x}_i}{\pi _i}. \end{aligned}$$

Let \(L_{j_1}^{*}(\varvec{\beta }_{t})\), \(L_{j_2}^{*}(\varvec{\beta }_{t})\) be the elements of \(\varvec{L}^{*}(\varvec{\beta }_{t})\) and \(x_{ij_1}\), \(x_{ij_2}\) be the elements of \(\varvec{x}_i\), \(j_1,j_2 = 0, 1, \ldots , p\), then the conditional expectation and conditional covariance are

$$\begin{aligned}&{\mathbb {E}}(\varvec{L}^{*}(\varvec{\beta }_t)\mid {\mathcal {F}}_{N})=-\frac{2}{N}\sum _{i=1}^{N}\omega _i(\varvec{\beta }_0)\psi _{\tau }(\varepsilon _i)\varepsilon _i\varvec{x}_i=O_{P}\left( \frac{1}{\sqrt{N}}\right) , \end{aligned}$$
(A.3)
$$\begin{aligned}&Cov(L_{j_1}^{*}(\varvec{\beta }_t),L_{j_2}^{*}(\varvec{\beta }_t)\mid {\mathcal {F}}_{N})\nonumber \\&\quad =\frac{4}{N^2}\sum _{i=1}^{N}\frac{\omega _i^2(\varvec{\beta }_0)\psi ^2_{\tau }(\varepsilon _i){\mathbb {V}}(R_i)\varepsilon ^2_ix_{ij_1}x_{ij_2}}{\pi _i^2}\nonumber \\&\quad =\frac{4}{N^2}\sum _{i=1}^{N}\frac{\omega _i^2(\varvec{\beta }_0)\psi ^2_{\tau }(\varepsilon _i)\varepsilon ^2_ix_{ij_1}x_{ij_2}}{\pi _i}-\frac{4}{N^2}\sum _{i=1}^{N} \omega _i^2(\varvec{\beta }_0)\psi ^2_{\tau }(\varepsilon _i)\varepsilon ^2_ix_{ij_1}x_{ij_2}\nonumber \\&\quad \le \underset{1\le i\le N}{\max }\left\{ \frac{1}{N\pi _i}\right\} \frac{4}{N}\sum _{i=1}^{N}\omega _i^2(\varvec{\beta }_0)\psi ^2_{\tau }(\varepsilon _i)\varepsilon ^2_ix_{ij_1}x_{ij_2}+O_{P}\left( \frac{1}{N}\right) \nonumber \\&\quad =O_{P}\left( \frac{1}{n}\right) , \end{aligned}$$
(A.4)

where the last equality is due to Conditions 1, 4, 5. For the (A.3), this is because

$$\begin{aligned} {\mathbb {E}}\{{\mathbb {E}}(\varvec{L}^{*}(\varvec{\beta }_t)\mid {\mathcal {F}}_{N})\}={\mathbb {E}}\left\{ -\frac{2}{N}\sum _{i=1}^{N}\omega _i(\varvec{\beta }_0)\psi _{\tau }(\varepsilon _i)\varepsilon _i\varvec{x}_i\right\} =\varvec{0}, \end{aligned}$$

and

$$\begin{aligned} Cov\left\{ {\mathbb {E}}(\varvec{L}^{*}(\varvec{\beta }_t)\mid {\mathcal {F}}_{N})\right\}&= \frac{4}{N^2}\sum _{i=1}^{N}{\mathbb {E}}\left( \omega _i^2(\varvec{\beta }_0)\psi _{\tau }^2(\varepsilon _i)\varepsilon _i^2\varvec{x}_i\varvec{x}_i'\right) \\&\quad -\frac{4}{N^2}\sum _{i=1}^{N}\left\{ {\mathbb {E}}\left( \omega _i(\varvec{\beta }_0)\psi _{\tau }(\varepsilon _i)\varepsilon _i\varvec{x}_i\right) {\mathbb {E}}\left( \omega _i(\varvec{\beta }_0)\psi _{\tau }(\varepsilon _i)\varepsilon _i\varvec{x}_i'\right) \right\} \\&=\frac{4}{N^2}\sum _{i=1}^{N}{\mathbb {E}}\left( \omega _i^2(\varvec{\beta }_0)\psi _{\tau }^2(\varepsilon _i)\varepsilon _i^2\varvec{x}_i\varvec{x}_i'\right) \\&=O\left( \frac{1}{N}\right) . \end{aligned}$$

By the Chebyshev’s inequality, (A.3) can be obtained. Therefore, from (A.3), (A.4) and Chebyshev’s inequality, the result can be derived. \(\square \)

Lemma 5

Denote \(Z=\frac{1}{N}\sum _{i=1}^{N}[\rho _{\tau }(\varepsilon _{i}-v_{i})-\rho _{\tau }(\varepsilon _{i})]\). Under Conditions 1 and 3, then we are able to split Z in two functions, i.e.

$$\begin{aligned} Z \simeq -\frac{2}{N}\sum _{i=1}^{N}v_{i}\varepsilon _{i}\psi _{\tau }(\varepsilon _{i})+\frac{1}{N}\sum _{i=1}^{N}v_{i}^{2}\psi _{\tau }(\varepsilon _{i}), \end{aligned}$$

where \(v_{i}=\varvec{x}_{i}'\left( \varvec{\beta }-\varvec{\beta }_{t} \right) \), \(\varvec{\beta }\in \Lambda _{\textrm{B}}\).

Proof

From Lemma 1, then we have

$$\begin{aligned} Z&=\frac{1}{N}\sum _{i=1}^{N}[\rho _{\tau }(\varepsilon _{i}-v_{i})-\rho _{\tau }(\varepsilon _{i})]\\&=\frac{1}{N}\sum _{i=1}^{N}[-2\varepsilon _{i}v_{i}\psi _{\tau }(\varepsilon _{i})+r(v_{i})]\\&=\frac{1}{N}\sum _{i=1}^{N}[-2\varepsilon _{i}v_{i}\psi _{\tau }(\varepsilon _{i})+r(v_{i})-v_{i}^{2}\psi _{\tau }(\varepsilon _{i})+v_{i}^{2}\psi _{\tau }(\varepsilon _{i})]\\&=\frac{1}{N}\sum _{i=1}^{N}[-2v_{i}\varepsilon _{i}\psi _{\tau }(\varepsilon _{i})+v_{i}^{2}\psi _{\tau }(\varepsilon _{i})]+O_P\left( \frac{1}{N}\sum _{i=1}^{N}v_{i}^{2} \right) . \end{aligned}$$

This completes the proof. \(\square \)

Proof of Theorem 3

The first part of Theorem 3 has showed in Lemma 3. Now we will proof the second part. Let \(\varvec{\xi } = \varvec{\beta }-\varvec{\beta }_t\) and

$$\begin{aligned} \varvec{Z}(\varvec{\xi })=\sum _{i=1}^{N}\frac{\omega _i(\varvec{\beta }_0)\left\{ \rho _{\tau }(\varepsilon _{i}-\varvec{x}_i'\varvec{\xi })-\rho _{\tau }(\varepsilon _{i})\right\} R_i}{N\pi _i}. \end{aligned}$$

Note that \(\varvec{Z}(\varvec{\xi })\) is convex and minimized by \(\tilde{\varvec{\beta }}-\varvec{\beta }_t\). Thus, we focus on \(\varvec{Z}(\varvec{\xi })\) when assessing the properties of \(\tilde{\varvec{\beta }}-\varvec{\beta }_t\). Denote \(\varvec{Z}_{N2}=\sum _{i=1}^{N}\frac{\omega _i(\varvec{\beta }_0)\psi _{\tau }(\varepsilon _{i})R_i\varvec{x}_i\varvec{x}_i'}{N\pi _i}\), then

$$\begin{aligned} \varvec{Z}_{N2}&=\varvec{Z}_{N2}-{\mathbb {E}}(\varvec{Z}_{N2}\mid {\mathcal {F}}_{N})+{\mathbb {E}}(\varvec{Z}_{N2}\mid {\mathcal {F}}_{N}) \\&=o_{P\mid {\mathcal {F}}_{N}}(1)+\varvec{D}_N, \end{aligned}$$

where \(\varvec{Z}_{N2}-{\mathbb {E}}(\varvec{Z}_{N2}\mid {\mathcal {F}}_{N})=o_{P\mid {\mathcal {F}}_{N}}(1)\) can be derived by (A.5), (A.6) and Chebyshev’s inequality. Denote \(\varvec{Z}_{N3}=\varvec{Z}_{N2}-{\mathbb {E}}(\varvec{Z}_{N2}\mid {\mathcal {F}}_{N})\), then

$$\begin{aligned} {\mathbb {E}}(\varvec{Z}_{N3}\mid {\mathcal {F}}_{N})=\varvec{0}, \end{aligned}$$
(A.5)

and let \(Z_{N3j_1}\), \(Z_{N3j_2}\) be the elements of \(\varvec{Z}_{N3}\) and \(x_{ij_1}\), \(x_{ij_2}\) be the elements of \(\varvec{x}_i\), \(j_1,j_2 =0, 1, \ldots , p\),

$$\begin{aligned} Cov(Z_{N3j_1},Z_{N3j_2}\mid {\mathcal {F}}_{N})&\le \sum _{i=1}^{N}\frac{\omega _i^2(\varvec{\beta }_0)\psi _{\tau }^2(\varepsilon _{i})(x_{ij_1}x_{ij_2})^2\pi _i(1-\pi _i)}{N^2\pi _{i}^2}\nonumber \\&\le \sum _{i=1}^{N}\frac{\omega _i^2(\varvec{\beta }_0)\psi _{\tau }^2(\varepsilon _{i})x_{ij_1}x_{ij_2}}{N^2\pi _{i}}\nonumber \\&\le \underset{1\le i\le N}{\max }\left( \frac{1}{N\pi _i}\right) \left( \sum _{i=1}^{N}\frac{\omega _i^2(\varvec{\beta }_0)\psi _{\tau }^2(\varepsilon _{i})(x_{ij_1}x_{ij_2})^2}{N}\right) \nonumber \\&=O_P\left( \frac{1}{n}\right) . \end{aligned}$$
(A.6)

From Lemma 5, we have

$$\begin{aligned} \varvec{Z}(\varvec{\xi })&= -\varvec{\xi }'\sum _{i=1}^{N}\frac{2\omega _i(\varvec{\beta }_0)\varepsilon _{i}\psi _{\tau }(\varepsilon _{i})R_i\varvec{x}_i}{N\pi _i}+\varvec{\xi }'\sum _{i=1}^{N}\frac{ \omega _i(\varvec{\beta }_0)\psi _{\tau }(\varepsilon _{i})\varvec{x}_i\varvec{x}_i'}{N}\varvec{\xi }+o_P(1) \\&= \varvec{\xi }'\varvec{L}^{*}(\varvec{\beta }_t)+\varvec{\xi }'\varvec{D}_N\varvec{\xi }+o_P(1). \end{aligned}$$

Since \(\varvec{Z}(\varvec{\xi })\) is convex, and from Lemma 2,

$$\begin{aligned} \tilde{\varvec{\beta }}-\varvec{\beta }_t=-{\frac{1}{2}}\varvec{D}_N^{-1}\varvec{L}^{*}(\varvec{\beta }_t)+o_P(1). \end{aligned}$$
(A.7)

By Condition 2 and Lemma 4, we have

$$\begin{aligned} \tilde{\varvec{\beta }}-\varvec{\beta }_t=O_{P\mid {\mathcal {F}}_{N}}\left( \frac{1}{\sqrt{n}}\right) . \end{aligned}$$

This completes the proof of Theorem 3. \(\square \)

Proof of Theorem 4

By Lemma 4,

$$\begin{aligned} {\mathbb {E}}\{\varvec{L}^{*}(\varvec{\beta }_t)\mid {\mathcal {F}}_{N}\}=O_{P\mid {\mathcal {F}}_{N}}\left( \frac{1}{\sqrt{N}}\right) , \quad {\mathbb {V}}\{\varvec{L}^{*}(\varvec{\beta }_t)\mid {\mathcal {F}}_{N}\}={ 4}\varvec{V}_{\pi }+o_{P\mid {\mathcal {F}}_{N}}(1). \end{aligned}$$
(A.8)

Now we check the Lindeberg-Feller condition. Note that

$$\begin{aligned} \varvec{L}^{*}(\varvec{\beta }_t)=-\frac{2}{N}\sum _{i=1}^{N}\frac{\omega _i(\varvec{\beta }_0)\psi _{\tau }(\varepsilon _i)R_i\varepsilon _i\varvec{x}_i}{\pi _i}:=-2\sum _{i=1}^{N}\varvec{\eta }_i. \end{aligned}$$

For every \(\epsilon >0\), we have

$$\begin{aligned} \sum _{i=1}^{N}{\mathbb {E}}\{\Vert \varvec{\eta }_i\Vert ^2\mathbb {1}(\Vert \varvec{\eta }_i\Vert >\epsilon )\mid {\mathcal {F}}_{N}\}&\le \frac{1}{\epsilon }\sum _{i=1}^{N}{\mathbb {E}}\{\Vert \varvec{\eta }_i\Vert ^3\mid {\mathcal {F}}_{N}\}\\&\le \frac{1}{\epsilon }\sum _{i=1}^{N}{\mathbb {E}}\left\{ \frac{\Vert \omega _i(\varvec{\beta }_0)R_i\psi _{\tau }(\varepsilon _i)\varepsilon _i\varvec{x}_i\Vert ^3}{N^3\pi _i^3}\mid {\mathcal {F}}_{N}\right\} \\&\le \frac{1}{\epsilon }\underset{1\le i\le N}{\max }\left\{ \frac{1}{(N\pi _i)^2}\right\} \frac{1}{N}\sum _{i=1}^{N}\Vert \omega _i(\varvec{\beta }_0)\psi (\varepsilon _i)\varepsilon _i\varvec{x}_i\Vert ^3\\&\le \frac{1}{\epsilon }O_P\left( \frac{1}{n^2}\right) =o_P(1), \end{aligned}$$

where the last inequality holds by Conditions 1, 4, 5.

Given \({\mathcal {F}}_{N}\), using (A.8) and Lindeberg-Feller central limit theorem,

$$\begin{aligned} \varvec{V}_{\pi }^{-1/2}\{\varvec{L}^{*}(\varvec{\beta }_t)-\sqrt{n}{\mathbb {E}}(\varvec{L}^{*}(\varvec{\beta }_t)\mid {\mathcal {F}}_{N})\}\overset{d}{\rightarrow } {\mathbb {N}}(\varvec{0},\varvec{I}) \end{aligned}$$
(A.9)

with \(\sqrt{n}{\mathbb {E}}\left( \varvec{L}^{*}(\varvec{\beta }_t)\mid {\mathcal {F}}_{N} \right) = O_{P}\left( \sqrt{\frac{n}{N}}\right) = o_{P}(1)\).

By Theorems 3, (A.7), (A.9) and Slutsky’s theorem, we conclude that as \(n\rightarrow \infty \), \(N\rightarrow \infty \), conditional on \({\mathcal {F}}_{N}\), with probability approaching one,

$$\begin{aligned} \{\varvec{D}^{-1}_N\varvec{V}_{\pi }\varvec{D}^{-1}_N\}^{-1/2}(\tilde{\varvec{\beta }}-\varvec{\beta }_t)\overset{d}{\rightarrow } {\mathbb {N}}(\varvec{0},\varvec{I}), \end{aligned}$$

where \(\varvec{D}_N\) and \(\varvec{V}_{\pi }\) are defined in (12) and (13), respectively. \(\square \)

Proof of Theorem 5

Define \(h_i^{\textrm{Aopt}}=\Vert \omega _i(\varvec{\beta }_0)\varepsilon _i\psi _{\tau }(\varepsilon _i)\varvec{D}_{N}^{-1}\varvec{x}_{i}\Vert \), \(i=1,\ldots ,N\). Without loss of generality, we assume that \(h_i^{\textrm{Aopt}}>0\), for any i, and \(h_{N+1}^{\textrm{Aopt}}=+\infty \). Minimizing \(\text {tr}(\varvec{D}^{-1}_N\varvec{V}_{\pi }\varvec{D}^{-1}_N)\) is sufficient to solve the following optimization problem:

$$\begin{aligned}&\min \ {\tilde{H}}:=\text {tr}(\varvec{D}^{-1}_N\varvec{V}_{\pi }\varvec{D}^{-1}_N) \\&s.t. \ \sum _{i=1}^{N}\pi _i=n, 0\le \pi _i\le 1 \ \text {for } i=1,\ldots ,N. \end{aligned}$$

Without loss of generality, we assume that \(h_1^{\textrm{Aopt}}\le h_2^{\textrm{Aopt}}\le \ldots \le h_N^{\textrm{Aopt}}\),

$$\begin{aligned} \text {tr}(\varvec{D}^{-1}_N\varvec{V}_{\pi }\varvec{D}^{-1}_N)&=\frac{1}{N^2}\sum _{i=1}^{N}\left\{ \frac{1}{\pi _i}\Vert \omega _i(\varvec{\beta }_0)\varepsilon _i\psi _{\tau }(\varepsilon _i)\varvec{D}_{N}^{-1}\varvec{x}_{i}\Vert ^2 \right\} \\&=\frac{1}{N^2}\sum _{i=1}^{N}\frac{1}{\pi _i}(h_i^{\textrm{Aopt}})^2\\&=\frac{1}{N^2}\frac{1}{n}\left( \sum _{i=1}^{N}\pi _i\right) \left( \sum _{i=1}^{N}\frac{1}{\pi _i}(h_i^{\textrm{Aopt}})^2\right) \\&\ge \frac{1}{N^2}\frac{1}{n}\left( \sum _{i=1}^{N}h_i^{\textrm{Aopt}}\right) ^2, \end{aligned}$$

where the last step is from the Cauchy-Schwarz inequality and the equality holds if and only if \(\pi _{i}\propto h_i^{\textrm{Aopt}}\). Now we consider two cases:

Case 1. If all \(\frac{nh_i^{\textrm{Aopt}}}{\sum _{j=1}^{N}h_j^{\textrm{Aopt}}}\le 1\), then \(\pi _i^{\textrm{Aopt}}=\frac{nh_i^{\textrm{Aopt}}}{\sum _{j=1}^{N}h_j^{\textrm{Aopt}}}\), where \(i=1,\ldots ,N\).

Case 2. Assume that exists some i such that \(\pi _i^{\textrm{Aopt}}=\frac{nh_i^{\textrm{Aopt}}}{\sum _{j=1}^{N}h_j^{\textrm{Aopt}}}>1\), by the definition of k, we know that the number of such i is k. Therefore, the original optimization turns into the following optimization problem:

$$\begin{aligned}&\min \ \frac{1}{N^2}\sum _{i=1}^{N-k}\left\{ \frac{1}{\pi _i}\Vert \omega _i(\varvec{\beta }_0)\varepsilon _i\psi _{\tau }(\varepsilon _i)\varvec{D}_{N}^{-1}\varvec{x}_{i}\Vert ^2 \right\} \\&s.t. \ \sum _{i=1}^{N-k}\pi _i=n-k, 0\le \pi _i\le 1 \ \text {for } i=1,\ldots ,N-k,\ \pi _{N-k+1}=\ldots =\pi _{N}=1. \end{aligned}$$

Similar with the calculating of \(\pi _i^{\textrm{Aopt}}\) under Case 1, from the Cauchy–Schwarz inequality,

$$\begin{aligned}&\frac{1}{N^2}\sum _{i=1}^{N-k}\left\{ \frac{1}{\pi _i}\Vert \omega _i(\varvec{\beta }_0)\varepsilon _i\psi _{\tau }(\varepsilon _i)\varvec{D}_{N}^{-1}\varvec{x}_{i}\Vert ^2 \right\} \\&\quad =\frac{1}{N^2}\frac{1}{(n-k)}\left( \sum _{i=1}^{N-k}\pi _i\right) \left( \sum _{i=1}^{N-k}\frac{1}{\pi _i}(h_i^{\textrm{Aopt}})^2\right) \\&\quad \ge \frac{1}{N^2}\frac{1}{(n-k)}\left( \sum _{i=1}^{N-k}h_i^{\textrm{Aopt}}\right) ^2, \end{aligned}$$

and the equality holds if and only if \(\pi _{i}\propto h_i^{\textrm{Aopt}}\), i.e. \(\pi _{i}^{\textrm{Aopt}}=\frac{(n-k)h_i^{\textrm{Aopt}}}{\sum _{j=1}^{N-k}h_j^{\textrm{Aopt}}}\), \(i=1,\ldots ,N-k\). Assume there exists \(\tilde{\textrm{M}}\) such that

$$\begin{aligned} \underset{1\le i\le N}{\max }\pi _{i}^{\textrm{Aopt}}=\underset{1\le i\le N}{\max }\frac{n(h_i^{\textrm{Aopt}}\wedge \tilde{\textrm{M}})}{\sum _{j=1}^{N}(h_j^{\textrm{Aopt}}\wedge \tilde{\textrm{M}})}=1, \end{aligned}$$

and \(h_{N-k}^{\textrm{Aopt}}\le \tilde{\textrm{M}}\le h_{N-k+1}^{\textrm{Aopt}}\), so \(\sum _{i=1}^{N-k}h_i^{\textrm{Aopt}}=(n-k)\tilde{\textrm{M}}\) holds. Thus, the set \(\{1, \ldots , N\}\) can be divided into two parts, i.e. \(\{1, \ldots , N-k\}\) and \(\{N-k+1, \ldots , N\}\), which correspond to \(\pi _{i}^{\textrm{Aopt}}=\frac{h_i^{\textrm{Aopt}}}{\tilde{\textrm{M}}}\) and \(\pi _{i}^{\textrm{Aopt}}=1\). So we have

$$\begin{aligned} \text {tr}(\varvec{D}^{-1}_N\varvec{V}_{\pi }\varvec{D}^{-1}_N)&=\frac{1}{N^2}\sum _{i=1}^{N}\frac{1}{\pi _i}(h_i^{\textrm{Aopt}})^2\nonumber \\&=\frac{1}{N^2}\left\{ (n-k)\tilde{\textrm{M}}^2+\sum _{i=N-k+1}^{N}(h_i^{\textrm{Aopt}})^2\right\} . \end{aligned}$$
(A.10)

(A.10) describes that the lower bound of \(\text {tr}(\varvec{D}^{-1}_N\varvec{V}_{\pi }\varvec{D}^{-1}_N)\) can be attained when the equality of the Cauchy-Schwarz inequality holds.

When \(\pi _{i}^{\textrm{Aopt}}=\frac{n(h_i^{\textrm{Aopt}}\wedge \tilde{\textrm{M}})}{\sum _{j=1}^{N}(h_j^{\textrm{Aopt}}\wedge \tilde{\textrm{M}})}\), and takes \(\pi _{i}^{\textrm{Aopt}}\) into \({\tilde{H}}\), the following equation holds:

$$\begin{aligned} {\tilde{H}}&:=\frac{1}{N^2}\sum _{i=1}^{N}\left\{ \frac{1}{\pi _{i}^{\textrm{Aopt}}}\Vert \omega _i(\varvec{\beta }_0)\varepsilon _i\psi _{\tau }(\varepsilon _i)\varvec{D}_{N}^{-1}\varvec{x}_{i}\Vert ^2 \right\} \nonumber \\&=\frac{1}{N^2}\sum _{i=1}^{N}\frac{1}{\pi _{i}^{\textrm{Aopt}}}(h_i^{\textrm{Aopt}})^2\nonumber \\&=\frac{1}{N^2}\left\{ (n-k)\tilde{\textrm{M}}^2+\sum _{i=N-k+1}^{N}(h_i^{\textrm{Aopt}})^2\right\} , \end{aligned}$$
(A.11)

which echoes the lower bound of \(\text {tr}(\varvec{D}^{-1}_N\varvec{V}_{\pi }\varvec{D}^{-1}_N)\) in (A.10). Therefore, by (A.10) and (A.11), \(\pi _{i}^{\textrm{Aopt}}=\frac{n(h_i^{\textrm{Aopt}}\wedge \tilde{\textrm{M}})}{\sum _{j=1}^{N}(h_j^{\textrm{Aopt}}\wedge \tilde{\textrm{M}})}\) is the optimal solution.

Now, we will prove the existence and rationality of \(\tilde{\textrm{M}}\) when \(\tilde{\textrm{M}} \in (h_{N-k}^{\textrm{Aopt}}, h_{N-k+1}^{\textrm{Aopt}}]\). The definition of k, which implies that

$$\begin{aligned} \frac{(n-k+1)h_{N-k+1}^{\textrm{Aopt}}}{\sum _{i=1}^{N-k+1} h_{i}^{\textrm{Aopt}}} \ge 1 \quad \text{ and } \quad \frac{(n-k) h_{N-k}^{\textrm{Aopt}}}{\sum _{i=1}^{N-k} h_{i}^{\textrm{Aopt}}}<1. \end{aligned}$$

Taking \(\tilde{\textrm{M}}_{1}=h_{N-k+1}^{\textrm{Aopt}}\) and \(\tilde{\textrm{M}}_{2}=h_{N-k}^{\textrm{Aopt}}\), we have

$$\begin{aligned} \frac{(n-k+1) h_{N-k+1}^{\textrm{Aopt}}+(k-1) \tilde{\textrm{M}}_{1}}{\sum _{i=1}^{N-k+1} h_{i}^{\textrm{Aopt}}+(k-1) \tilde{\textrm{M}}_{1}} \ge 1 \quad \text{ and } \quad \frac{(n-k) h_{N-k}^{\textrm{Aopt}}+k \tilde{\textrm{M}}_{2}}{\sum _{i=1}^{N-k} h_{i}^{\textrm{Aopt}}+k \tilde{\textrm{M}}_{2}}<1, \end{aligned}$$

which implies the fact that

$$\begin{aligned} n\frac{h_{i}^{\textrm{Aopt}} \wedge \tilde{\textrm{M}}_{1}}{\sum _{j=1}^{N}(h_{j}^{\textrm{Aopt}} \wedge \tilde{\textrm{M}}_{1})} \ge 1 \text{ and } n\frac{h_{i}^{\textrm{Aopt}} \wedge \tilde{\textrm{M}}_{2}}{\sum _{j=1}^{N}(h_{j}^{\textrm{Aopt}} \wedge \tilde{\textrm{M}}_{2})}<1. \end{aligned}$$

We can see that \(\underset{1\le i\le N}{\max }\frac{h_{i}^{\textrm{Aopt}}\wedge \tilde{\textrm{M}}}{\sum _{j=1}^{N}(h_{j}^{\textrm{Aopt}} \wedge \tilde{\textrm{M}})}\) is continuous, which shows the existence of \(\tilde{\textrm{M}}\).

For the rationality of \(\tilde{\textrm{M}}\), we only prove that \(\underset{1\le i\le N}{\max }\frac{h_{i}^{\textrm{Aopt}}\wedge \tilde{\textrm{M}}}{\sum _{j=1}^{N}(h_{j}^{\textrm{Aopt}} \wedge \tilde{\textrm{M}})}=\frac{1}{n}\), i.e. \(\frac{h_{N}^{\textrm{Aopt}}\wedge \tilde{\textrm{M}}}{\sum _{j=1}^{N}(h_{j}^{\textrm{Aopt}} \wedge \tilde{\textrm{M}})}\) is nondecreasing on \(\tilde{\textrm{M}} \in (h_{1}^{\textrm{Aopt}},h_{N}^{\textrm{Aopt}})\). For any \(h_N^{\textrm{Aopt}}\ge \tilde{\textrm{M}}'\ge \tilde{\textrm{M}}\), \(\tilde{\textrm{M}}'\wedge h_N^{\textrm{Aopt}}\ge \tilde{\textrm{M}}\wedge h_N^{\textrm{Aopt}}\), and \(\left( \tilde{\textrm{M}}' / \tilde{\textrm{M}}\right) \sum _{i=1}^{N}(h_{i}^{\textrm{Aopt}} \wedge \tilde{\textrm{M}}) \ge \sum _{i=1}^{n}(h_{i}^{\textrm{Aopt}} \wedge \tilde{\textrm{M}}')\). So, the rationality can be proved. \(\square \)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ren, M., Zhao, S., Wang, M. et al. Robust optimal subsampling based on weighted asymmetric least squares. Stat Papers 65, 2221–2251 (2024). https://doi.org/10.1007/s00362-023-01480-7

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-023-01480-7

Keywords

Mathematics Subject Classification

Navigation