Skip to main content
Log in

Adaptive quantile regressions for massive datasets

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

Analysis of massive datasets is challenging owing to limitations of computer primary memory. Adaptive quantile regressions is a robust and efficient estimation method. For computational efficiency, we propose an adaptive smoothing quantile regressions (ASQR). The ASQR method is used to analyze massive datasets. The proposed approach significantly reduces the required amount of primary memory, and the resulting estimate will be as efficient as if the entire data set is analyzed simultaneously. Both simulations and data analysis are conducted to illustrate the finite sample performance of the proposed methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

Download references

Acknowledgements

The two anonymous referees provided numerous valuable comments that improve the manuscript. This research is supported by the Shanghai Sailing Program (No. 17YF1400800) and the National Natural Science Foundation of China (No. 11801069 and No.11871143).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rong Jiang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Proof of Theorem 2.1

By the Theorem 4.3 in Chen et al. (2019) and the error term \(\varepsilon \) is independent of \(\mathbf{X}\), we have

$$\begin{aligned} {\hat{\beta }}_{\tau _{q},h}-{\beta }_0=\frac{1}{N}f^{-1}(b_{\tau _q})\mathbf{C}^{-1}\sum _{i=1}^N\mathbf{x}_i \left( I\{\varepsilon _i\ge 0\}+ \tau _{k}-1 \right) +r_N, \end{aligned}$$

where \(\Vert r_N\Vert _2=O_p\left( (p/N)^{3/4}(\log N)^{1/2}\right) \). Thus, for a given quantile \(\tau _{q}\),

$$\begin{aligned} E[{\hat{\beta }}_{\tau _{q},h}]&=\beta _0+r_N,\\ Var[{\hat{\beta }}_{\tau _{q},h}]&=E[({\hat{\beta }}_{\tau _{q},h}-\beta _0)^2|\tau _{q}] =N^{-1}{} \mathbf{C}^{-1}\tau _{q}(1-\tau _{q})/f^2(b_{\tau _{q}})+r_N^2. \end{aligned}$$

See, for example, Koenker (2005) for details about the above two results. It is also straightforward to verify that, for two independent quantile \(\tau _{q}\) and \(\tau _{q'}\),

$$\begin{aligned} Cov({\hat{\beta }}_{\tau _{q},h},{\hat{\beta }}_{\tau _{q'},h}) =N^{-1}{} \mathbf{C}^{-1}(\min (\tau _{q},\tau _{q'})-\tau _{q}\tau _{q'})/\{f(b_{\tau _{q}}) f(b_{\tau _{q'}})\}+r_N^2. \end{aligned}$$

In addition, the distributions of \({\hat{\beta }}_{\tau _{q},h}\), \(q=1,\ldots ,Q\) and the joint distributions of \({\hat{\beta }}_{\tau _{q},h}\), \({\hat{\beta }}_{\tau _{q'},h}\), \(q,q'=1,\ldots ,Q\) are all enjoy asymptotic normality. Thus, \({\hat{\beta }}=\sum _{q=1}^Qw_q{\hat{\beta }}_{\tau _{q},h}\) also asymptotically follows normal distribution. The mean and variance of \({\hat{\beta }}\) are established as follows,

$$\begin{aligned} E[{\hat{\beta }}]&=\sum _{q=1}^Qw_qE[{\hat{\beta }}_{\tau _{q},h}]=\beta _0+r_N,\\ Var[{\hat{\beta }}]&=\sum _{q=1}^Q\sum _{q'=1}^Qw_qw_{q'}Cov({\hat{\beta }}_{\tau _{q},h}, {\hat{\beta }}_{\tau _{q'},h})\\&=N^{-1}{} \mathbf{C}^{-1}\sum _{q=1}^Q\sum _{q'=1}^Qw_qw_{q'} \frac{\min (\tau _{q},\tau _{q'})-\tau _{q}\tau _{q'}}{f(b_{\tau _{q}})f(b_{\tau _{q'}})} +r_N^2. \end{aligned}$$

Then, the density function of \({\hat{\beta }}\) is as follows,

$$\begin{aligned} f({\hat{\beta }})\rightarrow \left\{ 2\pi N^{-1}\det \left( \varSigma (\mathbf{w})\right) \right\} ^{-N/2} \exp \left\{ -1/2N^{-1}(\hat{{\beta }}-{\beta }_0)^{\top }\varSigma ^{-1}(\mathbf{w})(\hat{{\beta }}-{\beta }_0) \right\} . \end{aligned}$$

Thus, under conditions in Theorem 2.1,

$$\begin{aligned} \sqrt{N}(\hat{{\beta }}-{\beta }_0)\xrightarrow {L} \left( 0,\varSigma (\mathbf{w})\right) . \end{aligned}$$

This completes the proof of Theorem 2.1. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jiang, R., Chen, Ww. & Liu, X. Adaptive quantile regressions for massive datasets. Stat Papers 62, 1981–1995 (2021). https://doi.org/10.1007/s00362-020-01170-8

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-020-01170-8

Keywords

Navigation