Abstract
Analysis of massive datasets is challenging owing to limitations of computer primary memory. Adaptive quantile regressions is a robust and efficient estimation method. For computational efficiency, we propose an adaptive smoothing quantile regressions (ASQR). The ASQR method is used to analyze massive datasets. The proposed approach significantly reduces the required amount of primary memory, and the resulting estimate will be as efficient as if the entire data set is analyzed simultaneously. Both simulations and data analysis are conducted to illustrate the finite sample performance of the proposed methods.
Similar content being viewed by others
References
Bloznelis D, Claeskens G, Zhou J (2019) Composite versus model-averaged quantile regression. J Stat Plan Inference 200:32–46
Chen X, Xie M (2014) A split-and-conquer approach for analysis of extraordinarily large data. Stat Sin 24:1655–1684
Chen X, Liu W, Zhang Y (2019) Quantile regression under memory constraint. arXiv:1810.08264
Draper NR, Smith H (1998) Applied regression analysis, 3rd edn. Wiley, New York
Fan TH, Lin D, Cheng KF (2007) Regression analysis for massive datasets. Data Knowl Eng 61:554–562
Jiang R, Hu X, Yu K, Qian W (2018) Composite quantile regression for massive datasets. Statistics 52:980–1004
Jiang X, Li J, Xia T, Yan W (2016) Robust and efficient estimation with weighted composite quantile regression. Physica A 457:413–423
Horowitz J (1998) Bootstrap methods for median regression models. Econometrica 66:1327–1351
Koenker R (1984) A note on L-estimates for linear models. Stat Prob Lett 2:323–325
Koenker R (2005) Quantile regression. Cambridge University Press, New York
Koenker R, Bassett G (1978) Regression quantiles. Econometrica 46:33–50
Li R, Lin D, Li B (2013) Statistics inference in massive data sets. Appl Stoch Model Bus Ind 29:399–409
Lin N, Xi R (2011) Aggregated estimating equation estimation. Stat Interface 4:73–83
Pang L, Lu W, Wang H (2012) Variance estimation in censored quantile regression via induced smoothing. Comput Stat Data Anal 56:785–796
Schifano ED, Wu J, Wang C, Yan J, Chen MH (2016) Online updating of statistical inference in the big data setting. Technometrics 58:393–403
Tian Y, Zhu Q, Tian M (2016) Estimation of linear composite quantile regression using EM algorithm. Stat Prob Lett 117:183–191
Xu Q, Cai C, Jiang C, Sun F, Huang X (2017) Block average quantile regression for massive dataset. Statistical Papers. https://doi.org/10.1007/s00362-017-0932-6
Yang K, Zhu L, Xu W (2018) Adaptive composite quantile regressions and their asymptotic relative efficiency. J Stat Comput Simul 88:900–919
Zhao K, Lian H (2016) A note on the efficiency of composite quantile regression. J Stat Comput Simul 86:1334–1341
Acknowledgements
The two anonymous referees provided numerous valuable comments that improve the manuscript. This research is supported by the Shanghai Sailing Program (No. 17YF1400800) and the National Natural Science Foundation of China (No. 11801069 and No.11871143).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Proof of Theorem 2.1
By the Theorem 4.3 in Chen et al. (2019) and the error term \(\varepsilon \) is independent of \(\mathbf{X}\), we have
where \(\Vert r_N\Vert _2=O_p\left( (p/N)^{3/4}(\log N)^{1/2}\right) \). Thus, for a given quantile \(\tau _{q}\),
See, for example, Koenker (2005) for details about the above two results. It is also straightforward to verify that, for two independent quantile \(\tau _{q}\) and \(\tau _{q'}\),
In addition, the distributions of \({\hat{\beta }}_{\tau _{q},h}\), \(q=1,\ldots ,Q\) and the joint distributions of \({\hat{\beta }}_{\tau _{q},h}\), \({\hat{\beta }}_{\tau _{q'},h}\), \(q,q'=1,\ldots ,Q\) are all enjoy asymptotic normality. Thus, \({\hat{\beta }}=\sum _{q=1}^Qw_q{\hat{\beta }}_{\tau _{q},h}\) also asymptotically follows normal distribution. The mean and variance of \({\hat{\beta }}\) are established as follows,
Then, the density function of \({\hat{\beta }}\) is as follows,
Thus, under conditions in Theorem 2.1,
This completes the proof of Theorem 2.1. \(\square \)
Rights and permissions
About this article
Cite this article
Jiang, R., Chen, Ww. & Liu, X. Adaptive quantile regressions for massive datasets. Stat Papers 62, 1981–1995 (2021). https://doi.org/10.1007/s00362-020-01170-8
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-020-01170-8