Robust reduced rank regression in a distributed setting

Chen, Xi; Liu, Weidong; Mao, Xiaojun

doi:10.1007/s11425-020-1785-0

Robust reduced rank regression in a distributed setting

Articles
Published: 21 January 2022

Volume 65, pages 1707–1730, (2022)
Cite this article

Science China Mathematics Aims and scope Submit manuscript

Xi Chen¹,
Weidong Liu² &
Xiaojun Mao³

317 Accesses
2 Citations
Explore all metrics

Abstract

This paper studies the reduced rank regression problem, which assumes a low-rank structure of the coefficient matrix, together with heavy-tailed noises. To address the heavy-tailed noise, we adopt the quantile loss function instead of commonly used squared loss. However, the non-smooth quantile loss brings new challenges to both the computation and development of statistical properties, especially when the data are large in size and distributed across different machines. To this end, we first transform the response variable and reformulate the problem into a trace-norm regularized least-square problem, which greatly facilitates the computation. Based on this formulation, we further develop a distributed algorithm. Theoretically, we establish the convergence rate of the obtained estimator and the theoretical guarantee for rank recovery. The simulation analysis is provided to demonstrate the effectiveness of our method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust distributed estimation and variable selection for massive datasets via rank regression

Article 20 June 2021

Distributed smoothed rank regression with heterogeneous errors for massive data

Article 23 October 2023

Sparse reduced-rank regression with covariance estimation

Article 09 December 2014

References

Ando R K, Zhang T. A framework for learning predictive structures from multiple tasks and unlabeled data. J Mach Learn Res, 2005, 6: 1817–1853
MathSciNet MATH Google Scholar
Argyriou A, Evgeniou T, Pontil M. Multi-task feature learning. In: Proceedings of the Advances in Neural Information Processing Systems, vol. 19. Cambridge: MIT Press, 2007, 41–48
MATH Google Scholar
Banerjee M, Durot C, Sen B. Divide and conquer in nonstandard problems and the super-efficiency phenomenon. Ann Statist, 2019, 47: 720–757
Article MathSciNet Google Scholar
Battey H, Fan J, Liu H, et al. Distributed testing and estimation under sparse high dimensional models. Ann Statist, 2018, 46: 1352–1382
Article MathSciNet Google Scholar
Bunea F, She Y, Wegkamp M H. Optimal selection of reduced rank estimators of high-dimensional matrices. Ann Statist, 2011, 39: 1282–1309
Article MathSciNet Google Scholar
Cai J-F, Candes E J, Shen Z. A singular value thresholding algorithm for matrix completion. SIAM J Optim, 2010, 20: 1956–1982
Article MathSciNet Google Scholar
Cai T T, Liu W. Adaptive thresholding for sparse covariance matrix estimation. J Amer Statist Assoc, 2011, 106: 672–684
Article MathSciNet Google Scholar
Candes E J, Recht B. Exact matrix completion via convex optimization. Found Comput Math, 2009, 9: 717–772
Article MathSciNet Google Scholar
Chao S-K, Hardle W, Yuan M. Factorisable multi-task quantile regression. SFB 649 Discussion Paper 2016-057, https://doi.org/10.2139/ssrn.2892555, 2016
Chen K, Chan K, Stenseth N C. Reduced rank stochastic regression with a sparse singular value decomposition. J R Stat Soc Ser B Stat Methodol, 2011, 74: 203–221
Article MathSciNet Google Scholar
Chen K, Dong H, Chan K-S. Reduced rank regression via adaptive nuclear norm penalization. Biometrika, 2013, 100: 901–920
Article MathSciNet Google Scholar
Chen X, Liu W, Mao X, et al. Distributed high-dimensional regression under a quantile loss function. J Mach Learn Res, 2020, 21: 182
MathSciNet MATH Google Scholar
Chen X, Liu W, Zhang Y. Quantile regression under memory constraint. Ann Statist, 2019, 47: 3244–3273
MathSciNet MATH Google Scholar
Fan J, Wang D, Wang K, et al. Distributed estimation of principal eigenspaces. Ann Statist, 2019, 47: 3009–3031
MathSciNet MATH Google Scholar
Franklin J N. Matrix Theory. Englewood Cliffs: Prentice Hall, 1968
MATH Google Scholar
Horn R A, Johnson C R. Matrix Analysis. Cambridge: Cambridge University Press, 2012
Book Google Scholar
Izenman A J. Reduced-rank regression for the multivariate linear model. J Multivariate Anal, 1975, 5: 248–264
Article MathSciNet Google Scholar
Izenman A J. Modern Multivariate Statistical Techniques: Regression, Classification, and Manifold Learning, 2nd ed. New York: Springer, 2013
MATH Google Scholar
Jordan M I, Lee J D, Yang Y. Communication-efficient distributed statistical inference. J Amer Statist Assoc, 2019, 114: 668–681
Article MathSciNet Google Scholar
Koenker R. Quantile Regression. Cambridge: Cambridge University Press, 2005
Book Google Scholar
Lee J D, Liu Q, Sun Y, et al. Communication-efficient sparse regression. J Mach Learn Res, 2017, 18: 5
MathSciNet MATH Google Scholar
Li R, Lin D K, Li B. Statistical inference in massive data sets. Appl Stoch Models Bus Ind, 2013, 29: 399–409
MathSciNet Google Scholar
Lian H, Fan Z. Divide-and-conquer for debiased li-norm support vector machine in ultra-high dimensions. J Mach Learn Res, 2017, 18: 6691–6716
Google Scholar
Lu Z, Monteiro R D, Yuan M. Convex optimization methods for dimension reduction and coefficient estimation in multivariate linear regression. Math Program, 2012, 131: 163–194
Article MathSciNet Google Scholar
Mao X, Chen S X, Wong R K. Matrix completion with covariate information. J Amer Statist Assoc, 2019, 114: 198–210
Article MathSciNet Google Scholar
Mazumder R, Hastie T, Tibshirani R. Spectral regularization algorithms for learning large incomplete matrices. J Mach Learn Res, 2010, 11: 2287–2322
MathSciNet MATH Google Scholar
Merikoski J K, Kumar R. Inequalities for spreads of matrix sums and products. Appl Math E-Notes, 2004, 4: 150–159
MathSciNet MATH Google Scholar
Pong T K, Tseng P, Ji S, et al. Trace norm regularization: Reformulations, algorithms, and multi-task learning. SIAM J Optim, 2010, 20: 3465–3489
Article MathSciNet Google Scholar
Reinsel G C, Velu R P. Multivariate Reduced-Rank Regression: Theory and Applications. New York: Springer, 1998
Book Google Scholar
Shamir O, Srebro N, Zhang T. Communication-efficient distributed optimization using an approximate Newton-type method. J Mach Learn Res, 2014, 32: 1000–1008
Google Scholar
She Y, Chen K. Robust reduced-rank regression. Biometrika, 2017, 104: 633–647
Article MathSciNet Google Scholar
Shi C, Lu W, Song R. A massive data framework for M-estimators with cubic-rate. J Amer Statist Assoc, 2018, 113: 1698–1709
Article MathSciNet Google Scholar
Volgushev S, Chao S-K, Cheng G. Distributed inference for quantile regression processes. Ann Statist, 2019, 47: 1634–1662
Article MathSciNet Google Scholar
Yuan M, Ekici A, Lu Z, et al. Dimension reduction and coefficient estimation in multivariate linear regression. J R Stat Soc Ser B Stat Methodol, 2007, 69: 329–346
Article MathSciNet Google Scholar
Zhang Y, Duchi J, Wainwright M. Divide and conquer kernel ridge regression: A distributed algorithm with minimax optimal rates. J Mach Learn Res, 2015, 16: 3299–3340
MathSciNet MATH Google Scholar
Zhang Y, Yang Q. An overview of multi-task learning. Natl Sci Rev, 2017, 5: 30–43
Article Google Scholar
Zhao T, Cheng G, Liu H. A partially linear framework for massive heterogeneous data. Ann Statist, 2016, 44: 1400–1437
Article MathSciNet Google Scholar
Zou H. The adaptive lasso and its oracle properties. J Amer Statist Assoc, 2006, 101: 1418–1429
Article MathSciNet Google Scholar

Download references

Acknowledgements

The second author was supported by National Basic Research Program of China (973 Program) (Grant No. 2018AAA0100704), National Natural Science Foundation of China (Grant Nos. 11825104 and 11690013), Youth Talent Support Program and Australian Research Council. The third author was supported by National Natural Science Foundation of China (Grant No. 12001109), Shanghai Sailing Program (Grant No. 19YF1402800) and the Science and Technology Commission of Shanghai Municipality (Grant No. 20dz1200600).

Author information

Authors and Affiliations

Stern School of Business, New York University, New York, NY, 10012, USA
Xi Chen
School of Mathematical Sciences and MoE Key Lab of Artificial Intelligence, Shanghai Jiao Tong University, Shanghai, 200240, China
Weidong Liu
School of Data Science, Fudan University, Shanghai, 200433, China
Xiaojun Mao

Authors

Xi Chen
View author publications
You can also search for this author in PubMed Google Scholar
Weidong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaojun Mao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaojun Mao.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, X., Liu, W. & Mao, X. Robust reduced rank regression in a distributed setting. Sci. China Math. 65, 1707–1730 (2022). https://doi.org/10.1007/s11425-020-1785-0

Download citation

Received: 30 April 2020
Accepted: 22 September 2020
Published: 21 January 2022
Issue Date: August 2022
DOI: https://doi.org/10.1007/s11425-020-1785-0

Keywords

MSC(2020)

62H12

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust reduced rank regression in a distributed setting

Abstract

Access this article

Similar content being viewed by others

Robust distributed estimation and variable selection for massive datasets via rank regression

Distributed smoothed rank regression with heterogeneous errors for massive data

Sparse reduced-rank regression with covariance estimation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

MSC(2020)

Navigation

Robust reduced rank regression in a distributed setting

Abstract

Access this article

Similar content being viewed by others

Robust distributed estimation and variable selection for massive datasets via rank regression

Distributed smoothed rank regression with heterogeneous errors for massive data

Sparse reduced-rank regression with covariance estimation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

MSC(2020)

Search

Navigation