Abstract
A reduced-rank regression with sparse singular value decomposition (RSSVD) approach was proposed by Chen et al. for conducting variable selection in a reduced-rank model. To jointly model the multivariate response, the method efficiently constructs a prespecified number of latent variables as some sparse linear combinations of the predictors. Here, we generalize the method to also perform rank reduction, and enable its usage in reduced-rank vector autoregressive (VAR) modeling to perform automatic rank determination and order selection. We show that in the context of stationary time-series data, the generalized approach correctly identifies both the model rank and the sparse dependence structure between the multivariate response and the predictors, with probability one asymptotically. We demonstrate the efficacy of the proposed method by simulations and analyzing a macro-economical multivariate time series using a reduced-rank VAR model.
Similar content being viewed by others
References
An, H., D. Huang, Q. Yao, and C.-H. Zhang. 2008. Stepwise searching for feature variables in high-dimensional linear regression. Technical Report, Department of Statistics, London School of Economics, London, UK.
Anderson, T. W. 1951. Estimating linear restrictions on regression coefficients for multivariate normal distributions. Annals of Mathematical Statistics 22:327–51.
Billingsley, P. 1999. Convergence of probability measures. Wiley series in probability and statistics: Probability and statistics. Hoboken, NJ: Wiley.
Bunea, F., Y. She, and M. Wegkamp. 2011. Optimal selection of reduced rank estimators of high-dimensional matrices. Annals of Statistics 39:1282–309.
Bunea, F., Y. She, and M. Wegkamp. 2012. Joint variable and rank selection for parsimonious estimation of high dimensional matrices. Annals of Statistics 40:2359–88.
Bura, E., and R. Pfeiffer. 2008. On the distribution of the left singular vectors of a random matrix and its applications. Statistics and Probability Letters 58:2275–80.
Chen, K. 2011. Regularized multivariate stochastic regression. Dissertation, University of Iowa, Ames, IA.
Chen, K., and K.-S. Chan. 2011. Subset arma selection via the adaptive lasso. Statistics and Its Interface 4:197–205.
Chen, K., K.-S. Chan, and N. C. Stenseth. 2012. Reduced rank stochastic regression with a sparse singular value decomposition. Journal of the Royal Statistical Society: Series B 74:203–21.
Chen, K., K.-S. Chan, and N. C. Stenseth. 2014. Source-sink reconstruction through regularized multicomponent regression analysis–With application to assessing whether North Sea cod larvae contributed to local fjord cod in Skagerrak. Journal of the American Statistical Association 109:560–73.
Chen, K., H. Dong, and K.-S. Chan. 2013. Reduced rank regression via adaptive nuclear norm penalization. Biometrika 100:901–20.
Chen, L., and J. Z. Huang. 2012. Sparse reduced-rank regression for simultaneous dimension reduction and variable selection. Journal of the American Statistical Association 107:1533–45.
Efron, B., T. J. Hastie, I. Johnstones, and R. J. Tibshirani. 2004. Least angle regression. Annals of Statistics 32(2):407–99.
Fan, J., and R. Li. 2001. Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96:1348–60.
Friedman, J., T. J. Hastie, H. Höfling, and R. Tibshirani. 2007. Pathwise coordinate optimization. Annals of Applied Statistics 2:302–32.
Hsu, P. L. 1941. On the limit distribution of roots of a determinantal equation. Journal of London Mathematical Society 16:183–94.
Huang, J., P. Breheny, and S. Ma. 2012a. A selective review of group selection in high dimensional models. Statistics Science 27:81–99.
Huang, J., F. Wei, and S. Ma. 2012b. Semiparametric regression pursuit. Statistica Sinica 22:1403–26.
Izenman, A. J. 1975. Reduced-rank regression for the multivariate linear model. Journal of Multivariate Analysis 5:248–64.
Knight, K., and W. Fu. 2000. Asymptotics for lasso-type estimators. Annals of Statistics 28:1356–78.
Lee, M., H. Shen, J. Z. Huang, and J. S. Marron. 2010. Biclustering via sparse singular value decomposition. Biometrics 66:1087–95.
Li, M.-C., and K.-S. Chan. 2007. Multivaraite reduced-rank nonlinear time series modeling. Statistica Sinica 17:139–59.
Lütkepohl, H. 1993. Introduction to multiple time series analysis. New York, NY: Springer Verlag.
Ma, X., L. Xiao, and W. H. Wong. 2014. Learning regulatory programs by threshold svd regression. Proceedings of the National Academy of Sciences of the United States of America 111:15675–80.
Ma, Z., and T. Sun. 2014. Adaptive sparse reduced-rank regression. ArXiv e-prints. http://arxiv.org/abs/1403.1922.
Mukherjee, A., and J. Zhu. 2011. Reduced rank ridge regression and its kernel extensions. Statistical Analysis and Data Mining 4:612–22.
Peng, J., J. Zhu, A. Bergamaschi, W. Han, D.-Y. Noh, J. R. Pollack, and P. Wang. 2010. Regularized multivariate regression for identifying master predictors with application to integrative genomics study of breast cancer. Annals of Applied Statistics 4:53–77.
R Development Core Team. 2014. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing.
Reinsel, G. C., and P. Velu. 1998. Multivariate reduced-rank regression: Theory and applications. New York, NY: Springer.
Schwarz, G. 1978. Estimating the dimension of a model. Annals of Statistics 6: 461–64.
She, Y. 2013. Reduced rank vector generalized linear models for feature extraction. Statistics and Its Interface 6:197–209.
Stout, W. F. 2007. The hartman-wintner law of the iterated logarithm for martingales. Annals of Mathematical Statistics 41:2158–60.
Tibshirani, R. J. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B 58:267–88.
van der Vaart, A. W. 2000. Asymptotic statistics (Cambridge series in statistical and probabilistic mathematics). Cambridge, UK: Cambridge University Press.
Witten, D. M., R. J. Tibshirani, and T. J. Hastie. 2009. A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics 10: 515–34.
Yee, T., and T. J. Hastie. 2003. Reduced rank vector generalized linear models. Statistical Modeling 3:367–78.
Yuan, M., A. Ekici, Z. Lu, and R. Monteiro. 2007. Dimension reduction and coefficient estimation in multivariate linear regression. Journal of the Royal Statistical Society: Series B 69:329–46.
Yuan, M., and Y. Lin. 2006. Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society, Series B, 68: 49–67.
Zhu, H., Z. Khondker, Z. Lu, and J. G. Ibrahim. 2014. Bayesian generalized low rank regression models for neuroimaging phenotypes and genetic markers. Journal of the American Statistical Association 109:977–90.
Zou, H. 2006. The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101:1418–29.
Zou, H., and T. J. Hastie. 2005. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B 67:301–20.
Zou, H., T. J. Hastie, and R. J. Tibshirani. 2007. On the degree of freedom of the lasso. Annals of Statistics 35:2173–92.
Author information
Authors and Affiliations
Corresponding author
Additional information
Color versions of one or more of the figures in the article can be found online at www.tandfonline.com/ujsp.
Rights and permissions
About this article
Cite this article
Chen, K., Chan, KS. A note on rank reduction in sparse multivariate regression. J Stat Theory Pract 10, 100–120 (2016). https://doi.org/10.1080/15598608.2015.1081573
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1080/15598608.2015.1081573