Statistics and Computing

, Volume 26, Issue 3, pp 715–724 | Cite as

Adaptive shrinkage of singular values

  • Julie Josse
  • Sylvain SardyEmail author


To recover a low-rank structure from a noisy matrix, truncated singular value decomposition has been extensively used and studied. Recent studies suggested that the signal can be better estimated by shrinking the singular values as well. We pursue this line of research and propose a new estimator offering a continuum of thresholding and shrinking functions. To avoid an unstable and costly cross-validation search, we propose new rules to select two thresholding and shrinking parameters from the data. In particular we propose a generalized Stein unbiased risk estimation criterion that does not require knowledge of the variance of the noise and that is computationally fast. A Monte Carlo simulation reveals that our estimator outperforms the tested methods in terms of mean squared error on both low-rank and general signal matrices across different signal-to-noise ratio regimes. In addition, it accurately estimates the rank of the signal when it is detectable.


Denoising Singular values shrinking and thresholding  Stein’s unbiased risk estimate Adaptive trace norm Rank estimation 



The authors are grateful to the editors and for the helpful comments of the reviewers. J. J. is supported by an AgreenSkills fellowship of the European Union Marie-Curie FP7 COFUND People Programme. S. S. is supported by the Swiss National Science Foundation. This work started while both authors were visiting Stanford University and the authors would like to thank the Department of Statistics for hosting them and for its stimulating seminars.


  1. Baik, J., Silverstein, J.: Eigenvalues of large sample covariance matrices of spiked population models. J. Multivar. Anal. 97(6), 1382–1408 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  2. Cai, J., Candes, E.J., Shen, Z.: A singular value thresholding algorithm for matrix completion. SIAM J. Optimiz. 20(4), 1956–1982 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  3. Candes, E.J., Li, X., Ma, Y., Wright, J.: Robust principle component analysis? J. ACM. 58, 1–37 (2009)MathSciNetCrossRefGoogle Scholar
  4. Candes, E.J., Sing-Long, C.A., Trzasko, J.D.: Unbiased risk estimates for singular value thresholding and spectral estimators. IEEE Trans. Signal Process. 61(19), 4643–4657 (2013)MathSciNetCrossRefGoogle Scholar
  5. Caussinus, H.: Models and uses of principal component analysis (with discussion). In: de Leeuw, J. (ed.) Multidimensional Data Analysis. DSWO, Leiden, The Netherlands (1986)Google Scholar
  6. Chatterjee, S.: Matrix estimation by universal singular value thresholding. arXiv:1212.1247 (2013)
  7. Chen, K., Dong, H., Chan, K.-S.: Reduced rank regression via adaptive nuclear norm penalization. Biometrika 100(4), 901–920 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  8. Craven, P., Wahba, G.: Smoothing noisy data with spline functions: estimating the correct degree of smoothing by the method of generalized cross-validation. Numerische Mathematik 31, 377–403 (1979)MathSciNetCrossRefzbMATHGoogle Scholar
  9. de Leeuw, J.D., Mooijaart, A., Leeden, M.: Fixed Factor Score Models with Linear Restrictions. University of Leiden, Leiden, The Netherlands (1985)Google Scholar
  10. Donoho, D.L., Johnstone, I.M.: Ideal spatial adaptation via wavelet shrinkage. Biometrika 81, 425–455 (1994)MathSciNetCrossRefzbMATHGoogle Scholar
  11. Donoho, D.L., Gavish, M.: The optimal hard threshold for singular values is 4/\(\sqrt{3}\). IEEE Trans. Inf. Theory 60(8), 5040–5053 (2014a)MathSciNetCrossRefGoogle Scholar
  12. Donoho, D.L., Gavish, M.: Minimax risk of matrix denoising by singular value thresholding. Ann. Statist. 42(6), 2413–2440 (2014b)Google Scholar
  13. Eckart, C., Young, G.: The approximation of one matrix by another of lower rank. Psychometrika 1(3), 211–218 (1936)CrossRefzbMATHGoogle Scholar
  14. Gaiffas, S., Lecue, G.: Weighted algorithms for compressed sensing and matrix completion. arXiv:1107.1638 (2011)
  15. Gavish, M., Donoho, D.L.: Optimal shrinkage of singular values. arXiv:1405.7511v2 (2014)
  16. Hoff, P.D.: Equivariant and scale-free tucker decomposition models. arXiv:1312.6397 (2013)
  17. Hoff, P.D.: Model averaging and dimension selection for the singular value decomposition. J. Am. Statist. Assoc. 102(478), 674–685 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  18. Huber, P.J.: Robust Statistics. Wiley, New York (1981)CrossRefzbMATHGoogle Scholar
  19. Ilin, A., Raiko, T.: Practical approaches to principal component analysis in the presence of missing values. J. Mach. Learn. Res. 11, 1957–2000 (2010)MathSciNetzbMATHGoogle Scholar
  20. Johnstone, I.: On the distribution of the largest eigenvalue in principal components analysis. Ann. Statist. 29(2), 295–327 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  21. Josse, J., Husson, F.: Selecting the number of components in PCA using cross-validation approximations. Computat. Statistist. Data Anal. 56(6), 1869–1879 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  22. Josse, J., Husson, F.: Handling missing values in exploratory multivariate data analysis methods. J. Société Française Statistique 153(2), 1–21 (2012)MathSciNetzbMATHGoogle Scholar
  23. Lê, S., Josse, J., Husson, F.: Factominer: an R package for multivariate analysis. J. Statist. Softw. 25(1), 1–18 (2008). 3CrossRefGoogle Scholar
  24. Ledoit, O., Wolf, M.: Nonlinear shrinkage estimation of large-dimensional covariance matrices. Ann. Statist. 40, 1024–1060 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  25. Mandel, J.: The partitioning of interaction in analysis of variance. J. Res. Natl. Bur. Stand. B 73, 309–328 (1969)MathSciNetCrossRefzbMATHGoogle Scholar
  26. Mazumder, R., Hastie, T., Tibshirani, R.: Spectral regularization algorithms for learning large incomplete matrices. J. Mach. Learn. Res. 99, 2287–2322 (2010)MathSciNetzbMATHGoogle Scholar
  27. Owen, A.B., Perry, P.O.: Bi-cross-validation of the svd and the nonnegative matrix factorization. Ann. Appl. Statist. 3(2), 564–594 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  28. Paul, D.: Asymptotics of sample eigenstructure for a large dimensional spiked covariance model. Statistica Sinica 17(4), 1617 (2007)MathSciNetzbMATHGoogle Scholar
  29. Rao, N.R.: Optshrink-low-rank signal matrix denoising via optimal, data-driven singular value shrinkage. arXiv:1306.6042 (2013)
  30. Sardy, S.: Blockwise and coordinatewise thresholding to combine tests of different natures in modern anova. arXiv:1302.6073 (2013)
  31. Sardy, S., Bruce, A.G., Tseng, P.: Block coordinate relaxation methods for nonparametric wavelet denoising. J. Computat. Graphic. Statist. 9, 361–379 (2000)MathSciNetGoogle Scholar
  32. Sardy, S., Tseng, P., Bruce, A.G.: Robust wavelet denoising. IEEE Trans. Signal Process. 49, 1146–1152 (2001)CrossRefGoogle Scholar
  33. Sardy, S.: Smooth blockwise iterative thresholding: a smooth fixed point estimator based on the likelihood’s block gradient. J. Am. Statist. Assoc. 107, 800–813 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  34. Shabalin, A.A., Nobel, B.: Reconstruction of a low-rank matrix in the presence of Gaussian noise. J. Multivar. Anal. 118, 67–76 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  35. Stein, C.M.: Estimation of the mean of a multivariate normal distribution. Ann. Statist. 9, 1135–1151 (1981)MathSciNetCrossRefzbMATHGoogle Scholar
  36. Talebi, H., Milanfar, P.: Global image denoising. IEEE Trans. Image. Process. 23(2), 755–768 (2014)MathSciNetCrossRefGoogle Scholar
  37. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Statist. Soc. B Methodol. 58, 267–288 (1996)MathSciNetzbMATHGoogle Scholar
  38. Verbanck, M., Josse, J., Husson, F.: Regularized PCA to denoise and visualize data. Statist. Comput. 25(2), 471–486 (2015)Google Scholar
  39. Zanella, A., Chiani, M., Win, M.Z.: On the marginal distribution of the eigenvalues of Wishart matrices. IEEE Trans. Commun. 57(4), 1050–1060 (2009)Google Scholar
  40. Zhang, C.H., Huang, J.: The sparsity and biais of the lasso selection in high-dimensional linear regression. Ann. Statist. 36(4), 1567–1594 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  41. Zou, H.: The adaptive LASSO and its oracle properties. J. Am. Statist. Assoc. 101, 1418–1429 (2006)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Agrocampus OuestRennesFrance
  2. 2.Université de GenèveGenevaSwitzerland

Personalised recommendations