Advertisement

An Analysis of Sketched IRLS for Accelerated Sparse Residual Regression

Conference paper
  • 879 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12357)

Abstract

This paper studies the problem of sparse residual regression, i.e., learning a linear model using a norm that favors solutions in which the residuals are sparsely distributed. This is a common problem in a wide range of computer vision applications where a linear system has a lot more equations than unknowns and we wish to find the maximum feasible set of equations by discarding unreliable ones. We show that one of the most popular solution methods, iteratively reweighted least squares (IRLS), can be significantly accelerated by the use of matrix sketching. We analyze the convergence behavior of the proposed method and show its efficiency on a range of computer vision applications. The source code for this project can be found at https://github.com/Diwata0909/Sketched_IRLS.

Keywords

Sparse residual regression \(\ell _{1}\) minimization Randomized algorithm Matrix sketching 

Notes

Acknowledgments

This work is supported by JSPS CREST Grant Number JPMJCR1764, Japan. Michael Waechter was supported through a postdoctoral fellowship by the Japan Society for the Promotion of Science (JP17F17350).

References

  1. 1.
    Aftab, K., Hartley, R.: Convergence of iteratively re-weighted least squares to robust M-estimators. In: Winter Conference on Applications of Computer Vision (WACV) (2015)Google Scholar
  2. 2.
    Aftab, K., Hartley, R., Trumpf, J.: Generalized Weiszfeld algorithms for Lq optimization. Trans. Pattern Anal. Mach. Intell. (PAMI) 37(4), 728–745 (2015)Google Scholar
  3. 3.
    Ailon, N., Chazelle, B.: Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform. In: Symposium on Theory of Computing (2006)Google Scholar
  4. 4.
    Alcantarilla, P.F., Nuevo, J., Bartoli, A.: Fast explicit diffusion for accelerated features in nonlinear scale spaces. In: British Machine Vision Conference (BMVC) (2013)Google Scholar
  5. 5.
    Besl, P.J., McKay, N.D.: A method for registration of 3-D shapes. Trans. Pattern Anal. Mach. Intell. (PAMI) 14(2), 239–256 (1992)Google Scholar
  6. 6.
    Candès, E.J., Li, X., Ma, Y., Wright, J.: Robust principal component analysis? J. ACM 58(3), 11:1–11:37 (2011)MathSciNetzbMATHGoogle Scholar
  7. 7.
    Candes, E.J., Tao, T.: Decoding by linear programming. IEEE Trans. Inf. Theory 51(12), 4203–4215 (2005)MathSciNetzbMATHGoogle Scholar
  8. 8.
    Charikar, M., Chen, K., Farach-Colton, M.: Finding frequent items in data streams. In: Widmayer, P., Eidenbenz, S., Triguero, F., Morales, R., Conejo, R., Hennessy, M. (eds.) ICALP 2002. LNCS, vol. 2380, pp. 693–703. Springer, Heidelberg (2002).  https://doi.org/10.1007/3-540-45465-9_59CrossRefGoogle Scholar
  9. 9.
    Chen, C., He, L., Li, H., Huang, J.: Fast iteratively reweighted least squares algorithms for analysis-based sparse reconstruction. Med. Image Anal. 49, 141–152 (2018)Google Scholar
  10. 10.
    Chen, Y., Medioni, G.: Object modeling by registration of multiple range images. In: International Conference on Robotics and Automation (ICRA) (1991)Google Scholar
  11. 11.
    Clarkson, K.L., Woodruff, D.P.: Low rank approximation and regression in input sparsity time. In: Symposium on Theory of Computing (2013)Google Scholar
  12. 12.
    Cohen, M.B., et al.: Solving SDD linear systems in nearly \(m\log ^{1/2}n\) time. In: Symposium on Theory of Computing (2014)Google Scholar
  13. 13.
    Daubechies, I., Defrise, M., De Mol, C.: An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Commun. Pure Appl. Math. 57(11), 1413–1457 (2004)MathSciNetzbMATHGoogle Scholar
  14. 14.
    Daubechies, I., DeVore, R., Fornasier, M., Güntürk, C.S.: Iteratively reweighted least squares minimization for sparse recovery. Commun. Pure Appl. Math. 63(1), 1–38 (2010)MathSciNetzbMATHGoogle Scholar
  15. 15.
    Donoho, D.L., Tsaig, Y., Drori, I., Starck, J.L.: Sparse solution of underdetermined systems of linear equations by stagewise orthogonal matching pursuit. Trans. Inf. Theory 58(2), 1094–1121 (2012)MathSciNetzbMATHGoogle Scholar
  16. 16.
    Drineas, P., Mahoney, M.W.: RandNLA: randomized numerical linear algebra. Commun. ACM 59(6), 80–90 (2016)Google Scholar
  17. 17.
    Drineas, P., Mahoney, M.W., Muthukrishnan, S.: Sampling algorithms for \(\ell _2\) regression and applications. In: ACM-SIAM Symposium on Discrete Algorithm (2006)Google Scholar
  18. 18.
    Drineas, P., Mahoney, M.W., Muthukrishnan, S., Sarlós, T.: Faster least squares approximation. Numer. Math. 117(2), 219–249 (2011)MathSciNetzbMATHGoogle Scholar
  19. 19.
    Duersch, J., Gu, M.: Randomized QR with column pivoting. SIAM J. Sci. Comput. 39(4), 263–291 (2017)MathSciNetzbMATHGoogle Scholar
  20. 20.
    Dutta, A., Richtárik, P.: Online and batch supervised background estimation via L1 regression. In: Winter Conference on Applications of Computer Vision (WACV) (2019)Google Scholar
  21. 21.
    Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Stat. 32(2), 407–499 (2004)MathSciNetzbMATHGoogle Scholar
  22. 22.
    Elad, M.: Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing, 1st edn. Springer, Heidelberg (2010).  https://doi.org/10.1007/978-1-4419-7011-4CrossRefzbMATHGoogle Scholar
  23. 23.
    Elad, M., Aharon, M.: Image denoising via sparse and redundant representations over learned dictionaries. Trans. Image Process. 15(12), 3736–3745 (2006)MathSciNetGoogle Scholar
  24. 24.
    Erichson, N., Mathelin, L., Brunton, S., Kutz, J.: Randomized dynamic mode decomposition. SIAM J. Appl. Dyn. Syst. 18(4), 1867–1891 (2017)MathSciNetzbMATHGoogle Scholar
  25. 25.
    Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)MathSciNetGoogle Scholar
  26. 26.
    Gentle, J.E.: Matrix Algebra: Theory, Computations, and Applications in Statistics. Springer, Heidelberg (2007).  https://doi.org/10.1007/978-3-319-64867-5CrossRefzbMATHGoogle Scholar
  27. 27.
    Goldstine, H.H., von Neumann, J.: Numerical inverting of matrices of high order. ii. Am. Math. Soc. 2, 188–202 (1951)MathSciNetzbMATHGoogle Scholar
  28. 28.
    Gorodnitsky, I.F., Rao, B.D.: Sparse signal reconstruction from limited data using FOCUSS: a re-weighted minimum norm algorithm. Trans. Signal Process. 45(3), 600–616 (1997)Google Scholar
  29. 29.
    Gower, R., Richtárik, P.: Randomized iterative methods for linear systems. SIAM J. Matrix Anal. Appl. 36(4), 1660–1690 (2015)MathSciNetzbMATHGoogle Scholar
  30. 30.
    Gower, R., Richtárik, P.: Randomized quasi-Newton updates are linearly convergent matrix inversion algorithms. SIAM J. Matrix Anal. Appl. 38(4), 1380–1409 (2016)MathSciNetzbMATHGoogle Scholar
  31. 31.
    Halko, N., Martinsson, P.G., Tropp, J.A.: Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev. 53(2), 217–288 (2011)MathSciNetzbMATHGoogle Scholar
  32. 32.
    Johnson, W.B., Lindenstrauss, J.: Extensions of Lipschitz mappings into a Hilbert space. Contemp. Math. 26, 189–206 (1984)MathSciNetzbMATHGoogle Scholar
  33. 33.
    Karmarkar, N.: A new polynomial-time algorithm for linear programming. Combinatorica 4(4), 373–395 (1984)MathSciNetzbMATHGoogle Scholar
  34. 34.
    Ke, Q., Kanade, T.: Robust \({L}_1\) norm factorization in the presence of outliers and missing data by alternative convex programming. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2005)Google Scholar
  35. 35.
    Kekatos, V., Giannakis, G.B.: From sparse signals to sparse residuals for robust sensing. IEEE Trans. Signal Process. 59(7), 3355–3368 (2011)MathSciNetzbMATHGoogle Scholar
  36. 36.
    Kiani, K.A., Drummond, T.: Solving robust regularization problems using iteratively re-weighted least squares. In: Winter Conference on Applications of Computer Vision (WACV) (2017)Google Scholar
  37. 37.
    Lin, Z., Chen, M., Wu, L., Ma, Y.: The augmented Lagrange multiplier method for exact recovery of corrupted low-rank matrices. Technical report UILU-ENG-09-2215, Coordinated Science Laboratory, University of Illinois at Urbana-Champaign (2010)Google Scholar
  38. 38.
    Lu, C., Lin, Z., Yan, S.: Smoothed low rank and sparse matrix recovery by iteratively reweighted least squares minimization. Trans. Image Process. 24(2), 646–654 (2015)MathSciNetzbMATHGoogle Scholar
  39. 39.
    Mahoney, M.W.: Randomized algorithms for matrices and data. Found. Trends® Mach. Learn. 3(2), 123–224 (2011)zbMATHGoogle Scholar
  40. 40.
    Mallat, S.G.: A Wavelet Tour of Signal Processing: The Sparse Way, 3rd edn. Academic Press Inc., Cambridge (2008)zbMATHGoogle Scholar
  41. 41.
    Mallat, S.G., Zhang, Z.: Matching pursuits with time-frequency dictionaries. Trans. Signal Process. 41(12), 3397–3415 (1993)zbMATHGoogle Scholar
  42. 42.
    Millikan, B., Dutta, A., Rahnavard, N., Sun, Q., Foroosh, H.: Initialized iterative reweighted least squares for automatic target recognition. In: Military Communications Conference (2015)Google Scholar
  43. 43.
    Needell, D., Tropp, J.A.: CoSaMP: iterative signal recovery from in complete and inaccurate samples. Appl. Comput. Harmonic Anal. 26(3), 301–321 (2009)MathSciNetzbMATHGoogle Scholar
  44. 44.
    Ochs, P., Dosovitskiy, A., Brox, T., Pock, T.: On iteratively reweighted algorithms for non-smooth non-convex optimization in computer vision. SIAM J. Imaging Sci. 8(1), 331–372 (2015)MathSciNetzbMATHGoogle Scholar
  45. 45.
    Oh, T.H., Matsushita, Y., Tai, Y.W., Kweon, I.S.: Fast randomized singular value thresholding for low-rank optimization. Trans. Pattern Anal. Mach. Intell. (PAMI) 40(2), 376–391 (2018)Google Scholar
  46. 46.
    Osborne, M.R.: Finite Algorithms in Optimization and Data Analysis. Wiley, Hoboken (1985)zbMATHGoogle Scholar
  47. 47.
    Pati, Y.C., Rezaiifar, R., Krishnaprasad, P.S.: Orthogonal matching pursuit: Recursive function approximation with applications to wavelet decomposition. In: Asilomar Conference on Signals, Systems, and Computers (1993)Google Scholar
  48. 48.
    Sarlós, T.: Improved approximation algorithms for large matrices via random projections. In: Symposium on Foundations of Computer Science, pp. 143–152 (2006)Google Scholar
  49. 49.
    Sigl, J.: Nonlinear residual minimization by iteratively reweighted least squares. Comput. Optim. Appl. 64(3), 755–792 (2016)MathSciNetzbMATHGoogle Scholar
  50. 50.
    Szeliski, R.: Video mosaics for virtual environments. Comput. Graph. Appl. 16(2), 22–30 (1996)Google Scholar
  51. 51.
    Szeliski, R.: Computer Vision - Algorithms and Applications. Texts in Computer Science. Springer, New York (2011).  https://doi.org/10.1007/978-1-84882-935-0CrossRefzbMATHGoogle Scholar
  52. 52.
    Tropp, J.A.: Greed is good: algorithmic results for sparse approximation. Trans. Inf. Theory 50(10), 2231–2242 (2004)MathSciNetzbMATHGoogle Scholar
  53. 53.
    Tropp, J.A.: Improved analysis of the subsampled randomized Hadamard transform. Adv. Adapt. Data Anal. 3, 115–126 (2011)MathSciNetzbMATHGoogle Scholar
  54. 54.
    Tropp, J.A., Gilbert, A.C., Strauss, M.J.: Algorithms for simultaneous sparse approximation. Part I. Signal Process. 86(3), 572–588 (2006)zbMATHGoogle Scholar
  55. 55.
    Turk, G., Levoy, M.: Zippered polygon meshes from range images. In: SIGGRAPH (1994)Google Scholar
  56. 56.
    Weiszfeld, E.: Sur le point pour lequel la somme des distances de \(n\) points donnés est minimum. Tohoku Math. J. 43, 355–386 (1937)zbMATHGoogle Scholar
  57. 57.
    Weiszfeld, E., Plastria, F.: On the point for which the sum of the distances ton given points is minimum. Ann. Oper. Res. 167, 7–41 (2009)MathSciNetzbMATHGoogle Scholar
  58. 58.
    Woodruff, D.P.: Sketching as a tool for numerical linear algebra. Found. Trends® Theor. Comput. Sci. 10(1–2), 1–157 (2014)MathSciNetzbMATHGoogle Scholar
  59. 59.
    Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust face recognition via sparse representation. Trans. Pattern Anal. Mach. Intell. (PAMI) 31(2), 210–227 (2009)Google Scholar
  60. 60.
    Yang, A., Zhou, Z., Ganesh Balasubramanian, A., Sastry, S., Ma, Y.: FastL1-minimization algorithms for robust face recognition. Trans. Image Process. 22(8), 3234–3246 (2013)Google Scholar
  61. 61.
    Zhang, Z., Xu, Y., Yang, J., Li, X., Zhang, D.: A survey of sparse representation: algorithms and applications. IEEE Access 3, 490–530 (2015)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Graduate School of Information Science and TechnologyOsaka UniversityOsakaJapan

Personalised recommendations