Skip to main content

Nuclear Norm Optimization and Its Application to Observation Model Specification

  • Chapter
  • First Online:
Compressed Sensing & Sparse Filtering

Part of the book series: Signals and Communication Technology ((SCT))

Abstract

Optimization problems involving the minimization of the rank of a matrix subject to certain constraints are pervasive in a broad range of disciples, such as control theory [6, 26, 31, 62], signal processing [25], and machine learning [3, 77, 89]. However, solving such rank minimization problems is usually very difficult as they are NP-hard in general [65, 75]. The nuclear norm of a matrix, as the tightest convex surrogate of the matrix rank, has fueled much of the recent research and has proved to be a powerful tool in many areas. In this chapter, we aim to provide a brief review of some of the state-of-the-art in nuclear norm optimization algorithms as they relate to applications. We then propose a novel application of the nuclear norm to the linear model recovery problem, as well as a viable algorithm for solution of the recovery problem. Preliminary numerical results presented here motivates further investigation of the proposed idea.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    This assumption is not mandatory for primal-dual interior point methods.

  2. 2.

    Non-intrusive methods relies upon black-box interface, for which output is received per input.

  3. 3.

    http://atlas.scmr.org/download.html

References

  1. ACM Sigkdd and Netflix (2007) Proceedings of KDD Cup and Workshop

    Google Scholar 

  2. Alizadeh F (1995) Interior point methods in semidefinite programming with applications to combinatorial optimization. SIAM J Opt 5:13–51

    Article  MathSciNet  MATH  Google Scholar 

  3. Argyriou A, Micchelli CA, Pontil M (2008) Convex multi-task feature learning. Mach Learn http://www.springerlink.com/

  4. Avron H, Kale S, Prasad S, Sindhwani V (2012) Efficient and practical stochastic subgradient descent for nuclear norm regularization. In: Proceedings of the 29th international conference on machine learning

    Google Scholar 

  5. Barrett R, Berry M, Chan TF, Demmel J, Donato J, Dongarra J, Eijkhout V, Pozo R, Romine C, Van der Vorst H (1994) Templates for the solution of linear systems: building blocks for iterative methods. SIAM, 2 edn

    Google Scholar 

  6. Beck C, D’Andrea, R (1998) Computational study and comparisons of lft reducibility methods. In: Proceedings of the American control conference

    Google Scholar 

  7. Belkin M, Niyogi P (2003) Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput 15:1373–1396

    Article  MATH  Google Scholar 

  8. Boyd S Subgradients. Lecture notes, EE392o. http://www.stanford.edu/class/ee392o/subgrad.pdf

  9. Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends Mach Learn 3(1):1–122

    Article  Google Scholar 

  10. Boykov Y, Kolmogorov V (2001) An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Trans Pattern Anal Mach Intell 26:359–374

    Google Scholar 

  11. Boykov Y, Veksler O, Zabih R (2001) Fast approximate energy minimization via graph cuts. IEEE Trans Pattern Anal Mach Intell 23:2001

    Article  Google Scholar 

  12. Briggs WL, Henson VE, McCormick SF (2000) A multigrid tutorial, 2 edn Society for Industrial and Applied Mathematics, Philadelphia

    Google Scholar 

  13. Bui-Thanh T, Willcox K, Ghattas O (2008) Model reduction for large-scale systems with high-dimensional parametric input space. SIAM J Sci Comput 30(6):3270–3288

    Article  MathSciNet  MATH  Google Scholar 

  14. Cai J, Candès EJ, Shen Z (2010) A singular value thresholding algorithm for matrix completion. SIAM J Opt 20(4):1956–1982

    Article  MATH  Google Scholar 

  15. Cai JF, Osher S, Shen Z (2009) Convergence of the linearized bregman iteration for \(\ell _1\)-norm minimization. Math Comp 78:2127–2136

    Article  MathSciNet  MATH  Google Scholar 

  16. Candès EJ, Li XD, Ma Y, Wrightes J (2011) Robust principal component analysis. J ACM 58(11):1–37

    Google Scholar 

  17. Chiu JW, Demanet L (2012) Matrix probing and its conditioning. SIAM J Num Anal 50(1):171–193

    Article  MathSciNet  MATH  Google Scholar 

  18. Combettes PL, Pesquet J-C (2011) Proximal splitting methods in signal processing. In: Bauschke HH, Burachik RS, Combettes PL, Elser V, Luke DR, Wolkowicz H (eds) Fixed-point algorithms for inverse problems in science and engineering, Springer, Berlin, pp 185–212

    Google Scholar 

  19. Conn AR, Gould N, Toint PhL (1993) Improving the decomposition of partially separable functions in the context of large-scale optimization: a first approach

    Google Scholar 

  20. Courtois PJ (1975) Error analysis in nearly-completely decomposable stochastic systems. Econometrica 43(4):691–709

    Article  MathSciNet  MATH  Google Scholar 

  21. Courtois PJ (1977) Decomposability : queueing and computer system applications, volume ACM monograph series of 012193750X. Academic Press

    Google Scholar 

  22. Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the em algorithm. J R Stat Soc Ser B 39(1):1–38

    Google Scholar 

  23. Domingos P (1999) Data Min Knowl Discov 3:409–425

    Article  Google Scholar 

  24. Donoho DL, Stark PhB (1989) Uncertainty principles and signal recovery. SIAM J Appl Math 49(3):906–931

    Article  MathSciNet  MATH  Google Scholar 

  25. Elman JL (1990) Finding structure in time. Cogn Sci 14(2):179–211

    Article  Google Scholar 

  26. Fazel M, Hindi H, Boyd S (2001) A rank minimization heuristic with application to minimum order system approximation. In: Proceedings of the American control conference

    Google Scholar 

  27. Fazel M, Hindi H, Boyd S (2003) Log-Det heuristic for matrix rank minimization with applications to Hankel and Euclidean distance matrices. Proc Am Control Conf 3:2156–2162

    Google Scholar 

  28. Feng C (2010) Localization of wireless sensors via nuclear norm for rank minimization. In: Global telecommunications conference IEEE, pp 1–5

    Google Scholar 

  29. Freund RM, S Mizuno (1996) Interior point methods: current status and future directions. OPTIMA — mathematical programming society newsletter

    Google Scholar 

  30. Fuchs J-J (2004) On sparse representations in arbitrary redundant bases. IEEE Trans Inf Theory 50:1341–1344

    Google Scholar 

  31. Ghaoui LE, Gahinet P (1993) Rank minimization under lmi constraints: a framework for output feedback problems. In: Proceedings of the European control conference

    Google Scholar 

  32. Globerson A, Chechik G, Pereira F, Tishby N (2007) Euclidean embedding of co-occurrence data. J Mach Learn Res 8:2265–2295

    MathSciNet  MATH  Google Scholar 

  33. Golub GH, Van Loan CF (1996) Matrix computations, 3 edn. Johns Hopkins University Press

    Google Scholar 

  34. Gonzalez-Vega L, Rúa IF (2009) Solving the implicitization, inversion and reparametrization problems for rational curves through subresultants. Comput Aided Geom Des 26(9):941–961

    Article  MATH  Google Scholar 

  35. Grama A, Karypis G, Gupta A, Kumar V (2003) Introduction to parallel computing: design and analysis of algorithms. Addison-Wesley

    Google Scholar 

  36. Griewank A, Toint PhL (1981) On the unconstrained optimization of partially separable objective functions. In: Powell MJD (ed) Nonlinear optimization. Academic press, London, pp 301–312

    Google Scholar 

  37. Griewank A, Walther A (2008) Evaluating derivatives: principles and techniques of algorithmic differentiation. Soc Indus Appl Math

    Google Scholar 

  38. Grossmann C (2009) System identification via nuclear norm regularization for simulated moving bed processes from incomplete data sets. In: Proceedings of the 48th IEEE conference on decision and control, 2009 held jointly with the 28th Chinese control conference. CDC/CCC 2009

    Google Scholar 

  39. Gugercin S, Willcox K (2008) Krylov projection framework for fourier model reduction

    Google Scholar 

  40. Hansson A, Liu Z, Vandenberghe L (2012) Subspace system identification via weighted nuclear norm optimization. CoRR, abs/1207.0023

    Google Scholar 

  41. Hazan E (2008) Sparse approximation solutions to semidefinite programs. In: LATIN, pp 306–316

    Google Scholar 

  42. Hendrickson B, Leland R (1995) A multilevel algorithm for partitioning graphs. In: Proceedings of the 1995 ACM/IEEE conference on supercomputing, supercomputing 1995, ACM, New York

    Google Scholar 

  43. Horesh L, Haber E (2009) Sensitivity computation of the \(\ell _1\) minimization problem and its application to dictionary design of ill-posed problems. Inverse Prob 25(9):095009

    Article  MathSciNet  Google Scholar 

  44. Jaggi M, Sulovsky M (2010) A simple algorithm for nuclear norm regularized problems. In: Proceedings of the 27th international conference on machine learning

    Google Scholar 

  45. Ji S, Ye J (2009) An accelerated gradient method for trace norm minimization. In: Proceedings of the 26th annual international conference on machine learning, ICML 2009, ACM, New York, pp 457–464

    Google Scholar 

  46. Kaipio JP, Kolehmainen V, Vauhkonen M, Somersalo E (1999) Inverse problems with structural prior information. Inverse Prob 15(3):713–729

    Google Scholar 

  47. Kanevsky D, Carmi A, Horesh L, Gurfil P, Ramabhadran B, Sainath TN (2010) Kalman filtering for compressed sensing. In: 13th conference on information fusion (FUSION), pp 1–8

    Google Scholar 

  48. Karypis G, Kumar V (1998) A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J Sci Comput 20(1):359–392

    Article  MathSciNet  Google Scholar 

  49. Lee K, Bresler Y (2010) Admira: atomic decomposition for minimum rank approximation. IEEE Trans Inf Theory 56(9):4402–4416

    Article  MathSciNet  Google Scholar 

  50. Li HF (2004) Minimum entropy clustering and applications to gene expression analysis. In: Proceedings of IEEE computational systems bioinformatics conference, pp 142–151

    Google Scholar 

  51. Liiv I (2010) Seriation and matrix reordering methods: an historical overview. Stat Anal Data Min 3(2):70–91

    MathSciNet  Google Scholar 

  52. Lin ZC, Chen MM, Ma Y (2009) The augmented Lagrange multiplier method for exact recovery of a corrupted low-rank matrices. Technical report

    Google Scholar 

  53. Lin ZC, Ganesh A, Wright J, Wu LQ, Chen MM, Ma Y (2009) Fast convex optimization algorithms for exact recovery of a corrupted low-rank matrix. In: Conference version published in international workshop on computational advances in multi-sensor adaptive processing

    Google Scholar 

  54. Liu J, Sycara KP (1995) Exploiting problem structure for distributed constraint optimization. In: Proceedings of the first international conference on multi-agent systems. MIT Press, pp 246–253

    Google Scholar 

  55. Liu Y-J, Sun D, Toh K-C (2012) An implementable proximal point algorithmic framework for nuclear norm minimization. Mathe Program 133:399–436

    Google Scholar 

  56. Liu Z, Vandenberghe L (2009) Interior-point method for nuclear norm approximation with application to system identification. SIAM J Matrix Anal Appl 31:1235–1256

    Article  MathSciNet  Google Scholar 

  57. Ma SQ, Goldfarb D, Chen LF (2011) Fixed point and bregman iterative methods for matrix rank minimization. Math Program 128:321–353

    Article  MathSciNet  MATH  Google Scholar 

  58. Majumdar A, Ward RK (2012) Nuclear norm-regularized sense reconstruction. Magn Reson Imaging 30(2):213–221

    Article  Google Scholar 

  59. Mallat SG (1989) A theory for multiresolution signal decomposition: the wavelet representation. IEEE Trans Pattern Anal Mach Intell 11:674–693

    Article  MATH  Google Scholar 

  60. Mardani M, Mateos G, Giannakis GB (2012) In-network sparsity-regularized rank minimization: algorithms and applications. CoRR, abs/1203.1570

    Google Scholar 

  61. Mazumder R, Hastie T, Tibshirani R (2010) Spectral regularization algorithms for learning large incomplete matrices. J Mach Learn Res 99:2287–2322

    MathSciNet  Google Scholar 

  62. Mesbahi M, Papavassilopoulos GP (1997) On the rank minimization problem over a positive semi-definite linear matrix inequality. IEEE Trans Autom Control 42(2):239–243

    Article  MathSciNet  MATH  Google Scholar 

  63. Mohan K, Fazel M (2010) Reweighted nuclear norm minimization with application to system identification. In: American control conference (ACC), pp 2953–2959

    Google Scholar 

  64. Moore BC (1981) Principal component analysis in linear systems: controllability, observability, and model reduction. IEEE Trans Autom Cont AC-26:17–32

    Google Scholar 

  65. Natarajan BK (1995) Sparse approximate solutions to linear systems. SIAM J Comput 24:227–234

    Article  MathSciNet  MATH  Google Scholar 

  66. Neal R, Hinton GE (1998) A view of the EM algorithm that justifies incremental, sparse, and other variants. In: Learning in graphical models. Kluwer Academic Publishers, pp 355–368

    Google Scholar 

  67. Nemirovsky AS, Yudin DB (1983) Problem complexity and method efficiency in optimization. Wiley-Interscience series in discrete mathematics, Wiley

    Google Scholar 

  68. Nesterov Y (1983) A method of solving a convex programming problem with convergence rate \({\cal O}\left(\frac{1}{{\sqrt{k}}}\right)\). Sov Math Dokl 27:372–376

    Google Scholar 

  69. Nesterov Y, Nemirovskii A (1994) Interior-point polynomial algorithms in convex programming. In: Studies in applied and numerical mathematics. Soc for Industrial and Applied Math

    Google Scholar 

  70. Olsson C, Oskarsson M (2009) A convex approach to low rank matrix approximation with missing data. In: Proceedings of the 16th Scandinavian conference on image, analysis, SCIA ’09, pp 301–309

    Google Scholar 

  71. Peng YG, Ganesh A, Wright J, Xu WL, Ma Y (2010) RASL: robust alignment by sparse and low-rank decomposition for linearly correlated images. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 763–770

    Google Scholar 

  72. Phillips DL (1962) A technique for the numerical solution of certain integral equations of the first kind. J ACM 9(1):84–97

    Article  MATH  Google Scholar 

  73. Pichel JC, Rivera FF, Fernández M, Rodríguez A (2012) Optimization of sparse matrix-vector multiplication using reordering techniques on GPUs. Microprocess Microsyst 36(2):65–77

    Article  Google Scholar 

  74. Pothen A, Simon HD, Liou K-P (1990) Partitioning sparse matrices with eigenvectors of graphs. SIAM J Matrix Anal Appl 11(3):430–452

    Article  MathSciNet  MATH  Google Scholar 

  75. Recht B, Fazel M, Parillo P (2010) Guaranteed minimum rank solutions to linear matrix equations via nuclear norm minimization. SIAM Rev 52(3):471–501

    Article  MathSciNet  MATH  Google Scholar 

  76. Recht B, Ré C (2011) Parallel stochastic gradient algorithms for large-scale matrix completion. In: Optimization (Online)

    Google Scholar 

  77. Rennie JDM, Srebro N (2005) Fast maximum margin matrix factorization for collaborative prediction. In: Proceedings of the international conference of Machine Learning

    Google Scholar 

  78. Rudin LI, Osher S, Fatemi E (1992) Nonlinear total variation based noise removal algorithms. Phys D 60:259–268

    Article  MATH  Google Scholar 

  79. Shapiro A, Homem de Mello T (2000) On rate of convergence of Monte Carlo approximations of stochastic programs. SIAM J Opt 11:70–86

    Article  MathSciNet  MATH  Google Scholar 

  80. Shapiro A, Dentcheva D, Ruszczyński A (eds) (2009) Lecture notes on stochastic programming: modeling and theory. SIAM, Philadelphia

    Google Scholar 

  81. Speer T, Kuppe M, Hoschek J (1998) Global reparametrization for curve approximation. Comput Aided Geom Des 15(9):869–877

    Article  MathSciNet  MATH  Google Scholar 

  82. Strout MM, Hovland PD (2004) Metrics and models for reordering transformations. In: Proceedings of the 2004 workshop on Memory system performance, MSP ’04, ACM, New York, pp 23–34

    Google Scholar 

  83. Tikhonov AN (1963) Solution of incorrectly formulated problems and the regularization method. Sov Math Dokl 4:1035–1038

    Google Scholar 

  84. Toh K-C, Yun S (2010) An accelerated proximal gradient algorithm for nuclear norm regularized least squares problems. Pacific J Optim 6:615–640

    Google Scholar 

  85. Tor AJ (1997) On tikhonov regularization, bias and variance in nonlinear system identification. Automatica 33:441–446

    Article  MATH  Google Scholar 

  86. Tropp JA (2004) Greed is good: algorithmic results for sparse approximation. IEEE Trans Inform Theory 50:2231–2242

    Article  MathSciNet  Google Scholar 

  87. Trottenberg U, Oosterlee CW, Schüller A (2000) Multigrid. Academic Press, London

    Google Scholar 

  88. Tseng P (2008) On accelerated proximal gradient methods for convex-concave optimization. SIAM J Optim (submitted)

    Google Scholar 

  89. Weinberger KQ, Saul LK (2006) Unsupervised learning of image manifolds by semidefinite programming. Int J Comput Vis 70(1):77–90

    Article  Google Scholar 

  90. Willcox K, Peraire J (2002) Balanced model reduction via the proper orthogonal decomposition. AIAA J 40:2323–2330

    Google Scholar 

  91. Yang JF, Yuan XM (2013) Linearized augmented Lagrangian and alternating direction methods for nuclear norm minimization. Math Comp 82(281):301–329

    Article  MathSciNet  MATH  Google Scholar 

  92. Yin W, Osher S, Goldfarb D, Darbon J (2008) Bregman iterative algorithms for l1-minimization with applications to compressed sensing. SIAM J Imaging Sci 1:143–168

    Google Scholar 

  93. Yomdin Y (2008) Analytic reparametrization of semi-algebraic sets. J Complex 24(1):54–76

    Article  MathSciNet  MATH  Google Scholar 

  94. Zhang J (1999) A multilevel dual reordering strategy for robust incomplete lu factorization of indefinite matrices

    Google Scholar 

Download references

Acknowledgments

The authors wish to acknowledge the valuable advice and insights of David Nahamoo and Raya Horesh. In addition, the authors wish to thank Jayant Kalagnanam, and Ulisses Mello for the infrastructural support in fostering the collaboration between Tufts University and IBM Research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ning Hao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Hao, N., Horesh, L., Kilmer, M. (2014). Nuclear Norm Optimization and Its Application to Observation Model Specification. In: Carmi, A., Mihaylova, L., Godsill, S. (eds) Compressed Sensing & Sparse Filtering. Signals and Communication Technology. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38398-4_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-38398-4_4

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-38397-7

  • Online ISBN: 978-3-642-38398-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics