Signal, Image and Video Processing

, Volume 8, Issue 1, pp 95–110 | Cite as

Primal-dual algorithms for audio decomposition using mixed norms

  • İlker Bayram
  • Ö. Deniz Akyıldız
Original Paper


We consider the problem of decomposing audio into components that have different time frequency characteristics. For this, we model the components using different transforms and mixed norms applied on the transform domain coefficients. We formulate the problem as a search for a saddle point and derive algorithms through a primal-dual framework. We also discuss how to modify the primal-dual algorithms in order to derive a simpler heuristic scheme.


Audio decomposition Mixed norms Analysis prior  Synthesis prior Primal-dual 



We thank Prof. Barış Bozkurt, Bahcesehir University, Istanbul, Turkey for comments and providing the signals used in the experiments. We also thank the reviewers for their constructive remarks.


  1. 1.
    Bayram, I.: Denoising formulations based on support functions (2011). Available at (unpublished manuscript)
  2. 2.
    Bayram, I.: Mixed-norms with overlapping groups as signal priors. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2011)Google Scholar
  3. 3.
    Bayram, I.: An analytic wavelet transform with a flexible time-frequency covering. IEEE Trans. Signal Process. 61(5), 1131–1142 (2013)Google Scholar
  4. 4.
    Beck, A., Teboulle, M.: Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Trans. Image Process. 18(11), 2419–2434 (2009)Google Scholar
  5. 5.
    Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 1, 183–202 (2009)CrossRefMathSciNetGoogle Scholar
  6. 6.
    Candès, E.J., Wakin, M.B., Boyd, S.P.: Enhancing sparsity by reweighted l1 minimization. J. Fourier Anal. Appl. 14(5), 877–905 (2008)CrossRefzbMATHMathSciNetGoogle Scholar
  7. 7.
    Chambolle, A., Pock, T.: A first-order primal-dual algorithm for convex problems with applications to imaging. J. Math. Imaging Vis. 40(1), 120–145 (2011)CrossRefzbMATHMathSciNetGoogle Scholar
  8. 8.
    Chen, P.Y., Selesnick, I.W.: Translation invariant shrinkage of group sparse signals. Manuscript, available from (2012)
  9. 9.
    Christensen, M.G., Jakobsson, A., Andersen, S.V., Jensen, S.H.: Linear amplitude decomposition for sinusoidal audio coding. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2005)Google Scholar
  10. 10.
    Christensen, M.G., Jensen, S.H.: On perceptual distortion minimization and nonlinear least-squares frequency estimation. IEEE Trans. Audio Speech Language Process. 14(1), 99–109 (2006)CrossRefGoogle Scholar
  11. 11.
    Christensen, M.G., Sturm, B.L.: A perceptually re-weighted mixed-norm method for sparse approximation of audio signals. In: Asilomar Conference Signals, Systems, and Computers (2011)Google Scholar
  12. 12.
    Christensen, O.: An Introduction to Frames and Riesz Bases. In: Applied and Numerical Harmonic Analysis. Birkhäuser, Boston (2003)Google Scholar
  13. 13.
    Cleju, N., Jafari, M.G., Plumbley, M.D.: Choosing analysis or synthesis recovery for sparse reconstruction. In: Proceedings of European Signal Processing Conference (EUSIPCO) (2012)Google Scholar
  14. 14.
    Combettes, P.L., Pesquet, J.C.: Proximal splitting methods in signal processing. In: Bauschke, H.H., Burachik, R.S., Combettes, P.L., Elser, V., Luke, D.R., Wolkowicz, H. (eds.) Fixed-Point Algorithms for Inverse Problems in Science and Engineering. Springer, New York (2011)Google Scholar
  15. 15.
    Daudet, L., Torrésani, B.: Hybrid representations for audiophonic signal encoding. Signal Process. 82(11), 1595–1617 (2002)CrossRefzbMATHGoogle Scholar
  16. 16.
    Elad, M., Milanfar, P., Rubinstein, R.: Analysis versus synthesis in signal priors. Inverse Probl. 23(3), 947–968 (2007)CrossRefzbMATHMathSciNetGoogle Scholar
  17. 17.
    Esser, E., Zhang, X., Chan, T.F.: A general framework for a class of first order primal-dual algorithms for convex optimization in imaging science. SIAM J. Imaging Sci. 3(4), 1015–1046 (2010)CrossRefzbMATHMathSciNetGoogle Scholar
  18. 18.
    Fadilli, M.J., Starck, J.L., Bobin, J., Moudden, Y.: Image decomposition and separation using sparse representations: an overview. Proc. IEEE 98(6), 983–994 (2010)CrossRefGoogle Scholar
  19. 19.
    Figueiredo, M.A.T., Bioucas-Dias, J.M., Nowak, R.D.: Majorization-minimization algorithms for wavelet-based image restoration. IEEE Trans. Image Process. 16(12), 2980–2991 (2007)Google Scholar
  20. 20.
    Hamdy, K., Ali, M., Tewfik, H.: Low bit rate high quality audio coding with combined harmonic and wavelet representations. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (1996)Google Scholar
  21. 21.
    Hiriart-Urruty, J.B., Lemaréchal, C.: Fundamentals of Convex Analysis. Springer, Berlin (2004)Google Scholar
  22. 22.
    Hunter, D.R., Lange, K.: A tutorial on MM algorithms. Am. Stat. 58(1), 30–37 (2004)CrossRefMathSciNetGoogle Scholar
  23. 23.
    Jacob, L., Obozinski, G., Vert, J.P.: Group lasso with overlap and graph lasso. In: Proceedings of 26th Interenational Conference on, Machine Learning (2009)Google Scholar
  24. 24.
    Jaillet, F., Torrésani, B.: Time-frequency jigsaw puzzle: adaptive multiwindow and multilayered Gabor representations. Int. J. Wavelets Multiresolution Inf. Process. 5(2), 293–316 (2007)CrossRefzbMATHGoogle Scholar
  25. 25.
    Jensen, T., Østergaard, J., Dahl, J., Jensen, S.H.: Multiple-description \(l_1\)-compression. IEEE Trans. Signal Process. 59(8), 3699–3711 (2011)CrossRefMathSciNetGoogle Scholar
  26. 26.
    Korpelevich, G.: The extragradient method for finding saddle points and other problems. Ekonomika i Matematcheskie Metody 12, 747–756 (1976)zbMATHGoogle Scholar
  27. 27.
    Kowalski, M.: Sparse regression using mixed norms. J. Appl. Comput. Harm. Anal. 27(3), 303–324 (2009)CrossRefzbMATHGoogle Scholar
  28. 28.
    Kowalski, M., Siedenburg, K., Dörfler, M.: Social sparsity! Neighborhood systems enrich structured shrinkage operators. IEEE Trans. Signal Process. (2013). doi: 10.1109/TSP.2013.2250967
  29. 29.
    Kowalski, M., Torrésani, B.: Sparsity and persistence: mixed norms provide simple signal models with dependent coefficients. Signal Image Video Process. 3(3), 251–264 (2009)CrossRefzbMATHGoogle Scholar
  30. 30.
    Levine, S., Smith, J.O.: A sines + transients+noise audio representation for data compression and time/pitch-scale modifications. In: Proceedings of 105th Convention of the AES (1998) Google Scholar
  31. 31.
    McAulay, R., Quatieri, T.: Speech analysis synthesis based on a sinusoidal representation. IEEE Trans. Acoust. Speech Signal Process. 34(4), 744–754 (1986)CrossRefGoogle Scholar
  32. 32.
    Molla, S., Torrésani, B.: A hybrid scheme for encoding audio signal using hidden Markov models of waveforms. J. Appl. Comput. Harm. Anal. 18(2), 137–166 (2005)CrossRefzbMATHGoogle Scholar
  33. 33.
    Nam, S., Davies, M.E., Elad, M., Gribonval, R.: Cosparse analysis modelling—uniqueness and algorithms. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2011)Google Scholar
  34. 34.
    van de Par, S., Kohlrausch, A., Heusdens, R., Jensen, J., Jensen, S.H.: A perceptual model for sinusoidal coding based on spectral integration. EURASIP J. Appl. Signal Process. 2005(9), 1292–1304 (2005)CrossRefzbMATHGoogle Scholar
  35. 35.
    Popov, L.D.: A modification of the Arrow-Hurwicz method for search of saddle points. Matematicheskie Zametki 28(5), 777–784 (1980)zbMATHMathSciNetGoogle Scholar
  36. 36.
    van Schijndel, N.H., et al.: Adaptive RD optimization hybrid sound coding. J. Audio Eng. Soc. 56(10), 787–809 (2008)Google Scholar
  37. 37.
    Selesnick, I.W.: Resonance-based signal decomposition: a new sparsity-enabled signal analysis method. Signal Process. 91(12), 2793–2809 (2011)Google Scholar
  38. 38.
    Selesnick, I.W., Bayram, I.: Oscillatory + transient signal decomposition using overcomplete rational-dilation wavelet transforms. In: Proceedings of SPIE (Wavelets XIII) (2009)Google Scholar
  39. 39.
    Selesnick, I.W., Figueiredo, M.A.T.: Signal restoration with overcomplete wavelet transforms : comparison of analysis and synthesis priors. In: Proceedings of SPIE (Wavelets XIII) (2009)Google Scholar
  40. 40.
    Siedenburg, K., Dörfler, M.: Structured sparsity for audio signals. In: Proceedings of International Conference on Digital Audio Effects (DAFx) (2011)Google Scholar
  41. 41.
    Smith, J.O., Serra, X.: PARSHL: an analysis/synthesis program for nonharmonic sounds based on a sinusoidal representation. In: Proceedings of International Computer Music Conference (1987)Google Scholar
  42. 42.
    Starck, J.L., Elad, M., Donoho, D.: Redundant multiscale transforms and their application for morphological component analysis. Adv. Imaging Electron Phys. 132, 287–348 (2004)Google Scholar
  43. 43.
    Velasco, G.A., Holighaus, N., Dörfler, M., Grill, T.: Constructing an invertible constant-q transform with non-stationary Gabor frames. In: Proceedings of International Conference on Digital Audio Effects (DAFx) (2011)Google Scholar
  44. 44.
    Verma, T.S., Levine, S., Meng, T.H.Y.: Transient modeling synthesis : a flexible transient analysis/synthesis tool for transient signals. In: Proceedings of ICMC (1997)Google Scholar
  45. 45.
    Zhu, M., Chan, T.F.: An efficient primal-dual hybrid gradient algorithm for total variation image restoration. UCLA CAM Report [08-34] (2008)Google Scholar

Copyright information

© Springer-Verlag London 2013

Authors and Affiliations

  1. 1.Istanbul Technical UniversityIstanbulTurkey
  2. 2.Bogazici UniversityIstanbulTurkey

Personalised recommendations