Skip to main content

Determined Blind Source Separation with Independent Low-Rank Matrix Analysis

  • Chapter
  • First Online:
Audio Source Separation

Abstract

In this chapter, we address the determined blind source separation problem and introduce a new effective method of unifying independent vector analysis (IVA) and nonnegative matrix factorization (NMF). IVA is a state-of-the-art technique that utilizes the statistical independence between source vectors. However, since the source model in IVA is based on a spherically symmetric multivariate distribution, IVA cannot utilize the characteristics of specific spectral structures such as various sounds appearing in music signals. To solve this problem, we introduce NMF as the source model in IVA to capture the spectral structures. Since this approach is a natural extension of the source model from a vector to a low-rank matrix represented by NMF, the new method is called independent low-rank matrix analysis (ILRMA). We also reveal the relationship between IVA, ILRMA, and multichannel NMF (MNMF), namely, IVA and ILRMA are identical to a special case of MNMF, which employs a rank-1 spatial model. Experimental results show the efficacy of ILRMA compared with IVA and MNMF in terms of separation accuracy and convergence speed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. P. Comon, Independent component analysis, a new concept? Signal Process. 36(3), 287–314 (1994)

    Article  MATH  Google Scholar 

  2. A.J. Bell, T.J. Sejnowski, An information-maximization approach to blind separation and blind deconvolution. Neural Comput. 7(6), 1129–1159 (1995)

    Article  Google Scholar 

  3. J.-F. Cardoso, Infomax and maximum likelihood for blind source separation. IEEE Signal Process. Lett. 4(4), 112–114 (1997)

    Article  Google Scholar 

  4. S. Haykin (ed.), Unsupervised Adaptive Filtering (Volume I: Blind Source Separation) (Wiley-Interscience, 2000)

    Google Scholar 

  5. A. Hyvärinen, J. Karhunen, E. Oja, Independent Component Analysis (Wiley-Interscience, 2001)

    Google Scholar 

  6. P. Smaragdis, Blind separation of convolved mixtures in the frequency domain. Neurocomputing 22(1), 21–34 (1998)

    Article  MATH  Google Scholar 

  7. S. Araki, R. Mukai, S. Makino, T. Nishikawa, H. Saruwatari, The fundamental limitation of frequency domain blind source separation for convolutive mixtures of speech. IEEE Trans. Speech and Audio Process. 11(2), 109–116 (2003)

    Article  MATH  Google Scholar 

  8. H. Sawada, R. Mukai, S. Araki, S. Makino, Convolutive blind source separation for more than two sources in the frequency domain, in Proceeding ICASSP (2004), pp. III-885–III-888

    Google Scholar 

  9. H. Buchner, R. Aichner, W. Kellerman, A generalization of blind source separation algorithms for convolutive mixtures based on second order statistics. IEEE Trans. Speech and Audio Process. 13(1), 120–134 (2005)

    Article  Google Scholar 

  10. H. Saruwatari, T. Kawamura, T. Nishikawa, A. Lee, K. Shikano, Blind source separation based on a fast-convergence algorithm combining ICA and beamforming. IEEE Trans. Speech and Audio Process. 14(2), 666–678 (2006)

    Article  Google Scholar 

  11. D.D. Lee, H.S. Seung, Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)

    Article  MATH  Google Scholar 

  12. D.D. Lee, H.S. Seung, Algorithms for non-negative matrix factorization, in Proceedings NIPS (2000), pp. 556–562

    Google Scholar 

  13. A. Cichocki, R. Zdunek, A.H. Phan, S. Amari, Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis and Blind Source Separation (Wiley, 2009)

    Google Scholar 

  14. T. Virtanen, Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria. IEEE Trans. Audio, Speech, and Lang. Process. 15(3), 1066–1074 (2007)

    Article  Google Scholar 

  15. A. Ozerov, C. Févotte, M. Charbit, Factorial scaled hidden Markov model for polyphonic audio representation and source separation, in Proceedings WASPAA (2009), pp. 121–124

    Google Scholar 

  16. P. Smaragdis, B. Raj, M. Shashanka, Supervised and semi-supervised separation of sounds from single-channel mixtures, in Proceedings ICA (2007), pp. 414–421

    Google Scholar 

  17. D. Kitamura, H. Saruwatari, K. Yagi, K. Shikano, Y. Takahashi, K. Kondo, Music signal separation based on supervised nonnegative matrix factorization with orthogonality and maximum-divergence penalties. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. E97-A(5), 1113–1118 (2014)

    Google Scholar 

  18. D. Kitamura, H. Saruwatari, H. Kameoka, Y. Takahashi, K. Kondo, S. Nakamura, Multichannel signal separation combining directional clustering and nonnegative matrix factorization with spectrogram restoration. IEEE/ACM Trans. Audio, Speech, and Lang. Process. 23(4), 654–669 (2015)

    Article  Google Scholar 

  19. S. Araki, F. Nesta, E. Vincent, Z. Koldovský, G. Nolte, A. Ziehe, A. Benichoux, The 2011 signal separation evaluation campaign (SiSEC2011):-audio source separation, in Proceedings LVA/ICA (2012), pp. 414–422

    Google Scholar 

  20. N. Ono, Z. Koldovský, S. Miyabe, N. Ito, The 2013 signal separation evaluation campaign (SiSEC2013), in Proceedings MLSP (2013)

    Google Scholar 

  21. N. Ono, Z. Rafii, D. Kitamura, N. Ito, A. Liutkus, The 2015 signal separation evaluation campaign, in Proceedings LVA/ICA (2015), pp. 387–395

    Google Scholar 

  22. A. Liutkus, F.-R. Stöter, Z. Rafii, D. Kitamura, B. Rivet, N. Ito, N. Ono, J. Fontecave, The 2016 signal separation evaluation campaign, in Proceedings LVA/ICA (2017)

    Google Scholar 

  23. S. Kurita, H. Saruwatari, S. Kajita, K. Takeda, F. Itakura, Evaluation of blind signal separation method using directivity pattern under reverberant conditions, in Proceedings ICASSP (2000), pp. 3140–3143

    Google Scholar 

  24. N. Murata, S. Ikeda, A. Ziehe, An approach to blind source separation based on temporal structure of speech signals. Neurocomputing 41(1–4), 1–24 (2001)

    Article  MATH  Google Scholar 

  25. H. Sawada, R. Mukai, S. Araki, S. Makino, A robust and precise method for solving the permutation problem of frequency-domain blind source separation. IEEE Trans. Speech and Audio Process. 12(5), 530–538 (2004)

    Article  Google Scholar 

  26. H. Sawada, S. Araki, S. Makino, Measuring dependence of bin-wise separated signals for permutation alignment in frequency-domain BSS, in Proceedings ISCAS (2007), pp. 3247–3250

    Google Scholar 

  27. A. Hiroe, Solution of permutation problem in frequency domain ICA using multivariate probability density functions, in Proceedings ICA (2006), pp. 601–608

    Google Scholar 

  28. T. Kim, T. Eltoft, T.-W. Lee, Independent vector analysis: an extension of ICA to multivariate components, in Proceedings ICA (2006), pp. 165–172

    Google Scholar 

  29. T. Kim, H.T. Attias, S.-Y. Lee, T.-W. Lee, Blind source separation exploiting higher-order frequency dependencies. IEEE Trans. Audio, Speech, and Lang. Process. 15(1), 70–79 (2007)

    Article  Google Scholar 

  30. D. Kitamura, N. Ono, H. Sawada, H. Kameoka, H. Saruwatari, Efficient multichannel nonnegative matrix factorization exploiting rank-1 spatial model, in Proceedings ICASSP (2015), pp. 276–280

    Google Scholar 

  31. D. Kitamura, N. Ono, H. Sawada, H. Kameoka, H. Saruwatari, Determined blind source separation unifying independent vector analysis and nonnegative matrix factorization. IEEE/ACM Trans. Audio, Speech, and Lang. Process. 24(9), 1626–1641 (2016)

    Article  Google Scholar 

  32. S. Arberet, A. Ozerov, N.Q.K. Duong, E. Vincent, R. Gribonval, F. Bimbot, P. Vandergheynst, Nonnegative matrix factorization and spatial covariance model for under-determined reverberant audio source separation, in Proceedings ISSPA (2010), pp. 1–4

    Google Scholar 

  33. H. Kameoka, T. Yoshioka, M. Hamamura, J. Le Roux, K. Kashino, Statistical model of speech signals based on composite autoregressive system with application to blind source separation, in Proceedings LVA/ICA (2010), pp. 245–253

    Google Scholar 

  34. A. Ozerov, C. Févotte, Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation. IEEE Trans. Audio, Speech, and Lang. Process. 18(3), 550–563 (2010)

    Article  Google Scholar 

  35. A. Ozerov, C. Févotte, R. Blouet, J.-L. Durrieu, Multichannel nonnegative tensor factorization with structured constraints for user-guided audio source separation, in Proceedings ICASSP (2011), pp. 257–260

    Google Scholar 

  36. H. Sawada, H. Kameoka, S. Araki, N. Ueda, Multichannel extensions of non-negative matrix factorization with complex-valued data. IEEE Trans. Audio, Speech, and Lang. Process. 21(5), 971–982 (2013)

    Article  Google Scholar 

  37. T. Eltoft, T. Kim, T.-W. Lee, On the multivariate Laplace distribution. IEEE Signal Process. Lett. 13(5), 300–303 (2006)

    Article  Google Scholar 

  38. S. Kotz, T.J. Kozubowski, K. Podgórski, Symmetric multivariate Laplace distribution, in The Laplace Distribution and Generalizations, chap. 5 (Birkhäuser, Basel, 2001), pp. 231–238

    Google Scholar 

  39. T. Adali, H. Ki, J.-F. Cardoso, Complex ICA using nonlinear functions. IEEE Trans. Signal Process. 56(9), 4536–4544 (2008)

    Article  MathSciNet  Google Scholar 

  40. N. Ono, Stable and fast update rules for independent vector analysis based on auxiliary function technique, in Proceedings WASPAA (2011), pp. 189–192

    Google Scholar 

  41. N. Ono, Fast stereo independent vector analysis and its implementation on mobile phone, in Proceedings IWAENC (2012)

    Google Scholar 

  42. N. Ono, Auxiliary-function-based independent vector analysis with power of vector-norm type weighting functions, in Proceedings APSIPA ASC (2012)

    Google Scholar 

  43. T. Ono, N. Ono, S. Sagayama, User-guided independent vector analysis with source activity tuning, in Proceedings ICASSP (2012), pp. 2417–2420

    Google Scholar 

  44. K. Hild, H.T. Attias, S. Nagarajan, An expectation-maximization method for spatio-temporal blind source separation using an AR-MOG source model. IEEE Trans. Neural Netw. 19(3), 508–519 (2008)

    Article  MATH  Google Scholar 

  45. C. Févotte, J.-F. Cardoso, Maximum likelihood approach for blind audio source separation using time-frequency Gaussian source models, in Proceedings WASPAA (2005), pp. 78–81

    Google Scholar 

  46. T. Nakatani, B.-H. Juang, T. Yoshioka, K. Kinoshita, M. Delcroix, M. Miyoshi, Speech dereverberation based on maximum-likelihood estimation with time-varying Gaussian source model. IEEE Trans. Audio, Speech, and Lang. Process. 16(8), 1512–1527 (2008)

    Article  Google Scholar 

  47. C. Févotte, N. Bertin, J.-L. Durrieu, Nonnegative matrix factorization with the Itakura-Saito divergence. With application to music analysis. Neural Comput. 21(3), 793–830 (2009)

    Article  MATH  Google Scholar 

  48. F.D. Neeser, J.L. Massey, Proper complex random processes with applications to information theory. IEEE Trans. Inf. Theory 39(4), 1293–1302 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  49. F. Itakura, S. Saito, Analysis synthesis telephony based on the maximum likelihood method, in Proceedings ICA (1968), pp. C-17–C-20

    Google Scholar 

  50. M. Nakano, H. Kameoka, J. Le Roux, Y. Kitano, N. Ono, S. Sagayama, Convergence-guaranteed multiplicative algorithms for nonnegative matrix factorization with beta-divergence, in Proceedings MLSP (2010), pp. 283–288

    Google Scholar 

  51. A.R. López, N. Ono, U. Remes, K. Palomäki, M. Kurimo, Designing multichannel source separation based on single-channel source separation, in Proceedings ICASSP (2015), pp. 469–473

    Google Scholar 

  52. N. Ono, S. Miyabe, Auxiliary-function-based independent component analysis for super-Gaussian sources, in Proceedings LVA/ICA (2010), pp. 165–172

    Google Scholar 

  53. S. Amari, A. Cichocki, H.H. Yang, A new learning algorithm for blind signal separation, in Proceedings NIPS (1996), pp. 757–763

    Google Scholar 

  54. A. Cichocki, S. Amari, Adaptive Blind Signal and Image Processing: Learning Algorithms and Applications, vol. 1 (Wiley, 2002)

    Google Scholar 

  55. T.G. Kolda, B.W. Bader, Tensor decompositions and applications. SIAM Rev. 51(3), 455–500 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  56. D. FitzGerald, M. Cranitch, E. Coyle, Non-negative tensor factorisation for sound source separation, in Proceedings ISSC (2005), pp. 8–12

    Google Scholar 

  57. R.M. Parry, I.A. Essa, Estimating the spatial position of spectral components in audio, in Proceedings ICA (2006), pp. 666–673

    Google Scholar 

  58. Y. Mitsufuji, A. Roebel, Sound source separation based on non-negative tensor factorization incorporating spatial cue as prior knowledge, in Proceedings ICASSP (2013), pp. 71–75

    Google Scholar 

  59. N.Q.K. Duong, E. Vincent, R. Gribonval, Spatial covariance models for under-determined reverberant audio source separation, in Proceedings WASPAA (2009), pp. 129–132

    Google Scholar 

  60. N.Q.K. Duong, E. Vincent, R. Gribonval, Under-determined reverberant audio source separation using a full-rank spatial covariance model. IEEE Trans. Audio, Speech, and Lang. Process. 18(7), 1830–1840 (2010)

    Article  Google Scholar 

  61. K.U. Simmer, J. Bitzer, C. Marro, Post-filtering techniques, in Microphone Arrays: Signal Processing Techniques and Applications, ed. by M. Brandstein, D. Ward, chap. 3 (Springer, Heidelberg, 2001), pp. 39–60

    Google Scholar 

  62. W. James, C. Stein, Estimation with quadratic loss, in Proceedings Berkeley Symposium on Mathematical Statistics and Probability, vol. 1 (1961), pp. 361–379

    Google Scholar 

  63. B. Kulis, M. Sustik, I. Dhillon, Learning low-rank kernel matrices, in Proceedings ICML (2006), pp. 505–512

    Google Scholar 

  64. S. Nakamura, K. Hiyane, F. Asano, T. Nishiura, T. Yamada, Acoustical sound database in real environments for sound scene understanding and hands-free speech recognition, in Proceedings LREC (2000), pp. 965–968

    Google Scholar 

  65. E. Vincent, R. Gribonval, C. Févotte, Performance measurement in blind audio source separation. IEEE Trans. Audio, Speech, and Lang. Process. 14(4), 1462–1469 (2006)

    Article  Google Scholar 

  66. S. Araki, S. Makino, Y. Hinamoto, R. Mukai, T. Nishikawa, H. Saruwatari, Equivalence between frequency-domain blind source separation and frequency-domain adaptive beamforming for convolutive mixtures. EURASIP J. Adv. Signal Process. 2003(11), 1–10 (2003)

    Article  MATH  Google Scholar 

  67. J.-F. Cardoso, A. Souloumiac, Blind beamforming for non-Gaussian signals. IEE Proc. F - Radar and Signal Process. 140(6), 362–370 (1993)

    Article  Google Scholar 

  68. D.B. Ward, R.A. Kennedy, R.C. Williamson, Constant directivity beamforming, in Microphone Arrays: Signal Processing Techniques and Applications, ed. by M. Brandstein, D. Ward, chap. 1 (Springer, Heidelberg, 2001), pp. 3–17

    Google Scholar 

  69. D. Kitamura, N. Ono, H. Sawada, H. Kameoka, H. Saruwatari, Relaxation of rank-1 spatial constraint in overdetermined blind source separation, in Proceedings EUSIPCO (2015), pp. 1271–1275

    Google Scholar 

Download references

Acknowledgements

This work was partially supported by Grant-in-Aid for JSPS Fellows Grant Number \(26\cdot 10796\), and SECOM Science and Technology Foundation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daichi Kitamura .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Kitamura, D., Ono, N., Sawada, H., Kameoka, H., Saruwatari, H. (2018). Determined Blind Source Separation with Independent Low-Rank Matrix Analysis. In: Makino, S. (eds) Audio Source Separation. Signals and Communication Technology. Springer, Cham. https://doi.org/10.1007/978-3-319-73031-8_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-73031-8_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-73030-1

  • Online ISBN: 978-3-319-73031-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics