Abstract
In this chapter, we address the determined blind source separation problem and introduce a new effective method of unifying independent vector analysis (IVA) and nonnegative matrix factorization (NMF). IVA is a state-of-the-art technique that utilizes the statistical independence between source vectors. However, since the source model in IVA is based on a spherically symmetric multivariate distribution, IVA cannot utilize the characteristics of specific spectral structures such as various sounds appearing in music signals. To solve this problem, we introduce NMF as the source model in IVA to capture the spectral structures. Since this approach is a natural extension of the source model from a vector to a low-rank matrix represented by NMF, the new method is called independent low-rank matrix analysis (ILRMA). We also reveal the relationship between IVA, ILRMA, and multichannel NMF (MNMF), namely, IVA and ILRMA are identical to a special case of MNMF, which employs a rank-1 spatial model. Experimental results show the efficacy of ILRMA compared with IVA and MNMF in terms of separation accuracy and convergence speed.
Keywords
- Music Signals
- Specific Spectral Structure
- General Source Model
- Frequency Domain ICA (FDICA)
- Symmetric Complex Gaussian Distribution
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
P. Comon, Independent component analysis, a new concept? Signal Process. 36(3), 287–314 (1994)
A.J. Bell, T.J. Sejnowski, An information-maximization approach to blind separation and blind deconvolution. Neural Comput. 7(6), 1129–1159 (1995)
J.-F. Cardoso, Infomax and maximum likelihood for blind source separation. IEEE Signal Process. Lett. 4(4), 112–114 (1997)
S. Haykin (ed.), Unsupervised Adaptive Filtering (Volume I: Blind Source Separation) (Wiley-Interscience, 2000)
A. Hyvärinen, J. Karhunen, E. Oja, Independent Component Analysis (Wiley-Interscience, 2001)
P. Smaragdis, Blind separation of convolved mixtures in the frequency domain. Neurocomputing 22(1), 21–34 (1998)
S. Araki, R. Mukai, S. Makino, T. Nishikawa, H. Saruwatari, The fundamental limitation of frequency domain blind source separation for convolutive mixtures of speech. IEEE Trans. Speech and Audio Process. 11(2), 109–116 (2003)
H. Sawada, R. Mukai, S. Araki, S. Makino, Convolutive blind source separation for more than two sources in the frequency domain, in Proceeding ICASSP (2004), pp. III-885–III-888
H. Buchner, R. Aichner, W. Kellerman, A generalization of blind source separation algorithms for convolutive mixtures based on second order statistics. IEEE Trans. Speech and Audio Process. 13(1), 120–134 (2005)
H. Saruwatari, T. Kawamura, T. Nishikawa, A. Lee, K. Shikano, Blind source separation based on a fast-convergence algorithm combining ICA and beamforming. IEEE Trans. Speech and Audio Process. 14(2), 666–678 (2006)
D.D. Lee, H.S. Seung, Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)
D.D. Lee, H.S. Seung, Algorithms for non-negative matrix factorization, in Proceedings NIPS (2000), pp. 556–562
A. Cichocki, R. Zdunek, A.H. Phan, S. Amari, Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis and Blind Source Separation (Wiley, 2009)
T. Virtanen, Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria. IEEE Trans. Audio, Speech, and Lang. Process. 15(3), 1066–1074 (2007)
A. Ozerov, C. Févotte, M. Charbit, Factorial scaled hidden Markov model for polyphonic audio representation and source separation, in Proceedings WASPAA (2009), pp. 121–124
P. Smaragdis, B. Raj, M. Shashanka, Supervised and semi-supervised separation of sounds from single-channel mixtures, in Proceedings ICA (2007), pp. 414–421
D. Kitamura, H. Saruwatari, K. Yagi, K. Shikano, Y. Takahashi, K. Kondo, Music signal separation based on supervised nonnegative matrix factorization with orthogonality and maximum-divergence penalties. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. E97-A(5), 1113–1118 (2014)
D. Kitamura, H. Saruwatari, H. Kameoka, Y. Takahashi, K. Kondo, S. Nakamura, Multichannel signal separation combining directional clustering and nonnegative matrix factorization with spectrogram restoration. IEEE/ACM Trans. Audio, Speech, and Lang. Process. 23(4), 654–669 (2015)
S. Araki, F. Nesta, E. Vincent, Z. Koldovský, G. Nolte, A. Ziehe, A. Benichoux, The 2011 signal separation evaluation campaign (SiSEC2011):-audio source separation, in Proceedings LVA/ICA (2012), pp. 414–422
N. Ono, Z. Koldovský, S. Miyabe, N. Ito, The 2013 signal separation evaluation campaign (SiSEC2013), in Proceedings MLSP (2013)
N. Ono, Z. Rafii, D. Kitamura, N. Ito, A. Liutkus, The 2015 signal separation evaluation campaign, in Proceedings LVA/ICA (2015), pp. 387–395
A. Liutkus, F.-R. Stöter, Z. Rafii, D. Kitamura, B. Rivet, N. Ito, N. Ono, J. Fontecave, The 2016 signal separation evaluation campaign, in Proceedings LVA/ICA (2017)
S. Kurita, H. Saruwatari, S. Kajita, K. Takeda, F. Itakura, Evaluation of blind signal separation method using directivity pattern under reverberant conditions, in Proceedings ICASSP (2000), pp. 3140–3143
N. Murata, S. Ikeda, A. Ziehe, An approach to blind source separation based on temporal structure of speech signals. Neurocomputing 41(1–4), 1–24 (2001)
H. Sawada, R. Mukai, S. Araki, S. Makino, A robust and precise method for solving the permutation problem of frequency-domain blind source separation. IEEE Trans. Speech and Audio Process. 12(5), 530–538 (2004)
H. Sawada, S. Araki, S. Makino, Measuring dependence of bin-wise separated signals for permutation alignment in frequency-domain BSS, in Proceedings ISCAS (2007), pp. 3247–3250
A. Hiroe, Solution of permutation problem in frequency domain ICA using multivariate probability density functions, in Proceedings ICA (2006), pp. 601–608
T. Kim, T. Eltoft, T.-W. Lee, Independent vector analysis: an extension of ICA to multivariate components, in Proceedings ICA (2006), pp. 165–172
T. Kim, H.T. Attias, S.-Y. Lee, T.-W. Lee, Blind source separation exploiting higher-order frequency dependencies. IEEE Trans. Audio, Speech, and Lang. Process. 15(1), 70–79 (2007)
D. Kitamura, N. Ono, H. Sawada, H. Kameoka, H. Saruwatari, Efficient multichannel nonnegative matrix factorization exploiting rank-1 spatial model, in Proceedings ICASSP (2015), pp. 276–280
D. Kitamura, N. Ono, H. Sawada, H. Kameoka, H. Saruwatari, Determined blind source separation unifying independent vector analysis and nonnegative matrix factorization. IEEE/ACM Trans. Audio, Speech, and Lang. Process. 24(9), 1626–1641 (2016)
S. Arberet, A. Ozerov, N.Q.K. Duong, E. Vincent, R. Gribonval, F. Bimbot, P. Vandergheynst, Nonnegative matrix factorization and spatial covariance model for under-determined reverberant audio source separation, in Proceedings ISSPA (2010), pp. 1–4
H. Kameoka, T. Yoshioka, M. Hamamura, J. Le Roux, K. Kashino, Statistical model of speech signals based on composite autoregressive system with application to blind source separation, in Proceedings LVA/ICA (2010), pp. 245–253
A. Ozerov, C. Févotte, Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation. IEEE Trans. Audio, Speech, and Lang. Process. 18(3), 550–563 (2010)
A. Ozerov, C. Févotte, R. Blouet, J.-L. Durrieu, Multichannel nonnegative tensor factorization with structured constraints for user-guided audio source separation, in Proceedings ICASSP (2011), pp. 257–260
H. Sawada, H. Kameoka, S. Araki, N. Ueda, Multichannel extensions of non-negative matrix factorization with complex-valued data. IEEE Trans. Audio, Speech, and Lang. Process. 21(5), 971–982 (2013)
T. Eltoft, T. Kim, T.-W. Lee, On the multivariate Laplace distribution. IEEE Signal Process. Lett. 13(5), 300–303 (2006)
S. Kotz, T.J. Kozubowski, K. Podgórski, Symmetric multivariate Laplace distribution, in The Laplace Distribution and Generalizations, chap. 5 (Birkhäuser, Basel, 2001), pp. 231–238
T. Adali, H. Ki, J.-F. Cardoso, Complex ICA using nonlinear functions. IEEE Trans. Signal Process. 56(9), 4536–4544 (2008)
N. Ono, Stable and fast update rules for independent vector analysis based on auxiliary function technique, in Proceedings WASPAA (2011), pp. 189–192
N. Ono, Fast stereo independent vector analysis and its implementation on mobile phone, in Proceedings IWAENC (2012)
N. Ono, Auxiliary-function-based independent vector analysis with power of vector-norm type weighting functions, in Proceedings APSIPA ASC (2012)
T. Ono, N. Ono, S. Sagayama, User-guided independent vector analysis with source activity tuning, in Proceedings ICASSP (2012), pp. 2417–2420
K. Hild, H.T. Attias, S. Nagarajan, An expectation-maximization method for spatio-temporal blind source separation using an AR-MOG source model. IEEE Trans. Neural Netw. 19(3), 508–519 (2008)
C. Févotte, J.-F. Cardoso, Maximum likelihood approach for blind audio source separation using time-frequency Gaussian source models, in Proceedings WASPAA (2005), pp. 78–81
T. Nakatani, B.-H. Juang, T. Yoshioka, K. Kinoshita, M. Delcroix, M. Miyoshi, Speech dereverberation based on maximum-likelihood estimation with time-varying Gaussian source model. IEEE Trans. Audio, Speech, and Lang. Process. 16(8), 1512–1527 (2008)
C. Févotte, N. Bertin, J.-L. Durrieu, Nonnegative matrix factorization with the Itakura-Saito divergence. With application to music analysis. Neural Comput. 21(3), 793–830 (2009)
F.D. Neeser, J.L. Massey, Proper complex random processes with applications to information theory. IEEE Trans. Inf. Theory 39(4), 1293–1302 (1993)
F. Itakura, S. Saito, Analysis synthesis telephony based on the maximum likelihood method, in Proceedings ICA (1968), pp. C-17–C-20
M. Nakano, H. Kameoka, J. Le Roux, Y. Kitano, N. Ono, S. Sagayama, Convergence-guaranteed multiplicative algorithms for nonnegative matrix factorization with beta-divergence, in Proceedings MLSP (2010), pp. 283–288
A.R. López, N. Ono, U. Remes, K. Palomäki, M. Kurimo, Designing multichannel source separation based on single-channel source separation, in Proceedings ICASSP (2015), pp. 469–473
N. Ono, S. Miyabe, Auxiliary-function-based independent component analysis for super-Gaussian sources, in Proceedings LVA/ICA (2010), pp. 165–172
S. Amari, A. Cichocki, H.H. Yang, A new learning algorithm for blind signal separation, in Proceedings NIPS (1996), pp. 757–763
A. Cichocki, S. Amari, Adaptive Blind Signal and Image Processing: Learning Algorithms and Applications, vol. 1 (Wiley, 2002)
T.G. Kolda, B.W. Bader, Tensor decompositions and applications. SIAM Rev. 51(3), 455–500 (2009)
D. FitzGerald, M. Cranitch, E. Coyle, Non-negative tensor factorisation for sound source separation, in Proceedings ISSC (2005), pp. 8–12
R.M. Parry, I.A. Essa, Estimating the spatial position of spectral components in audio, in Proceedings ICA (2006), pp. 666–673
Y. Mitsufuji, A. Roebel, Sound source separation based on non-negative tensor factorization incorporating spatial cue as prior knowledge, in Proceedings ICASSP (2013), pp. 71–75
N.Q.K. Duong, E. Vincent, R. Gribonval, Spatial covariance models for under-determined reverberant audio source separation, in Proceedings WASPAA (2009), pp. 129–132
N.Q.K. Duong, E. Vincent, R. Gribonval, Under-determined reverberant audio source separation using a full-rank spatial covariance model. IEEE Trans. Audio, Speech, and Lang. Process. 18(7), 1830–1840 (2010)
K.U. Simmer, J. Bitzer, C. Marro, Post-filtering techniques, in Microphone Arrays: Signal Processing Techniques and Applications, ed. by M. Brandstein, D. Ward, chap. 3 (Springer, Heidelberg, 2001), pp. 39–60
W. James, C. Stein, Estimation with quadratic loss, in Proceedings Berkeley Symposium on Mathematical Statistics and Probability, vol. 1 (1961), pp. 361–379
B. Kulis, M. Sustik, I. Dhillon, Learning low-rank kernel matrices, in Proceedings ICML (2006), pp. 505–512
S. Nakamura, K. Hiyane, F. Asano, T. Nishiura, T. Yamada, Acoustical sound database in real environments for sound scene understanding and hands-free speech recognition, in Proceedings LREC (2000), pp. 965–968
E. Vincent, R. Gribonval, C. Févotte, Performance measurement in blind audio source separation. IEEE Trans. Audio, Speech, and Lang. Process. 14(4), 1462–1469 (2006)
S. Araki, S. Makino, Y. Hinamoto, R. Mukai, T. Nishikawa, H. Saruwatari, Equivalence between frequency-domain blind source separation and frequency-domain adaptive beamforming for convolutive mixtures. EURASIP J. Adv. Signal Process. 2003(11), 1–10 (2003)
J.-F. Cardoso, A. Souloumiac, Blind beamforming for non-Gaussian signals. IEE Proc. F - Radar and Signal Process. 140(6), 362–370 (1993)
D.B. Ward, R.A. Kennedy, R.C. Williamson, Constant directivity beamforming, in Microphone Arrays: Signal Processing Techniques and Applications, ed. by M. Brandstein, D. Ward, chap. 1 (Springer, Heidelberg, 2001), pp. 3–17
D. Kitamura, N. Ono, H. Sawada, H. Kameoka, H. Saruwatari, Relaxation of rank-1 spatial constraint in overdetermined blind source separation, in Proceedings EUSIPCO (2015), pp. 1271–1275
Acknowledgements
This work was partially supported by Grant-in-Aid for JSPS Fellows Grant Number \(26\cdot 10796\), and SECOM Science and Technology Foundation.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this chapter
Cite this chapter
Kitamura, D., Ono, N., Sawada, H., Kameoka, H., Saruwatari, H. (2018). Determined Blind Source Separation with Independent Low-Rank Matrix Analysis. In: Makino, S. (eds) Audio Source Separation. Signals and Communication Technology. Springer, Cham. https://doi.org/10.1007/978-3-319-73031-8_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-73031-8_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73030-1
Online ISBN: 978-3-319-73031-8
eBook Packages: EngineeringEngineering (R0)